[WARNINGS] link-id not found - [fmiosacomparativestudypartikernellandchapter1bootup|1 link-id not found - [fmiosacomparativestudypartikernellandchapter3scheduling|3 link-id not found - [fmiosacomparativestudypartikernellandchapter5memorymanagement|5 link-id not found - [fmiosacomparativestudypartikernellandchapter7deadlocks|7 link-id not found - [fmiosacomparativestudypartiiuserlandchapter8iolayers|8 link-id not found - [fmiosacomparativestudypartiiuserlandchapter10environmentlayer|10
Intro (I 1 2 3 4 5 6 7) (II 8 9 10 11) (D) (C) (B
I)
h1. Chapter 4: Inter-Process Communication
One of the most important tasks of the kernel, in a client-server model, is to provide a means for clients and servers to send and receive messages. Clients need to be able to talk to servers, servers need to be able to talk back to clients, and servers need to talk to servers.
h2. 1. Communication Mechanisms
There are four parts to this communication infrastructure:
- Ports*
When a process that wants to be a server is created it requests a port number from the kernel. Ports are the endpoints where clients and servers send their messages, kind of like a mailbox address. Each port has its own queue of messages being sent to, or returned by, the server.
- Translator*
When a client wants to talk to a server it needs a way to find what port number that server is bound to. This requires the need of a separate server that has access to the kernel's directory of port numbers. This server translates the server names to port numbers.
- Message Passing Commands*
These are the actual syscalls for all user processes, clients and servers, to use to send and receive messages.
- Interrupt Communication*
There must also be a way for the interrupt handlers to communicate to other processes. These are separate from user space message passing commands because they are used within the kernel.
h2. 2. OS Comparisons
h3. MachOS/Hurd
MachOS/Hurd does things somewhat similar to the basic client-server system in that the microkernel (MachOS) handles the IPC. It uses ports and a translator, but its differences are significant enough to take a moment to look at it.
h4. Remote Procedure Calls (RPCs)
MachOS utilizes RPCs, which are transport-transparent procedure calls. It is basically a way of making the functions exported by a server accessible by clients (local or remote). MachOS facilitates this by providing an interface for connecting client RPC calls to the server RPC functions. This is really nice for developers writing on the MachOS platform. However, internally, the kernel has to go through complex steps to make it all work. Because of this complexity I leave further details of an RPC system to the reader (see Le Mignot and The GNU Hurd in the Bibliography).
h4. Asynchronous
The MachOS/Hurd IPC is asynchronous in that a thread sending a message isn't blocked until the thread it sent it to receives it. This can easily cause scheduling headaches and other problems, as well as add a lot of overhead (such as complex buffer code).
h4. Translators
When a potential server is requesting a port it calls on the translator. The translator will actually mount the server to Hurd's Virtual File System (VFS and mount: Chapter 9 File Systems) so it is treated like any other directory.
So if a client wanted to talk to the @pfinet@ (a TCP/IP internet driver) it would call the root translator (seen by all applications) for the port right to @/servers/socket/pfinet@. Port right is the right, or permission, to connect to a port. The root translator will then find the TCP/IP translator. The TCP/IP translator will then open the port by giving the port right to the client. At this point the client can make RPC calls to open TCP/IP sockets or send commands such as @ping@.
Any user who wants to run a server process can have it request a translator of its own. Upon bootup, though, most servers' translators are started by the root translator, such as @pfinet@.
This whole system has very powerful implications. Since all of the servers are accessible through the file system, file system commands, such as @cd@ or @ls@ can give the user access to different parts of the server if they were in the appropriate directory. This is controlled and restricted, however, by using UNIX file permissions (Chapter 11 Security).
h3. FMI/OS and Plan9
FMI/OS took many ideas for its IPC model from Plan9 due to its functionality and simplicity. Some things we have kept, and some we have changed.
h4. Namer
Plan9's translator, called namer, is a user space server that facilitates the translation of a server name to its port number. When a server registers itself with the namer, namer assigns it a port number that is provided by the kernel. The kernel keeps track of these associations in a lookup table.
When a process wants to send a message to a server, it must connect to that server and get the port number for it. This connection request, via @msgconnect()@, uses namer to get the port number. Once connected, the process can use that port number when sending messages. When the process does send a message, using @msgsend()@, the syscall uses the port/server lookup table in the kernel to translate the port number to the server it is associated with.
This method requires a lot of bouncing back and forth between user space and the kernel. Recently, we have discussed putting namer into the kernel. It will still issue, or bind, a port to a server upon request; however, ports are represented by strings instead of numbers. And instead of these strings being some arbitrary lookup number to the server, they actually contain the address (within the file system) of the server. This accomplishes two things:
1) The user space to kernel bounces are diminished.
2) It eliminates the server name/port number lookup table.
h4. Message Passing Commands
@msg_connect( port )@
"Connecting" to a port first, before the thread can send messages to it, is important because the kernel has to authenticate the thread. For example, you don't want a user program to be able to connect straight to the @wd@ disk server because they could cause some real damage. Instead, the user program has to connect to @fs@ which can connect to @wd@ because it knows how to properly talk to @wd@ without corrupting files or some other haphazard insanity.
@msg_send( port, &msg )@
This is where the thread can actually send a message (@msg@) to the server. Since the FMI/OS IPC is synchronous, every message send causes the sender to block. It can send as many as it wants, one at a time. When it wakes up, the message that @msg@ points to has been changed to contain the server's reply.
@msg_receive( &msg )@
This allows server threads to receive a message (@msg@). Each receive blocks the thread, and it can do as many as it likes, one at a time.
@msg_reply( &msg )@
This is how the server replies to a message sent to it. It also enables the server to pass messages back to the sender by changing what's inside the original @msg@.
@msg_close()@
This closes the connection to the port and the process can no longer send or receive messages from it.
h4. Interrupt Service Request (ISR)
Interrupts are implemented in the ISR. This is basically a set of message calls that allow interrupt handlers to directly inject messages into subscribed servers' queues without blocking (@kmesg_send()@ for kernel-message send, etc.). Clients never talk directly to interrupts and they never have to. Besides, the interrupt handler has to have a port to send messages to, and clients can't have ports. They always user servers as an in-between, thus further simplifying the ISR.
When still under the guise of VSTa, our kernel only allowed one server at at time to subscribe to an interrupt. And if that server were busy, the interrupt handler would keep track of all of the missed interrupts. This model limited the scalability of the kernel and required more overhead and complexity to handle the backed up, missed interrupts.
Our new implementation takes advantage of shared interrupts. This means that more than one server can subscribe to an interrupt. The handler also has two different lists of these subscribers: a ready list and a pending list. If a server were ready to receive a message from the handler it would be on the ready list. When the interrupt calls the handler it informs every server on the ready list and copies them to the pending list. When a server receives the message, off the queue, from the handler, that server is moved back over to the ready list.
This new model does away with missed interrupts, but allows for a much cleaner and more flexible way of dealing with numerous subscribers.
h4. Typical IPC Flow
Let's say a thread wants to use the keyboard. Keyboard is an I/O and uses interrupts. The in-between server is @cons@ (whose program actually resides in the file system at location @//cons@ for console. The procedure this thread would go through is as follows:
** @msg_connect( "//cons" )@
** @msg_send( "//cons", &msg )@ where @msg@ contains a message that the @cons@ server understands as "get keyboard input". The program blocks until @cons@ replies with the actual keyboard input inside @msg@.
** @msg_close( "//cons" )@ to close the connection after looping through send/receive/send/receive enough times to get all that it needs.
In the case of the 'cons' server, it would look something like this:
** @msg_receive( &msg )@ which will block the server until it receives a message.
** @msg_send()@ to the keyboard interrupt and block, putting returned data into @msg@ when woken.
** @msgreply( &msg )@ which will wake up (i.e. @setrunnable()@) the blocked thread, with the new data in @msg@
- The server will then loop back to @msg_receive()@ and block, waiting for more messages.
(dia) :
h4. Critical Region
Due to the nature of multithreaded programming, special attention must be paid to the critical regions of code. These critical regions are parts of a program that access a shared resource. For instance, the message queue for a server is shared by everyone; in other words, any other client or server can send a message to it whenever they please (if they have the appropriate privileges). The part of the program that processes a message send request by putting a message on the server's queue, and then incrementing the index pointer on that queue, is a critical region. Keeping in mind that a preemptive scheduler (such as ours) can interrupt a process anywhere, imagine what may happen if that process were interrupted after adding a message to a queue, but before incrementing the index pointer! Now imagine if the new process to run was sending a message to the same queue. Since the index pointer was never incremented, the message from the previous process becomes overwritten!
This is an example of a race condition: two processes are racing for access to a shared resource the one that gets there first alters the state of the resource, adversely affecting the other's work on the resource.
To overcome this common occurrence, the kernel must ensure that all message commands are atomic, and thus are run as one command without interruption. The only case, then, where race conditions could still become a problem is in the context of multiprocessors. This is because two different message syscalls can be running at once (on two different processors). This will be covered more in Chapter 6 section 2.
Now on to Memory Management!
(c)2006 Dimitri Hammond