[WARNINGS]
link-id not found - [fmiosacomparativestudypartikernellandchapter1bootup|1
link-id not found - [fmiosacomparativestudypartikernellandchapter3scheduling|3
link-id not found - [fmiosacomparativestudypartikernellandchapter5memorymanagement|5
link-id not found - [fmiosacomparativestudypartikernellandchapter7deadlocks|7
link-id not found - [fmiosacomparativestudypartiiuserlandchapter8iolayers|8
link-id not found - [fmiosacomparativestudypartiiuserlandchapter10environmentlayer|10

Intro (I 1 2 3 4 5 6 7) (II 8 9 10 11) (D) (C) (B

I)

h1. Chapter 2: Processes/Threads

(If you want to destroy my sweater ...)

Processes (and threads) are one of the most significant resources of any computer, and managing them is one of the primary objectives when designing an operating system. This chapter is, therefore, one of the most important, and also one of the longest in the entire thesis.

h2. 1. Process

What is a process? Processes are essentially what one would consider a program that has started to execute: a process has instructions that are to be executed by the processor and has memory set aside to keep track of data and its state.

For instance, a word processor is considered a process. It sends instructions to the processor that result in formatting of text, spell checking, even reading-in from the keyboard. It also has some memory set aside to keep track of all of the user's text.

Now, let's say the word processor supports hyperlinks, where you click a link and it'll open a web browser for you. In this case, the word processor starts another process, the web browser. The web browser has its own instructions it needs executed. It also has its own memory to hold web pages, etc. The word processor and the web browser are two distinct processes. Their code is separate, their global data areas are separate, and their stack areas (where local variables are stored) must be separate. Each is a computing entity unto its own. Each is "independent" of the other and may be run in random order by the OS scheduler. In fact, the OS may run each in interwoven bursts of time, making it appear to the user that both are running simultaneously! This is an example of multiprogramming.

Whereas the example web browsers and word processors may alternate execution and continue to present simultaneously changing data to the user, some processes are intended as background processes. These are called daemons. These processes do not interact directly with humans but handle other external events, such as mail arrivals, network connections from external hosts, the "plugging in" of PCMCIA cards, etc.

It is the job of the OS to control processes and schedule each one's use of the processor. These stints of processor usage by each process may be short, measured in milliseconds (msec), yet add up in the end to complete whole tasks. Since they are so quick to the human eye, we may see the results of two (or more) running tasks at roughly the same time. The scheduling of these processes is handled by the OS component called the scheduler. Scheduling is a complicated endeavor that requires complex algorithms. Each OS will implement its scheduler differently and Chapter 3 covers a few examples of scheduling algorithms.

Keeping track of processes is important. A table is maintained within the kernel that has entries for each process. These entries contain everything that the process needs while running: instruction pointer, registers (temporary holding space for data the processor uses when executing a process), stack pointer, process state, priority, scheduler data, etc. When the kernel switches from one process to the next, it saves all of these data about the currently running process into their spot in the table, essentially freezing the process in place. The kernel then loads info about the next process into the processor and runs it. This change is transparent to the process itself. All of its registers are there; its stack is still the same; its files are still open when it is chosen to run again in the future. This procedure, called a context switch, is effective in multitasking between multiple processes, but it may be costly, in terms of processor time.

One important thing to look at is the creation of a process. Think of living cells: to create a new cell they basically split off a duplicate of themselves. This is really the easiest way because the best blueprint with which to make a new cell is their own! This is a lot like how computer processes are created in many operating systems. The two basic requests (syscalls) to the OS to create new processes doing new activities are @fork()@ and @exec()@.

Let's say you were running a command-line shell. This is the typical program that the user directly interacts with to send commands to the operating system, such as "run this" or "list the contents of this directory." Suppose you were to type in @ls@ (on POSIX systems, Chapter 10) to do the latter. The shell process would then make a syscall (commands you send to the kernel) called @fork()@. This @fork@ command will then create a new process identical to the shell process: its new memory will be a copy of the shell's memory, its new table a copy of the shell's. The only thing it gets that's unique is a new process id (PID). Once this copy is made, and the new spawned process is run, it'll act just like another shell. However, the first thing this new shell does is make another syscall, @exec()@, which "executes" the desired command (@ls@ in this case). This execution replaces the memory of the running process with that of the new program and the new shell process is replaced by @ls@, And when @ls@ stops, the process dies with it.

[[Image(http://www.ocgnet.org/~dam/fork.jpeg)]]

Now, there is another way of accomplishing multiple tasks at once without creating whole new processes: threads.


h2. 2.Thread

Lets say you are running your word processor, writing a 600-page paper, and you decide to save your document. However, saving a 600-page document takes some time and you could be stuck waiting for it. It would be much more efficient to enable the word processor to save the document while you were still allowed to type. Well, that's actually what it does, and it does it using threads.

It would be silly for the word program to start a whole new process just to save the document. If it were to do this, it would have to copy over the entire memory of the document to the new process, still taking a good bit of time. However, lets say a new "sub-process" was started within the word program. This sub-process shares the same memory, but has its own task to execute, getting its own time to be scheduled on the processor. This sub-process is called a thread, for "thread of execution". Machines that implement processing of multiple threads are called multi-threaded machines.

Threads also have their own table of information, but it is typically different than the process table. Implementation of threads is also sometimes separate from implementation of processes. Although processes are very much a kernel's responsibility, threads don't have to be. Let's look at the operating system combination MachOS/Hurd, for example.

MachOS is a microkernel that implements only IPC (Inter-Process Communication, see Chapter 4), basic I/O interaction, basic memory handling, and basic process scheduling. Running on top of MachOS in a "multi-server" (Cite: Le Mignot, 3) configuration is the Hurd server system. This server system is basically the interim between the kernel and user environments. This is where services typically seen inside monolithic kernels reside, such as implementation and scheduling of threads, full memory handling, filesystems implementation, and network stacks. This system therefore allows many different servers to run on one single MachOS kernel, allowing a third "environment" layer (on top of each of the servers) with which the user actually interacts (such as a POSIX environment, MacOS, or even DOS).

MachOS and Hurd basically split apart what UNIX contains all within its monolithic kernel design. However, when it comes to threads, UNIX can still sometimes be found implementing them in user space in the form of a library. But the kernel is the most popular place for managing threads. When kernel threads are the case, the syscall @t_fork@ is set aside for creating threads in much the same way @fork@ creates processes.

In UNIX there are two main tables for handling processes (the threads get their own, separate tables).

* *Process table (contains information needed at all times, even when process is not in memory): * Scheduler parameters (Chapter 3) * It's virtual memory image (Chapter 5) * Signals (so the process can still receive them when it's not in memory) * Miscellaneous Information (process state, event it's waiting for, process id (PID) for itself, its parent, and its children, etc)

* *User table (information needed only when process is in current memory and running): * Machine registers (for when trapping, see below) * Syscall state (parameters and results) * File descriptor table (list of files open) * Kernel stack * Miscellaneous (user & system CPU time, limits to CPU time, stack size, etc)

The implementation of these tables is pretty straightforward. The process table is held in memory at all times because it is used by the scheduler, memory manager, and signal manager, even for processes that aren't active. Then, when the scheduler determines that it's ready for a process to come back to life, it performs the context switch, saving the current process' User Table and then loading the new process' User Stack.

One interesting note about UNIX and its process creation is something called copy-on-write. When a new process is forked it actually doesn't get a full copy of the memory of the parent process. In fact, the kernel actually cheats a little and only gives the new process a pointer to the parent's memory pages. This way the child process can read from the memory all it likes. It's not until the moment the child tries to write to the memory that the kernel copies it so the child can write and read to it at will. This, therefore, saves memory and processing power by not creating extraneous memory if a new process never needs to write to it.

Linux tackles the process/thread situation slightly different than its big brother, UNIX. Linux actually combines processes and threads into one general structure. Instead of @fork@ and @t_fork@, it uses @clone@. Whether the new clone is a process or a thread is determined by the parameters sent to @clone@. These particular parameters are centered around sharing flags. These set of flags basically determine which address spaces and tables are shared between clones. Threads share the same address space, but processes have their own. This also implies that the contents of the Process Table and User Table in UNIX are combined under Linux.

To be compliant with the POSIX standard, Linux has Process and User table structures that just point to the appropriate parts of the combined tables. Linux also has @fork@ and @t_fork@ but they're basically just wrapper functions to @clone@ that call @clone@ with the particular share flags appropriate to process or thread.

h2. 3. FMI/OS

FMI/OS implements threads in the kernel, treating them not much differently than processes, much like Linux. Details on their implementation are in the next chapter, Scheduling.

As far as the Process and Thread tables, FMI/OS separates them. These structures contain: ** Process Structure * Permissions * Protections * Debugging information * Pointers to the parent process and child process(es) * List of its threads * Its memory pages, or virtual address space (VAS) * And the ports it has for use by the IPC (Chapter 4)

** Thread Structure * Parent process information * Pointer to its node in the schedule tree * Clock ticks ran for and ticks left to run (for use by scheduler) * State of process (whether running or blocking or sleeping) * Lists of what it's waiting for if blocking or sleeping * Stack info (for context switching) * Number of times it ran over its quantum (Chapter 3) * Miscellaneous flags

For a more detailed look at FMI/OS's thread and process structures check the source:

'http://amatus.g-cipher.net/cgi-bin/archzoom.cgi/fmios-pqm@ocgnet.org--fmios/kernel--devel--1.0--patch-6/include/vsta/proc.h'

'http://amatus.g-cipher.net/cgi-bin/archzoom.cgi/fmios-pqm@ocgnet.org--fmios/kernel--devel--1.0--patch-6/include/vsta/thread.h'

Chapter 3 Scheduling


(c)2006 Dimitri Hammond

h4. Comment by fuzzy@bo2k.net on Thu Nov 30 21:33:10 2006

your archzoom links are dead, probably want to get those fixed up soon. I'll see if I can find the equiv links, or at least post the files somehwere permanent if I can.

h4. Comment by mabri on Sat Dec 2 09:49:40 2006

I've just fixed the links up, they now work (I don't know for how long ;-(, they're to AFAIK now-obsolete archives).

AddComment

fork.jpeg - Diagram of forking (38.4 KB) dam, 2006-12-02 02:08

Also available in: HTML TXT