CS111 Greek to English Dictionary

Instructor: Mark Kampe

Last updated: Feb 15, 2008

Gary Nutt's "Operating Systems, A Modern Perspective" has a lot of good things going for it: the range of subjects, the depth of coverage, frequent examples from real systems, and a good blend of theory and code. Some of the discussions, however, include a more formal approach (e.g. expression of principles in set-theory notation) that may not be well targeted to a general undergraduate audience (who is more concerned with understanding what operating systems are and do than with being able to read theoretical papers in operating systems). Also, as with any presentation, there are inevitably some points that don't come across quite as well as the writer intended.

done This "Greek to English Dictionary" attempts to explain, in plainer language, the points that Dr. Nutt is making in some of those sections. Also, for easy reference, it assembles definitions of a few key concepts and descriptions of a few key principles that are not well covered in the text.

Clarifications:

2.3.1 Figure 2.6 - FORK, JOIN, QUIT

For numerous reasons (many of which should be clear after studying this example) fork and join are not generally used as synchronization operations. They are, however, fundamental operations in all discussions of parallel algorithms ... and the join operation does achieve synchronization (by forcing the first thread to join to wait for the second one to do so).

In this example (which continues from the problem posed in figure 2.5), the goal is to let the processes do their computations (A1, A2, B1 and B2) in parallel, but force the parallel computations back into a single thread for the update and retrieve operations. This "serialization" (the cancelizing of parallelism) enables us to ensure that:

The flow of control in this example is illustrated by the following figure:

flow of control in fork/join example

The way the solution works is:

Critical sections and synchronization are such important subjects that we dedicate three chapters and four lectures to them. You will learn much better ways to solve this kind of problem.

3.3.3 (Second Edition Only)Requesting Services from the OS

Near the end of this section, Nutt says:

Many students ask what the point of this sentence is. It is actually an introduction to some very interesting issues in OS design. This paragraph is related to, but does not follow directly from the preceding discussion of message-based and trap-based service requests. The discussion is still about service requests, but he has moved from the means of issuing service requests to the process context in which the requested operations will be carried out.

He suggests two ways of organizing an operating system:

  1. An operating system that does not have any processes of its own. When a user application makes a service request, that application traps into supervisor state, where it continues executing as the same process. If the service request is a simple one (e.g. return the time of day) it gets the requested data and returns promptly to user mode.

    If the request is one that might block (e.g. to allocate a resource that is not available or await the completion of an I/O operation) then the requesting process is put to sleep (in supervisor mode), will be re-dispatched when the awaited operation has completed, and the service request will finish, the operating system will return to user mode, and the user program will resume execution.

    The key thing here is that the process that blocked and was later re-dispatched was the user process, temporarily executing in supervisor mode only for the duration of the request. Dr. Nutt asserts that the only processes executing in the operating system are of this sort.

  2. An operating system in which services are provided by system processes, and user processes request services by sending messages to those service processes. The requesting user process may or may not block awaiting a response ... but even if it blocks, the requesting process is still entirely distinct from the service process.

    If the service process needs to allocate resources or await I/O, it will do so, perhaps blocking itself. Later, when the requested operations are complete the service process will send a response message back to the user process.

    The key here is that there are some processes that exist entirely within the operating system and solely to perform services for user processes.

In the last two sentences of section 3.3, Dr. Nutt alludes to the fact that it is occasionally awkward to implement a service that only runs in user processes. In the example of waiting for an user I/O operation to complete or waiting for a user requested resource to be freed, it makes perfect sense to block the user process until the awaited event happens.

There are, however, cases where a service may need to block (awaiting an external event) but where it doesn't make sense to do this in any particular user process:

In fact, it is possible to solve problems like this without making recourse to system processes (e.g. interrupt driven chain scheduling). The next I/O requests are sitting in a queue in the driver, and when an I/O completion interrupt comes in, the I/O interrupt handler pulls the next request off of the queue and issues the appropriate commands to the device controller. The request waited for the device to become free, but the waiting did not require the blocking of a process.

When should we use queued requests and when should we use processes?

Dr. Nutt's last paragraph in 3.3 was contrasting these two extremes of operating system organization, and observing that one of them (no system processes) was very constraining. It is doable (I believe that IBM's OS/MVT worked this way) but it is easier if you can have a few system processes.

Note, however, that this is only loosely coupled to the earlier discussion of messages vs traps. Many systems use traps into supervisor state, and most supervisor execution is on behalf of a specific user process ... but they also have a few system processes to handle particular situations.

6.3 (Second Edition Only) Process Address Spaces

The description of what an address space is may be a little confusing. Most people define an address space as the set of memory (and memory-like) locations that are addressable by a process. Nutt says (and illustrates in figure 6-1) this includes files and resources.

There are a few situations where this is true, but in most cases system services cannot be found in a process' address space. Rather they exist in separate address spaces, and are accessed through call-gate system call trap instructions. Similarly, most of the non-memory resources allocated to a process cannot be found within the process' address space. They too exist in OS (or other server process) address spaces and are accessed through system calls.

Resources that might conform to Nutt's description are:

In all of these situations however, the resources in the process' address space are still some sort of memory. Thus, for all intents and purposes, a process' address space is made of of the memory (and memory-like) locations that the process can directly address. Any "resources" that may be mapped in to the address space are located in one of those memory segments.

6.3.3 (Second Edition Only) Maintaining Consistency in Address Space

This is actually a very simple concept, but there are two problems. First, the concept is being discussed in the abstract ... when we will have very concrete opportunities to discuss it when we get to memory scheduling, and page replacement algorithms. The second problem is that the formalisms are much more complex than the statement they are trying to formalize.

What is the real problem we are talking about?

In modern computers and operating systems, there are a hierarchy of storage media. Modern CPUs are much faster than memory. Rather than slow the machine down to memory's speed, modern CPUs include one or more levels of "cache memory". This cache memory is many times faster than main memory.

The basic idea of a cacheing is that each time you have to fetch something from memory, you save a copy of it in the fast cache. Each time you need something, you first look in the fast cache, and only go to memory if the item is not already in cache. When we get to swapping and demand-paging in the memory management and virtual memory lectures) we will develop a very similar relationship between memory and disk.

What does the formalism mean?

In one sentence: "If you have multiple copies of something, and you change one of them (but not the others), they cease to be copies of the same thing (until you update the other copies too)."

We normally think of cacheing read operations. The first time we have to read something from memory, we copy it into the cache ... and after that we don't have to go to memory anymore. We can still think of the copy in memory as the definitive one ... but the cache is a very convenient local copy. The situation becomes a little more complex when we consider write operations. When we update something, we would much rather update the cache than memory (because it the cache is so much faster). However, if we update something in cache ... the cache copy becomes the definitive copy, and the copy in memory is out-of-date.

Having the in-memory copy be out-of-date (the term "stale" is often used) is a bad thing. After the CPU updates some location in the cache, the cache must (in the background) propagate this update back to main memory ... bringing it back up-to-date. When we get to demand paging, we will see the exact same relationship emerge between modified pages in memory and their saved copies on disk.

So if we agree that the other copies have to be updated, the only remaining question becomes when?

6.7 Resource Managers

This is a completely reasonable section, but there might be some value in explaining the set-theory notation.

7.3 Partitioning a Process into small processes

This is another completely reasonable section, but there might be some value in cutting through the Greek letters.

The basic concept here is that (most) processes do not get the CPU and run continuously until they complete. They regularly request resources, services and I/O that may not be immediately available. As a result, a process typically runs for a while, and then waits for something, and then runs some more, and waits some more.

A scheduler is only responsible interested in processes that are ready to run. Processes that are blocked for something are not eligible to run. Thus it is useful to look at the execution of a process, not as a single execution interval, but as an alternating series of computations and waits.

7.4 Approximating System Load

This section introduces some basic concepts and notations from queueing theory.

8.3 Semaphore counts

"s" refers to the value of the semaphore's counter. If s is positive, a P will succeed. If s is zero, a P will block.

This counter makes semaphores more general than a simple lock. If s is initialized to one, the semphore will act as a mutual exclusion lock.

If it is initialized to a higher number (e.g. 6) it will allow up to six people to enter the critical section before starting to block them. This makes little sense if we are thinking about mutual exclusion ... but suppose the semaphore represented the number of messages that were available to be read. Each time a new message was received, the count would be incremented (V the semaphore). When someone wanted to get a message, they would P the semaphore. If the count was positive (messages already available) they would continue immediately. If the count was zero (no messages available) they would wait until a new message arrived. Nutt discusses this use in section under "The producer-consumer problem".

8.3 Figure 8.xx Implementing semaphores w/TS

The semaphore data structure in this example contains three interesting fields:

The key thing to understand about this example is that there is a critical section surrounding the testing/incrementing/ decrementing of the semaphore counter. The mutex is used to protect that critical section. Getting the mutex does not mean that your P succeeded ... it merely means that you can now safely check/modify the semaphore counter.

With this in mind, the what the code does is:

The only really tricky thing about the code is the use of s.hold:

This is not a great implementation of P and V (mostly because of the wasteful busy-wait), but it is a correct one.

8.3 Atomic Instructions

Nutt mentions the atomic instruction Test-and-Set (TS), but does not describe it in detail. In the lecture, I make use of a more powerful atomic instruction, Compare-and-SWAP (CS). This note describes what they actually do.

Like all atomic instructions, they implement a read/modify/write operation with no possibility of interruption or interference from any other device or processor on the bus. The TS instruction has been around for a very long time. Its functionality is illustrated by the following code (although it is actually implemented in hardware):

	char TS( char *p ) {	/* address of byte to test/set	*/
		char rc;

		rc = *p;	/* note current value of byte	*/
		*p = 0xff;	/* set byte to 0xff		*/
		return rc;	/* return previous value	*/
	}

The more interesting CS instruction came about in the '70s and is much more powerful, because it can implement "change the value if noone else has changed it". Its functionality is illustrated as:

	int CS( int *p, int old, int new ) {
		
		if (*p == old) { /* if value hasn't changed	*/
			*p = new;/* replace it with new value	*/
			return 1;/* return success		*/
		} else
			return 0;/* value has changed, failure	*/
	}

In the x86 architecture, this is done with the CMPXCHG instruction.

10.The Bankers' Algorithm

I am not aware of any systems that actually use the Banker's algorithm, because it makes some fairly major assumptions. It is, however, a good exploration of safe states, which are a very useful concept. The Banker's algorithm is only applicable to commodity resources (resources like money, memory, or CPU time, where you need a certain amount, but don't particularly exactly which instances of those resources are allocated to you). It turns out, however, that commodity resources are a place where resource ordering is often not an option, and so it is nice to have another way of dealing with deadlocks involving these resources.

The banker's algorithm takes a recursive approach to the resource allocation question by starting with known safe states, and then saying that any state, from which you can get to a safe state, is also a safe state:

The banker's algorithm is testing for this last condition. A high level but informal summary of the banker's algorithm is:

8.3.1 Figures 8.15-16 Reader/Writer coordination

These two code examples are fairly complex and many people have trouble understanding them. The more interesting example is 8.16, but because 8.16 is an attempt to correct problems with 8.15, it is important to understand both samples.

8.15 reader/writer locks

The problem with this code is that a writer could be starved by a long line of readers. It might be more fair to say that once a writer gets in line, new readers have to wait until the writer finishes (a more FCFS approach). He also suggests that more than fairness is at stake, because readers might be eager to see the latest/greatest value.

8.16 reader/writer locks

Note, however, that the reader serialization created by writePending only applies to the critical section that raises readCount and (if necessary) gets writeBlock. Once the readers get past this point, they can all run in parallel (which is, after all, the point of read/write locks).

12.1.1, 12.2.1 Virtual Address Translation

In chapter 12, Nutt uses set-theory notation to describe the mapping of virtual addresses into physical addresses. These formalisms are not used to state any assertions or theorems, they are just the way he chooses to talk about the mapping process.

12.1.1 PSIt:virtual address space -> physical address U {Omega}

12.2.1 Virtual Address Translation

This section too includes much more Greek than is required to describe the (relatively simple) mapping from a virtual address to a physical address in a paged system.

Nutt talks about space in terms of words, but this is an imprecise term. The description below is in terms of bytes, and assumes that all addresses are byte addresses (i.e. that they are capable of addressing individual bytes).

12.3 Replacement Algorithms

St(m) = St-1(m) U Xt - Yt

Definitions of a few key (or confusing) concepts

Descriptions of a key engineering principles from this course