Last updated: Feb 15, 2008
Gary Nutt's "Operating Systems, A Modern Perspective" has a lot of good things going for it: the range of subjects, the depth of coverage, frequent examples from real systems, and a good blend of theory and code. Some of the discussions, however, include a more formal approach (e.g. expression of principles in set-theory notation) that may not be well targeted to a general undergraduate audience (who is more concerned with understanding what operating systems are and do than with being able to read theoretical papers in operating systems). Also, as with any presentation, there are inevitably some points that don't come across quite as well as the writer intended.
done This "Greek to English Dictionary" attempts to explain, in plainer language, the points that Dr. Nutt is making in some of those sections. Also, for easy reference, it assembles definitions of a few key concepts and descriptions of a few key principles that are not well covered in the text.
Clarifications:
For numerous reasons (many of which should be clear after studying this example) fork and join are not generally used as synchronization operations. They are, however, fundamental operations in all discussions of parallel algorithms ... and the join operation does achieve synchronization (by forcing the first thread to join to wait for the second one to do so).
In this example (which continues from the problem posed in figure 2.5), the goal is to let the processes do their computations (A1, A2, B1 and B2) in parallel, but force the parallel computations back into a single thread for the update and retrieve operations. This "serialization" (the cancelizing of parallelism) enables us to ensure that:
The flow of control in this example is illustrated by the following figure:
The way the solution works is:
Critical sections and synchronization are such important subjects that we dedicate three chapters and four lectures to them. You will learn much better ways to solve this kind of problem.
3.3.3 (Second Edition Only)Requesting Services from the OS
Near the end of this section, Nutt says:
Many students ask what the point of this sentence is. It is actually an introduction to some very interesting issues in OS design. This paragraph is related to, but does not follow directly from the preceding discussion of message-based and trap-based service requests. The discussion is still about service requests, but he has moved from the means of issuing service requests to the process context in which the requested operations will be carried out.
He suggests two ways of organizing an operating system:
If the request is one that might block (e.g. to allocate a resource that is not available or await the completion of an I/O operation) then the requesting process is put to sleep (in supervisor mode), will be re-dispatched when the awaited operation has completed, and the service request will finish, the operating system will return to user mode, and the user program will resume execution.
The key thing here is that the process that blocked and was later re-dispatched was the user process, temporarily executing in supervisor mode only for the duration of the request. Dr. Nutt asserts that the only processes executing in the operating system are of this sort.
If the service process needs to allocate resources or await I/O, it will do so, perhaps blocking itself. Later, when the requested operations are complete the service process will send a response message back to the user process.
The key here is that there are some processes that exist entirely within the operating system and solely to perform services for user processes.
In the last two sentences of section 3.3, Dr. Nutt alludes to the fact that it is occasionally awkward to implement a service that only runs in user processes. In the example of waiting for an user I/O operation to complete or waiting for a user requested resource to be freed, it makes perfect sense to block the user process until the awaited event happens.
There are, however, cases where a service may need to block (awaiting an external event) but where it doesn't make sense to do this in any particular user process:
Consider a network protocol, that is not allowed to transmit new messages until it has received an acknowledgment of previously sent messages. An obvious way to implement this would be to have a process that slept waiting for the ACK and transmitted queued messages whenever it was allowed to do so. Again, which user process should we block while waiting for this acknowledgement? There really was no appropriate user process to block, so creating a system process makes sense.
In fact, it is possible to solve problems like this without making recourse to system processes (e.g. interrupt driven chain scheduling). The next I/O requests are sitting in a queue in the driver, and when an I/O completion interrupt comes in, the I/O interrupt handler pulls the next request off of the queue and issues the appropriate commands to the device controller. The request waited for the device to become free, but the waiting did not require the blocking of a process.
When should we use queued requests and when should we use processes?
In part it is a question of encapsulation and appropriate abstraction. How much information does the agent who completes the request need to know. In the case of an I/O request, a half dozen parameters (device address, memory address, length of transfer, type of transfer, who to notify when done) may adequately describe the request ... and so putting these parameters in the queued request is totally reasonable. In the case of the memory scheduler or network protocol, there may have been a very large amount of state and context associated with "doing the next thing" and so it was much easier to dispatch a process (that knew all of this information) than to try to encapsulate it all into a queued request.
Note, however, that this is only loosely coupled to the earlier discussion of messages vs traps. Many systems use traps into supervisor state, and most supervisor execution is on behalf of a specific user process ... but they also have a few system processes to handle particular situations.
6.3 (Second Edition Only) Process Address Spaces
The description of what an address space is may be a little confusing.
Most people define an address space as the set of memory (and memory-like)
locations that are addressable by a process. Nutt says (and illustrates
in figure 6-1) this includes files and resources.
There are a few situations where this is true, but in most cases system services cannot be found in a process' address space. Rather they exist in separate address spaces, and are accessed through call-gate system call trap instructions. Similarly, most of the non-memory resources allocated to a process cannot be found within the process' address space. They too exist in OS (or other server process) address spaces and are accessed through system calls.
Resources that might conform to Nutt's description are:
Later, when we talk about swapping and paging, we will see how the OS can associate a page of secondary storage with each page in a process' address space. The secondary storage pages don't have to be swapped out copies of the processes code or data. They could be pages of a file that the OS has been used to map into the process' address space. For many applications this is a much more efficient means of reading and writing files than normal, bufferred, read and write operations.
Sometimes, when two applications need to regularly exchange large amounts of data, and normal interprocess communication would be too slow, the processes ask the OS to map a shared read/write segment into their address spaces, and they communicate with one-another through data structures in that shared segment. This segment refers to a piece of allocated memory in the computer ... just like the code, data, and stack segments. The main difference is that this segment is read/write and shared by multiple processes.
For performance or fire-walling reasons, some device drivers (for devices that don't need interrupts) are implemented in user mode programs. For example, a display server might be implemented in a separate process that asked the operating system to map the graphics frame buffer into the process' address space. This would create a new segment in the process' virtual address space, and by accessing those locations, the process could (very efficiently and directly) update the display.
In all of these situations however, the resources in the process' address space are still some sort of memory. Thus, for all intents and purposes, a process' address space is made of of the memory (and memory-like) locations that the process can directly address. Any "resources" that may be mapped in to the address space are located in one of those memory segments.
6.3.3 (Second Edition Only) Maintaining Consistency in Address Space
This is actually a very simple concept, but there are two problems. First, the concept is being discussed in the abstract ... when we will have very concrete opportunities to discuss it when we get to memory scheduling, and page replacement algorithms. The second problem is that the formalisms are much more complex than the statement they are trying to formalize.
In modern computers and operating systems, there are a hierarchy of storage media. Modern CPUs are much faster than memory. Rather than slow the machine down to memory's speed, modern CPUs include one or more levels of "cache memory". This cache memory is many times faster than main memory.
The basic idea of a cacheing is that each time you have to fetch something from memory, you save a copy of it in the fast cache. Each time you need something, you first look in the fast cache, and only go to memory if the item is not already in cache. When we get to swapping and demand-paging in the memory management and virtual memory lectures) we will develop a very similar relationship between memory and disk.
In one sentence: "If you have multiple copies of something, and you change one of them (but not the others), they cease to be copies of the same thing (until you update the other copies too)."
We normally think of cacheing read operations. The first time we have to read something from memory, we copy it into the cache ... and after that we don't have to go to memory anymore. We can still think of the copy in memory as the definitive one ... but the cache is a very convenient local copy. The situation becomes a little more complex when we consider write operations. When we update something, we would much rather update the cache than memory (because it the cache is so much faster). However, if we update something in cache ... the cache copy becomes the definitive copy, and the copy in memory is out-of-date.
Having the in-memory copy be out-of-date (the term "stale" is often used) is a bad thing. After the CPU updates some location in the cache, the cache must (in the background) propagate this update back to main memory ... bringing it back up-to-date. When we get to demand paging, we will see the exact same relationship emerge between modified pages in memory and their saved copies on disk.
So if we agree that the other copies have to be updated, the only remaining question becomes when?
This is a completely reasonable section, but there might be some value in explaining the set-theory notation.
There are "m" different classes of resources. We can call them R1, R2 ... Rm. R1 might be memory, and R2 might be printers, etc. Realize, however that these resources could be physical or abstract.
For each resource, there are some number of them available. We can call the number of resources of type j Cj.
Each of these different classes of resource has a resource manager. The resource manager is responsible for keeping track of all of the resources of that type.
A process can request some amount of any given resource. A process cannot request more of any resource than exists. (actually it can ask, but it won't get them :-)
If a process requests resources that are not currently available, the resource manager for that resource will cause the process to block until the requested resources become available.
This is another completely reasonable section, but there might be some value in cutting through the Greek letters.
The basic concept here is that (most) processes do not get the CPU and run continuously until they complete. They regularly request resources, services and I/O that may not be immediately available. As a result, a process typically runs for a while, and then waits for something, and then runs some more, and waits some more.
A scheduler is only responsible interested in processes that are ready to run. Processes that are blocked for something are not eligible to run. Thus it is useful to look at the execution of a process, not as a single execution interval, but as an alternating series of computations and waits.
This section introduces some basic concepts and notations from queueing theory.
always represents the arrival rate of new requests. Consider a web-server that was getting 72,000 hits per hour. We would call the request arrival rate 72,000/hour (or 20/second).
always represents the service rate. Suppose that the web server could (on average) handle 40 requests per second. We would call the service rate 40/second.
Sometimes people prefer to talk times rather than rates. If the average service rate (µ) is 40 requests per second, the average service time (1/µ) is 25ms. If the average request arrival rate (lambda) is 20/second, then the average inter-request-arrival time (1/lambda) is 50ms.
The load on a system is often expressed as a fraction of its capacity. If requests are arriving at rate (lambda) 20/second, and the system can process them at rate (µ) 40/second, the load (rho = lambda/µ) = 1/2. The system is currently operating at 50% capacity.
If the request arrival rate exceeds the rate at which the system can service requests, the system will get farther and farther behind, and the response time will get longer and longer.
8.3 Semaphore counts
"s" refers to the value of the semaphore's counter.
If s is positive, a P will succeed. If s is zero, a P will
block.
This counter makes semaphores more general than a simple lock. If s is initialized to one, the semphore will act as a mutual exclusion lock.
If it is initialized to a higher number (e.g. 6) it will allow up to six people to enter the critical section before starting to block them. This makes little sense if we are thinking about mutual exclusion ... but suppose the semaphore represented the number of messages that were available to be read. Each time a new message was received, the count would be incremented (V the semaphore). When someone wanted to get a message, they would P the semaphore. If the count was positive (messages already available) they would continue immediately. If the count was zero (no messages available) they would wait until a new message arrived. Nutt discusses this use in section under "The producer-consumer problem".
8.3 Figure 8.xx Implementing semaphores w/TS
The semaphore data structure in this example contains three interesting fields:
The key thing to understand about this example is that there is a critical section surrounding the testing/incrementing/ decrementing of the semaphore counter. The mutex is used to protect that critical section. Getting the mutex does not mean that your P succeeded ... it merely means that you can now safely check/modify the semaphore counter.
With this in mind, the what the code does is:
get the mutex so we can manipulate the semaphore decrement the semaphore counter if it goes negative (we have to wait) release the mutex wait for the count to go back up to zero (s.hold) else (we got it) release the mutex return success
get the mutex so we can manipulate the semaphore increment the semaphore counter if someone is waiting (value <= 0) wait for the waiter to set hold (meaning he is watching) clear hold (which will wake him up) release the semaphore mutex
The only really tricky thing about the code is the use of s.hold:
The "while(TS(s.hold))" loop in P is waiting for s.hold to be set to false. Note that since the TS atomically checks s.hold and sets it to be TRUE, only one P'er will get false returned ... after which s.hold will be true again.
In V, you might think that all you have to do is set s.hold to FALSE if someone was waiting. The problem is what happens if multiple people were waiting, and there were two V's in a row. The first V would set s.hold to FALSE ... but if a second V happened before the first P'er could run (and have his TS succeed), we would have no means of waking up the second process. By waiting to ensure that s.hold is TRUE before setting it to false, V deals with this possibility.
This is not a great implementation of P and V (mostly because of the wasteful busy-wait), but it is a correct one.
Nutt mentions the atomic instruction Test-and-Set (TS), but does not describe it in detail. In the lecture, I make use of a more powerful atomic instruction, Compare-and-SWAP (CS). This note describes what they actually do.
Like all atomic instructions, they implement a read/modify/write operation with no possibility of interruption or interference from any other device or processor on the bus. The TS instruction has been around for a very long time. Its functionality is illustrated by the following code (although it is actually implemented in hardware):
char TS( char *p ) { /* address of byte to test/set */
char rc;
rc = *p; /* note current value of byte */
*p = 0xff; /* set byte to 0xff */
return rc; /* return previous value */
}
The more interesting CS instruction came about in the '70s and is much more powerful, because it can implement "change the value if noone else has changed it". Its functionality is illustrated as:
int CS( int *p, int old, int new ) {
if (*p == old) { /* if value hasn't changed */
*p = new;/* replace it with new value */
return 1;/* return success */
} else
return 0;/* value has changed, failure */
}
In the x86 architecture, this is done with the CMPXCHG instruction.
I am not aware of any systems that actually use the Banker's algorithm, because it makes some fairly major assumptions. It is, however, a good exploration of safe states, which are a very useful concept. The Banker's algorithm is only applicable to commodity resources (resources like money, memory, or CPU time, where you need a certain amount, but don't particularly exactly which instances of those resources are allocated to you). It turns out, however, that commodity resources are a place where resource ordering is often not an option, and so it is nice to have another way of dealing with deadlocks involving these resources.
The banker's algorithm takes a recursive approach to the resource allocation question by starting with known safe states, and then saying that any state, from which you can get to a safe state, is also a safe state:
if some people need more resources to complete, and we have enough resources to satisfy their needs, we are obviously in a "safe" state.
if we have enough resources to enable some people to complete, and when they complete they will free up enough resources to enable other people to complete, and eventually everyone can complete ... we are also in a safe state.
The banker's algorithm is testing for this last condition. A high level but informal summary of the banker's algorithm is:
figure out how many resources of each type are currently available
see if that is enough to allow ANY (one will do) process to complete. If not, we are in an UNSAFE state.
if a process can complete, consider all of the resources it currently holds to have been freed (add them to the available vector). This is because this process is guaranteed to be able to complete, and so will (eventually) free all of the resources that have been reserved for it.
repeat this procedure until all processes have been found to be able to complete. If all processes can complete, the state is safe. If any processes cannot complete, the state is unsafe.
These two code examples are fairly complex and many people have trouble understanding them. The more interesting example is 8.16, but because 8.16 is an attempt to correct problems with 8.15, it is important to understand both samples.
when the first reader starts, he P's writeblock if a writer is currently busy, this blocks until the writer finishes if not, the P succeeds, and future writers are blocked when the last reader finishes, he V's writeblock this unblocks a waiting writer (if any) the writer just P's writeblock to start, and V's it when done.
The problem with this code is that a writer could be starved by a long line of readers. It might be more fair to say that once a writer gets in line, new readers have to wait until the writer finishes (a more FCFS approach). He also suggests that more than fairness is at stake, because readers might be eager to see the latest/greatest value.
writeCount is a symmetric count of the number of waiting writers, and is protected by mutex2. Similar to the operation of readCount in 8.15, readers will not be allowed in to the resource until writeCount goes to zero.
writeBlock is used (as in 8.15) to manage the trade-off between readers and writers in the critical section. Either the readers have it or the writers have it.
readBlock is used by writers to stop new readers from getting into line for the critical section.
writePending is more subtle, and is used to give writers priority over new readers. Suppose that multiple readers and writers all want to go at about the same time. The first reader gets writePending, and then he and the first writer have a race to see who gets to readBlock first. This is OK, but if a second reader shows up, the first writer should not have to race against him too. The writePending semaphore ensures that only one reader at a time can race against a writer.
Note, however, that the reader serialization created by writePending only applies to the critical section that raises readCount and (if necessary) gets writeBlock. Once the readers get past this point, they can all run in parallel (which is, after all, the point of read/write locks).
12.1.1, 12.2.1 Virtual Address Translation
In chapter 12, Nutt uses set-theory notation to describe the mapping of virtual addresses into physical addresses. These formalisms are not used to state any assertions or theorems, they are just the way he chooses to talk about the mapping process.
Since the base registers or page table can be reloaded (to change the virtual-to-physical mapping), the mapping function changes with time. Let us designate the mapping function in effect at time t Psit.
Not all virtual addresses are valid (i.e. not all virtual addresses actually map to some physical address). We will define Omega to be the invalid physical address.
We can now talk about this time varying function (Psit) that maps virtual addresses into physical addresses (or into the invalid address, Omega).
When we say Psit(i)=k, we are saying that k is the physical address that virtual address i maps to at time t.
This section too includes much more Greek than is required to describe the (relatively simple) mapping from a virtual address to a physical address in a paged system.
Nutt talks about space in terms of words, but this is an imprecise term. The description below is in terms of bytes, and assumes that all addresses are byte addresses (i.e. that they are capable of addressing individual bytes).
There is also a physical address space N that consists of n pages ... which we might as well say can be numbered 0 through n-1.
As we will soon see, it makes a great deal of sense to have the number of bytes in a page, the number of pages in a virtual address, and the number of pages in a physical address all be powers of 2. If we make this assumption:
the number of bytes in a page can be assumed to be 2h.
the number of pages in the virtual address space (n can be assumed to be 2g.
the number of pages in the physical address space (m can be assumed to be 2j.
This having been said, we can now represent a virtual address as a g+h bit number, where the first g bits are a page number, and the last h bits are a byte offset within that page.
We can similarly represent a physical address as a j+h bit number, where the first j bits are a page number, and the last h bits are a byte offset within that page.
The mapping function Psit can now be viewed as one that maps a virtual address (a number between 0 and 2g+h) into a physical page number (between 0 and 2j) and an offset within that page (between 0 and 2h). (Nutt calls the offset with the page a "line number").
St(m) = St-1(m) U Xt - Yt
The virtual pages in those frames at time t are going to be the same pages that were in those frames earlier (say at time t-1), less those pages that were replaced in the interrim, plus those places that were fetched in the interrim. In other words, the only thing that changes which virtual pages in memory is page fetching and replacement.
The fetch policy (X) is pretty boring ... we fetch those pages that the process needs that are not already in memory. Thus, the entire remainder of this section will focus on the replacement policy (Y).
The set of locations used by the process to address primary memory locations, and any other resources or services that can be accessed with ordinary instructions. This is often referred to as a "virtual address space" to distinguish it from physical memory, which most processes cannot directly address.
An atomic instruction is one that performs its function (typically a read/modify/write of a small number of contiguous memory locations) with no possibility of interference from interrupts, other processors, or devices
An atomic operation is one that must be completed without interference from other threads, processes, interrupts or processors. An atomic operation need not be implemented with atomic instructions. It can be protected by a semaphore or mutex. The semaphore or mutex may be implemented with atomic instructions, but given such protection atomic operations can then be implemented with ordinary instructions.
A running process that is willing to give up the CPU for a while so that other processes can run can yield the CPU. In a system with preemptive scheduling, a process that has run too long can be forced to yield the CPU. Either way, the process is still ready and eligible to run the next time its turn comes around.
A process that has requested some resource that is not currently available (e.g. memory allocation or the completion of some disk I/O) cannot continue executing until the requested resource becomes available. In this case, the process is blocked. When a process is blocked its process descriptor includes a note to the scheduler indicating that this process is no longer eligible to be run. Once a process becomes blocked it will yield the CPU, and will not be eligible to be executed again until the blocked notation has been turned off.
If a desired resource is not currently available, the requestor will often choose to wait until it becomes available. In real systems, there is usually a way to say "put me to sleep and wake me up when the resource becomes available".
Nutt's text contains numerous examples where the process sits in a tight loop until the resource becomes available:
Such loops are called "busy waits" because they keep the computer busy, even though all it is doing is waiting. As we discussed in class, these are very wasteful, and seldom done in actual practice.
The term "spin lock" refers to a lock where the locker awaits the availability of the lock with a busy wait.
An architecture in which it is possible to divide systems into servers (systems with unique resources like fast CPUs, large disks, high quality printers, a direct internet connection, etc) and clients (that use the resources on the servers). Such an architecture simplifies and lowers the cost of the client systems, while permitting the expensive server resources to be amortized over a large number of clients.
Client/server architectures generally include discovery protocols (that enable clients to find servers), and remote access protocols (that enable clients to use the servers' resources).
A deterministic process is one that yields consistent or predictable results, because there are no random or statistical activities in the process. Example: the results of a long division operation are deterministic, but the results of trying to teach a child how to perform long division are not :-)
An indeterminate result is one that is undefined or cannot be stated with any confidence. The results of dividing a number by zero are indeterminate.
The order of execution of instructions in unsynchronized parallel processes may be non-deterministic, perhaps making the results of those computations indeterminate.
We say that an activity is "distributed" when it involves multiple computers that are interconnected by a network. The opposite of "distributed" is "local". An activity involving multiple processors on the same bus is considered local rather than distributed.
Because the nodes in a distributed service do not share memory, they cannot all see the same data at the same time. Distributed services are more complex than local services because communication between the nodes is relatively slow, and because there are many more possible modes of failure.
Originally these terms referred to the frequencies that were used to transmit radio signals. In digital communication they now refer to distinct data paths.
In-band communication is sequenced (in a first-come, first-served order) into an ongoing communications byte-stream. Out-of-band communication travels over a separate channel so that it can be received promptly (ahead of normal traffic that might have been sent before it).
Out-of-band communication is often used to provide prompt notification of failures.
A family of computers are all members of the same Instruction Set Architecture if they all support the same instruction sets (i.e. if they interpret the same bit-patterns in a program to mean the same operations). We generally talk about ISAs rather than manufacturers because multiple manufacturers may make computers that support the same ISA (e.g. Intel and AMD, or SUN and Fujitsu) and a single manufacturer may make computers that support different ISAs (e.g. Motorola makes processors in both the 68000 and PowerPC families).
Within an ISA there may be many subsets, typically with upwards compatability. For example, the 486 architecture is a superset of the 386 architecture, and the Pentium is a superset of the 486.
A library is a collection of (usually object) modules that is likely to be useful to applications software. They might be general (e.g. string functions), system service related (e.g. threads packages), or application specific (e.g. MPEG decoding).
With static libraries, the linkage editor searches the library to find needed modules, and then copies those modules into the load module for the new program.
With traditional shared libraries, the linkage editor finds an address with which to resolve the references, but does not actually copy the needed modules into the new program. Rather, when the program is loaded into memory, the loader notices that the program needs shared libraries and maps them into the new process' address space at the expected addresses.
Dynamically Loadable Libraries are similar to shared libraries, but employ a run-time loader that does not load a module until it is referenced, and is capable of performing much more sophisticated initialization for each newly loaded module (as described in lecture). What the initialization includes is, in principle, open-ended - but the most common activity is the allocation of space for and initialization of static data - which is not possible with a simple mapped shared library.
Both traditional shared libraries and DLLs have the advantages (over static libraries) of reducing the amount of memory required by programs (because many processes can share a single in-memory copy of the library) and deferred loading (so that programs automatically get the latest local version of the desired library each time they are loaded).
In situations where deadlocks are prevented by resource ordering (e.g. always lock type A resources before trying to lock a corresponding type B resource) it may occasionally be necessary for a process to release a type B resource, lock a type A resource, and then re-obtain a lock on the original type B resource.
This obligatory releasing of held resources so they can be reobtained in the correct order is sometimes referred to as a "lock dance".
In advisory locking, a lock is (by convention) associated with a shared object, and users of the object are expected to obtain the lock before using the object, and release the lock when they are done. If a user fails to follow these rules correctly, the resource may be compromised or a deadlock may result.
The disadvantage of advisory locking is (obviously) its dependence on correct application behavior. The benefit, however, is that applications can use a locking protocol that minimizes overhead and bottleneck potential.
With enforced locking the only way to access the object is through official methods that guarantee to perform correct serialization. Usually this means that the applications are in user mode, and the resource manipulation is performed through system calls.
Enforced locking should be quite safe (assuming the OS code is well written), but a one-size-fits all locking discipline may prove to be much more conservative than a particular application requires. As a result, enforced locking usually involves higher overhead (going through the OS to access objects) and longer lines (from unnecessarily coarse lock granularity).
In file systems and databases, the term "data" generally refers to user supplied information that is to be stored for later retrieval. Meta-data is data that describes user data. Examples of meta-data might be:
A multi-processor (MP) system is a single computer system that contains multiple CPUs, each capable of executing its own program. Multi-processor systems are distinguished from "distributed systems" by the fact that they can share hardware resources (like memory) and can communicate at very high rates with very low latency.
The most common MP architecture in use today is the "symmetric multi-processor", where all of the processors have access to the same memory and devices. There are others (like Non-Uniform Memory Architectures, NUMA) where each processor has its own memory, but accessing another processor's memory comes at a cost in time and/or complexity.
A multi-tasking system is one that can run multiple independent tasks (processes) in parallel. The same term can be used whether application programs can directly exploit it (e.g. by creating multiple threads or processes) or not (e.g. the system creates one process for each logged-in user).
The parallelism may be real (if it is running on a multi-processor machine) or virtual (accomplished by time-sharing a single CPU).
A policy is a rule that tells a system how to behave.
A serial program in execution, or an executing instance of a sequential program. Less formally, it is an abstracted private virtual computer that preserves the illusion of continuous execution even though the CPU may be running many other processes as well.
A process differs from a program in that a program contains instructions to execute (and perhaps initial data values) but a program alone is not actually executing. Two twins might be analogous to processes that share the same program (DNA) but they are distinct instantiations that have accumulated different life experiences, think different thoughts and engage in different actions.
A process differs from a thread in that the process can have resources (e.g. memory and files) allocated to it, and a process has its own address space.
The "context" of a process is the saved state of its execution (data, stack, registers, PC, PS) plus all of the state of all resources currently being used by the process (e.g. files, locks, network connections).
When a process yields the CPU its context must be saved, and the context must be completely restored before the process can resume execution again.
A rentrant piece of code is one that will work correctly, even if invoked recursively, simultaneously in multiple concurrent threads, or from nested interrupts. Reentrancy is usually achieved by ensuring that all resources are either private (per instance, and therefore dyanmic), or properly synchronized.
A slightly weaker notion is "Multi-Thread Safe" (or more simply MT-safe). An MT-safe routine can safely be invoked simultaneously in multiple concurrent threads. Critical updates to global variables can be performed under the control of mutual exclusion locks in MT-safe routines, whereas such exclusion could result in a deadlock if used in a routine that might be subject to recursive or nested invocations.
A much weaker related concept "serial reusability". Some code is so poorly written that after a routine has been called the first time, it cannot be invoked again (probably because a crucial initialized variable was changed and no longer has the required initial value). Such routines are not even "serially reusable".
An abstract entity, needed by the executing program and allocated to the process to permit the program to run. Something for which a process/thread might have to block.
We use this word in many different contexts, and with different context-sensitive meanings. Fortunately, all of those meanings derive from one of the most common English language definitions:
When we talk about the state of a computation, we are referring to what instruction it is executing, and the values of all relevent variables. When we talk about the state of a process, we are talking about the entirety of its state (address space, register contents, and all system resources). When we talk about a process's "scheduling state" we are talking about only the subset of the process' attributes that affect its scheduling (e.g. blocked/runnable, priority, time in queue, ...). When we talk about the state of a system, we are talking about the states of all of the processes and resources in that system.
Because the meaning of the word is context sensitive, is important that we qualify it with other words (e.g. "the scheduling state of process 4"). When you see the word "state" on a quiz, homework, or exam problem, make certain that you understand the context in which the word is being used.
From the greek "syn" (together) and "chronos" (time), it refers to things that agree on, and happen in, a common time-line.
In synchronous operations, things happen in a distinct and specific order. In asynchronous operations the relative order of events may be indeterminate or variable). Asynchronous events that can be "synchronized" (usually by forcing one to conform to the others' time line) after which those events become synchronous.
When we talk about a synchronous completion, we mean that the operation does not return until it has completed (e.g. a system call to return the time of day). An asynchronous completion refers to an operation that returns fairly promptly, but will actually complete at a later time (e.g. sending a network message).
We can also talk about asynchronous notifications. If something happens, and the process should be told about it, but the process is not currently blocked awaiting such a notification, we will have to interrupt the process in order to draw its attention to this event. Interrupting a process to inform it of some event that happened while the process was doing other things is called an asynchronous notification. The notification may describe an asynchronous completion (e.g. that a network connection has been established) or it might describe some exception (e.g. an addressing error, the death of a process, the loss of a network connection, etc.).
There are a variety of mechanisms for communicating asynchronous notifications (signals, messages, the V-ing of a semaphore, the posting of an event, etc.) Some of these mechanisms (e.g. semaphores and events) are more appropriate for asynchronous completions. Others (e.g. signals) are more appropriate for exception notifications.
As an example of how these things might combine, consider the collection of dead child processes in the first lab project. The termination of the child process is an asynchronous event (with respect to the parent process). The parent process would like to await the completion of this asynchronous event. It can receive this notification as a synchronous completion (the wait system call, which will wait until the child process terminates) or via an asynchronous notification mechanism (the deliver of a SIGCLD signal, which will interrupt the parent when the child process terminates).
To further complicate things, the words have different meanings in the context of hardware communications protocols. The RS232 (standard terminal/modem communications) protocol is asynchronous because a sender can start transmitting a character at any time. Other communications protocols are synchronous (e.g. the receiver sends a clock signal to the sender, who must transmit his data synchronously (in time with) the rising and falling of the provided clock.
An independently schedulable unit of sequential execution that shares the address space and other resources of a process.
Typically, the only context associated with a thread is a distinct stack, a private set of registers, and a private program-counter and processor status word.
a program or file that appears to be something useful (e.g. a text editor, game, or Pamela Anderson honeymoon video), but secretly does something malicious (e.g. copying or destroying your files).
The term is a historical reference to the Iliad description of a large wooden horse, supposedly built and left outside the gates of Troy by the Greeks who had been laying siege to the city. When the Greeks left, the Trojans (thinking the horse a tribute to their valor) brought it in to the city and had a victory party. That night, Greek soldiers hidden within the horse came out and overran the city.
A virus is a program that makes copies of itself, and uses them to reinfect the same or other systems. Unlike a worm (which can penetrate a system by itself), a virus is carried into a system on a seemingly useful Trojan Horse (e.g. a program or e-mail enclosure).
A worm is a program that penetrates the protective outer layers of an operating system and gains access that should not otherwise have been available. Once in the system it may promptly do something, or it may wait around until it is needed.
It can be argued that this was one of the key concepts that enabled the industrial revolution.
In this context, an interface is a complete specification of the behavior that some component should have. The specification might be in terms of an Application Program Interface, an Application Binary Interface, a network protocol, a data format, device registers, or times and voltages. In all of these cases we specify, not how the component must be built, but how the component should behave.
This interface specification becomes the basis for interoperability and upgradability. Clients design their products around specified components, and component providers build widgets that meet those specifications. If the specifications are clear, and everyone has complied with them, the clients product should work with the supplied components.
Later, if the supplier decides to change their component implementation (e.g. to reduce cost or improve performance), the change should not affect their clients (as long as the new component still conforms to the original standards). Similarly, the client can decide to change suppliers ... as long as the new supplier's components also conform to the original specification.
If the client had engineered their product directly to the supplier's original component (rather than to a neutral specification), it might not be possible for the supplier to change implementation, or for the client to change suppliers. Engineering to interfaces (rather than implementations) is the basis for component interchangeability and interoperability.
A system that exhibits Mechanism/Policy separation is one where it is possible to change the behavior of the system without changing its underlying implementation.
It is rare to find a single resource management policy that meets the needs of all customers for all times. This is true of all types of resources, whether we are talking about CPU scheduling, file access control, or the allocation network bandwidth. Different customers have different needs, and a policy that worked well at one time may not work well as applications and usage evolve. Because of this, hard-coding resource management policy (who gets what resources when) into a resource management mechanism almost guarantees that it will not satisfy anyone.
When designing resource management mechanisms, we must consider the range of policies that people might want to impose on them, and then we should design our mechanisms so that externally specifiable parameters can be used to control the resource management policies.
A good non-OS example of this is electronic card keys. We do not encode keys with a list of doors that they can open. Rather we encode the key with the identity of its owner, and then we create a database that specifies which people are allowed in which rooms at which times. The mechanisms (cards, readers, electronically controlled locks, and an access database) enable, but do not control the policy. The access control database can be used to implement a myriad of policies on top of those basic mechanisms.
Modularity refers to the implementation of desired functionality in independently changeable/replacable modules. The key to modularity is well specified interfaces and minimal interdependencies between distinct modules.
Well engineered modules usually hide details (both mechanism and behavior) of their implementation, exposing only a standard interface (as above). The implementation of a standard function may complex mechanisms (e.g. synchronization or error recovery), but these mechanisms should (as much as possible) be opaquely encapsulated within the module so that clients do not have to be aware of them.
Hiding such details has at least two major advantages: