All of the quizzes combined are worth 10% of the grade, and each quiz covers one day's assigned reading. One will be given in the first five minutes of every lecture period for which reading was assigned. The primary purposes of the quizzes is to encourage you to do the reading, and thus come to each lecture well prepared to understand the material that will be presented. A secondary purpose of the quizzes is to enable me to assess what concepts you are having trouble with, so that I can give them greater emphasis in future lectures.
Each quiz will have four questions. Quiz questions usually ask for the definition of a key term, distinctions between two key terms, or examples of a key concept. Most quiz questions can be answered in a single sentence (or even a few words).
There are no make-ups for missed quizzes.
What is a "serially resusable resource"?
a resource that can be used by one process/user at a time
There are three normal exams, each worth 10% of the grade, and each covering approximately six lectures worth of material. The first will be given at the at the end of the third week or the beginning of the fourth week (so it can be returned to you before the final drop date). The second is usually given during the seventh week. The last is given during the first 90 minutes of the final exam period. The purpose of these exams is to determine whether or not you understand the key concepts that have been discussed in the preceding lectures and chapters.
A typical exam will be comprised of roughly 10-15 questions. Some may ask for definitions and examples, but most will ask you to describe how or why something works, to contrast related concepts, to explain which principles are applicable, or to predict what would happen in some situation. The vast majority of these questions will pertain to designated key concepts, and in most cases the answers will have been presented in the text, the lectures or both. Most exam questions have brief (2-4 sentence or a simple diagram) answers.
Make-ups for missed exams will be administered only if you were medically incapacitated at the time of the exam, and will require a physician's certification. In those cases, a (very different) make-up exam will be administered after the end of the quarter.
What does it mean for a protocol to be "stateless" and why is this a desirable thing in client/server protocols?
The server is not assumed to maintain "session state" (i.e. remember anything about previous operations in the session).
Stateless protocols make it relatively easy to recover from server failures. A back-up server can take over the session at any time, because the protocol does not assume that the new server remembers what the old server had done in the past.
There is also a comprehensive final exam, worth 20% of the grade and covering the entire course. The final exam is given during the last 90 minutes of the final exam period. The purpose of the final exam is to determine whether or not you understand the key principles well enough to work with them and apply them to real problems.
The final exam questions are much harder than the questions on the normal exams. A few students always score in the 90s, but the median scores tend to be in the 55-60 range. The final exam is usually what separates the A students from the B students.
The final exam contains multiple (6-12) multi-part problems, half conceptual and half practical. You can choose a few of those questions (e.g. 2 conceptual and 2 practical) to answer. All of the questions will center around the designated key concepts, and they will all involve applications or extensions of those concepts that were never discussed in class. The required answers are not necessarily long, but may require considerable thought.
A Cable File System
A cable TV provider has moved to ultra-high-speed (10^12 bytes per second) digital fiber (direct to hundreds of thousands of subscribers) and has enough bandwith to offer a wide range of new data services. The out-bound broadcast data stream will be a combination of service information (e.g. program guides and Pay Per View Authorizations), popular information (stock quotes, news paper articles, encyclopedias, ...), and content requests for individual subscribers (e.g. HTML and e-mail).
Standard information will be broadcast on a regular basis. Small and popular things (like the program guide) will be sent at least once per second. Larger and less frequently used things (like the Encyclopedia Brittanica) may be sent only once every few minutes, and will be delayable during times of high demand. Content requested by a particular subscriber would broadcasted only once (on the assumption that the subscriber will reissue the request if they don't receive an answer).
As the engineers started designing the data stream format, they realized that they were actually designing a file system, where data was stored in blocks of time rather than in blocks of disk ... but they still had to solve all of the problems that have to be solved in every file system ... plus many of the problems encountered by schedulers!
Would you make all files contiguous, or would you break some (or all) files up into multiple blocks? If "no", explain why this is not necessary. If "yes", describe the data structure would you use to find a particular block in a particular file?
The cable is probably fast enough to send most files in much less than one micro-second, but huge files (like the Brittanica) might take so long to send that they could delay higher priority tasks, and so we should have the ability to break up a file into multiple chunks (effectively allowing preemptive scheduling).
Each chunk of file could be prefixed with a header that identified the file, and the range of bytes contained in this chunk.
What kind of data structures would you use to locate a desired file, and what information would be included in a directory entry? How would you deal with files that were regularly repeated?
A "directory" would be a list of files (or file chunks) and times when they were next scheduled to be transmitted. I would put (in each entry) the expected time of transmission, the name of the file, the range of bytes in the chunk, and perhaps a brief description (for browsing purpose).
Clients probably don't care how often a file is to be sent. They just need to know when it will next be sent.
How would you structure and maintain the free-time-list" and what strategies would you use allocating time to a particular file?
Unlike disk space, which is bounded, the amount of available "time" extends to eternity, and so is infinite. This makes it impractical to maintain a list of free time slots. I would use the directory of currently scheduled transmissions as my used-space list. Any time slots not used, can be assumed to be free.
Whenever I needed time for a file, I would first break the file into reasonable sized chunks, and then do first-fit allocation to schedule each of the successive chunks. The use of a standard chunk size would be analogous to paged allocation, and would help to control both internal and external fragmentation.
Most of the data broadcast over the cable would be public, but some requested content (like e-mail) would be private. How could we control access to private data?
Encryption. The cable operator could use the requestor's public key, or the requestor could specify an assymmetric session key with the each request. Either way, the broadcast data would only be decipherable by the intended recipient.
Paging v.s. Swapping
In class, we discussed waiting time in the ready queue as a reasonable metric for CPU scheduler performance. This question deals with finding metrics for the memory scheduler performance. In a land-mark paper, Peter Denning asserted that the cost of any memory management strategy can be measured by the sum (over all processes) of the time-integral, for each process, of the number of page frames assigned to that process (or more simply the total number of page-seconds consumed by all of the processes in the system). If, for instance, a process has an image size of "N" pages and we leave it in memory for "T" seconds, the time-space cost of this decision is N*T page-seconds.
Why is this time-space product a reasonable measure for the cost or performance of a memory management strategy?
If each a process uses fewer pages of memory, more processes can be fit in memory at one time, meaning that the scheduler is more likely to have ready tasks to run, and the CPU will be kept busy, and system throughput will be maximized. Thus, if the memory manager can run more processes in fewer pages, it is doing a good job.
If we were to use pure demand-paging (w/o any pre-loading) rather than swapping, what (approximately) would be the time-space cost of allowing the same process to run for the same "T" seconds (if we assume that the we started with no pages and that the process faulted for its N pages uniformly over the "T" second period)?
At the beginning there would be no pages in memory, and by the end there would be N pages. If the fault rate is uniform, the area under that curve is T*N/2.
Is it likely that the demand paged process would require the same number of pages (n) as the swapped process (N)? Why or Why not?
It is most likely that (n < N). Most programs exhibit reasonable code and data locality, so that during a small period of time they reference only a fraction of their pages (perhaps because they are executing one loop, calling only a few subroutines, and manipulating only a small amount of data).
The amount of time that a process is in memory is not merely the time it takes the process to run (T) but also includes the time it takes to swap the process in and out of memory. If we can swap N pages in or out in Ts(N) seconds, the time-space product for swapping becomes N*(T+2Ts(N)). Suppose that we can fault-in or replace-out N pages in Tp(N) seconds. Which is likely to take longer: swapping in N pages, or page-faulting for N pages? Why?
Tp(N) is likely to be greater than Ts(N) for a few reasons:
A process is usually swapped out to one contiguous region on disk, which means that the I/O involves little or no head motion. A process that is demand paging might have pages all over the disk.
There is start-up and completion over-head associated with each I/O operation. In swapping, there is likely to be only one large request, whereas in demand paging, there is likely to be one request per page, each with its own startup and completion overheads.
Beyond the I/O time, there is also an overhead associated with taking each page-fault.
What mathematical relationship must hold (between T, Ts Tp, N and n) in order for demand paging to out-perform swapping? What does this mean about the programs, operating system, or hardware?
The cost of swapping is (from above) N*(T+2Ts(N)). The corresponding cost of demand paging is n*(T+2Tp(n)). Demand paging will win if "n*(T+2Tp(n)) < N*(T+2Ts(N))", or (perhaps more clearly) if "n/N < (T+2Ts(N))/(T+2Tp(n))".
This means that either n is much smaller than N (the programs exhibit good locality) or that Tp is not much greater than Ts (the performance penalty for page faulting and scattered I/O is low).
Last updated: May 4, 2001