原文链接:http://harttle.com/2014/04/10/modern-os.html
The first digital computer was designed by Charles Babbage. Ada Lovelace was the first programmer(hired by Babbage).
batch system
IBM System/360
A series of software-compatible machines differed only in price and performance.
360 was the first major computer line to use ICs(Integrated Circuits).
OS/360 had to work on all models, the result was an enormous and complex OS, each release fixed some bugs and introduced new ones.
multiprogramming
Partition memory into several pieces, with a different job in each partition. While one job was waiting for I/O, another job could be using the CPU.
spooling
Simultaneous Peripheral Operation On Line, OS read jobs from cards onto the disk as soon as they were brought to the computer room, whenever a running job finished, the operating system could load a new job from the disk and run it.
timesharing
Proveding quick response time. The 1st general-purpose timesharing system, **CTSS(Compatible Time Sharing System) was developed at MIT.
MULTICS(MULTiplexed Information and Computing Service)
Developed by MIT, Bell Labs, and General Electric, MULTICS supports handreds of uses on a machine only slightly powerful than an Intel 386-based PC.
minicomputer
DEC PDP-1(1961) and other PDPs(all incompatible)
UNIX
Developed by Ken Thompson(based on PDP-7 minicomputer). There are 2 major versions:
System V(from AT&T) and BSD(Berkeley Software Distribution)
1987, MINIX was released for educational purposes.
With the development of LSI(Large Scale Integration) circuits, microcomputers appears.
Kildall developed CP/M(Control Program for Microcomputers, disk-based OS), which later supports 8080, Zilog Z80, and other CPU chips.
Bill Gates offered DOS(Disk Operating System, which renamed to MS-DOS later) to IBM.
Engelbart invented the GUI(Graphical User Interface), which was adopted by Xerox PARC.
Apple
Steve Jobs visited PARC and embarked on building an Apple with a GUI.
Windows
Microsoft released Windows 95, Windows 98(with 16-bit Intel CPU), Windows NT(New Technology, 32-bit), Windows Me(Millennium edition), Windows 2000(1999,renamed from Windows NT5), and Windows XP(2001).
UNIX
FreeBSD, originating from BSD project at Berkeley. X Window System(X11), MIT.
Mainframe OS
Oriented toward process many jobs at once, most of which need prodigious amounts of I/O.
Server OS
Serve multiple users at once over a network and allow the users to share hardware and software resources.
Mutiprocessor OS
Special features for communication, connectivity, and consistency.
Personal Computer OS
Provide good surport to a single user.
Handheld Computer OS
PDA(Personal Digital Assistant) and mobiles.
Embedded OS
Donot accept user-installed software.
Sensor Node OS
Tiny computers that communicate with each other and with a base station using wireless communication.
Real-Time OS
hard real-time the action absolutely must occur at a certain moment.
soft real-time missing an occasional deadline is acceptable and does not cause any permanent damage.
Smart Card OS
Some are Java-oriented. The ROM holds an interpreter for the JVM.
Resource management and protection.
Basic sturctures
THE system built by E. W. Dijkstra.
layer | function |
---|---|
5 | The operator |
4 | User programs |
3 | I/O management |
2 | Operator-process communication |
1 | Memory and drum management |
0 | Processor allocation and multiprogramming |
MULTICS was described as having a series of concentric rings, with the inner ones being more privileged than the outer ones.
The advantage is that it can be easily extended to structure user subsystems.
Achieve high reliability by spilitting the OS into small, well-defined modules, only one of which(the microkernel) runs in kernel mode.
An idea related to having a minimal kernal is to put the mechanism for doing something in the kernel but not policy.
A few of better-known microkernels: Integrity, K42, Symbian, and MINIX 3.
A slight variantion of the microkernel idea is to distinguish 2 classes of processes, the servers(providing services), and the clients(use these services).
Also known as virtual machine monitor, runs upon th e hardware.
CP/CMS , later renamed VM/370 is a timesharing system provedes
CMS (Conversational Monitor System), a single-user, interactive processing OS.
Runs upon the Host OS, like other applications.
Rather than cloning the actual machine, another strategy is pratitioning it, giving each user a subset of the resources.
At the bottom layer, running in kernel mode, is a program called the exokernal.
pseudoparallelism
The illusion of parallelism while CPU is switching from process to process quickly.
process is an instance of an executing program, is the activity of a program.
If a program is running twice, it counts as two processes.
Processes are created when
In Unix, fork
system call creates an exact clone of the calling process. execve
is another syscall used to change its memory image and run a new program.
This separation allows the child to manipulate its file descriptors after
fork
but before theexecve
in order to accomplish redirections.
In Windows, CreateProcess
system call handles both process creation and loading with 10 parameters.
It’s possible for a newly created process to share resources such as open files.
When processes terminate
Voluntary termination: exit
in UNIX and ExitProcess
in Windows. Kill someother process: kill
in UNIX and TerminateProcess
in Windows.
In UNIX, a process and all its children and further descendants form a process group. User signals are sends to all members in the group.
init
is the first process created in boottime. Thus all processes in the whole system belong to a single tree withinit
as the root.
Windows has no concept of process hierarchy, despite that a handle(a special token used to control the child) is returnd when creating a new process.
The OS maintains a process table , with one entry ( process control block ) per process. Each entry contains program counter, stack pointer, memory allocation, status of open files, scheduling information registers, and everything need to save when swapped out.
With a PCB, process can be saved when interrupted and swapped out. The interrupt routine may be as follows:
Suppose p is the fraction of I/O time for a process. n is current count of processes. Then
CPU utilization=1−pn
where n is called the degree of multiprogramming .
Three ways to construct a server
Processes are used to group resources together; threads are the entities scheduled for execution on the CPU. There are no protection betwwen threads because (1) it’s impossible, and (2) it should not be nessessary.
While threads share one memory space, it takes fewer space to maintain a thread, including Program Counter, Registers, Stack and State.
Pthread_create
: create a new thread.
Pthread_exit
: terminate the calling thread.
Pthread_join
: wait for a specific thread to exit.
Pthread_yield
: release the CPU.
Pthread_attr_init
: create and initialize a thread’s attribute structure.
Pthread_attr_destroy
: remove a thread’s attribute structure.
Advantages
Problems
Implementation of blocking sys-calls.
These calles intended to block the process (all threads in it) since the kernal know nothing about threads. This problem could be solved by adding wrappers to all blocking sys-calls.
Page faults.
The same as above.
Due to the relatively greater cost of creating and destroying threads in the kernel, some systems take an environmentally correct approach and recycle their threads.
Problems
Programmers can determine how many kernel threads to use and how many user-level threads to multiplex on each one.
The kernel notifies the process’ run-time system to switch thread, thus avoiding unnecessary transitions between user and kernel space.
This implementation violates the structure inherent in any layered system.
On arrival of a message, the system creates a new thread to handle it. Since a pop-up thread has no history, it’s quicker to create than swap.
Problems should be solved
global variables
Private global variables. A new library to create, set, read these variables is needed.
many library procedures are not reentrant
A jacket for each of these procedures is needed.
stack management
overflows could not be awared.
race conditions
Two or more processes are reading or writing some shared data and the final result depends on who runs precisely when.
mutual exclusion
Prohibit more than one process from reading and writing the shared data at the same time.
critical region
Also called critical section , the part of the program where the shared memory is accessed.
Conditions to a good solution
Problems
This would not take place without atom operations.
//Process 1
while(TRUE){
while (turn != 0);
critical_region();
turn = 1;
noncritical_region();
}
//Process 2
while(TRUE){
while(turn != 1);
critical_region();
turn = 0;
noncritical_regino();
}
Problem: neither of them could run twice in a row, which violates condition 3.
Continuously testing a variable until some value apperas is called busy waiting , a lock that uses busy waiting is called a spin lock .
Busy waiting lock variables.
#define FALSE 0
#define TRUE 1
#define N 2
int turn;
int interested[N];
void enter_region(int process)
{
int other = 1 - process;
interested[process] = TRUE;
turn = process;
while (turn == process && interested[other] == TRUE);
}
void leave_region(int process)
{
interested[process] = FALSE;
}
TSL RX, LOCK
Test and lock lock instruction reads the content of the memory word lock
into register RX
and then stores a nonzero value at the memory address lock
.
It’s guaranteed by the hardware that the read and set operations are indivisible.
enter_region:
TSL REGISTER, LOCK
JNE REGISTER, #0, enter_region
RET 'return
leave_region:
MOVE LOCK, #0
RET
An alternative instruction to TSL
is XCHG
. The implementations are similiar.
Problems in busy waiting:
Producer loop:
if count == N:
sleep
else:
produce one
if count == 1:
wakeup consumer
Consumer loop:
if count == 0:
sleep
else:
consume one
if count == N-1
wakeup producer
Since count
is unconstrained, race condition could occur. When consumer is about to sleep, wakeup signal from producer is lost, causing both of them sleeping.
Semaphores are used to buffer signals, keep them from losing. Value 0 indicating that no wakeups were saved; positive value if some wakeups were pending.
down
(Proberen, try in Dutch) operation: checking the value and cosume one or sleep to wait one.
up
(Verhogen, raise in Dutch) operation: produce one, if someone’s waiting, wake him.
mutex
A kind of semaphore, a variable that can be in 1 of 2 states: unlocked or locked, used when the semaphore’s ability to count is not needed.
When several mutexes are refered to, deadlock could occur by a subtle error. Monitors are provided by some programming languages to manage a group of mutual exclusive threads.
Only one of the threads in this group would run in a certain time. It’s up to the compiler to arrange mutexes to accomplish the monitor.
message passing is used for information exchange between machines. This method of interprocess communication uses 2 primitives, send
and receive
.
Approaches:
mailbox
A mailbox is a place to buffer a certian number of messages.
rendezvous
No buffering, either of each primitive is blocked until the other occurs.
With multiple processes, a barrier can be placed at the end of each phase. When a process reaches the barrier, it’s blocked until all processes have reached the barrier.
Scheduling is simpler in personal computers because (1) there is only one active process, (2) CPU is more faster than I/O.
Compute-bound: long CPU bursts and thus infrequent I/O waits.
I/O-bound: short CPU bursts and thus frequent I/O waits.
Types of schedule algorithm
All systems
Batch systems
Interactive systems
Real-time systems
Reduce the mean turnaround time.
When new process enterd, its executime is compared with the remaining time of current process.
Each process is assigned a time interval, called its quantum , during which it is allowed to run. The CPU switches when the process blocks, of course.
Setting the quantum too short causes too many process switches and lowers the CPU efficiency, but setting it too long may cause poor response to short interactive requests. A quantum arount 20-50 msec is often a resnable compromise.
Each process is assigned a priority, and the runable process with the highest priority is allowed to run. The priority of the running process decreases at each clock tick.
A simple algorithm for giving a good service to I/O bound processes is to set the priority to 1/f, where f is the fraction of the last quantum that a process used.
Set up priority classes, a group of processes sorted by priority in each class.
Whenever a process used up all the quanta allocted to it, it was moved down one class, saving the CPU for short, interactive processes.
Whenever a carriage return was typed at a terminal, the process belonging to that terminal was moved to the highest priority class.
This prevents a process that needed to run for a long time when it first started but became interactive later, from being punished forever.
To a certain extent, it would be nice if this algorithm used in batch systems could be used for interactive processes.
aging
Estimating running time as t=aT0+(1−a)T1
Make real promises to the users about performance ahd then live up to those promises.
Give processes lottery tickets for various system resources. Whenever a scheduling decision has to be made, a lottery ticket is chosen at random, and the process holding that ticket gets the resource.
Advantages
Each user is allocated some fraction of the CPU and the scheduler picks processes in such a way to enforce it.
For example, user 1 created process A B C D, while user 2 only created process E, the scheduling sequence should be:
A E B E C E …
If user 1 is entitled to twice as much CPU time as user 2, the sequence should be:
A B E C D E …
The events that a real-time system may have to respond to can be categorized as periodic (occurring at regular intevals) or aperiodic (occurring unpredictably).
Depending on how much time each event requires for processing, it may not even be possible to handle them all. A real-time system that meets this requirement is said to be schedulable .
Separate the scheduling mechanism from the scheduling policy to alow more flexible scheduling. What this means is that the scheduling algorithm is parameterized in some way.
Running multiple programs by static relocation: Modify the second program on the fly as it loaded it into memory.
The lack of memory abstraction is still common in embedded and smart card systems.
Address space is the set of addresses that a process can use to address memory.
Dynamic relocation uses base and limit registers to map each process’ address space onto a different part of physical memory in a simple way.
The disadvantage of relocation using base and limit registers is the need to perform an addition and comparison on every memory reference.
Swapping is used to deal with memory overload, bringing in each process in its entirety, running for a while, then putting it back on the disk.
When swapping creates multiple holes in memory, it’s possible to combine them all into one big one, which is called memory compaction.
Free memory can be recorded as bitmaps or linked lists.
Processes use program generated addresses, called virtual address. They go to an MMU(Memory Management Unit) that maps the virtual addresses onto the physical addresses when a memory access occurs.
virtual memory allows programs to run even when they are only partially in main memory.
The virtual address space is divided into fixed-size units called pages. The corresponding units in the physical memory are called page frames.
The virtual address is split into a virtual page number and an offset. The virtual page number is used as and index into the page table to find the entry for that page. Each page table entry(PTE) consists a Caching disabled bit, Referenced bit, Modified bit, Present/absent bit and page frame number.
TLB(Translation Lookaside Buffers) is used to speed up paging. It’s usually inside the MMU and consists of a small number of entries. Each entry contains information about one page, including the virtual page number, a bit that is set when the page is modified, the protection code, and the physical page frame in which the page is locted.
multilevel page table avoids keeping all the page tables in memory all the time.
Inverted page table is a solution for 64-bit computers. There is one entry per page frame in real memory, rather than one entry per page of virtual address space.
Each page can be labeled with the number of instructions that will be executed before that page is first referenced. And the page with the highest label should be removed.
The only problem with this algorithm is that it’s unrealizable.
Keep track of each process’ working set and make sure that it is in memory before letting the process run, which is called prepaging.
Forms of directory system:
Absolute path name: Consists the path from the root directory to the file.
Relative path name: This is used in conjunction with the concept of the working deirectory(current directory)
MBR(Master Boot Record) is the first sector(sector 0) of the disk, which contains the partition table(one is marked as active).
Superblock is contained by every partition, which lays after the boot block and contains all the key parameters about the filesystem.
MBR is loaded and executed by BIOS. The MBR program locate the active prtition, read in its first block(boot block), and execute it. The the program in the boot block load the OS contained in that partition.
Longer, variable-length file names supporting. There are 3 implementations below:
2 solutions:
All pending writes are bufferd in memory, and collected into a single segment and written to the disk as a single contiguous segment at the end of the log.
An i-node map is maintained to make it possible to find i-nodes.
LFS has a cleaner thread that spends its time scanning the log circularly to compact it. (Removing overwitten blocks)
Keeps a log of what the file system is going to do before it does it, so that if the system crashes before it can do its planned work, upon rebooting the system can look in the log to see what was going on at hte time of the crash and finish the job.
Only after the log entry has been written, do the various operations begin. After the operations complete successfully, the log entry is erased.
The logged operations must be idempotent.
VFS tries to integrate multiple file systems into an orderly structure.
In fact, the origina motivation for Sun to build the VFS was to support remote file systems using the NFS(Network File System) protocol.
If the allocatoin unit is too large, we waste space; if it’s too small, we waste time.
The disk operation time is the sum of the seek, rotational delay, and transfer time. While the first 2 of them dominated the access time.
Each control register is assigned a unique memory address to which no memory is assigned.
Pros:
Cons:
Direct memory access (DMA) is a feature of modern computers that allows certain hardware subsystems within the computer to access system memory independently of the CPU.
Bus mode thus DMA controller mode:
Precise interrupt
An interrupt doesn’t meet these requirements is called an imprecise interrupt.
in Unix a device name such as
/dev/disk0/
, uniquely specifies the i-node for a special file. This i-node contains major device number to locate the appropriate driver, and the minor device number which is passed as a parameter to the driver and specifies the read/write unit.
double buffering is used to store another buffer when the first buffer is being brought out and characters are keeping arriving.
circular buffer is another widely used form of buffering.
A preemptable resource is one that can be taken away from the process owning it with no ill effects.
A nonpreemptable resource is one that cannot be taken away from its current owner without causing the computation to fail.
Resource Acquisition
A possible implementation:
down
on the semaphore to aquire the resource.up
on the semaphore to release the resource.A set of processes is deadlocked if each process in the set is waiting for an event that only another process in the set can cause.
Conditions for Resource Deadlocks:
Strategies used for dealing with deadlocks:
Just ignore the problem when the deadlock isn’t that often, that serious.
Few current systems will detect the deadlock between CD-ROM and printer.
For such a system, we can construct a resource graph of the resources and processes. If this graph contains cycles, a deadlock exists.
E is the existing resource vector, which gives the total nmber of instances of each resource in existence.
A is the available resource vector, which gives the number of instances of resource that are currently available.
C is the current allocation matrix, the i-th row of C tells how many instances of each resource class Pi currently holds.
R is the request matrix, holds the number of instances of resource that processes want.
The following algorithm will mark all processes that are not deadlocked.
A state is said to be safe if there is some scheduling order in which every process can run to completion even if all of them suddenly request their maximum number of resources immediately.
It worth nothing that an unsafe state is not a deadlocked state. The system can run for a while and one process can even complete. The difference between a safe state and an unsafe state is that from a safe state the system can guarantee that all processes will finish; from an unsafe state, no such guarantee can be given.
This Algorithm considers each request as it occurs, and sees if granting it leads to a safe state. If it does, the request is granted; otherwise, it is postponed until later.
Condition | Approach |
---|---|
Mutual exclusion | Spool everything |
Hold and wait | Request all resources initially |
No preemption | Take resources away |
Circular wait | Order resources numerically |
two-phase locking
conmunication deadlocks, unlike resource deadlocks, these are caused by communication errors.
livelock, polling(busy waiting) processes uses up its CPU quantum without making progress but also without blocking.
Starvation, some policy is needed to make a decision about who gets which resource when. This policy may lead to some processes never getting service even though they are not deadlocked.
Starvation can be avoided by using a first-come, first-served, resource allocation policy.
kill
cmd, for example) are used for inter-process communication.fork
will create an exact duplicate of the original process(file descriptors, registers, and everything else), and applys the copy on write mechanism.exec
will cause its entire core image to be replaced by the file named in its 1st parameter.wait
is used to collect information and clean the zombie process left by the terminated child process.pause
tells Linux to suspend the process until signal arrives.Kernel thread is supported by Linux using clone
system call.
When a thread was created, the original thread and the new one shared everything but their registers.
3 classes of threads for scheduling purposes:
boot
reads in the OS kernel and jumps to it.init
(process 1), and page daemon(process 2).init
execs /etc/rc/*
, finally opens tty and print login:
brk
).Text segment is read-only. Self-modifying programs went out of style in 1950s because they they were too difficult to understand and debug.
The existence of uninitialized data is actually just an optimization to make binary programs smaller.
File systems under the VFS includes:
Network socket(with Protocol drivers and Network device driver)
History of Linux FS:
Linux allows directory and file locking(byte range) with semaphore, including shared locks and exclusive locks.
4 main structures of VFS: