A computer system consists of hardware and systems software that work together to run application programs.
We begin our study of systems by tracing the lifetime of the hello program, from the time it is created by a programmer, until it runs on a system, prints its simple message, and terminates.
#include <stdio.h> int main(void) { printf("hello, world\n"); }
The hello.c program is stored in a file as a sequence of bytes, we shows the ASCII representation of the hello.c program:
Files such as hello.c that consist exclusively of ASCII characters are known as text files.All other files are known as binary files.
The representation of hello.c illustrates a fundamental idea: All information in a system--including disk files, programs stored in memory, user data stored in memory, and data transferred across a network--is represented as a bunch of bits.The only thing that distinguishes different data objects is the context in which we view them.
in order to run hello.c on the system, the individual C statements must be translated by other programs into a sequence of low-level machine-language instructions.These instructions are then packaged in a form called an executable object program and stored as a binary disk file.Object programs are also referred to as executable object files.
we run hello.c in unix system:
lgtdeMacBook-Pro:~ lgt$ gcc -o hello hello.c lgtdeMacBook-Pro:~ lgt$ ls hello* hello hello.c lgtdeMacBook-Pro:~ lgt$ ./hello hello, world
Preprocessing phase:The preprocessor(cpp) modifies the original C program according to directives that begin with the # character.For example, the #include<stdio.h> command in line 1 of hello.c tells the preprocessor to read the contents of the system header file stdio.h and insert it directly into the program text. The result is another C program, typically with the .i suffix.
Compilation phase:The compiler(cc1) translates the text file hello.i into the text file hello.s, which contains an assembly-language program.Each statement in an assembly-language program exactly describes one low-level machine-language instruction in a standard text form.Assembly language is useful because it provides a common output language for different compilers for different high-level languages.
Assembly phase: The assembler(as) translates hello.s into machine-language instructions, packages then in a form known as a relocatable object program, and stores the result in the object file hello.o.The hello.o file is a binary file whose bytes encode machine language instructions rather than characters.
Linking phase:The printf function resides in a separate precompiled object file called print.o, which must somehow be merged with our hello.o program. The linker(ld) handles this merging.The result is the hello file, which is an executable object file(or simply executable) that is ready to be loaded into memory and executed by the system.
how compilation system work: Optimizing program performance, Understanding link-time errors and Avoiding security holes.
To understand what happens to our hello program when we run it, we need to understand the hardware organization of a typical system:
Buses
Running throughout the system is a collection of electrical conduits called buses that carry bytes of information back and forth between the components. Buses are typically designed to transfer fixed-sized chunks of bytes known as words.The number of bytes in a word(the word size) is a fundamental system parameter that varies across systems.
I/O Devices
Input/output(I/O) devices are the system's connection to the external world.
Each I/O device is connected to the I/O bus by either a controller or an adapter. Controllers are chip sets in the device itself or on the system's main printed circuit board(often called the mother board). An adapter is a card that plugs into a slot on the motherboard. Regardless, the purpose of each is to transfer information back and forth between the I/O bus and an I/O device.
Main Memory
The main memory is a temporary storage device that holds both a program and the data it manipulates while the processor is executing the program. Physically, main memory consists of a collection of dynamic random access memory(DRAM) chips. Logically, memory is organized as a linear array of bytes, each with its own unique address(array index) starting at zero. In general, each of the machine instructions that constitute a program can consist of a variable number of bytes. The sizes of data items that correspond to C program variables vary according to type.
Processor
The central processing unit(CPU) or simply processor, is the engine that interprets(or executes) instructions stored in main memory.At its core is a word-sized storage device(or register) called the program counter(PC). At any point in time, the PC points at (contains the address of) some machine-language instruction in main memory.
From the time that power is applied to the system, until the time that the power is shut off, a processor repeatedly executes the instruction pointed at by the program counter and updates the program counter to point to the next instruction. A processor appears to operate according to a very simple instruction execution model, defined by its instruction set architecture. In this model, instructions execute in strict sequence, and executing a single instruction involves performing a series of steps. The processor reads the instruction from memory pointed at by the program counter(PC), interprets the bits in the instruction, performs some simple operation dictated by the instruction, and then updates the PC to point to the next instruction, which may or may not be contiguous in memory to the instruction that was just executed.
There are only a few of these simple operations, and they revolve around main memory, the register file, and the arithmetic/logic unit(ALU). The register file is a small storage device that consists of a collection of word-sized registers, each with its own unique name. The ALU computes new data and address values. Here has some example:
Load: Copy a byte or a word from main memory into a register, overwriting the previous contents of the register.
Store: Copy a byte or a word from a register to a location in main memory, overwriting the previous contents of that location.
Operate: Copy the contents of two registers to the ALU, perform an arithmetic operation on the two words, and store the result in a register, overwriting the previous contents of that register.
Jump: Extract a word from the instruction itself and copy that word into the program counter(PC), overwriting the previous value of the PC.
When we type the characters "./hello" at the keyboard, the shell program reads each one into a register, and then stores it in memory:
When we hit the enter, the shell then loads the executable hello file by executing a sequence of instructions that copies the code and data in the hello object file from disk to main memory. The data include the string of characters "hello, world\n" that will eventually be printed out:
Once the code and data in the hello object file are loaded into memory, the processor begins executing the machine-language instructions in the hello program's main routine.
We know that processor is faster than main memory, and main memory is faster than disk when moving information from one place to another.
To deal with the processor-memory gap, system designers include smaller faster storage devices called cache memories(or simply caches) that serve as temporary staging areas for information that the processor is likely to need in the near future.
An L1 cache on the processor chip holds tens of thousands of bytes and can be accessed nearly as fast as the register file. A larger L2 cache with hundreds of thousands to millions of bytes is connected to the processor by a special bus. It might take 5 times longer for the process to access the L2 cache than the L1 cache, but this is still faster than accessing the main memory. The L1 and L2 caches are implemented with a hardware technology known as static random access memory(SRAM).
memory hierarchy:
Back to our hello example. When the shell loaded and ran the hello program, and when the hello program printed its message, neither program accessed the keyboard, display, disk, or main memory directly. Rather, they relied on the services provided by the operating system. We can think of the operating system as a layer of software interposed between the application program and the hardware. All attempts by an application program to manipulate the hardware must go through the operating system.
The operating system has two primary purposes:(1) to protect the hardware from misuse by runaway applications, and (2) to provide applications with simple and uniform mechanisms for manipulating complicated and often wildly different low-level hardware devices.
files are abstractions for I/O devices, virtual memory is an abstraction for both the main memory and disk I/O devices, and processes are abstractions for the processor, main memory, and I/O devices.
A process is the operating system's abstraction for a running program.
The operating system keeps track of all the state information that the process needs in order to run. This state, which is known as the context, includes information such as the current values of the PC, the register file, and the contents of main memory. At any point in time, a uniprocessor system can only execute the code for a single process. When the operating system decides to transfer control from the current process to some new process, it performs a context switch by saving the context of the current process, restoring the context of the new process, and then passing control to the new process.
threads running in the context of the process and sharing the same code and global data.
Virtual memory is an abstraction that provides each process with the illusion that it has exclusive use of the main memory. Each process has the same uniform view of memory, which is known as its virtual address space.
A file is a sequence of bytes.
The network can be viewed as just another I/O device. With the advent of global networks such as the Internet, copying information from one machine to another has become one of the most important uses of computer systems:
Returning to our hello example, we could use the familiar telnet application to run hello on a remote machine:
we want computer to do more, and we want them to run faster. Both of these factors improve when the processor does more things at once. We use the term concurrency to refer to the general concept of a system with multiple, simultaneous activities, and the term parallelism to refer to the use of concurrency to make a system run faster.Parallelism can be exploited at multiple levels of abstraction in a computer system.
Thread-Level Concurrency
We use multi-core processors and hyper threading to make a system to consist of multiple processors.
Multi-core processors have several CPUs integrated onto a single integrated-circuit chip:
Hyperthreading, sometimes called simultaneous multi-threading, is a technique that allows a single CPU to execute multiple flows of control.It involves having multiple copies of some of the CPU hardware, such as program counters and register files, while having only single copies of other parts of the hardware, such as the units that perform floating-point arithmetic.