Read CSAPP (1)

A Tour of Computer Systems

A computer system consists of hardware and systems software that work together to run application programs.

Information is Bits + Context

All information in a system is represented as a bunch of bits.

The only thing that distinguish different data objects is the context in which we view them.

Programs Are Translated by Other Programs into Different Forms

The compilation system

The programs that perform the four phases are know collectively as the compilation system.

  • Preprocessing phase: preprocessor(cpp) modifies the original C program according to directives that begin with the # character. The result is another C program, typically with the .i suffix.

  • Compilation phase: compiler(cc1) translates the text file hello.i into the text file hello.s, which contains an assembly-language program.

  • Assembly phase: the assembler(as) translates hello.s into machine-language instructions, packages them in a form known as a relocatable object program, and stores the result in the object file hello.o.

  • Linking phase: the linker(ld) handles the merging.

It Pays to Understand How Compilation Systems Work

  • Optimizing program performance: in order to make good coding decisions in our C programs, we do need a basic understanding of machine-level code and how the compiler translates differenct C statements into machine code.

  • Understanding link-time errors: some of the moster perplexing programming errors are related to the operation of the linker.

  • Avoiding security holes: a first step in learning secure programming is to understand the consequences of the way data and control information are stored on the program stack.

Processors Read and Interpret Instructions Stored in Memory

Hardware Organization of a System

Hardware Organization of a typical system

  • Buses: running throughout the system is a collection of electrical conduits called buses that carry bytes of information back and forth between the components. Buses transfer fixed-sized chunks of bytes known as words. Word varies across systems. We assume buses transfer a word at a time.

  • I/O Devices: are the system's connection to the external world. Each I/O device is connected to the I/O bus by either a controller or an adapter. Controllers are chip sets in the device itself or on the system's main printed circuit board. An adapter is a card that plugs into a slot on the motherboard.

  • Main Memory: a temporary storage device that holds both a program and the data it manipulates while the processor is executing the program. Logically, memory is organized as a linear array of bytes, each with its own unique address starting at zero.

  • Processor: is the engine that interprets instructions stored in main memory. At its core is a word-sized register called the program counter(PC). A processor repeatedly executes the instruction pointed at by the PC and updates the PC to point to the next instruction. Simple operations revolve around main memory, the register file, and the arithmetic/logic unit(ALU). The ALU computes new data and address values. Some simple CPU operations:

    • Load: Copy a byte or a word from main memory into a register
    • Store: Copy a byte or a word from a register to a location in main memory
    • Operate: Copy the contents of two registers to the ALU, perform an arithmetic operation and store the result in a register
    • Jump: Extract a word from the instruction itself and copy that word into the PC.
Running the hello program

The shell loads the executable hello file by executing a sequence of instructions that copies the code and data in the hello object file from disk to main memory.

readingcommandfromthekeyboard

Using the DMA(direct memory access) the data travels directly from disk to main memory, without passing through the processor.

fromdiskintomainmemory

Once the code and data in the hello object file are loaded into memory, the processor begins executing the machine-language instructions in the hello program's main routine. These instructions copy the bytes in the "hello world\n" string from memory to the register file, and from there to the display device.

writetodisplay

Caches Matter

To deal with the processor-memory gap, system designers include smaller faster storage devices called cache memories that serve as temporary staging areas for information that the processor is likely to need in the near future.

cache memory

System can get the effect of both a very large memory and a very fast one by exploiting locality, the tendency for programs to access data and code in localized regions.

By setting up caches to hold data that is likely to be accessed often, we can perform most memory operations using the fast caches.

The Operating System Manages the Hardware

The operating system as a layer of software interposed between the application program and the hardware.

The operating system has two primary purposes:

  1. to protect the hardware from misuse by runaway applications
  2. to provide applications with simple and uniform mechanisms for manipulating complicated and often wildly different low-level hardware devices.
Processes

A process is the operating system's abstraction for a running program.

The operating system keeps track of all the state information that the process needs in order to run. This state, which is known as the context, includes information such as the current values of the PC, the register file, and the contents of main memory.

The transition from one process to another is managed by the operating system kernel. The kernel is the portion of the operating system code that is always resident in memory.

The kernel is not a separate process. It is a collection of code and data structures that the system uses to manage all the processes.

Threads

A process can actually consist of multiple execution units, called threads, each running in the context of the proce3ss and sharing the same code and global data.

Virtual Memory

Virtual memory is an abstraction that provides each process with the illusion that it has exclusive use of the main memory.

Process virtual address space

In Linux, the topmost region of the address space is reserved for code and data in the operating system, the lower region of the address space holds the code and data defined by the user's process. Addresses in the figure increase from the bottom to the top.

  • Program code and data: Code begins at the same fixed address for all processes, followed by data locations that correspond to global C variables. The code and data areas are initialized directly from the contents of an executable object.

  • Heap: The code and data areas are followed immediately by the run-time heap. The heap expands and contracts dynamically at run time as a result of calls to C standard library routines.

  • Shared libraries

  • Stack: At the top of the user's virtual address space is the user stack that the compiler uses to implement function calls. Each time we call a function, the stack grows; each time we return from a function, it contracts.

  • Kernel virtual memory: The top region of the address space is reserved for the kernel.

Files

A file is a sequence of bytes.

Systems Communicate with Other Systems

The network can be viewed as just another I/O device.

A network is another IO device

Important Themes

A system is more than just hardware. It is a collection of intertwined hardware and systems software that must cooperate in order to achieve the ultimate goal of running application programs.

Amdahl's Law

The main idea is that hen we speed up one part of a system, the effect on the overall system performance depends on both how significant this part was and how much it sped up.

Consider a system in which executing some application requires time Told. Suppose some part of the system requires a fraction α of this time, and that we improve its performance by a factor of k. That is, the component originally required time αTold, and it now requires time αTold/k. The overall execution time would thus be:

Amdahl's Law

From this, we can compute the speedup S = Told/Tnew as:

Amdahl's Law

To significantly spped up the entire system, we must improve the speed of a very large fraction of the overall system.

If we are able to take some part of the system and speed it up to the point, at which it takes a negligible amount of time. We can ge:

Sinfinite

Concurrency and Parallelism

We use the term concurrency to refer to the general concept of a system with multiple, simultaneous activities, and the term parallelism to refer to the use of concurrency to make a system run faster. Parallelism can be exploited at multiple levels of abstraction in a computer sytem.

  • Thread-Level Concurrency

Multi-core processors have several CPUs(referred to as cores) integrated onto a single integrated-circuit chip. The cores share higher levels of cache as well as the interface to main memory.

Hyperthreading, sometimes called simultaneous multi-threading, is a technique that allows a single CPU to execute multiple flows of control.

The use of multiprocessing can improve system performance in two ways.
1. It reduces the need to simulate concurrency when performing multiple tasks.
2. It can run a single application program faster, but only if that program is expressed in terms of multiple threads that can effectively execute in parallel.

  • Instruction-Level Parallelism

Instruction-level parallelism allows modern processors execute multiple instructions at one time. Processors that can sustain execution rates faster than 1 instruction per cycle are known as superscalar processors.

  • Single-Instruction, Multiple-Data Parallelism

Many modern processors have special hardware that allows a single instruction to cause multiple operations to be performed in parallel, a mode known as single-instruction, multiple-data(SIMD) parallelism. These SIMD instructions are provided mostly to speed up applications that process image, sound and video data.

The Importance of Abstrctions in Computer System

The use of abstractions is one of the most important concepts in computer science. On the processor side, the instruction set architecture provides an abstraction of the actual processor hardware: by keeping the same execution model, different processor implementations can execute the same machine code while offering a range of cost and performance. On the operating system side, we have introduced three abstractions: files as an abstraction of I/O devices, virtual memory as an abstraction of the entire computer, including the operating system, the processor, and the programs.

Summary

A computer system consist of hardware and systems software that cooperate to run application programs. Information inside the computer is represented as group of bits that are interpreted in different ways, depending on the context. Programs are translated by other programs into different forms, beginning as ASCII text and then translated by compilers and linkers into binary executable files.

Processors read and interpret binary instructions that are stored in main memory. Since computer spend most of their time copying data between memory, I/O devices, and the CPU registers, the storage devices in a system are arranged in a hierarchy, with the CPU registers at the top, followed by multiple levels of hardware cache memories, DRAM main memory, and disk storage.

The operating system kernel serves as an intermediary between the application and the hardware.

Finally, networks provide ways of computer systems to communicate with one another. From the viewpoint of a particular system, the network is just another I/O devices.