DV1460 / DV1492:

Realtime- (and) Operating-Systems

08:15-10:00 Thursday September 1st, 2016

An introduction to operating systems.

Chapter 1.4 - 1.12 (pg 35-81)

Table of Contents
History
Hardware Review
Concepts
System Calls

1. History of Computing

  • The history of computing is driven by advances in fabrication technology.
  • Vaccum tubes (light bulbs): originally amplifiers, improved to switches.
  • When a digital signal can switch another current on/off: computation.
  • Transistor was invented in 1955: much smaller switch.
  • In 1965 multiple transistors were integrated into a single chip.
  • MSI / LSI / VLSI reduced feature size, improved density, lowered cost.
  • Mid-90s System-On-a-Chip: mixed digital/analog components.
  • Reduction of feature size is stalling: "end of Moore's law".
  • Further improvements: stacked chip 3D designs, nano-tubes etc.

2. History of Computing

  • Advances in fabrication enables changes in system design.
  • Originally these were considered "generations" of computing technology.
  • Now we think of them as "classes" of devices that are enabled.

3. The sprawling OS "Zoo"

  • OS development has stretched more than 50 years.
    • Some aspects are common across many device classes.
    • Some aspects specialised for to particular classes.
  • Each class of device requires an OS: lots of differences.
  • Mainframes: I/O dominated workload, reliability, expensive.
  • Server: services over networks, scalability, independent jobs.
  • Multiprocessor: compute-heavy scientific computing, many cores.
  • PC: commodity hardware, wide variety, low cost.
  • Mobile: power-constrained, efficiency, battery-life.
  • Embedded: custom hardware, slow processors, cheap.
  • Sensor networks: many small machines working collaboratively.
  • Realtime: hard (industrial, robotics), soft (phones).

4. Ontogency Recapitulates Phylogeny

Basic Idea
Patterns of development repeat in each class of device.
  • e.g. generations 1,2,3: single program, batch, time-sharing, memory protection.
  • Same progression repeats for devices in each new class.
  • Consequence of economics vs engineering (old techniques at new scales).
  • OS Design cover both currently used techniques and some that are "resting".
  • Now a quick review of modern hardware that an OS manages.

5. Hardware: Instruction sets

  • Each CPU has an instruction set:
    • A set of logical opcodes and operands.
    • Each performs a specific operation.
    • Single binary encoding.
  • Working in a compiled language:
    • Portability is not too difficult.
    • Use the constructs the language supports.
    • Compiler translates them into instructions.
  • To port to a new processor (e.g. ARM):
    • Just use a new compiler (hopefully).

6. Hardware: Language support

  • Some operations are common to many CPUs.
    • Arithmetic (expressions).
    • Logic + PSW (conditions).
    • Memory access (variables, pointers etc).
    • Stack (recursive procedures).
  • The common operations are accesible from C.
  • Some operations that we cannot use in C.
    • Saving / resuming state (partly).
    • Privileged mode.
    • Hardware I/O.
    • Changing memory hierarchy.
    • Interrupts.

7. Hardware: Processor review

  • Basic operation of a processor is simple:
    • PC: address of next instruction.
    • Fetch - load bytes from memory.
    • Decode - which instruction / data?
    • Execute - activate circuitry.
  • Functional processor = easy, fast = hard.
  • More parallelism is easy: economics.
  • Communicating faster hard: physics.
  • Pipelining is an optimisation:
    • Fetch / Decode / Execute in parallel.
    • Prefetch bytes - no waiting.
    • "state" more complex: where is PC?

8. Hardware: Further optimisations

  • Modern processors use superscalar despatch.
    • Multiple pipelines filled in parallel.
    • Allows multiple instructions per cycle.
  • Processor state is now more complex.
    • Including "what is the next instruction?"
  • Some instructions cannot be executed yet.
    • Depend on results of other instructions.
    • Maybe not all the EUs are occupied.
    • Avoid these "bubbles" for performance.
  • Hyperthreading fills EUs from another program (requires OS support).
  • Interrupts require "rewinding" of state.

9. Hardware: Memory

  • The storage system holds data for the processor to work upon.
  • Ideally it would be large, fast, and cheap.
  • Currently technology provides one or two of these properties.
  • Normal behaviour for a program is not "random" access:
    • Often reuse recent data.
    • Pick "nearby" data next.
  • Take advantage of this with the memory hierarchy.
  • Fast, small, expensive technology at the top.
  • Slow, large, cheap technology at the bottom.
  • When data is read (upwards) a copy is stored - faster next time.
  • Copies in cache layers need to be written back downwards (persistence).

10. Hardware: Mo' Memory Mo' Problems

  • Multiple layers of caches within the storage hierarchy.
  • Managed by a mixture of hardware, program control and the OS.
  • Only the bottom layer is persistent.
  • Volatile storage is lost without power.
  • Consequence: programs need to explicit move data to/from storage.
  • Data in the top layer is lost when we switch programs.
  • Consequence: OS needs to save/reload it ("context switch").

11. Hardware: Multi-processors

  • Progress in making processors faster has slowed down (stopped?).
  • Increases in performance need to come from parallelism.
  • Different approaches may be combined, the OS needs to be aware of them:
  • Multiple processors (packages / chips):
    • External bus, caches must be consistent, memory must be coherent.
  • Multiple cores within a single package:
    • Internal bus, caches, probably a single memory controller.
  • Multithreading (hyperthreading):
    • Keeping execution units full, scheduler needs to be aware.

12. Hardware: Storage

  • Mechanical devices are slow: but magnetism provides dense storage.
  • Many generations of improvement: highly engineered machines.
  • OS needs to be aware of geometry: surfaces, cyclinders, tracks.
  • Peformance is maximised by choosing access patterns.
  • Near end-of-life: replacement is flash.
  • Faster, no moving parts.
  • Less mature: higher cost per GB.
  • OS hides actual devices behind the file-system: easy replacement for users.

13. Hardware: I/O

  • All other hardware: consider as generic I/O.
  • Device has a controller: talks to the OS.
  • OS has a separate driver for each device.
  • Epic slowness relative to CPU.
  • OS should do useful work waiting on responses.
  • Interrupt is a signal from device to CPU.
  • CPU stops what it is currently doing.
  • Handles the interaction with device.
  • Returns to normal work.
  • DMA decreases the CPU involvement further.

14. Hardware: Buses

  • Typical PC architecture: several buses that devices are connected on.
  • Bus: shared communication channel, time-multiplexed by different devices.
  • Several "buses" (e.g. PCIe) are actually fully switched, name is legacy.
  • Basic contraints: longer wires are slower (lower frequency).
  • Parallel wires increase bandwidth, latency is fixed by physics

Break (15mins)





Intermission

15. Concepts: Address Spaces

  • Now we look at some central concepts used in operating systems.
  • Each is an abstraction to simplify the programming model.
Address Space
The mapping applied to memory accesses.
  • Addresses are explicit inside programs: encoded inside instructions.
  • The overall range is determined by the processor model, e.g. \(0-2^{32}\).
  • No way for two programs to negotiate which addresses to use.
  • There is no way for programs to coordinate how to divide this range.
  • OS needs a mapping in hardware.

16. Concepts: Data Storage

  • Programs manipulate volatile data inside their address space.
  • Need to choose which data persists, how to organise it for later reuse.
File
Persistent data storage with explicit control of read/write.
File System
An organisation of files so that they can be chosen for use.
  • Interface: open / close (choose data).
  • Interface: read / write (variable length).
  • Interface: seek (random access).
  • Conventionally a tree of named files.
  • OS maps this onto a device that can read/write fixed-size blocks.

17. Concepts: Process

  • OS must run multiple programs, multiple copies of the same program.
  • Separate the "program" (i.e code) from "a copy of the program in memory".
Process
An executing copy of a program: unique address-space and resources
  • Example: different processes executing same program.
  • Separate (overlapping) address spaces.
  • Unique resources (open files).
  • Memory: implicit / private.
  • Storage: explicit / public.
  • Interface: create, fork, terminate.

18. Concepts: Privilege

  • Simple system: only one program is running, full access to the machine.
  • Programmers are given the process abstraction to use multi-processing in the OS.
  • Appears that each process executes on a isolated machine.
  • Isolation prevents unauthorised access, denial of resources, injection of faults.
  • How does the OS enforce these properties?
  • Processor needs privileged instructions: alter processes (e.g. remap addresses).
  • Prevent access by regular programs.

19. Concept: Rings

  • Idea: Concentric rings - inner rings have lower numbers and more privileges.
  • A ring is a protection mode on the processor.
    • Defines which instructions are allowed.
  • Programs run in ring 3: user mode.
    • Using privileged instructions causes an error.
    • These exceptions stop execution.
    • Cannot change address space mapping.
    • Illegal access outside address space.
  • The OS runs in ring 0: kernel mode.
    • Hardware access, memory control, changing mode.
  • Negative rings are for virtualisation.

20. Concept: System calls

  • We cannot allow user-mode processes to change mode.
    • Could simply change to kernel-mode and do anything.
  • So how can a process make a request to the kernel?
  • Basic idea: let the program crash (a little bit).
Software interrupt (trap)
Instruction that causes a specific exception (error handled by the OS).
  • Processor switches into kernel-mode to handle the error.
    • Jumps to a specific point in the OS code - the trap handler.
    • The user-mode process is not in control of jumping into the kernel.
  • Trap handler can look at the registers (select a function).
    • Process the request.
    • Change mode to user-mode and resume the process.

21. System Call example (read a file)

1-4. Program calls read(). 5. Syscalls have an id number. 6. Exception (freezes program). 7. Processor switch to ring 0, calls exception handler in kernel. 8. Kernel looks up syscall id. 9. Jumps to syscall handler. 10. On return switches back to ring 3, jumps to next instruction in program. Cost: 200-15000 clock cycles.

21. System Calls

  • The performance of system calls is highly variable.
    • How much context (i.e. register values) in the program did the kernel need to save and restore?
    • How much data did the kernel access - what was the effect on the cache?
    • How much actual code did the kernel need to execute.
  • In OS design we assume the worst case, use them sparingly.
Design Principle
Assume syscalls are expensive: minimize the number than a process must make.
  • The minimum set of function that a process needs to ask the OS to perform:
    • Tends to be common across UNIX / Windows.
    • Operations that would break protection if the process could do them.

22. System Calls

  • Each sub-system needs some core functionality.
  • Specific white-listing of allowed access.
    • Memory.
    • File-system.
    • Process management.
    • Specific hardware facilities.
  • Looks a lot like the C standard library.
  • POSIX defines the calls.
    • Evolved along with C.
  • Most language then provide a similar runtime.
    • Call down to the C standard library.

23. Language

  • The C language was invented to implement the UNIX system.
    • Tied to low-level system programming.
    • More portable than the alternative: raw assembly.
    • Porting C programs between architectures is still quite painful.
  • Primitive data-types: integers, floating point numbers, characters.
    • These are the "most efficient" sizes for the target machine.
    • All variables / data have concrete representation in memory.
  • Mostly a subset of C++.
    • No classes - structure through procedures / modules.
    • No polymorphism or inheritence: performance more predictable.
  • Implementing OSs' in high-level languages is possible.

24. Structure in C Programs

  • Programs in C are made of translation units.
    • One .c file, includes text from .h files.
  • The inclusion of files and expansion of macros is handled by the preprocessor.
  • This takes each unit, loads in all included files and outputs a single raw C intermediate file.
  • Each unit is compiled separately into an object code.
    • This describes the x86 instructions, data and dangling references to other units.
  • The linker combines these separate objects, and any statically linked libraries into an output binary (normally ELF).

25. Structure: Choices

  • A monolith is an ancient mysterious artifact.
    • Monolithic kernel: a single large program.
    • Everything is compiled in together.
    • Syscalls for access from user-mode.
  • Microkernels are modular.
    • Offloads services to user-mode servers.
    • Message-passing to communicate.
    • Less privileged code: should be more robust.
    • Drivers are more isolated: should contain crashes.
  • There is more overhead in microkernel communication.

26. Structure: Debate

  • The Debate
  • 25 years later: GNU/Hurd is still not ready, GNU/Linux is quite popular (monolithic).
  • For a long time it looked like monolithic kernels were winning the performance debate.
  • OS-X runs on Mach (microkernel), performance is debatable.
  • Robustness vs Performance: still unclear.
  • Windows is mostly monolithic (with some services pushed to user-mode).
  • Modern trends: pushing drivers to user-mode for robustness.

27. Structure: Further Layers

  • Stronger protection: virtual machines.
  • Isolate OSs' from hardware / one another.
  • Hypervisor supervises multiple OSs.
  • Basic idea: "An OS for OSs".
  • Operates on a lower protection ring.
  • Traps / simulates privileged instructions.
  • Virtual machine believes it has hardware access.
  • Type-1 (KVM, virtual servers, cloud etc).
  • Type-2 (Virtualbox, app inside OS).
  • Type-2 can simulate machine (slow) or break protection on the host (faster).

28. Summary

  • Modern hardware is quite complex.
    • Flat uniform memory is quite slow.
    • Many optimisations in the architecture
  • Some of the OS core functionality needs specific hardware support.
    • Memory protection manager.
    • Privileged instruction modes.
    • Persistent data storage.
  • But this complexity is hidden behind simple abstractions.
    • Processes, Address spaces, Files, File-system.
  • Next week:
    • The process model and scheduling of programs.