# Realtime- (and) Operating-Systems

08:15-10:00 Tuesday September 27th, 2016

FS Overview and Implementation.

§ 4-4.4.3 pg 263-290

OverviewFree space
File structure and formatsFile organisation
File / Directory Organisation

# 1. Introduction

• Computers generally do three things.
• We've already looked at Processing.
• Memory was the short-term overlap.
• Transient: lost when process exits.
• Long-term information storage.
• Data should be stored until further notice.
• User needs to be confident that data cannot be lost accidently / unexpectedly.
• Data inside programs is explicit for the programmer, but implicit for the user...

# 2. Explicit storage of data

• Users require explicit control over data storage.
• Persistence: decouples lifetime of data from process.
• Allows user to control when it is written.
• How long it lasts before it is removed.
• Stored data should be available until explicity deleted.
• Copying: guarantees duplication of data.
• Allows checkpointing (recovering intermediate states, e.g. thesis_v5.tex).
• Allows sharing and communication (e.g. taking data from word processor, sending it by email).
• Organisation: finding data at a later time.
• Expressing relationships between data (e.g. part of same project).

# 3. Definitions

File
A logical unit of information.
• Different meanings in different contexts.
• Each application can map data onto a collection of files.
• User can control which information is stored together.
Persistence
Remaining in existence until explicitly removed.
• Implies that stored data should not be lost when a process exits or the system reboots.
File System (FS)
The storage and management of collections of files.

# 4. Issues

• Some of the issues in design and implementation of a FS:
• How do we locate information?
• How do we provide confidentiality?
• How do we provide integrity?
• How do we manage free space?
• What structure is supported for data within files?
• What operations are supported on files?
• How does the logical organisation map onto disk operations?
• We look at the user interface to storage first.
• This leads into how we implement what the user sees.
• Which design choices arise.

# 5. File Structure

• We have different options for a "Logical unit of information".
• Different forms of structure for the contents within each file.
• Affects how much freedom each program has in choosing structure.
• Raw byte sequence: least structured, program has complete control.
• Raw abstraction of most systems; program implements everything.
• Records: sequence of fixed length pieces (array).
• Not used directly anymore, indirectly implemented by programs.
• Tree: highly structured, less flexible (in terms of what can be written).
• Seen on mainframes, OSX resource forks, the fabled Reiser4 etc.

# 6. File Structure

• There is a wide design-space of options for file structure.
• Files are used to transfer data between systems.
• Some systems have richer representation, some less so.
• How should files be exported for use on other systems?
• e.g. mounting HFS on non-mac systems and accessing resource forks.
• The trend over time has been towards least-common-denominator.
Observation
Providing file structure in the OS has largely been superceded by applications defining structure over a raw byte stream.
• Optimising the simplest case for data storage, adding more structure in file-formats, seems to work better overall.

# 7. File Types (Formats)

• Problem: When files are raw streams of bytes programs need to agree on the structure of data within.
• Solution: Define structure of each kind case-by-case; specify a file-format.
• Many commmon formats are textual: sequence of symbols.
• ASCII - 128 characters (other 128 depend on context).
• Each symobl is always a byte.
• 7-bit clean files are easily recognisable.
• UTF-8 (unicode) represents more symbols.
• Breaks the connection between a symbol and a byte.
• A code-point (basically a symbol) can be a sequence of bytes.
• Harder to process, but more useful.

# 8. Text vs Binary

• ASCII (or UTF-8) provide a common interface.
• Many viewers, editors, processing tools.
• Least common denominator (tools are available everywhere).
• More specific structure can be layered on top.
• e.g. Tree data as XML, Records as CSV.
• Anything not encoded as UTF or ASCII is lumped together as "binary".
• Each binary format is different - needs specific tools.
• More difficult to process (both for users and programs).
• Better optimised for specific tasks.
• Smaller file-sizes.
• Faster access.

# 9. Examples of binary formats

• Magic number: explicit constant to indicate file type.
• File formats are similar to network message specification.
• The nice clean lines don't really exist.
• The sequence of bytes needs to express divisions into different pieces.
• Fixed length parts can be implicit.
• Variable length parts need well-defined encoding of size (e.g. known position/size before field).

# 10. Explicit Naming

• Method to choose between files: names as human-readable labels.
• Either strings in ASCII or UTF-8.
• Design choice: should names be case-sensitive?
• NTFS and UNIX generally choose yes - equals strings name the same file.
• The Mac chooses to be different - HFS+ is insensitive by default.
• Making it easier for people vs programs (programmers).
• Fixed-length names are easier to work with (but can get awkward).
• Variable length names makes the implementation more intricate.
• Names allow the selection of files.
• Need to be stored in containers for collections of data...

# 11. Flat Directory Structures

• Within a single-level directory system all names must be unique.
• e.g. naming scheme on a camera IMG00000.JPG, IMG00001.JPG ...
• Doesn't scale up well:
• Typical desktops / small networks store millions of files.
• Uniqueness becomes difficult.
• Same name often works in many contexts (e.g. readme.txt).
• Independent naming contexts.
• Remembering which files amongst millions relate to each other is hard.
• Users want to group files together.
• Allowing directories of files and also directories...
• ... solves both issues of uniqueness and grouping...

# 12. Hierarchical Directory Structure

• Grouping files together allows users to organise data.
• Tagging is slightly more powerful (database style queries on relations).
• Hierarchy is cheaper to implement.
• Directories can contain both files and directories.
• Arbitrary number of groups, arbitrary depth of nesting.
• Powerful enough to cover most use-cases.
• Can indicate privacy over entire group (directory or sub-tree).
• Good organisational tools for multi-user sytems.
• Names must be unique within a single directory, not the entire FS.

# 13. Paths

• Path: How to navigate a tree.
• e.g. /usr/jim or /usr/lib/dict
• Component: each name in the path.
• Separator: illegal character for a component.
• The path is a sequence of directions.
• Each component is where to go next: name in directory.
• Most UNIX systems use / as separator.
• Windows uses \.

# 14. Paths

• Absolute path: starts with the separator.
• Instructions start at the root.
• e.g. /usr or "\Program Files".
• UNIX uses a single root: FS is one tree.
• Windows uses a root per drive (C:, D: etc), forest of trees.
• Locates a single file or directory.
• Relative path: starts with a component.
• Different targets, relative to a location.
• Current Working Directory (CWD) is stored in each process.
• Relative paths are instruction from the the CWD.
• e.g. CWD=/usr lib/dict refers to /usr/lib/dict

Intermission

# 15. Directories

• Special names: exist inside every directory.
• . is the directory itself, e.g. filename = ./filename
• .. is the parent directory i.e. .. is always up in the tree.
• .. of root is the root (a loop).
• Every directory is a namespace (dictionary).
• Names within each directory are unique keys.
• Values are locations of other directories or files.
• We will look at different schemes for writing locations.
• They are equivalent to pointers to disk blocks.
• Multiple pointers can refer to the same location.
• Allows different names (paths) for a file / directory.
• Directories and paths are a logical organisation.
• Using pointers separates it from the physical organisation.

# 16. Typical Directory API

• To manipulate directories the system provides an API (normally POSIX).
• These operations are exposed in UNIX as shell commands.
• Deletion: only empty directories can be deleted.
• Multiple ways to handle deleting trees with links.
• Program must decide by doing it first.
• Open/Close/Read: allows a program to access the list of entries.
• Rename: move.
• This concludes the user view of the FS, move on to implementation.

# 17. File-system Layout

• Program (user) view is the logical organisation of the data in the FS.
• FS Implementation is mainly concerned with the physical organisation.
• How to map the structure described onto the disk API..
Disk API
Normally a disk has a fixed number of fixed-size blocks (e.g. $$10^9$$ 4kb blocks) and is accessed by read(k) and write(k).
• Typically the disk is divided into smaller logical pieces (partitions).
• Each partition (drive) contains an independent file-system.
• Isolate failures (generally mechanical devices).
• Each partition is accessed through an API similar to the raw disk.
• Read($$k$$) or write($$k$$) for $$k<n$$.

# 18. File-system Layout

• Boot-strap problem: code to access the FS inside the OS inside the FS...
• Example of a possible file-system layout.
• Boot-block in a known location: solves the boot-strap problem.
• Simple interface for BIOS/EFI to load known blocks, execute contents.
• Superblock contains admin: size, FS type, locations of other boxes.
• Description of which blocks on the disk are free.
• I-nodes are the tree nodes for the FS.
• Link into the files and directories.

# 19. File Allocation: Contiguous

• Allocating a file as a contiguous range of disk blocks:
• Easy to represent, file is start block and size.
• Indexing into block $$k$$ of the file is just addition.
• Maximum read performance on spinning disks.
• When files are deleted they leave holes...
• Same fragmentation problem we saw in memory allocation.

# 20. File Allocation: Contiguous

• Avoiding fragmentation: we could preallocate (max) space for the file.
• Prevents deletion and reallocation when it changes.
• Works pretty well in write-once applications: all file sizes are known.
• e.g. burning UDF FS onto an optical disk.
• e.g. creating read-only boot systems (building a server image).
• Also works well if we write out incremental snapshots.
• Version-control at the FS level, e.g. Fossil (Plan9) or ZFS.
• Another use is to split the file into extents.
• A collection of contiguous pieces of the file.

# 21. File Allocation: Linked List

• Another approach to file allocation is a linked list for each file.
• Each block in a file stores the next block in the file.
• Zero fragmentation.
• Sequential file access becomes random access on the disk.
• Mechanical disks are slower at random access than sequential.
• Problem: Indexing into block $$k$$ (random access in the file) requires reading $$k$$ blocks to follow the list.
• Problem: Block size is slightly less than $$2^n$$: alignment problems, e.g. 4KB access will span blocks.

# 22. File Allocation Table (FAT)

• File Allocation Table: (FAT) use a single table to hold all lists together.
• Directory entry points to start.
• e.g. A: 4,7,10,12.
• No pointer inside blocks: store $$2^n$$.
• Keep table in memory - faster to index into for random access.
• One entry for each block on disk: problem for larger FS.
• Sentinel (-1) at end of each list.
• FAT was introduced in MS-DOS, FAT32 was Windows 95 - still used as standard for most removable media.

# 23. I-nodes

• Last approach for file allocation: index-node (inodes) are a block listing file blocks.
• Splits FAT into many indices - one per file.
• Don't put entire disk table in memory - only open files.
• Scales much better for large FS.
• Problem: FAT allows arbitrary length files, i-node can only hold fixed number of pointers.
• Solution: chain i-nodes together (linked-list of index blocks).
• Used in UNIX / NTFS.

# 24. Implementing directories

• The file-allocation records where each file is on the disk.
• But to open/read these files by their path, the OS must:
• Walk through the directory tree, according to the path components.
• Find the location of the file (block-range, first block or inode).
• The implementation of a directory must:
• Map the ASCII name string onto some disk block.
• Could be the files (as above) or the next directory component in the path.
• Mapping could be fixed-length names:
• 8+3 in MSDOS, a label and an extension indicating file format.
• 14 characters in older unix, any mix of labels and/or extensions.
• 255 characters in most modern systems (almost arbitrary strings).
• Variable-length names are slightly more challenging...

# 25. Variable-length filenames

• a) shows file-names inline.
• Names are padded to 4-byte boundaries.
• Each entry is a different size.
• Making listing the directory more complex.
• Fragmentation inside listing if files are deleted.
• b) shows a heap approach.
• Each entry is the same size.
• Simpler to list entries, no fragmentation in entries themselves.
• Still need to compact the heap when it is fragmented (can be done in memory).

# 26. Scalability of directory structures

• Schemes shown so far assume linear search for filenames.
• So walking a path is $$\mathcal{O}(n)$$ for each component.
• Must search for name inside directory at each step.
• Large directories slow down the system (even opening their subdirectories).
• What about servers, NFS, NAS - millions / billions of files?
• Hash filenames.
• Split the lists seen so far into the chain in each hash.
• Average lookup is $$O(1)$$.
• Implementation is more complex.
• inode requires hashtable structure followed by multiple lists.

# 27. Attributes in directories

• Attributes are general meta-data for files.
• Track usage information: creation / access times.
• Ownership
• Access rights
• More general tags / categories.
• These can be stored in the directory entry as shown in a).
• Can be put into a separate structure: b) shows dedicated i-node.
• Flexibility.
• Scope for optimising small files (e.g reusing i-node).

# 28. Summary

• Logical organisation: user visible organisation of files and directories.
• Physical organisation: layout of blocks on the disk
• We've seen most of the implementation section:
• File Allocation.
• Directories.
• Rest is in the next lecture.
• Links, Journalling File Systems and VFS.