Chapter 8 -- Memory Management

Memory management background.
1. What you can do depends on hardware support
2. What you can do depends on the software you want to run.
Goals
1. Processes can access RAM
2. RAM use minimized
3. Processes protected
4. Everything is fast
5. Easy of programing for applications
Address binding.
1. Assuming the file to be run resides on on a disk as a binary image ...
2. Addresses of variables/functions could be fixed at compile time
  1. Generates absolute code (the fastest kind).
  2. Code must be loaded into a fixed location and kept there.
3. Addresses could be fixed at load time
  1. Uses absolute code
  2. Load step can be long.
  3. Code cannot be moved once loaded.
4. Addresses could be variable during run time
  1. Uses absolute code.
  2. Can be moved during execution.
  3. Requires hardware support
    1. Most CPUs have this support. Some (like car computers or Nintendo) do not.
    2. Hardware provides the illusion that the process has not moved, even though it has.
    3. Distinguishes between logical and physical addresses.
Swapping
1. Bring whole process in
2. Run process
3. Swap whole process out
4. But what about speed?
Dynamic loading
1. Idea: Don't load code until it's actually called.
2. Linux Implementation: Load first small chunk, then jump into it. Let rest come in on page faults.
3. Advantage: produces quick loading code at first. Saves memory.
4. Disadvantage: Total loading time can be increased.
Dynamic Linking
1. Idea: share libraries between processes.
2. Implementation:
  1. Process includes a stub for each dynamically linked subroutine.
  2. When stub is called, stub loads library function (hopefully already in shared memory) and gets address routine.
    1. Can call routine ...
    2. Can change caller to call routine directly, then call routine.
Overlays
1. Idea: Have a fixed portion of memory to bring in routines.
2. Implementation:
  1. Partition program into phases
  2. Load and run phase one
  3. Load and run phase two ...
3. Advantages
  1. Can work in a very small amount of memory
  2. Don't need any O.S. support
4. Disadvantages
  1. Need to partition application
  2. Need to program phases explicitly.
5. Used especially when DOS had a 640K limit.

Hardware support

Logical vs. Physical addressing
1. Logical is what the program generates and sees. Process only cares about logical addresses.
2. Physical is what the memory (RAM) sees. RAM only cares about physical.
3. Change comes from the MMU.

Segmentation, Partitioning and Swapping

Swapping
1. Idea: Free up memory by storing other partitions on disk.
2. Advantage:
  1. Can run more processes than have memory for.
  2. Can use swapping to relocate and help external fragmentation.
3. Disadvantage:
  1. Must swap WHOLE PARTITIONS. That's expensive.
  2. A 100K partition on a 1M/sec disk requires 2 * 100 / 1000 = 0.2 second to swap.
4. Used a lot as a medium term scheduler. Toss processes out when you get overloaded.

Segmentation

Idea: Use this simple hardware to provide multiple processes memory at the same time.
Scheme:
1. Give each process a place in RAM. Set base register to the start of this segment, and limit register to it's length.
2. When process generates an address A,
  1. MMU generates A + base
  2. MMU checks to make sure that does not exceed limit.
  3. Can store a process anywhere in memory, and have it think it's at a different place.
3. Advantages: Simple, can relocate things at will.

Disadvantage:

All segments must be contiguous.
Problems
1. External fragmentation: Free space exists but not in a big enough single piece to be useful.
2. Internal fragmentation: Applications have free space in each segment, but the operating system cannot use it.

Allocation schemes can help

First fit. Put new segment in first free whole. (cheap, easy).
Best fit. Put new segment in smallest free whole. (saves free big wholes, produces many small free wholes.)
Worst fit. Put free segment in biggest free whole. (produces no small free holes, but uses up big free holes.)
id personal experiment out of 10 tries, FIRST one 1.5, worst won1, best won 3.5, and 4 ties.

Example Problem

Process	Memory	Start Time	Required Time
1	600K	1	10
2	1000K	5	5
3	300K	7	20
4	700K	10	8
5	500K	15	15
6	600K	16	10
7	1000K	20	5
8	300K	21	20
9	700K	24	8
10	500K	27	15

Compaction can help.
1. Simple to do (just move memory and change relocation registers)
2. Can be expensive (lots of memory copying).

PAGING ...
1. Basic Idea: Permit noncontigious memory allocation.
  1. Break memory up into pages, which are normally 512 bytes to 8K bytes and each contigious and normally a power of two big.
  2. Break logical addresses into page-num and offset on a bit boundry.
  3. Translate logical into physical via PAGETABLE[page-num]+offset.
  4. Associate with each page a set of permissions bits, and a valid-invalid bit.
2. Problems
  1. Can slow a CPU down. Normally the page table is cached in a Translation Lookaside Buffer, which is associative memory.
    1. Now TLB must be clear on every page table change.
    2. Normally, can get 98% hit rates.
  2. Can slow process switch time down. Page table must be switched. Luckily, most CPUs offer a "page table base register", and only that need be changed. Or can associate with the CPU an PID field, and with each page a PID field. On the IBM/360, access is allowed only if fields are equal or one is zero.
  3. Allocating pages can be slow, since the OS might need to keep track of hundreds of pages. (linked list of free pages can help).
  4. Internal fragmentation. Average waste is 1/2 page per segment.
  5. For large address space machines, page table can become rediculus.
    1. A 32 bit CPU with 2K pages has 2^20 pages!!
    2. This is a problem for large address-space machines, not large memory machines!
    3. A two-level page table can be used, with un-needed sections simply not allocated. VAX.
    4. For 64 bit machines, this is not enough. Instead, can use 3 (SPARC) and 4 (moto) level paging schemes.
    5. Can also use an INVERTED PAGE TABLE. (IBM RT, HP)
      1. Table contains one entry for every physical page, not every virtual one. table is tuples of the form <PID, PAGE-NUM>.
      2. When a virtual memory is to be found, lookup up <PID, PAGE_NUM> in the table. Find it at location 'i'.
      3. Real memory is at <i, offset>.
      4. Page table no longer has all associations, since some virtual pages will be swapped out. Each process must keep a whole page table with it.
      5. Not sure how to share memory without redoing the inverted page table on each switch.
      6. Not sure how to map in the O.S