Page: https://student.cs.uwaterloo.ca/~cs350/S21/

Instructor: Lesley Istead

Time: T,Th 11:15am-12:40pm

Week 1. May 11

normal = (.02 * A0 + .1 * A1 + .15 * A2 + .13 * A3) + .075 * R1 + .075 * R2 + .1Q + .35F

OS/161 essentials:

defn. generally, an OS is a system that

three views of an OS:

defn. the kernel is the part of the operating system that responds to system calls, interrupts and exceptions

defn. monolithic kernel: includes device drives, file system, virtual memory, IPC, etc

defn. microkernel: only absolutely necessary components are a part of the kernel. all other elements are user programs (QNX)

defn. real-time OS: an OS with stringent event response times, guarantees, and preemptive scheduling (QNX)

threads and concurrency

reasons to use threads:

OS/161 thread interface

// kern/include/thread.h // creation int thread_fork(const char *name, struct proc *proc, void (*func)(void *, unsigned long), void *data1, unsigned long data2); // termination void thread_exit(void); // yielding void thread_yield(void);

other thread libraries:

review: MIPS registers

num name use num name use
0 z0 always zero 24,25 t8,t9 temps (caller-save)
1 at assembler reserved 26,27 k0,k1 kernel temps
2 v0 return val/syscall # 28 gp global ptr
3 v1 return val 29 sp stack ptr
4-7 a0-a3 subroutine args 30 s8/fp frame ptr (callee-save)
8-15 t0-t7 temps (caller-save) 31 ra return addr (jal)
16-23 s0-s7 saved (callee-save)

concurrent program execution (two threads)

+--------------------+ +---------------+ thread 2 CPU | | PC | Registers +-----+ | +--------------------+ | | |SP +--|------------------------------------------|----------------+ |--v---++-------+ +-----v---++---------+ | ||code ||data | address space |thd stack||thd stack| | |--^---++-------+ (shared) +---------++-----^---+ | +--|-----------------------------------------------------|-----+ | SP | | +----------------------+ | | PC | thread 1 CPU | | +---------------+ Registers +--------------+ +----------------------+

implementing concurrent threads:

options

timesharing

Week 2. May 18

context switch

context switch in MIPS

/* kern/arch/mips/thread/switch.S */ switchframe_switch: /* * a0 contains the address of the switchframe pointer in the old thread. * a1 contains the address of the switchframe pointer in the new thread. */ /* Allocate stack space for saving 10 registers. 10*4 = 40 */ addi sp, sp, -40 /* Save the registers */ sw ra, 36(sp) sw gp, 32(sp) sw s8, 28(sp) sw s6, 24(sp) sw s5, 20(sp) sw s4, 16(sp) sw s3, 12(sp) sw s2, 8(sp) sw s1, 4(sp) sw s0, 0(sp) /* Store the old stack pointer in the old thread */ sw sp, 0(a0) /* Get the new stack pointer from the new thread */ lw sp, 0(a1) nop /* _delay slot_ for load (because load is slow in pipeline) */ /* Now, restore the registers */ lw s0, 0(sp) lw s1, 4(sp) lw s2, 8(sp) lw s3, 12(sp) lw s4, 16(sp) lw s5, 20(sp) lw s6, 24(sp) lw s8, 28(sp) lw gp, 32(sp) lw ra, 36(sp) nop /* delay slot for load (not needed?) */ /* and return. */ j ra /* jump to new thread */ addi sp, sp, 40 /* in delay slot */ .end switchframe_switch

the caller of switchframe_switch() has to save t registers (kern/thread/thread.c:thread_switch).

reasons of context switch:

thread states

ready, running, blocked

+----------------+ preemption or thread_yield() +------------------+ | ready | <------------------------------------------ | running | | | | | +----------------+ ------------------------------------------> +------------------+ dispatch ^ | | | | resource avail resource | | or not avail | | wake_all/one() or | | sleep() | | +--------------------+ | +-------------------- | blocked | <------------------+ | | +--------------------+ wait channels

voluntarily context switch

timesharing and preemption

interrupt

a preemptive scheduler uses the scheduling quantum to impose a time limit on running threads

os/161 threads use preemptive round-robin scheduling.

eg. interruption

thd 1 (interrupted) thd2 (take on) +----------------------------+ | | 0x0 | switchframe_switch() | | | |----------------------------| | | | thread_switch() | | | |----------------------------| | | | thread_yield() | | switchframe_switch() | |----------------------------| |----------------------------| | interrupt hdler frame(s) | | thread_switch() | |----------------------------| |----------------------------| ^ | trap frame | | thread_yield() | | |----------------------------| |----------------------------| | |... program stack frame ... | |... program stack frame ... | |

questions:

eg.

int main() { helper(NULL, 0); } void helper(void *, unsigned long i) { if (i < 3) { kprintf("%ld", i); thread_fork("helper1", NULL, helper, NULL, i+1); thread_fork("helper2", NULL, helper, NULL, i+1); } if (i == 0) kprintf("%ld", i); thread_exit(); } /* which following outputs are not possible? a) 01221220 b) 01120222 c) 01342560 x flow stops when i==3 d) 01222120 x cannot have three 2's after one fork (at most 2) e) 01122220 f) 01234560 x flow stops when i==3 g) 01212220 h) 00112222 */

eg.

for i = i, N: busy(C ms) sleep(S ms)

there is one processor and two threads running the program. assume C < quantum and C < S. they run at t=0, at which time do they finish at first time? answer: N(C + S) + C

let C = xx, S = yyy th0| xxyyy|xxyyy|xxyyy| ... |xxyyy th1| xxy|yyxxy|yyxxy| ... | xxy|yy

what if C > S? answer: 2NC + S

let C = xxx, S = yy, extra waiting = _ th0| xxx|yy_|xxx|yy_|xxx|yy_| ... |xxx|yy_ th1| xxx|yy_|xxx|yy_|xxx| ... |yy_|xxx|yy

Week 3. May 25

synchronization

(spin) lock acquire and release

hardware-specific synchronization instructions:

#if x86 void Acquire(bool *lock) { while (Xchg(true, lock)); } #elif arm w ARMTestAndSet(w value, void *addr) { w tmp = LDREX addr; w result = STREX value, addr; if (result == SUCCEED) // may fail, so return true to keep trying return tmp; // successfully changed it, return it return truthy; } void Acquire(bool *lock) { while (ARMTestAndSet(true, lock)); } #endif void Release(bool *lock) { *lock = false; } void my_program() { extern bool lk; Acquire(&lk); // critical session Release(&lk); }

os/161

test and set in os/161

/* return value 0 indicates lock was acquired */ spinlock_data_testandset(volatile spinlock_data_t *sd) { spinlock_data_t x, y; y = 1; __asm volatile( /* assembly instructions x = %0, y = %1, sd = %2 */ ".set push;" /* save assembler mode */ ".set mips32;" /* allow MIPS32 instructions */ ".set volatile;" /* avoid unwanted optimization */ "ll %0, 0(%2);" /* x = *sd */ "sc %1, 0(%2);" /* *sd = y; y = success? */ ".set pop" /* restore assembler mode */ : "=r" (x), "+r" (y) : "r" (sd) /* outputs : inputs */ ); if (y == 0) return 1; return x; }

assumption:

os/161 locks

thread blocking

in os/161, wait channels (queues) are used to implement thread blocking

typedef struct Lock { char *name; // dbg bool held; Thread *owner; wchan *wc; Spinlock *spin; } Lock; void lock_acquire(Lock *lk) { KASSERT(lk); KASSERT(!(lk->held && lk->owner == currthread)); // i dont own it spin_acquire(lk->spin); while (lk->held) { wchan_lock(lk->wc); spin_release(lk->spin); wchan_sleep(lk->wc); // after waking up spin_acquire(lk->spin); } lk->held = true; lk->owner = currthread; spin_release(lk->spin); } void lock_release(lock *lk) { KASSERT(lk); KASSERT(lk->held && lk->owner == currthread); spin_acquire(lk->spin); lk->held = false; lk->owner = NULL; wchan_wakeone(lk->wc); spin_release(lk->spin); }

semaphore impl is almost identical..

semaphores

types of semaphores:

differences between lock and semaphore:

Week 4. June 1

void P(Semaphore *sem) { KASSERT(sem); spin_acquire(sem->spin); while (sem->sem_count == 0) { wchan_lock(sem->wc); // prevent sleeping on available resource spin_release(sem->spin); // wchan_sleep(sem->wc); // after waking up spin_acquire(sem->spin); } --sem->sem_count; spin_release(sem->spin); } void V(Semaphore *sem) { KASSERT(sem); spin_acquire(sem->spin); ++sem->count; wchan_wakeone(sem->wc); spin_release(sem->spin); }

eg. producer/consumer synchronization with bounded buffer

Semaphore *items = sem_create("", 0); Semaphore *spaces = sem_create("", N); void Producer() { P(spaces); // add item to buffer (has to be atomic) V(items); } void Consumer() { P(items); // remove item from buffer (has to be atomic) V(spaces); }

condition variables

void cv_wait(cv *cv, Lock *lk) { KASSERT(cv && lk); KASSERT(lk->held && lk->owner == currthread); wchan_lock(cv->wc); lock_release(lk); wchan_sleep(cv->wc); lock_acquire(lk); } void cv_signal(cv *cv, Lock *lk) {}

eg. bounded buffer producer & consumer

int volatile count = 0; Lock *mutex; cv *notfull, *notempty; void Produce(item) { lock_acquire(mutex); while (count == N) { cv.wait(notfull, mutex); } // add item to buffer ++count; cv_signal(notempty, mutex); lock_release(mutex); } void Consume() { lock_acquire(mutex); while (count == 0) { cv.wait(notempty, mutex); } // remove item from buffer --count; cv_signal(notfull, mutex); lock_release(mutex); cb(item); }

volatile (os/161)

deadlocks

eg.

void a() { acquire(A); acquire(B); // body } void b() { acquire(B); acquire(A); // body }

techniques:

eg. no hold and wait

int try_acquire(Lock *lk) { spin_acquire(lk->spin); if (lk->held) { spin_release(lk->spin); return error; } lk->held = true; lk->owner = currthread; return 0; } void a() { acquire(A); while (try_acquire(B) != 0) { release(A); acquire(A); } // body } void b() { acquire(B); while (try_acquire(A) != 0) { release(B); acquire(B); } // body }

eg. resource ordering

void a() { acquire(A); acquire(B); // body } void b() { acquire(A); acquire(B); // body }

eg. have n queues, need to support an operation transfer(i,j) that moves an element from queue i to queue j unless i has to elements. how to use locks?

  1. use one lock per queue and no holding wait: no deadlock. can can n/2 transfers (# unique pairs) at the same time
  2. use one lock per queue and resource ordering:
    transfer(q1, q2) { if (q1->num < q2->num) { acquire(q1); acquire(q2); } else { acquire(q2); acquire(q1); } }

Week 5. June 8

processes

os/161 process management calls:

eg. how many processes in total? 8

int main() { fork(); fork(); fork(); }

system calls

system calls are the interface between processes and the kernel.

+-------------------+ | application | +-------------------+ |system call library| +-------------------+ unprivileged code ============================================ +-------------------+ privileged code | kernel | +-------------------+

exceptions

there are only two things that make kernel code run:

in mips, the same mechanism handles exceptions and interrupts.

EX_IRQ 0 /* Interrupt */ EX_MOD 1 /* TLB Modify (write to read-only page) */ EX_TLBL 2 /* TLB miss on load */ EX_TLBS 3 /* TLB miss on store */ EX_ADEL 4 /* Address error on load */ EX_ADES 5 /* Address error on store */ EX_IBE 6 /* Bus error on instruction fetch */ EX_DBE 7 /* Bus error on data load *or* store */ EX_SYS 8 /* Syscall */ EX_BP 9 /* Breakpoint */ EX_RI 10 /* Reserved (illegal) instruction */ EX_CPU 11 /* Coprocessor unusable */ EX_OVF 12 /* Arithmetic overflow */

how does the kernel know which syscall the app is requesting?

how do syscalls work

  1. app calls library wrapper function for syscall
  2. library function performs syscall instruction
  3. kernel flips to privileged mode and exception handler runs
    1. create trap frame
    2. determines this is a syscall exception
    3. determine which syscall
    4. does the work for requested syscall
    5. restores the application program state from the trap frame
    6. returns from exception
  4. library wrapper finishes and returns

user and kernel stacks

every os/161 process thread has two stacks though it uses one at a time

inter-process communication

Week 6. June 15

A2

virtual memory

os/161:

grow |code|data <- |stack|

address translation

dynamic relocation

process A vmem process B vmem +------+- - - - - - + +----------+- - - - + | 28KB | | | 48KB | | +------+- - - - - - + +----------+- - - - + 0x0 0x6fff 0x0 0x8fff physical mem B offset B limit v v +------+----------+--------+------+-------------+ | | proc B | |proc A| | +------+----------+--------+------+-------------+ 0x0 ^ ^ A off. A limit

problem:

segmentation

translation:

eg.

proc A --------------------------------------------------------- segment limit register relocation register 0 0x2000 0x38000 1 0x5000 0x10000 --------------------------------------------------------- translation for A: --------------------------------------------------------- addr segment offset physical addr remark 0x1240 0 0x1240 0x39240 addr = 0|001 0010 0100 0000 0xA0A0 1 0x20AD 0x120AD addr = 1|010 0000 1010 0000 ---------------------------------------------------------

Week 7. June 22

paging

process A vmem +-+-+---+ |o|o| | +-+-+---+ proc A page table ----------------- page frame valid? 0x0 0x0f 1 0x1 0x26 1 0x3 0x00 0 0x4 0x00 0 -----------------

formulas (note there is only one page table!):

img

other info found in PTEs:

eg. if V=32bits, then the maximum virtual mem size is 4GB. assuming page size is 4KB, and PTE size is 4 bytes, then page table is 4GB/4KB*4~4MB. if V=48bits, then page table size is 256GB!

eg. consider 32-bit virtual and physical addresses and a page size of 2^12 bytes (4KB). suppose process P is running using only 128KB of vmem. the first 5 entries in its page table is as follows:

page frame v -------------------- 0x0 0x00234 1 0x1 0x00235 1 0x2 0x0023f 1 0x3 0x00ace 1 0x4 0x00004 1
  1. what is total number of entries in P's page table? 2^32 / 2^12 = 2^20 PTEs => 20 bits for page num
  2. how many valid entries? 128KB / 4KB = 32 entries
  3. translate following virtual addresses:
  4. if page size were 16KB. how many entries in P page table? 2^32 / 2^14 = 2^18 => 18 bits for page num

why do we keep all pages in page table instead of only valid ones? because entries are not contiguous.

multi-level paging

in practice, multi-level paging is smaller than single page.

formulas:

eg. V=32, PTE size = 2^2, page size = 2^12. levels = ceil((32 - log(2^12)) / log(2^12 / 2^2)) = ceil((32 - 12) / 10) = 2. if V=48, then 4 levels.

eg. consider virtual addresses and physical addresses are 64 bits. page size is 1MB (2^20 bytes). the size of page table entry is 16 (2^4) bytes.

  1. how many bits for virtual address page offset? log(2^20) = 20
  2. what is maximum number of entries in an individual page table? 2^20 / 2^4 = 2^16
  3. how many levels?
  4. a process uses 128MB (2^27) of vmem, with a virtual address range from 0 to 2^27-1. how many individual page tables at each level required to translate the process' address space? 1 1 1

TLB

software-managed TLBs:

mips R300 tlb:

<-- hi 32 bits --><-- low 32 bits --> +-------------------+-----+----+--------------+---+-------+ | 20 | 6 | | 20 | | | +-------------------+-----+----+--------------+---+-------+ page # PID frame # valid/perm... (not used)

why PID is not used: because we clear cache after every context switch anyways.
benefit of PID: cache affinity, security...

conslusion:

os/161 (dumbvm)

npages2 <---------> vbase2 -------------------------v +- - -+----------+- - - -+----------+- - - - - - - - - - - - - - - - +-----+ | | code | | data | virtual memory |stack| +- - -+----------+- - - -+----------+- - - - - - - - - - - - - - - - +-----+ ------^ vbase1 <----------> npages1 stackbase -----v +----+-----+------------+----------+---+--------+--------------------------+ | |stack| | data | | code | physical memory | +----+-----+------------+----------+---+--------+--------------------------+ ------------------------^--------------^ pbase2 pbase1
struct addrspace { vaddr_t as_vbase1; /* base virtual address of code segment */ paddr_t as_pbase1; /* base physical address of code segment */ size_t as_npages1; /* size (in pages) of code segment */ vaddr_t as_vbase2; /* base virtual address of data segment */ paddr_t as_pbase2; /* base physical address of data segment */ size_t as_npages2; /* size (in pages) of data segment */ paddr_t as_stackpbase; /* base physical address of stack */ }; ... // USERSTACK = 0x8000 0000, DUMBVM STACKPAGES = 12, PAGE SIZE = 4KB. auto vbase1 = as->as_vbase1; auto vtop1 = vbase1 + as->as_npages1 * PAGE_SIZE; auto vbase2 = as->as_vbase2; auto vtop2 = vbase2 + as->as_npages2 * PAGE_SIZE; const auto stackbase = USERSTACK - DUMBVM_STACKPAGES * PAGE_SIZE; const auto stacktop = USERSTACK; if (faultaddress >= vbase1 && faultaddress < vtop1) // code paddr = (faultaddress - vbase1) + as->as_pbase1; else if (faultaddress >= vbase2 && faultaddress < vtop2) // data paddr = (faultaddress - vbase2) + as->as_pbase2; else if (faultaddress >= stackbase && faultaddress < stacktop) // stack paddr = (faultaddress - stackbase) + as->as_stackpbase; else return EFAULT;

os/161 uses ELF binary files:

Week 8. June 29

virtual memory for kernel

challenges if kernel lives in virtual memory:

user | kernel +-------------------+------+-----+-----+----------+ | |stack |kseg0|kseg1| kseg2 | +-------------------+------+-----+-----+----------+ 0x0 |0x8000 0000 0xffff ffff

os/161:

address translation is os/161:

using secondary storage

reducing page faults:

measuring usage:

Week 9. July 6

scheduling

simple scheduling model

simple scheduling model:

first come, first served:

A B C D +-----+----+-----------+----+ order of arrival: A(a=0,r=6) B(a=0,r=5) C(a=0,r=12) D(a=10,r=5)

round robin:

A B C A D C C +----+----+----+-+----+----+-+ max time: 5 order of arrival: A(a=0,r=6) B(a=0,r=5) C(a=0,r=12) D(a=10,r=5)

shortest job first:

B A D C +----+-----+----+----------+ order of arrival: A(a=0,r=6) B(a=0,r=5) C(a=0,r=12) D(a=10,r=5)

shortest remaing time first:

B A A D C +----+----+-+----+----------+ order of arrival: A(a=0,r=6) B(a=0,r=5) C(a=0,r=12) D(a=10,r=5)

cpu scheduling

multi-level feedback queues:

algo:

  1. n round-robin ready queues where priority of Qi>QjQ_i>Q_j if i>j
  2. threads in QiQ_i use quantum qiq_i and qiqjq_i\leq q_j if i>j
  3. scheduler selects a thread from highest priority queue to run
    1. threads in Qn1Q_{n-1} are only selected when QnQ_n is empty
  4. if a thread from QnQ_n is preempted, it is pushed to Qn1Q_{n-1}
  5. when a thread wakes after blocking, it is put onto the highest-priority queue
  6. to prevent starvation, all threads are periodically placed in the highest-priority queue
  7. since interactive threads tend to block frequently, they will 'live' in higher-priority queues while non-interactive threads sit down to the bottom
+---------+ wake | v | +---------+ +<--- | Qn,qn | highest priority, lowest quantum | +---------+ | | preemption | v | +---------+ +<--- |Qn-1,qn-1| | +---------+ | | preemption | v | +---------+ +<--- | Q1,q1 | lowest priority, highest quantum +---------+

linux completely fair scheduler:

scheduling on multi-core processors

shared ready queue: contention

per core ready queue: scalable, cache affinity. downside: load-balancing

device and IO

sys/161 clock:

offset size type desc
0 4 status current time (s)
4 4 status current time (ms)
8 4 command restart-on-expiry
12 4 status and command interrupt (reading clears)
16 4 status and command countdown time (ms)
20 4 command speaker (beeping)

clock is used to implement preemption.

os/161 serial console:

offset size type desc
0 4 command and data character buffer
4 4 status writeIRQ (complete status)
8 4 status readIRQ

if a write is in progress, the device exhibits undefined behaviour if another write happens. (mutex needed for writing)

device drivers

eg. write character to tty

P(output device write semaphore) write character to device data register do: read writeIRQ register // polling while status is not 'completed' V(output device write semaphore)

why use semaphore? we may be in different threads (?)

// use interrupts to avoid polling // device driver write handler P(output device write semaphore) write character to device data register // interrupt handler for serial device write write IRQ register to ack completion P(output device write semaphore)

accessing devices:

mips/os161: each device is assigned to one of 32 64KB device slots. a device's registers and data buffers are mem-mapped into its assigned slot.

larger data transfer:

Week 10. July 13

persistent (non-volatile) storage

persistent storage is any device where data persists even when device is without power.

hard disks

spindle | r/w =============== platter (glass or porcelain, ferromagnetic-coated) head ---------+ | =============== platter =============== platter | tracks (concentric circles)

cost model for disk io:

eg. a disk has total capacity of 2^32 bytes. disk has single platter with 2^20 tracks. each track has 2^8 sectors. disk operates at 10000RPM and has max seek time of 20ms.

performance implications:

disk head scheduling

elevator algorithms (SCAN):

sys/161 disk controller:

offset size type desc
0 4 status # of sectors
4 4 status and command status
8 4 command sector number
12 4 status RPM
32768 512 data transfer buffer

writing:

// only one disk request at a time // device driver write handler: P(disk semaphore) copy data from mem to device transfer buffer write 'write' command to disk status register // wait for request to complete P(disk completion semaphore) V(disk semaphore) // interrupt handler write disk status register to ack completion V(disk completion semaphore)

reading:

// only one disk request at a time // device driver read handler: P(disk semaphore) write target sector number to disk sector number register write 'wread' command to disk status register // wait for request to complete P(disk completion semaphore) copy data from device transfer buffer to mem V(disk semaphore) // interrupt handler write disk status register to ack completion V(disk completion semaphore)

thread initiating r/w must wait until r/w is completed.

SSD

persistent ram

A3

Week 11. July 20

file system

basic file interfaces:

directory and file names

/root --> 2 +- bin/ --> 312 | +- prog1 --> 654 | +- prog2 --> 100 | +- misc/ --> 145 | +- test --> 5 +- foo/ --> 24 +- uu --> 654 +- docs/ --> 93

links:

multiple file systems

VSFS

persistent data: file data, metadata, diretories and links, fs metadata

non-persistent: per process open file fd table (file handle, file position); system wide (open file table, cached copies of persistent data)

example:

inodes data +--+-----+--------------+ +-----------------------+ |su|i |d |I |I |I |I |I | |D |D |D |D |D |D |D |D | ... +--+-----+--------------+ +-----------------------+ 0 7 8 15

to know which i-nodes and blocks are free:

superblock:

i-nodes:

img

eg. block size is 4 KB, pointer size is 4 B.

  1. inode has 12 direct pointers, how big is file we can access from direct pointers?
  2. inode has 1 single indirect pointer, how big can we access from single indirect pointers?
  3. inode has 1 double indirect pointer, how big can we access from double indirect pointers?
  4. inode has 1 triple indirect pointer, how big can we access from triple indirect pointers?
  5. how big a file can we store?

eg. assume fs uses a block size of 1024 bytes, pointer size 4 bytes, and that i-nodes include 9 direct pointers, 1 indirect pointer, and 1 double-indirect pointer. The fs has a block cache (intially empty) and an i-node cache.

char x[1000]; // assume that f is a descriptor referring to a // large file that has already been opened lseek(f, 10500, SEEK_SET); write(f, x, 1000); read(f, x, 1000);

eg. reading from file (/foo/bar):

assuming foo directory fit in single block. no caching.

op data bitmap inode bitmap root inode foo inode bar inode root data foo data bar data[0] bar data[1]
open(bar) read
read
read
read
read
read() read
read
write

Week 12. July 27

eg. create file (/foo/bar)

op data bitmap inode bitmap root inode foo inode bar inode root data foo data bar data[0] bar data[1]
create(bar) read
read
read
read
read
write
write
read
write
write
write() read
read
write
write
write

delete:

  1. remove entry from directory;
  2. remove file index from i-node table;
  3. mark file's data blocks free in free space index

how does quick format work? clear all bitmap (and inodes).

chaining

file system design

for general purpose file system, design it to be efficient for common case

file system parameters:

failures

virtual machines

pros:

first attempt

guest OS vm host OS physical CPU +------+ +--------+ +-----+ +---------+ |os/161| --> |sys/161 | --> |linux| --> |intel cpu| +------+ +--------+ +-----+ +---------+

privileges

+--+ +--+ |VM| |VM| +--+ +--+ v v +--+ +--+ +-----------------+ |VM| |VM| |type 2 hypervisor| +--+ +--+ +-----------------+ v v v +-----------------+ +-------+ |type 1 hypervisor| |host OS| +-----------------+ +-------+ v v +---+ +---+ |CPU| |CPU| +---+ +---+

how does hypervisor differentiate betweern os and user program excuting privileged instruction?

virtual memory

io and devices

Opt. Aug 3

networking

5-layer model:

transport layer:

network layer:

link layer:

physical layer:

kernel's role in 5-layer model

int ping() { // all syscalls int sfd = socket(AF_INET, SOCK_STREAM, 0); bind(sfd, addr, addrlen); listen(sfd, backlog); send(sfd, msg, msglen, 0); }

not (usually) a part of kernel: