COT 5611 Spring 2012 Operating Systems Design Principles

  • Slides: 35
Download presentation
COT 5611 – Spring 2012 Operating Systems Design Principles Dan C. Marinescu Office: HEC

COT 5611 – Spring 2012 Operating Systems Design Principles Dan C. Marinescu Office: HEC 304 Office hours: M-Wd 5: 00 -6: 00 PM

Lecture 22 – Monday April 2, 2012 n Reading assignment: ¨ n Chapter 9

Lecture 22 – Monday April 2, 2012 n Reading assignment: ¨ n Chapter 9 from the on-line text Last time – n ADVANCE n SEQUENCE n TICKET n Events n Coordination with events n Virtual memory and multi-level memory management 9/3/2021 Lecture 22 2

Today n n n Atomic actions All-or nothing and Before-or-after atomicity Applications of atomicity

Today n n n Atomic actions All-or nothing and Before-or-after atomicity Applications of atomicity 9/3/2021 Lecture 22 3

9/3/2021 Lecture 22 4

9/3/2021 Lecture 22 4

Atomicity n n Atomicity ability to carry out an action involving multiple steps as

Atomicity n n Atomicity ability to carry out an action involving multiple steps as an indivisible action; hide the structure of the action from an external observer. All-or-nothing atomicity (AONA) ¨ n To an external observer (e. g. , the invoker) an atomic action appears as if it either completes or it has never taken place. Before-or-after atomicity (BOAA) Allows several actions operating on the same resources (e. g. , shared data) to act without interfering with one another ¨ To an external observer (e. g. , the invoker) the atomic actions appear as if they completed either before or after each other. ¨ n Atomicity simplifies the description of the possible states of the system as it hides the structure of a possible complex atomic action ¨ allows us to treat systematically and using the same strategy two critical problems in system design and implementation (a) recovery from failures and (b) coordination of concurrent activities ¨ 9/3/2021 Lecture 22 5

Atomicity in computer systems 1. 2. 3. 4. n Hardware: interrupt and exception handling

Atomicity in computer systems 1. 2. 3. 4. n Hardware: interrupt and exception handling (AONA) + register renaming (BOAA) OS: SVCs (AONA) + non-sharable device (e. g. , printer) queues (BOAA) Applications: layered design (AONA) + process coordination (BOAA) Database: updating records (AONA) + sharing records (BOAA) Example: exception handling when one of the following events occur ¨ ¨ ¨ ¨ n Hardware faults External events Program exception Fair-share scheduling Preemptive scheduling when priorities are involved Process termination to avoid deadlock User-initiated process termination Register renaming avoid unnecessary serialization of program operations imposed by the reuse of registers by those operations. High performance CPUs have more physical registers than may be named directly in the instruction set, so they rename registers in hardware to achieve additional parallelism. r 1 m(1000) r 1+5 m(1000) r 1 m(2000) r 1+8 m(2000 ) r 1 9/3/2021 r 2 m(2000) r 2 + 8 m(2000) r 2 Lecture 22 6

Atomicity in databases and application software n n Recovery from system failures and coordination

Atomicity in databases and application software n n Recovery from system failures and coordination of multiple activities is not possible if actions are not atomic. Database example: a procedure to transfer from a debit account (A) to a credit account (B) Procedure TRANSFER (debit_account, credit_account, amount) GET (temp, A) temp – amount PUT (temp, A) GET (temp, B) temp + amount PUT (temp, B) What if: (a) the system fails after the first PUT; (b) multiple transactions on the same account take place. n Layered application software example: a calendar program with three layers of interpreters: Calendar program ¨ JVM ¨ Physical layer ¨ 9/3/2021 Lecture 22 7

All-or-nothing atomicity n n The AONA is required to - (1) handle interrupts (e.

All-or-nothing atomicity n n The AONA is required to - (1) handle interrupts (e. g. , a page fault in the middle of a pipelined instruction). Need to retrofit the AONA at the machine language interface if every machine instruction is an AONA then the OS could save as the next instruction the one where the page fault occurs. Additional complications with a user-supplied exception handler. - (2) handle supervisor calls (SVCs); an SVC requires a kernel action to change the PC, the mode bit (from user to kernel) and the code to carry out the required function. The SVC should appear as an extension of the hardware. Design solutions a typewriter driver activated by a user issued SVC, READ. Implement the “nothing” option blocking read; when no input is present reissue the READ as the next instruction. This solution allows a user to supply its own exception handler. ¨ Implement the “all” option non-blocking read; return control to the user program if no input available with a zero length input. ¨ 9/3/2021 Lecture 22 8

Before-or-after atomicity n Two approaches to concurrent action coordination: n n ¨ ¨ BOAA

Before-or-after atomicity n Two approaches to concurrent action coordination: n n ¨ ¨ BOAA is more general than sequence coordination. Example: two transactions operating on account A each performs GET and PUT n n n ¨ ¨ ¨ Sequence coordination e. g. , “action A should occur before B” strict ordering BOAA, the effect of A and B is the same whether A occurs before B or B before A non-strict ordering. Six possible sequences of actions: (G 1, P 1, G 2, P 2), (G 2, P 2, G 1, P 1), (G 1, G 2, P 1, P 2), (G 1, G 2, P 1), (G 2, G 1, P 2), (G 2, G 1, P 2, P 1). Only the first two lead to correct results. Solution the sequence Ri Pi should be atomic. Correctness condition for coordination if every possible result is guaranteed to be the same as if the actions were applied in one after another in some order. Before-or-after atomicity guarantees the correctness of coordination indeed it serializes the actions. Stronger correctness requirements are sometimes necessary: n n External time consistency e. g. , in banking the transaction should be processed in the order they are issued. Sequential consistency e. g. , instruction reordering should not affect the result 9/3/2021 Lecture 22 9

Common strategy and side-effects of atomicity n n The common strategy for BOAA and

Common strategy and side-effects of atomicity n n The common strategy for BOAA and AONA hide the internal structure of a complex action; prevent an external observer to discover the structure and the implementation of the atomic action. Atomic actions could have “good” (benevolent) side-effects: An audit log records the cause of a failure and the recovery steps for later analysis ¨ Performance optimization: when adding a record to a file the data management may restructure/reorganize the file to improve the access time ¨ 9/3/2021 Lecture 22 10

Bootstrapping for the development atomic actions n n Bootstrapping informally means to build complex

Bootstrapping for the development atomic actions n n Bootstrapping informally means to build complex actions based on simple ones Bootstrapping steps Find a systematic way to reduce a general problem to a particular one ¨ Solve the problem for the particular problem ¨ See how this solution can be generalized ¨ 9/3/2021 Lecture 22 11

All-or nothing disk storage n n The commit point in CAREFUL_PUT will be discussed

All-or nothing disk storage n n The commit point in CAREFUL_PUT will be discussed later. The three sectors, S 1, S 2, and S 3 form a virtual sector. Writing and reading are sequential, one waits for the previous to complete. We consider only system crashes, the disk does not fail. If the system crashes during a PUT only one of the sectors is affected ¨ On a GET we compare the contents of the sectors. ¨ 9/3/2021 Lecture 22 12

Is there anything wrong with this implementation? n n n We assumed that the

Is there anything wrong with this implementation? n n n We assumed that the three sectors have the same contents at the beginning of an operation. A previous failure would invalidate this assumption. Example: - the running thread is interrupted while writing S 3; data 3 is garbage - the next GET will find data 1=data 2 data 1 will be used - a new thread calls call PUT but it is interrupted while writing S 2 - the next call to GET will find that data 1 ≠ data 2 so it will use data 3 The fix: guarantee that the three sectors are identical before updating. 9/3/2021 Lecture 22 13

9/3/2021 Lecture 22 14

9/3/2021 Lecture 22 14

The implementation of the all-or-nothing disk n n n Assumes that only one thread

The implementation of the all-or-nothing disk n n n Assumes that only one thread at a time attempts to use the PUT and GET Does not implement before-or-after atomicity. The CHECK_AND_REPAIR is idempotent does not have any side effects, can be interrupted and restarted. The commit point occurs after writing to S 2 is finished; after that point the new data is available to a GET (before writing to S 2 starts) the old data is available to a new GET. Writing to the three sectors is not done to improve durability; it is important that the writing is done sequentially 9/3/2021 Lecture 22 15

Generalization the golden rule of atomicity n n An all-or-nothing action should consist of

Generalization the golden rule of atomicity n n An all-or-nothing action should consist of ¨ A pre-commit phase it should be possible to back up from it without leaving any trace ¨ Post-commit phase this phase should be able to run to completion The semantics for programming all-or nothing actions 9/3/2021 Lecture 22 16

Do and don’t for pre-commit and post-commit phases n n n Carry out all

Do and don’t for pre-commit and post-commit phases n n n Carry out all steps necessary to prepare the post-commit phase (which should run to completion), e. g. , check permissions, bring in all pages that may be needed, mount removable media, allocate stack space, do not expose any results or carry out actions that are irreversible. Shared resources allocated during the pre-commit cannot be released until after the commit point. The commit step should be the last step of an all-or nothing action. 9/3/2021 Lecture 22 17

Shadow copies a technique to implement AONA n n n Used by programs that

Shadow copies a technique to implement AONA n n n Used by programs that modify existing files e. g. , text editors, calendar management programs, compilers, etc. Pre-commit create a duplicate working copy of the file and make changes to the working copy Commit point exchange the working copy with the original; use atomic actions supported by the OS e. g. , RENAME Post-commit –clean up, release the space of the working copy Note that the new copy is not available before the RENAME 9/3/2021 Lecture 22 18

Version histories n n n Maintain the history of a variable, do not delete

Version histories n n n Maintain the history of a variable, do not delete older versions. Accept tentative values but ignore them until they are committed. A journal storage manager Allows the creation of a more complex storage model that the cell storage (an item fits into one cell). ¨ Provides atomic actions to the end-user. ¨ 9/3/2021 Lecture 22 19

9/3/2021 Lecture 22 20

9/3/2021 Lecture 22 20

New procedures for a journal storage 9/3/2021 Lecture 22 21

New procedures for a journal storage 9/3/2021 Lecture 22 21

The state transitions of a journal storage system 9/3/2021 Lecture 22 22

The state transitions of a journal storage system 9/3/2021 Lecture 22 22

Read and write procedures: caller_id is the action identifier returned by NEW_ACTION 9/3/2021 Lecture

Read and write procedures: caller_id is the action identifier returned by NEW_ACTION 9/3/2021 Lecture 22 23

The attributes of the read and write n n If the current all-or nothing

The attributes of the read and write n n If the current all-or nothing action fails before the COMMIT then the new version is not visible to the next read. They support writing of an entire record with multiple fields. If say a record has 13 fields and if action 1234 fails then the entire record is not available. A read will access the last committed action say action 1232. 9/3/2021 Lecture 22 24

Example: a thread has created a new record with: data_id=A, new_value=75, and client_id=1794. The

Example: a thread has created a new record with: data_id=A, new_value=75, and client_id=1794. The procedure READ_CURRENT_VALUE will return value 24 for A and ignore versions aborted or pending. 9/3/2021 Lecture 22 25

An all-or-nothing transfer using journal storage n n Note: the transaction is not before-or-after

An all-or-nothing transfer using journal storage n n Note: the transaction is not before-or-after It checks if there are enough funds in the credit_account. The order of steps is unconstrained Problems: updates to version history and changes to the outcome must be all-or-nothing. But these can be done by overriding a single cell. 9/3/2021 Lecture 22 26

Atomicity logs and journal storage n n n Log An interleaved version of all

Atomicity logs and journal storage n n n Log An interleaved version of all variables; the information about the update of each data forms a record appended at the end of the log. Easy access to a log, only the pointer to the last record is needed Combine all-or-nothing atomicity of journal storage with the speed of cell storage. Two steps ¨ Log carry out the change in the journal storage ¨ Install change the cell storage by overriding the previous version of each record The log is the authoritative record of the outcome of an action; the cell storage can be reconstructed using the log. The log should reside in non-volatile memory. 9/3/2021 Lecture 22 27

Types of logs n n Atomicity log. Allows a crash recovery procedure to undo

Types of logs n n Atomicity log. Allows a crash recovery procedure to undo all-or-nothing actions that didn’t complete, or finish all-or-nothing actions that committed but that didn’t record all of their effects. Archive log. Many uses for archive information: watching for failure patterns, reviewing the actions of the system preceding and during a security breach, recovery from application-layer mistakes fraud control, and compliance with record-keeping requirements. Performance log. Most mechanical storage media have much higher performance for sequential access than for random access. Since logs are written sequentially, they are ideally suited to such storage media. When combined with a cache that eliminates most disk reads, a performance log can provide a significant speed-up. 4. Durability log. If the log is stored on a non-volatile medium (e. g. , magnetic tape) that fails in ways and at times that are independent from the failures of the cell storage medium (e. g. , magnetic disk) then the copies of data in the log are replicas that can be used as backup in case of damage to the copies of the data in cell storage. Any log that uses a non-volatile medium also helps support durability. 9/3/2021 Lecture 22 28

Logging configurations 9/3/2021 Lecture 22 29

Logging configurations 9/3/2021 Lecture 22 29

Logging protocols n n Reason for write-ahead-log append while install overwrites Log record Id

Logging protocols n n Reason for write-ahead-log append while install overwrites Log record Id of the all-or-nothing action performing the update ¨ The do or redo action component action that can perform the install if the system crashes before the install. ¨ The undo action component action that can reverse the effects if the system crashes during the install. ¨ n Four types of log records 1. 2. 3. 4. 9/3/2021 BEGIN NEW_ACTION writes this record and records the action id CHANGE written by the pre-commit phase OUTCOME written by the COMMIT or by the ABORT procedures END the final step of an action Lecture 22 30

Example of log records 9/3/2021 Lecture 22 31

Example of log records 9/3/2021 Lecture 22 31

Example: all-or-nothing TRANSFER with logging 9/3/2021 Lecture 22 32

Example: all-or-nothing TRANSFER with logging 9/3/2021 Lecture 22 32

Recovery procedures n n We need recovery procedures in case of a system crash

Recovery procedures n n We need recovery procedures in case of a system crash Assume an in-core database The log is not affected as it resides on non-volatile memory ¨ Abandon the in-core database and all-or-nothing actions in progress ¨ n Two steps: Backwards scan the log and identify all actions with an OUTCOME record showing that the action has been COMMITTED, call them winner actions ¨ Forward scan the log and identify REDO actions of every winner whose OUTCOME record shoe COMMITTED ¨ 9/3/2021 Lecture 22 33

9/3/2021 Lecture 22 34

9/3/2021 Lecture 22 34

9/3/2021 Lecture 22 35

9/3/2021 Lecture 22 35