SQCK A Declarative File System Checker Haryadi S

  • Slides: 26
Download presentation
SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau,

SQCK: A Declarative File System Checker Haryadi S. Gunawi, Abhishek Rajimwale, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau University of Wisconsin – Madison OSDI ’ 08 – December 9 th, 2008 1

Corrupt file systems q File systems q q q Store massive amounts of data

Corrupt file systems q File systems q q q Store massive amounts of data Must be reliable Corrupted file system images q q Due to hardware errors, file system bugs, etc. Need to be repaired a. s. a. p. 2

Who should repair? q Does journaling (write-ahead log) help? q q Does file system

Who should repair? q Does journaling (write-ahead log) help? q q Does file system repair itself online? q q No, only for crashes No, not enough machinery Fsck: the last line of defense q It’s a “must have” utility − q XFS: “no need fsck ever”, but deploys fsck at the end Must be fully reliable 3

But … fsck is complex q Fsck has a big task q q q

But … fsck is complex q Fsck has a big task q q q Turn any corrupt image to a consistent image E. g. check if a data block is shared by two inodes How are they implemented? q q Written in C hard to reason about Large and complex − − q q Ext 2 fsck: 150 checks in 16 KLOC XFS fsck: 340 checks in 22 KLOC Hundreds of cluttered if-check statements Bottom line: fsck code is “untouchable” 4

Two Questions q Are current checkers really reliable? q If not, how should we

Two Questions q Are current checkers really reliable? q If not, how should we build robust checkers? 5

e 2 fsck is unreliable q Analyze e 2 fsck (ext 2 file system

e 2 fsck is unreliable q Analyze e 2 fsck (ext 2 file system checker) q Findings: q Inconsistent repair − q The file system becomes unreadable Consistent but not “correct” − − Fsck deletes valid directory entries Fsck loses a huge number of files 6

SQCK q Lesson: Complexity is the enemy of reliability q q q SQCK (SQL-based

SQCK q Lesson: Complexity is the enemy of reliability q q q SQCK (SQL-based Fsck) q q q Big task + bad design complexity unreliability Need a higher-level approach for simplicity Use a declarative query language to write checks Put simply: write fewer lines of code Evaluation q q Simple and reliable: e 2 fsck in 150 queries (vs. 16 KLOC of C) More: Great flexibility and reasonable performance 7

Outline q Introduction q Analysis of e 2 fsck q SQCK Design q SQCK

Outline q Introduction q Analysis of e 2 fsck q SQCK Design q SQCK Evaluation q Conclusion 8

Methodology q E 2 fsck task: cross-check all ext 2 metadata q q q

Methodology q E 2 fsck task: cross-check all ext 2 metadata q q q An indirect pointer should not point to the superblock A subdir should only be accessible from one directory Inject single corruption q q Observe how e 2 fsck repairs a single corruption Only corrupt on-disk pointers − − q Corrupt an indirect pointer to point to the superblock Corrupt a directory entry to point to another directory Usually, a corrupt pointer is simply cleared to zero 9

Inconsistent (Out-of-order) Repair Inode *ind Ideal fsck e 2 fsck Inode *ind 1. Check

Inconsistent (Out-of-order) Repair Inode *ind Ideal fsck e 2 fsck Inode *ind 1. Check bad indirect pointer 0 Superblock 2. Check indirect content Inode Indirect block *ind … 850 … 851 … 998999 0 853 … 2. Check indirect content Superblock … … 1. Check bad indirect pointer Inode *ind 0 Superblock … 0 … 10

Consistent but Incorrect Repair (1) / a 1 / b 1 a 1 b

Consistent but Incorrect Repair (1) / a 1 / b 1 a 1 b 1 LF b 2 a 2 X a 2 b 2 Ideal fsck e 2 fsck Kidnapping problem! / a 1 b 1 / E 2 fsck does not use all available a 1 b 1 information X a 2 b 2 11

Result Summary q Four problems q q q E 2 fsck does not handle

Result Summary q Four problems q q q E 2 fsck does not handle all corruptions q q Inconsistent Information-incomplete Policy-inconsistent Insecure “Warning: Programming bug in e 2 fsck! Or some bonehead (you) is checking a mounted (live) filesystem. ” Not simplementation bugs q q Difficult to combine available information Difficult to ensure correct ordering 12

Outline q Introduction q Analysis q SQCK Design q SQCK Evaluation q Conclusion 13

Outline q Introduction q Analysis q SQCK Design q SQCK Evaluation q Conclusion 13

Fsck Properties q Hundreds of checks q Complex cross-checks q Taxonomy of checks in

Fsck Properties q Hundreds of checks q Complex cross-checks q Taxonomy of checks in e 2 fsck: Single instance Multiple instances Same structure 63 11 Different structures 12 35 q Must be ordered correctly struct A { A { A int x x. A { x. B { int y Ay { x A y{ m. A }x x n y } } } y y } } } B { m n } { x {y x y B { m n } 14

A Declarative Approach q Lesson: Complexity is the enemy of reliability q SQCK q

A Declarative Approach q Lesson: Complexity is the enemy of reliability q SQCK q q Use a declarative query language (e. g. SQL), why? It is declarative: high-level intent is clear Fit for cross-checking massive information Goals achieved q q q Simple: e 2 fsck in 150 queries (vs. 16 KLOC of C) Reliable: Each check/query is easy to understand Flexible: Plug in/out different queries 15

Using SQCK q Take a fs image q Load metadata to db tables q

Using SQCK q Take a fs image q Load metadata to db tables q q Temporary tables Ex: Inode. Table, Group. Desc. Table, Dir. Entry. Table q Run checks and repairs (in the form of queries) q Flush any modification, and delete tables Database tables Scanner Loader Checks + Repairs Flush File system image 16

Declarative check (example 1) q Cross-checking a single instance of a structure q “Find

Declarative check (example 1) q Cross-checking a single instance of a structure q “Find block bitmap that is not located within its block group” first_block = sb->s_first_data_block; last_block = first_block + blocks_per_group; for (i = 0, gd=fs->group_desc; i < fs->group_desc_count; SELECT * i++, gd++) { FROM Group. Desc. Table G if (i == fs->group_desc_count - 1) WHERE G. block. Bitmap NOT last_block = sb->s_blocks_count; if ((gd->bg_blk_bmap. G. start < first_block) || AND G. end (gd->bg_blk_bmap >= last_block)) { px. blk = gd->bg_block_bitmap; if (fix_problem(BB_NOT_GROUP, . . . )) gd->bg_block_bitmap = 0; }. . . } BETWEEN 17

Declarative check (example 2) q Cross-checking multiple instances of the same structure q “Find

Declarative check (example 2) q Cross-checking multiple instances of the same structure q “Find false parents (i. e. directory entries that point to a subdirectory that already belongs to another directory)” q q Must read all directory entries in dir data blocks Wrong implementation in e 2 fsck (the kidnapping problem) 18

Declarative check (example 2) if ((dot_state > 1) && (ext 2 fs_test_inode_bitmap (ctx->inode_dir_map, dirent->inode)))

Declarative check (example 2) if ((dot_state > 1) && (ext 2 fs_test_inode_bitmap (ctx->inode_dir_map, dirent->inode))) { // ext 2 fs_get_dir_info // is 20 lines long subdir = e 2 fsck_get_dir_info (dirent->inode); . . . if (subdir->parent) { if (fix_problem(LINK_DIR, . . )) { dirent->inode = 0; goto next; } } else { subdir->parent = ino; } } 19

Declarative check (example 2) // returns the // false parent(s) SELECT F. * FROM

Declarative check (example 2) // returns the // false parent(s) SELECT F. * FROM Dir. Entry. Table P, C, F WHERE // P says C is its child P. entry_num >= 3 AND P. entry_ino = C. ino AND // and C says P is his parent C. entry_num = 2 AND C. entry_ino = P. ino AND F P C // F also says C is its child F. entry_num >= 3 AND F. entry_ino = C. ino AND F. ino <> P. ino AND 20

Declarative Repairs q Running declarative checks is part of the problem q q Must

Declarative Repairs q Running declarative checks is part of the problem q q Must also perform the declarative repairs A repair = An update query q Some repairs simply update a few fields . . . SET q T. field = new. Value, T. dirty = 1 A repair = A series of queries q q Ex: Reconnect an orphan directory to the lost+found directory Combine a series of queries with C code − − All repairs are written in SQL C code is only used for connecting them 21

Outline q Introduction q Analysis q SQCK Design q SQCK Evaluation q Conclusion 22

Outline q Introduction q Analysis q SQCK Design q SQCK Evaluation q Conclusion 22

SQCK Evaluation q Complexity q q q Reliability q q Pass hundreds of corruption

SQCK Evaluation q Complexity q q q Reliability q q Pass hundreds of corruption scenarios Flexibility q q q 150 queries in 1100 lines of SQL statements (compared to 16, 000 lines of C in e 2 fsck) Add new checks/repairs Enable different versions of e 2 fsck Performance q Introduce some optimizations 23

SQCK vs. e 2 fsck q Reasonable q q q First generation of SQCK

SQCK vs. e 2 fsck q Reasonable q q q First generation of SQCK (with My. SQL) Within 1. 5 x of e 2 fsck Future optimizations q q Hierarchical checks Concurrent queries 24

Conclusion q Complexity is the enemy of reliability q Recovery code is complex q

Conclusion q Complexity is the enemy of reliability q Recovery code is complex q SQCK: Build recovery tools with a higherlevel approach 25

Thank you! Questions? ADvanced Systems Laboratory www. cs. wisc. edu/adsl 26

Thank you! Questions? ADvanced Systems Laboratory www. cs. wisc. edu/adsl 26