The Google File System GFS Introduction Special Assumptions

Introduction • • • Special Assumptions Consistency Model System Design System Interactions Fault Tolerance

Assumptions • The system will always be broken • Files are BIG • Large

Consistency Model • Consistent: All readers see same thing. • Defined: You see exactly

How do Apps Deal? • Parts of files are inconsistent • Must do some

Single Master Architecture • Good: – Has global knowledge • Can make intelligent placement/replication

Architecture • Master – Keeps track of everything • Chunk Servers – Where the

Let The Master Rule • • Namespace Locking Replica placement Creation (Garbage Collection)

Metadata • In Memory – Fast – Limited space • Chunk Locations – No

System Interactions • To write: – – – Ask master for chunk locations (cache)

Record Append • Atomic • Allows for multiple writers • May cause inconsistent states

Fault Tolerance • Restore state fast • Copies, Copies • Checksums for data integrity

Results summary • When you build a file system around the specific applications which

Slides: 13

Download presentation

The Google File System (GFS)

Introduction • • • Special Assumptions Consistency Model System Design System Interactions Fault Tolerance (Results)

Assumptions • The system will always be broken • Files are BIG • Large streaming reads / small random reads • Large sequential writes (appends) • Lots of multiple appends • High sustained bandwidth

Consistency Model • Consistent: All readers see same thing. • Defined: You see exactly what you write. • Undefined: Consistent, but might not be exactly as expected.

How do Apps Deal? • Parts of files are inconsistent • Must do some checking of data: – Application level checksums

Single Master Architecture • Good: – Has global knowledge • Can make intelligent placement/replication decisions. • Bad: – Becomes a bottleneck • Must limit it’s involvement in read/write

Architecture • Master – Keeps track of everything • Chunk Servers – Where the data lives – Each chunk is 64 MB • On other file systems ~8 KB

Let The Master Rule • • Namespace Locking Replica placement Creation (Garbage Collection)

Metadata • In Memory – Fast – Limited space • Chunk Locations – No persistent record • Op Log – Every change to metadata

System Interactions • To write: – – – Ask master for chunk locations (cache) Push data to all chunks (to a buffer) Send write request to primary Primary writes changes (in order received) Primary forwards to secondaries (in order received) – Secondaries write changes, confirm.

Record Append • Atomic • Allows for multiple writers • May cause inconsistent states between successful appends.

Fault Tolerance • Restore state fast • Copies, Copies • Checksums for data integrity

Results summary • When you build a file system around the specific applications which use the system, it works well.