Mobile Handset Storage and File System Outline Storage

Mobile Handset Storage and File System

Outline Storage and File System Basics n Android File System n i. OS File System n 2

Storage Hierarchy Almost all computers use a storage hierarchy n Put fast but expensive and small storages close to the CPU n Put slower but larger and cheaper storages far away from the CPU n 3

Primary and Secondary Storages n Primary storage (or main memory or internal memory) is the storage directly accessible to the CPU ¨ Lose information when not powered ¨ Such as cache, registers and main memory n Secondary storage (or external memory or auxiliary storage) is not directly accessible by the CPU ¨ The computer usually uses input/output channels to access secondary storage ¨ Does not lose the data when the device is powered down ¨ Such as hard disks, CD/DVD, flash memory (e. g. sdcard) 4

Flash Memory n n Flash memory is a computer memory chip that maintains stored information without requiring a power source. It belongs to the secondary storage in the computer storage hierarchy It can be electronically erased and overwritten Mobile devices use flash memory to store data There are two major types of flash memory ¨ NAND flash memory and NOR flash memory ¨ They are named after the NAND and NOR logic gates 5

Flash Memory History n Flash memory was invented by Dr. Fujio Masuoka (Toshiba) ¨ First presented in IEEE International Electron Devices Meeting (IEDM) 1984 ¨ Intel introduced the first commercial NOR type flash in 1988 ¨ Toshiba announced NAND flash at IEDM 1987 ¨ The first NAND-based removable media format was Smart. Media in 1995 6

NOR Flash Memory Random access to any memory location n Use blocks as the storage units (the typical block sizes are 64 KB, 128 KB or 256 KB) n Erasure must happen on block level, a block at a time. Write happens on byte level n Long write and erasure time n Developed as a replacement for ROM (read often, rarely updated) n 7

NAND Flash Memory (1) Use page (a group of memory words) as the basic unit to store data. The typical page sizes are 512, 2048 or 4096 bytes n Associated with each page are 12 to 16 bytes for checksum n Pages are combined into blocks n Read and write happen on a page level. Erasure can only happen on a block level n 8

NAND Flash Memory (2) Write and erasure time is reduced n Suitable for replacing disks n Typical block sizes: n ¨ 16 KB: 32 pages of (512 + 16 spare bytes) ¨ 128 KB: 64 pages of (2048 + 64 spare bytes) ¨ 256 KB: 64 pages of (4096 + 128 spare bytes) ¨ 512 KB: 128 pages of (4096 + 128 spare bytes) ¨ Spare bytes can be used for checksum 9

NOR VS NAND NOR NAND Performance Very slow erase Slow write Fast read Fast erase Fast write Fast read Reliability Standard reliability Low reliability Needs bad block management Erase Times 10, 000 – 100, 000 – 1, 000 Life Span Less than 10% the life span of NAND Over 10 times more than NOR Access Random Sequential Hardware Implementation Easy Complicated Spare Bytes No Yes (16 bytes) 10

File System File system is a computer program which controls how data is stored and retrieved n Primary roles: n ¨ Provides an abstraction for secondary storage ¨ Provides a logical organization of files ¨ Enables sharing data between processes, users and machines ¨ Protects data from unwanted access 11

Why Need File System n Disks are messy physical devices ¨ Errors, n bad blocks, missed seeks, etc. The job of OS is to hide the mess from higher level software ¨ It needs to handle low-level device control (start a disk read, etc. ) ¨ It needs to provide higher-level abstractions (files, databases, etc. ) n The file system handles the mess for OS 12

File Concept n A file is a logically contiguous address space which stores a collection of data. It has following attributes: ¨ File name ¨ File identifier (a unique number for the file) ¨ File type ¨ File location (pointer to file location on disk) ¨ File size ¨ File protection (controls who can read, write or execute), etc. 13

File Operations n Most common operations: ¨ Create ¨ Write ¨ Read ¨ Reposition within a file ¨ Delete ¨ Truncate 14

File Protection (1) n n File system must implement some kind of protection to control who can access a file and how they can access it Types of users ¨ Owner: the user who created the file ¨ Group: the users who is in the same group with the owner ¨ Others: any other users in the system ¨ Super user: administrator of the system n Types of access are read (r), write (w) and execute (x) 15

File Protection (2) n Protection in Unix file system 16

Directories A directory, also known as folder, is a structure which allows the user to group files into separate collections n The root directory is the first or top-most directory in tree structured directories. It is the starting point where all branches originate from n ¨ E. g. , the / directory in Unix systems 17

Tree-Structured Directories 18

Block (1) A block is a sequence of bytes or bits and have a maximum length, a block size. It is the basic unit used by most file systems to store data n File systems define a block size (e. g. , 4 KB) n ¨ Disk n space is allocated in granularity of blocks A “Master Block” stores the location of root directory ¨ Always at a well-known disk location ¨ Often replicated across disk for reliability 19

Block (2) n A map stores which blocks are free, which are allocated ¨ Usually a bitmap, one bit per block on the disk ¨ Also stored on disk, cached in memory for performance n Remaining disk blocks are used to store files and directories 20

Outline Storage and File System Basics n Android File System n i. OS File System n 21

Overview n n Android uses flash memory as its storage media, so it can use flash file systems such as ex. FAT, YAFFS 2, JFFS 2, etc. Android is based on Linux kernel, so it can use a Linux file system, such as ext 2, ext 3, ext 4, etc. It may also use a proprietary file system developed by a manufacturer, depending on who made the device The most commonly used file system on Android ¨ Yet Another Flash File System 2 (YAFFS 2) 22

YAFFS n YAFFS is a flash file system developed for NAND flash ¨ YAFFS 1: designed for early NAND generations of flash memory (512 -byte page) ¨ YAFFS 2: support new NAND with 2 KB pages and strictly sequential page writing order n It uses chunk to manage data. Chunk is YAFFS terminology for a page. 23

YAFFS 1 Chunks File data stored in fixed size “chunks”, i. e. , NAND pages (512 bytes) n Two types of chunk: n ¨ Data chunk: holding regular data file contents ¨ File header: a file’s metadata such as file name, parent directory, etc. 24

YAFFS 1 Tags n Each chunk has tags with it. The tags comprise the following fields (8 bytes in total): Field Bits Meaning File ID 18 Identifies which file the chunk belongs to Chunk ID 20 Identifies where in the file this chunk belongs to. 0 means this chunk contains a file header, 1 means the first chunk and 2 is the next chunk and so on Serial Number 2 Differentiates chunks with the same file ID and chunk ID Byte Count 10 Number of bytes of data if this is a data chunk Checksum 12 Checksum for tags Reserved 2 Unused 64 Total 25

YAFFS 1 Serial Number n n When data is overwritten, the relevant chunks are replaced by writing new pages to the flash containing the new data. Then the old page is marked as “discarded” If power loss/crash/other problem happens before the old page is marked as regarded, it is possible to have two pages with the same tags ¨ Solve the problem: Increase 2 -bit serial number by 1 every time a chunk is overwritten to distinguish the new data and old data 26

YAFFS 1 Garbage Collection A block with all discarded pages is an obvious candidate for garbage collection n Otherwise, valid pages are copied from a block and then mark the whole block discarded and ready for garbage collection n 27

YAFFS 1 Page Layout Bytes Range Fields 0 - 511 Data, either file data or file header depending on tags 512 - 515 Tags 516 Data Status 517 Block Status Shows whether the block is damaged 518 - 519 Tags 520 - 522 Checksum 523 - 524 Tags 525 - 527 Checksum Details If more than 4 bits are zero, this page is discarded. Tags Checksum for second 256 bytes part of data Tags Checksum for first 256 bytes part of data 28

YAFFS 2 VS YAFFS 1 (1) n n n YAFFS 2 is very similar in concept to YAFFS 1 and they share much of the same source code Add support for new NAND with 2 KB pages Mark very newly written block with a sequence number ¨ The sequence of the chunks can be inferred from the block sequence number and chunk offset within the block ¨ When it detects two chunks with same file ID and chunk ID, it can choose the new chunk by taking the greater sequence number 29

YAFFS 2 VS YAFFS 1 (2) n Introduce concept of shrink headers for efficiency ¨ When a file is resized to a smaller size, YAFFS 1 will mark all of the affected chunks as discarded. But YAFFS 2 writes a “shrink header”, which indicates that a certain number of pages before this header are invalid n Improve performance relative to YAFFS 1 ¨ Write 1. 5 -5 x ¨ Delete: 4 x ¨ Garbage collection: 2 x 30

Outline Storage and File System Basics n Android File System n i. OS File System n 31

Overview In 1985 Apple developed a new file system called hierarchical file system (HFS) for use in Mac OS n Hierarchical file system plus (HFS+) was introduced in 1998 for use in Mac OS 8. 1 n HFSX was introduced in Mac OS 10. 3 in 2005. Now it becomes the file system for i. OS n 32

HFS Blocks At the physical level, the disk is divided into blocks of 512 bytes n There are two types of blocks: n ¨ Logical blocks: they are numbered from the first to the last on the disk. And they are static and the same size as the physical blocks, 512 bytes ¨ Allocation blocks: they are groups of logical blocks used by the HFS to track data in a more efficient way 33

HFS Structure (1) n Logical blocks 0 and 1 ¨ the n boot blocks which contain system startup information Logical blocks 2 ¨ contains the master directory block (MDB) which defines a wide variety of data such as date and time stamps for when the partition was created, the location of the bitmap, etc. n Logical block 3 ¨ the starting block of the bitmap which keeps track of which allocation blocks are in use and which are free. Each allocation block is represented by a bit in the map: if the bit is set, the block is in use. Otherwise it is free to use. 34

HFS Structure (2) n The extent overflow file ¨ Keeps track of which allocation blocks are allocated to which files n Catalog file ¨ Describes the folder and file hierarchy on the disk. It contains metadata about all the files and folders on the disk including information about modify, access and create times 35

HFS+ VS HFS n HFS+ has three more parts in terms of the structure ¨ Attributes file: contains attribute information of all files and folders ¨ Startup file: designed to assist in booting non-Mac OS systems that don’t have HFS or HFS+ support ¨ Reserved block: reserved for use by Apple 36

HFSX VS HFS+ n All Apple mobile devices use HFSX as the file system. There is one major difference between HFSX and HFS+. HFSX is case sensitive. ¨ For example, Case_sensitive. doc and Case_Sensitive. doc are treated as two different files. They can both exist on HFSX but not in HFS+ 37

References (1) n n n n http: //en. wikipedia. org/wiki/File_system https: //cseweb. ucsd. edu/classes/fa 06/cse 120/lectures/120 -fa 06 -l 13. pdf http: //codex. cs. yale. edu/avi/os-book/OS 8/os 8 c/slide-dir/PDF-dir/ch 10. pdf http: //www. dubeiko. com/development/File. Systems/YAFFS/How. Yaffs. Wor ks. pdf http: //en. wikipedia. org/wiki/YAFFS http: //www. forensicswiki. org/wiki/HFS%2 B http: //en. wikipedia. org/wiki/Mac_OS_X_Tiger http: //en. wikipedia. org/wiki/HFS_Plus 38
- Slides: 38