Ne ST Network Storage Flexible Commodity Storage Appliances
Ne. ST: Network Storage Flexible Commodity Storage Appliances John Bent, Miron Livny, Andrea Arpaci-Dusseau and Remzi Arpaci-Dusseau
Terms z. Appliance (Merriam-Webster) yb : an instrument or device designed for a particular use; specifically a household or office device z. Storage appliance y. Storage plus access methods
What storage users want z. Reliability and availability z. Manageability ycost of management > cost of storage itself y“no futz” computing z. Scalability z. Performance
What storage vendors have z. Net. App, EMC, others make storage appliances (network-attached storage) z. Manageable y. Just plug it in and it works y. Administrative web interface z. Reliable and available y. Standard RAID techniques z. High performance y. Specialized, thin OS focused on serving files
What storage vendors get, annual revenues Net. App EMC $800 million in 2000 $9 billion in 2000
What’s the problem? z. False coupling between HW and SW z“Playground syndrome” z. Myth of specialization
H/W and S/W are bundled z. Hardware decisions are imposed z. Hard to ride commodity curve y. Example: x. Netapp F 720 • $35, 000. 00, 252 GB • $138 / GB x. Maxtor Diamond. Max • $279. 00, 80 GB • $3. 50 / GB
“Playground syndrome” z“We have storage appliances. . . yif you use these protocols, yif you use these security mechanisms, yif you are comfortable with our data semantics” z. Non-flexible software entity
Myth of specialization z. Specialize for one protocol on one machine z. Specialization decreases over time as y. Protocols are added y. Product line expands z. Example: Netapp software y. Generation 1 fit on a single floppy y. Generation 2 took six y. Generation 3?
Alternatives? z. Appliance (Merriam-Webster) ya : a piece of equipment for adapting a tool or machine to a special purpose
Our game? z. Flexible, commodity based, software-only storage appliances z. Goal y. Find a networked machine y“Drop” some software on it y. Have a ready to use storage appliance with flexible mechanisms
New worlds, new problems z. Diverse hardware, software platforms y. Netapp, EMC advantage xfewer platforms, control over OS y. Our approach x. Automate configuration to each host system • Hardware example - use file system or self-manage • Software example - use either read/write or mmap z. Cost of flexibility z. Key is design of the software
Outline z. Introduction z. Building flexible storage modules y. Big picture y. Protocol layer y. Concurrency architecture y. Storage layer z. Motivations for flexible storage appliances z. Conclusion and current status
Ne. ST structure z. Cleanly separated modules for communication, transfer and storage y. Protocol layer x. Maps diverse protocols into common control flows y. Concurrency architectures x. Different models to maximize system throughput y. Storage layer x. Provides abstract interface to disks
Ne. ST structure GFTP Ne. ST Wi. ND HTTP NFS Transfer request Protocol Layer Concurrency Architecture Event Multidriven process threaded Storage Layer Raw disk Local FS RAID Central Control
Protocol layer A collection of servers is less than the sum of their parts. NFSd HTTPd Operating system HTTP Ne. ST Operating system
Consolidate protocols z. Single point of control y. Storage quotas and guarantees can be supported across multiple protocols. y. Bandwidth can be controlled and quality of service can be guaranteed. z. Single administrative interface y. Set policies y. Manage user accounts
Protocol layer implementation z. Each protocol listens on well-defined port z. Central control accepts connections z. Protocol layer reads from connection and returns generic request object z. Like Linux V-nodes y. Add new protocol by writing a couple of methods
Protocol layer example, directory list request “ 31: LIST” FTP “ftp, ftp” Ne. ST speak “ 5” “nest, nest” Directory list Central control Directory list Linked list Protocol layer Linked list Storage layer
Concurrency architecture z. Three difficult goals y. Low latency y. High bandwidth y. Multiple simultaneous clients z. No single portable solution y. Provide multiple models to provide solutions on a range of different platforms x. Multi-threaded x. Multi-process x. Event driven
Concurrency architecture Event driven Multi-process Multi-threaded Concurrency architecture z. Central control creates transfer object y. Socket descriptor from the protocol layer y. File descriptor from the storage layer z. Transfer object passed to concurrency architecture
Concurrency on Linux
Storage layer z. Three needed areas of flexiblity y. File systems interfaces x. Example: read()/write() or mmap() y. Abstract storage models x. RAID, JBOD, etc. y. User account administration x. Creation and removal x. Quotas and guarentees for users and groups
File system interfaces on Linux
Outline z. Introduction z. Building flexible storage modules z. Motivations for flexible storage appliances z. Conclusion and current status
Clients have different needs z. Communication protocols z. Replacement costs z. Data semantics z. Security and authentication
Communication protocols z. The Esperanto problem z. Too many protocols to implement them all z. Too many clients use proprietary protocols Storage must allow pluggable protocols.
Replacement costs z. Infinite cost to replace first class data. z. Variable cost to replace cached data depending on size and distance. z. Variable cost to replace job output files depending on computation cost. Cheap cached files First class data Cost aware storage can effectively increase its own capacity.
Data semantics z. Must stored objects be protected from read and write dependencies? z. Is transaction support necessary? z. Acceptable replies to storage requests.
Data semantics, example z. Problem y. PFS on top of FTP fakes open yread may then return file not found z. Solution y. Mechanisms are needed to support flexible semantics independent of the transfer protocol. Divorce semantics from the protocol.
Security and authentication z. Ownership z. Privacy z. Encryption z. Authentication z. Access rights
Who, when, how and how much? z. Who is allowed to use the storage? z. Promiscuity and monogamy are easy z. Polygamy is also easy Abstinent Promiscuous
Do I know you? z. Problem y. Migrant grid users may need temporary, preferential storage access z. Solution y. Provide mechanisms to xadvertise available storage xcreate self-destructing user accounts Matchmake applications with storage opportunities.
Outline z. Introduction z. Building flexible storage solutions z. Motivations for flexible storage appliances z. Conclusion y. Current status y. Future work y. Concluding remarks
Current status z. Concurrency architectures are done y. Gets, puts, reads and writes perform well z. Virtual protocol class interface is built y. Ne. ST speak is fully implemented y. Grid ftp coming soon!! z. Simple first implementation of storage reservations and remote quota management is done y. Venkateshwaran Venkataramani
Future work z. Discovery process of client storage requirements z. Quality of service guarantees for bandwidth and storage z. Support for transient and opportunistic users
Concluding remarks z. Return storage to the commodity curve by creating software-only storage appliances z. Allow greater storage flexibility for a wide range of application needs
- Slides: 37