NFS on a Database Structure and Performance Alan

NFS on a Database: Structure and Performance Alan Halverson Babis Samios

Motivation Goal: NFS Server / Database Backend Why Database? n n Transactions provide idempotency naturally Graceful backup/recovery Why NFS? n n n Nearly universal client availability Transparent access for existing applications Ease of implementation

Approach Implement standard UNIX file API n n open(), read(), write(), etc. All routines talk to the database Modify NFS server to use new API … Profit!

Main Results Efficient Implementation is Possible n Same order of magnitude with native file system for read/write operations Choice of Database Schema is Important Server Cache Usage is Critical n Avoids database round-trips

Roadmap Approach n n n NFS server choices Databases choices Architecture/Design Experimental Setup & Results Summary/Conclusions

Database Choices Many available DBMS’s We chose Postgre. SQL n n n Free, open source Inspiration for our work was the Inversion File System – also implemented on top of Postgres Uses client/server model

NFS Server Choices Kernel mode n n Pros: included in Linux, supports NFS v 3 Cons: difficult to debug User mode - UNFSD n n Pros: Easier to debug, comm. with Postgre. SQL possible! Cons: Only supports NFS v 2 Our choice: User mode

Architecture

Database Schema meta-data -> file_attributes dir hierarchy -> naming data -> Many options n Table/File (used by Inversion FS) n Single Table (avoids table creation overhead) n Intermediate solutions (e. g. table/dir)

Single Table Schema file_attributes inode 1 1 uid 1 N inode gid mode nlinks size ctime mtime atime N name parent naming N inode chunk_id all_files data

Caching Old Story: Client Side Caching n Buffer cache New Story: Server Side Caching n Minimize the number of round-trips to the DB by maintaining three different caches: w Stat cache Major Contribution w Naming cache w Buffer cache (significantly beneficial only in a multi-client environment)

Binary Data SQL statements issued to Postgre. SQL must contain ASCII data only Provides escaping function n escape(data) ≤ 4 x data We used base 64 encoding n n base 64(data) = 4/3 x data Best case raw write performance is 4/3 of native file system write performance

Experimental Setup

Summary/Conclusions Design and implementation of NFS operating on top of Postgre. SQL Use of 3 -tier architecture for maximum flexibility Performance comparable to native UNIX FS for read/write operations Factors that affect performance n Caching (both server and client side) n Chunk size and NFS r/w message size n Database Schema

Things we will not do Asynchronous database writes (for both data and meta-data) Compare recovery times with both ext 2 and ext 3 Test multi-client environment Add mechanism for querying system meta-data