Mobile File Systems Mobility and Distributed Filesystems Examples

Mobile File Systems

Mobility and Distributed File-systems • Examples of distributed file-systems – NFS (network file system), AFS (andrew file system), etc. • Key problems in a wireless/mobile environment? – Disconnections – Low bandwidths 2020 -09 -10 2 of 28

File-systems … • CODA – supports disconnected operation • PFS – targets partially connected environments • Bayou – support for data-sharing among mobile users • others … 2020 -09 -10 3 of 28

CODA • Three key components – Hoarding – Emulating – Reintegrating Hoarding 2020 -09 -10 Emulating Reintegrating 4 of 28

Hoarding • Cache maintained locally at clients • Hoarding – fills in cache when mobile client is connected • Two types of hoarding – Usage based – User specified • Cache management – User driven – LRU 2020 -09 -10 5 of 28

Emulating • When mobile client is disconnected, coda enters the emulation mode • All requests are served from the local cache • change modify log (CML) maintained based on user-writes • CML used later in conflict detection and resolution • CML optimized periodically to reduce the size of the CML – delete unnecessary entries 2020 -09 -10 6 of 28

Reintegrating • Transparent conflict resolution for directories – Except under impossible scenarios – directory permission changes, add/deletion of same directories … • File conflict resolution – ASR – application specific resolvers • Possess information about nature of files (say calendars) – resolve based on application semantics – Manual repair • User provided a view-graph of conflicts and asked to resolve 2020 -09 -10 7 of 28

Optimizations • Rapid cache validations – Original mechanism in CODA for cache coherence based on callbacks – Callbacks cannot be used when clients are disconnected – Client validates cached copies explicitly upon reconnection – Potentially time-consuming – CODA uses cache coherence checks at multiple levels of granularity 2020 -09 -10 8 of 28

Rapid cache validations - Illustration Vol 1 Vol 2 Vol 3 Vol 4 Version x Version y Version z • Check volume version stamp • If version stamps different, check individual object version stamps 2020 -09 -10 9 of 28

Trickle Re-integration • When network is weakly connected, propagate updates to server asynchronously • Trade-offs – lower bandwidth efficiency as fewer CML optimizations are possible – More up to date copies at server – fewer conflicts possible • CODA allows users to dynamically set the period of the asynchronous updates based on requirements 2020 -09 -10 10 of 28

User assisted cache miss handling • When a cache miss occurs, what should be done? • Option 1 – fetch from server • Option 2 – convey error message to user • Trade-offs? – Latency vs. availability – Patience vs. importance • CODA uses a “patience threshold” to decide whether a file can be retrieved within this threshold 2020 -09 -10 11 of 28

Mimic: Raw Activity Shipping for File Synchronization in Mobile File Systems GNAN Research Group School of Electrical and Computer Engineering Georgia Institute of Technology Atlanta, GA 30332, USA

File Synchronization in Mobile File Systems • File synchronization in distributed file systems is a consistency restoration process between a file server and a client • Over mobile networks such as a WWAN, file synchronization in traditional distributed file systems cannot be used effectively due to limited bandwidth Ø Bandwidth usage efficiency of file synchronization is important 2020 -09 -10 13 of 28

Current File Synchronization Schemes (1) • Data compression – Compresses each block of data or meta-data – On-line data compression in a log-structured file system [Burrows’ 92] • Differential update – Exploits similarities between versions of the same file • Based on the diff scheme of the UNIX systems – Rsync [Tridgell’ 00] – Low-bandwidth network file systems [Muthi’ 01] 2020 -09 -10 14 of 28

Current File Synchronization Schemes (2) • Operation shipping – Logs and ships user operations that update the files • Session/application level logging and replaying • Need modification for GUI-based interactive applications – Corrects replaying errors by forward error correction (FEC) • Minor re-execution discrepancies caused by non-repeatable operations can be detected by fingerprint algorithms – Operation shipping for mobile file systems [Lee’ 02] 2020 -09 -10 15 of 28

Motivation Update size comparison in Microsoft Word Activity Description Activity Events Full File Size diff Size Activity Size Insert a line 98 keystrokes 29341 B 1543 B 236 B Insert a paragraph 476 keystrokes 29356 B 2111 B 848 B Copy and paste a paragraph from the same file 6 keystrokes + 12 mouse-clicks 33449 B 1119 B 72 B Change the font type of a paragraph 7 mouse-clicks 40611 B 1660 B 30 B 2020 -09 -10 16 of 28

Goal and Overview • Goal – To design an application-unaware activity shipping scheme for file synchronization in interactive applications Ø Mimic is a file synchronization strategy that records raw activity at the client side, ships the records to the server during synchronization, and replays the activities at the server 2020 -09 -10 17 of 28

Mimic System Overview Input Device Event Input Driver Input Message Queue Input Message Application File Mimic Recorder Client Record Local File System File / Record Mimic File Verifier Figerprint / FEC Network File System Record File Server 2020 -09 -10 Mimic Replayer Input Message Queue Input Message Application 18 of 28

Mimic Design Elements • Mimic Client Design: recording raw input activity – Mapping filename to process/session – Record optimization • Mimic server design: replaying raw input activity – Environment synchronization – Replaying speed optimization • Integration of Mimic with file systems • Verification/error correction of replayed files – Presented in [Lee’ 02] 2020 -09 -10 19 of 28

Mimic Client Design: Mimic Client to Window Manager Interface Client Server Record trap. System. Message( ) get. Clipboard( ) Window Manager get. Environment( ) process. To. Window. Handle( ) system message OS Mimic Client Mimic Synchronization Mimic Server OS File System Native File Synchronization 2020 -09 -10 20 of 28

Mimic Client Design: Background • Translation of input activity – Device driver translates an input event into a input system message – Handle of the corresponding window is also written on the message • Delivery of input system messages – Interactive application has a set of application windows – System message queue demultiplexes system messages according to their window handle 2020 -09 -10 21 of 28

Mimic Client Design: Window Handle Table (WHT) • WHT maps the process (or session) handle and filename of an application to the corresponding set of window handles – When a file is opened with a filename, the mapping information is acquired, and added on WHT • through process. To. Window. Handles( ) – When the file is closed, the mapping is removed from WHT • Mimic recorder captures system messages having the window handles listed on the WHT or the system window handles – through trap. System. Message( ) 2020 -09 -10 22 of 28

Mimic Client Design: Descriptors • Each captured message is translated into a descriptor • Activity Descriptor (AD) – Describes an individual user input activity for an application • Meta Descriptor (MD) – Capture the changes of system environment during recording • Corresponding system messages are generated • Includes screen resolution, color depth, keyboard layout, clipboard content, etc. • Environment Descriptor (ED) – Describes the initial system environment when recording begins • Same structure as that of an MD • through get. Environment() and get. Clibboard() 2020 -09 -10 23 of 28

Mimic Client Design: Descriptor Structures 0 31 32 message identifier 63 64 l. Param 95 96 w. Param 127 128 time 159 0 20 bytes hwnd message type 3 4 15 clipboard size variable clipboard information EVENTMSG in Windows time Meta Descriptor (Clibboard Content) 0 34 message type 11 12 virtual-key code 0 15 time 2 bytes message type 34 11 12 0 15 x-coordinate message type 3 4 time 11 12 MD link information 11 12 15 time 2 bytes 3 4 11 12 15 x-coordinate 4 bytes time Meta Descriptor (Screen Resolution Change) 15 time Activity Descriptor (Meta Descriptor Link) 2020 -09 -10 keyboard layout information y-coordinate Activity Descriptor (Mouse Activity) 0 message type 4 bytes y-coordinate 3 4 Meta Descriptor (Keyboard Layout Change) Activity Descriptor (Keyboard Activity) 0 message type 0 2 bytes message type 3 4 11 12 color information 15 time Meta Descriptor (Color Depth Change) 24 of 28 2 bytes

Mimic Client Design: Records • Each descriptor is recorded in a record • File Activity Record (FAR) – Maintained per file – Consisting of file information, a set of environment descriptors (EDs), and a sequence of activity descriptors (ADs) – Linked to a meta descriptor (MD) of a meta activity record (MAR) when an environment change happens • Meta Activity Record (MAR) – Shared by file activity records (FARs) – Consisting of a sequence of meta descriptors (MDs) 2020 -09 -10 25 of 28

Mimic Client Design: Mimic Client to File System Interface Client Server Record trap. System. Message( ) get. Clipboard( ) Window Manager get. Environment( ) process. To. Window. Handle( ) system message OS open( ) Mimic Client Mimic Synchronization Mimic Server OS close( ) rename( ) File System finish( ) synch( ) /Synch_Status 2020 -09 -10 File System delete( ) Native File Synchronization 26 of 28

Mimic Client Design: Integration with File Systems • open(filename, mode, process_handle) – Called when a shared file is opened • close(filename), rename(filename), delete(filename) – Called when a shared file is closed, renamed, or deleted • synch(filename, diff_size) – Called when a shared file needs to be synchronized • Decides update mode by comparing FAR_size with diff_size – Returns Synch_Status • SYNCH_FAIL if Mimic synchronization is failed or diff is chosen • Updates again on diff mode when Mimic synchronization failed • finish() – Called when the synchronization process is terminated 2020 -09 -10 27 of 28

Mimic Server Design: Mimic Server to Window Manager/OS Interface Client Server Record trap. System. Message( ) get. Clipboard( ) Window Manager Replay get. Environment( ) process. To. Window. Handle( ) set. Environment( ) system message OS open( ) play. Activity( ) Window Manager set. Clipboard( ) Mimic Client Mimic Synchronization Mimic Server execute. Default. Application( ) wait. For. Process. Idle( ) OS close( ) rename( ) File System finish( ) synch( ) /Synch_Status 2020 -09 -10 File System delete( ) Native File Synchronization 28 of 28

Mimic Server Design: Initialization • Environment synchronization – Based on the environment descriptor (ED) of the FAR – Initial system environment synchronization • through set. Environment() – Clipboard content synchronization • through set. Clipboard() • Application synchronization – Opens a corresponding application of the file activity record • Based on the file extension of the filename • Sets the same environment such as window size and location • through execute. Default. Application() – Moves the system focus to the application 2020 -09 -10 29 of 28

Mimic Server Design: Replaying • System message/function generation – Activity descriptor (AD) is mapped into an input system message – Meta or environment descriptor (MD or ED) is mapped into a system function to set the environment – through play. Activity() • System message/function re-execution – Deliver the messages to system message queue – Run the system functions 2020 -09 -10 30 of 28

Mimic Server Design: Replaying Speed Control • User activity skipping and misinterpretation – Certain inputs are relevant to the application only for particular states – Too fast replaying may not let the application wait for a particular state, and cause replaying errors • Replaying speed control in Mimic – Monitors the CPU utilization of the process after every message playback – Playbacks the next AD, only when the process is idle • through wait. For. Process. Idle( ) 2020 -09 -10 31 of 28

Experiment Setup • Wireless wide area network (WWAN) – CDMA 2000 -1 X cellular network • Effective data rate: about 17 Kbps • Round-trip time between the client and server: about 300 ms • Operating system/application – Microsoft Windows 2000 Professional – Microsoft Office 2000 suite • Metrics – Transfer size – Synchronization latency • Includes shipping, replaying, and verification delays – Compared with the differential update (diff) • x. Delta for Windows 2020 -09 -10 32 of 28

Transfer Size Results (1) 2020 -09 -10 33 of 28

Transfer Size Results (2) 2020 -09 -10 34 of 28

Transfer Size Results (3) • Transfer size in Mimic is generally less than that in diff – Except when copying from outside the file • Bandwidth-inefficient clipboard structure (OLE) • Mimic overhead is generally proportional to activity size – Except when copying from outside the file • Transfer size relies on the size of the copied object – Diff overhead is not proportional to activity size • Single line insertion in diff may consume more bandwidth than a single paragraph insertion – Delete or modify operations in Mimic incurs significantly smaller overheads than those in diff 2020 -09 -10 35 of 28

Synchronization Latency Results (1) 2020 -09 -10 36 of 28

Synchronization Latency Results (2) 2020 -09 -10 37 of 28

Synchronization Latency Results (3) • Mimic still shows better latency performance for certain activity – For small insertions, deletions, internal copies, and meta data changes – Even though the latency in Mimic includes its playback time, its total update time does not exceed that of diff • Benefit of small transfer size for those operations is larger than playback overhead • However, for the other types of activities, diff performs better in terms of latency – For large insertions, modifications, and external copies 2020 -09 -10 38 of 28

Conclusions • We propose an application-unaware and OS-independent approach called Mimic that relies on transferring raw user activity records to the server, where the file is updated through a playback of the raw user activity on the old copy of the file • We show that Mimic performs much better than diff in most scenarios in terms of the transfer file sizes • We conclude that Mimic can be used in tandem with diff to substantially improve file synchronization performance 2020 -09 -10 39 of 28

Puzzle • Lateral thinking – E. g. • A man approaches the center of a field, frantically trying to open a package. When he reaches the center of the field, he dies. Why? • A man lives on the 11 th of a building • The elevator is a small one (can accommodate only 1 person) • When he goes to work, he takes the elevator to the 1 st floor • When he comes back from work, he takes the elevator to the 6 th floor, and walks up 5 floors • When it rains, he takes the elevator up to the 11 th floor • Why? 2020 -09 -10 40 of 28
- Slides: 40