Simple introduction to HDFS Jie Wu Some Useful

  • Slides: 6
Download presentation
Simple introduction to HDFS Jie Wu

Simple introduction to HDFS Jie Wu

Some Useful Features – File permissions and authentication. – Rack awareness: to take a

Some Useful Features – File permissions and authentication. – Rack awareness: to take a node's physical location into account while scheduling tasks and allocating storage. – Safemode: an administrative mode for maintenance. – fsck: a utility to diagnose health of the file system, to find missing files or blocks. – Rebalancer: tool to balance the cluster when the data is unevenly distributed among Data. Nodes. – Upgrade and rollback: after a software upgrade, it is possible to rollback to HDFS' state before the upgrade in case of unexpected problems. – Secondary Name. Node: performs periodic checkpoints of the namespace and helps keep the size of file containing log of HDFS modifications within certain limits at the Name. Node.

Goals of HDFS • Hardware Failure: detection of faults and quick, automatic recovery •

Goals of HDFS • Hardware Failure: detection of faults and quick, automatic recovery • Streaming Data Access: designed more for batch processing rather than interactive use by users • Large Data Sets • Simple Coherency Model: write-once-read-many access model, but there is a plan to support appendingwrites to files in the future • Moving Computation is Cheaper than Moving Data • Portability

Architecture

Architecture

What It can Support • Create, read, write (once), remove, copy and rename a

What It can Support • Create, read, write (once), remove, copy and rename a file, no modification • A very simple permission model like Linux, no user authentication like Kerberos and encryption of data transfers • Directory quotes, no user quotes • no hard links or soft links • Recycle Bin