Overview of Azure Data Lake Store Fundamentals Reliable

Overview of Azure Data Lake Store

Fundamentals Reliable Unlimited Storage • Automatically replicates your data • Unlimited account sizes • Three copies within a single region • Individual file sizes from gigabytes to petabytes • Highly available • No limits to scale Optimized for Analytics Built for running large analytics systems that require massive throughput Optimized for parallel computation over petabytes of data Automatically optimizes for any throughput

Secure your Data Access control Auditing Encryption • POSIX-compliant Access Control Lists (ACLs) on Files and Folders * • Audit logs for all operations Transparent server-side encryption * • Audit logs that can be analyzed with ADL USQL Scripts Azure-managed (Azure Key Vault) and customermanaged keys* • Integrated with Azure Active Directory * Features arriving by GA

HDFS for the Cloud Built from the ground up as a Hadoop file system HDI Cluster Types Hadoop Works Today Storm Works Today HBase Works Today Spark By GA Hadoop Distros Hortonwork s* Cloudera* Tools running in HDI By GA Sqoop Works Today By GA Distcp Works Today Other Microsoft R Services (Revolution R) Apache Hadoop Works Today Version 2. 8 and above * Features arriving by GA

Scenarios Billing ADL Store Azure Blob Storage Optimized for Analytics General purpose bulk storage Pay for amount stored and for I/O operations Web. HDFS Implements Web. HDFS No Web. HDFS Authentication Azure Active Directory Access Keys POSIX-style ACLs Access Keys Transparent Server-side Encryption* Client-Side Encryption Authorization Data Encryption * Features arriving by GA

Ingress and Egress Services ADL SDKs Tools ADL REST endpoints • • • Azure Data Factory ADL Copy Service Azure Import/Export Service Azure Stream Analytics* Apache Sqoop™ Dist. Cp Azure Portal Azure Power. Shell Azure X-Platform CLI • • . NET SDK Node. Js SDK Java SDK * Python SDK * • Curl • Any HTTP REST Client * Features arriving by GA

Integration with Azure Data Factory Sources Sinks Azure Blob Azure Table Azure Blob Azure SQL Database Azure SQL Data Warehouse Azure Table Azure Document. DB Azure Data Lake Store Azure SQL Database SQL Server File system Azure SQL Data Warehouse Oracle database My. SQL database Azure Document. DB DB 2 database Teradatabase Azure Data Lake Store Sybase database Postgre. SQL database SQL Server

http: //aka. ms/Azure. Data. Lake
- Slides: 8