Big Data as a cornerstone of Cortana Intelligence
Big Data as a cornerstone of Cortana Intelligence Data Sources People Web Apps Mobile Apps Bots Cortana Intelligence Sensors and devices Data Automated Systems Intelligence Action
Big Data as a cornerstone of Cortana Intelligence Data Sources Apps Information Management Big Data Stores Machine Learning and Analytics Intelligence People Data Factory Data Lake Store Machine Learning Cognitive Services Data Catalog SQL Data Warehouse Data Lake Analytics Bot Framework Web HDInsight (Hadoop and Spark) Cortana Mobile Event Hubs Apps Bots Stream Analytics Dashboards & Visualizations Sensors and devices Data Power BI Intelligence Automated Systems Action
Bringing Big Data to everybody Accelerate the pace of innovation through a state-of-the-art cloud platform User Adoption CONTROL EASE OF USE Azure Data Lake Analytics Hadoop technology Iaa. S Hadoop Workload optimized, managed clusters Specific apps in a multi-tenant form factor Managed Hadoop Big Data as-a-service Azure Data Lake Analytics Azure Data Lake Store Azure Storage BIG DATA STORAGE Azure Marketplace HDP BIG DATA ANALYTICS Azure HDInsight
Azure HDInsight Hadoop and Spark as a Service on Azure Fully-managed Hadoop for the cloud with services like Spark, Storm, Hbase, Hive, Tez, Pig, Map. Reduce (Python, Java, C#) … 100% Open Source Hortonworks data platform Clusters up and running in minutes Managed, monitored and supported by Microsoft with the industry’s best SLA Familiar BI tools for analysis, or open source notebooks for interactive data science 63% lower TCO than deploy your own Hadoop on-premises* *IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
Azure Data Lake Store A hyper-scale repository for Big Data analytics workloads Hadoop File System (HDFS) for the cloud No limits to scale Store any data in its native format Enterprise-grade access control, encryption at rest Optimized for analytic workload performance
Azure Data Lake Analytics A new distributed analytics service Distributed analytics service built on Apache YARN Elastic scale per query lets you focus on business goals—not configuring hardware Includes U-SQL—a language that unifies the benefits of SQL with the expressive power of C# Integrates with Visual Studio to develop, debug, and tune code faster Federated query across Azure data sources Enterprise-grade role based access control
Azure Data Lake Store and analyze data of any kind and size Develop faster, debug and optimize smarter HDInsight Analytics U-SQL Hive YARN HDFS Store R Server Interactively explore patterns in your data No learning curve Managed and supported Dynamically scales to match your business priorities Enterprise-grade security Built on YARN, designed for the cloud
Azure Data Lake Big Data made easy Analytics on any data, any size Easier and more productive for all users Enterprise-ready
Azure Data Lake Big Data made easy Analytics on any data, any size Easier and more productive for all users Enterprise-ready
Store any size of data and optimize for highperformance Store EBs • Store data in it’s native format • No fixed limits on file sizes—PB sized files • Ultra-fast read/write access • Optimized for large analytic systems with massive throughput • Optimized for Io. T with high availability TBs
Any type of analytics • Batch, interactive, streaming, machine learning • Allows for exploratory analytics over data HDInsight Analytics U-SQL Hive R Server YARN HDFS Store Cortana Intelligence Suite • Analyze with Hadoop and Microsoft solutions
Analytics that dynamically scale to match your needs • Architected for cloud scale and performance • Provision any amount of resources with a few clicks • Dynamically provisions and winds down resources • Frees you up to focus only on your business logic
Azure Data Lake Big Data made easy Analytics on any data, any size Easier and more productive for all users Enterprise-ready
Easy for administrators to spin up quickly • Deploy big data projects in minutes • No hardware to install, tune, configure or deploy • No infrastructure or software to manage • Scale to tens to thousands of machines instantly
Easy for developers— from novice to expert • Deep integrations with Intelli. J and Visual Studio • Easy for novices to write simple queries • Robust environment for experts • Integrated with U-SQL, Hive, and Storm • Playback visually displays performance to identify bottlenecks and areas for optimization
Easy for developers with familiar language • Leverages U-SQL: a simple and powerful language that’s familiar to SQL and/or. NET developers and easily extensible • Unifies the declarative nature of SQL with expressive power of C# • Familiar syntax to millions of developers
Easy notebook experience for data scientists • Most popular notebook, Jupyter out-of-the-box • Combine code, statistical equations and visualizations • Worked w/ Jupyter community to enhance kernel to allow Spark execution through REST endpoint
Easy for data scientists with familiar R language R Server for HDInsight • Largest R-compatible parallel analytics library • Terabyte-scale machine learning— 1, 000 x larger than in open source R • Up to 100 x faster performance using Spark and optimized vector/math libraries • Enterprise-grade security and support *Applies to HDInsight only
Easy for business analysts with interactive reports over big data • Interactive BI with big data • Spark integration with Power BI, Tableau, SAP Lumira and Qlik • Power BI offers a streaming connector with Spark Stream
Azure Data Lake Big Data made easy Analytics on any data, any size Easier and more productive for all users Enterprise-ready
Highest availability guarantee in the industry for peace of mind • Managed, monitored and supported by Microsoft • Enterprise-leading SLA— 99. 9% uptime • No IT resources needed for upgrades and patching • Microsoft monitors your deployment so you don’t have to 99. 9% SLA *Applies to HDInsight only
Runs in the most datacenters worldwide North Central US Illinois West Europe Netherlands Central US Iowa China North* Beijing East US Virginia West US California South Central US Texas North Europe Ireland Japan East Tokyo, Saitama China South* Shanghai Japan West Osaka India Central Pune East US 2 Virginia East Asia Hong Kong SE Asia Singapore Azure doubling compute and storage every 6 months *Applies to HDInsight only Australia East New South Wales Brazil South Sao Paulo State Australia South East Victoria
Manage and secure your data by leveraging existing IT investments • Auditing, alerting, access control—all from within a single web-based portal • Azure Active Directory integration for identity and access management • Leverage existing investment in Active Directory on-premises
Lower total cost of ownership • No hardware • Hadoop support included with Azure support • Pay only for what you use • Independently scale storage and compute • No need to hire specialized operations team • 63% lower total cost of ownership than on-premises* *IDC study “The Business Value and TCO Advantage of Apache Hadoop in the Cloud with Microsoft Azure HDInsight”
Recognized by top analysts Forrester Wave for Big Data Hadoop Cloud • Named industry leader by Forrester with the most comprehensive, scalable, and integrated platforms* • Recognized for its cloud-first strategy that is paying off* *The Forrester Wave. TM: Big Data Hadoop Cloud Solutions, Q 2 2016.
Get started now Learn more on the Data Lake website: http: //azure. com/datalake Watch videos on Azure Data Lake: https: //channel 9. msdn. com/Series/Azure. Data. Lake Take courses and read documentation on Azure Data Lake: http: //aka. ms/hditraining http: //aka. ms/adlanalytics http: //aka. ms/adlstore
© 2016 Microsoft Corporation. All rights reserved.
- Slides: 28