BUSINESS INTELLIGENCE NEW TECHNOLOGIES METHODOLOGY IMPLICATIONS ENTERPRISE ARCHITECTURE
BUSINESS INTELLIGENCE NEW TECHNOLOGIES, METHODOLOGY IMPLICATIONS, ENTERPRISE ARCHITECTURE AND CONTROL
NEW TECHNOLOGIES – “BIG DATA” What is it? History & Technologies New buzz word everyone wants to talk about Historical BI/DW practices driven by four variables: Disk, Memory, Processor Power and Licensing What does it mean? Publication of a paper by Google on a process called Map Reduce Simply, data sets large enough to be not easily managed analyzed using standard relational or OLAP toolsets Where does it apply Born out of web traffic analysis, advertising targeting and product suggestion Scientific Applications Language Analysis Parallel processing in a highly distributed environment Many relatively simple machines running Map Reduce processes HADOOP was born as the Apache implementation of Map Reduce Challenges Not ACID and no SQL language support Legacy reporting tools do not understand these sources No. SQL Key Value Pairs Document model
NEW TECHNOLOGIES – “BIG DATA” Vendors Legacy versus New Challengers (Commercial/Open Source) Legacy Data Warehouse Vendors: Oracle, IBM, Microsoft, Teradata, Neteeza Many New Entrants Cloud Based Amazon Elastic Map Reduce (EMR) HADOOP Meets SQL Nuo. DB, Cloudera, Cassandra, Accumulo, MS Poly. Base No. SQL http: //nosql-database. org/ Mongo. DB, Couch. DB, Raven. DB Basic Question: Do you have an infrastructure that has multiple BI platforms (Relational/OLAP and HADOOP)? Or wait for one of the legacy vendors to supply enough HADOOP functionality in its core offering to suffice?
NEW TECHNOLOGIES – IN MEMORY ANALYTICS What is it? Full (or targeted bits of) data set in system memory History & Technologies Enterprise Deployed – Small & Mid Size Enterprise Moving from Appliance based to Cloud based Qlik. View was a pioneer in this space Initially was analytics focused Tabelau Self Service Analytics using disparate data sources No ETL No central data architecture control Intended to be high performance Beginning to spread into the Transactional/Relational space Becoming Main Stream technology Cloud Based SAP HANA From SAP or Amazon $’s per hour of use Oracle Times. Ten Microsoft SQL Server 2014
NEW TECHNOLOGIES – BI IN THE CLOUD Traditional Vendors The “Cloud” has many definitions Virtual Machines versus “Cloud” processes Major Players Birst Reporting Service / HDInsight DOMO Oracle in Azure… Wait. . What? Good. Data Indicee Jaspersoft (Almost) Everyone is welcome Microsoft, Oracle, SAP HANA, No. SQL, HADOOP Largely Virtual Machine based SAP Slowly adding options for higher performance Amazon Web Services ETL Reporting Vendors SQL Server progressively moving to Azure Saa. S pricing Azure Cloud Only Deployment Typically full life cycle solutions Microsoft Cloud Based BI – New Entrants HANA
NEW TECHNOLOGIES – DATA SERVICES IN THE CLOUD New Uses Disaster Recovery Tight integration of local SQL Engines and Cloud based failovers Backup and Restore using the Cloud Complex Event Processing Microsoft Stream. Insight
NEW TECHNOLOGIES – METHODOLOGY IMPACTS – BIG DATA “We need BIG DATA!” Sometimes more is not better John Snows Cholera Map Discovering the cause of a particular cholera epidemic as well as discovering the general concept of infectious disease was determined by analyzing 620 data points Location of infections on map of London limited to a particular area. Initial analysis pointed towards water pumps in the vicinity. Confirming data was that Monks only drink beer. A “Big Data” project might have involved the compilation and analysis of all infection locations worldwide integrated with all activities performed by those individuals. The analysis would have likely have been swamped by noise. Avoid the temptation to push for bigger and bigger data sets without a clear objective in mind and some scientific reasoning as to why more will be better. Make sure a limited scope data set is also an option for analysis when looking for specific causation. Consider the role of Data Scientist within the organization
NEW TECHNOLOGIES – METHODOLOGY IMPACTS – IN MEMORY “Who needs a data warehouse anymore? ” I blame Qlik. View for the above statement. Cloud Based BI tools are heading down the same road. I’m looking at you SAP. Vendors perceived a market opportunity to gain customers by claiming In Memory technology allowed for the elimination of costs related to data architecture and ETL development Statements you may hear: “I don’t need good data architecture because the speed will make up for inefficiencies in joins or storage of the data” “The users want the flexibility to join to any data source and any time. ETL just slows us down. ” The above runs contrary to another concept that is increasingly gaining traction (finally). Master Data Management.
NEW TECHNOLOGIES – METHODOLOGY IMPACTS – IN MEMORY “Who needs a data warehouse anymore? ” Observations Results are very mixed Very hard to maintain a proper Data Governance/MDM process The best results I’ve seen have involved the use of In Memory tools on top of quality data mart/warehouse environments There is no free lunch, buying more memory or more virtual servers will only take you so far BUT, there is some merit here Pure speed does give you options We see utility in prototyping new additions to the formal Data Warehouse structure or for giving users some room to roam from the base Data Governance needs to maintain control
NEW TECHNOLOGIES – ARCHITECTURE & CONTROL “Beware the Zombie Clouds!” Clouds are the new Flash drives with regards to data control and security There a million new low cost Software as a Service options on the market No or low up front adoption costs Can be initiated by the user/business side of the enterprise as well as IT personnel outside the data governance process Many are designed to quickly accept your data and make it easily accessible to an audience (which you don’t control or might not even know about) Some offer Single Sign-On, but is not required Some might be quickly abandoned and data is left in a zombie state in perpetuity Data timeliness and provenance becomes very suspect
NEW TECHNOLOGIES – CONTACT INFO Paul Dausman pdausman@valordevelopment. com twitter: @pdausman www. valordevelopment. com www. valianthealth. com www. techweuse. com
- Slides: 12