DBM 380 BIG DATA Big Data Big Data
DBM 380 BIG DATA
Big Data � Big Data refers to a set of data that displays characteristics of volume, velocity, and variety to an extent that makes the data unsuitable for management by a relational database management system (Coronel 2017)
The 3 V’s � Volume is the quantity of data to be stored � Velocity is the speed at which data is entering the system � Variety is the variations in the structure of the data to be stored
Other Characteristics � Variability – changes in the meaning of the data based on context � Sentiment analysis – method of text analysis that determines if a statement conveys a positive, negative, or neutral attitude about a topic � Veracity – reference to the integrity of data
Internet Companies � Google, Amazon, and Facebook are some of the companies that currently experience issues with Big Data �Each company created a tool to help with to store and manage with the increase in data ○ Google has the Big. Table data store ○ Amazon created Dynamo ○ Facebook created Cassandra
Challenges of Internet Companies � Increasing quantity of data needing to be stored �Generates a need for larger databases �Systems can scale up or out to accommodate for the increase in storage � Storing data at increasing rates �As technology advances people can access information at faster speeds �Tracking data can be generated faster because people can browse websites, tweet, or post pictures with the touch of a button
Challenges con’t � Rate of storing data �Broken down into two categories ○ Stream processing ○ Feedback loop processing � Storing data in a variety of format or structures �Data can be structured, unstructured, or semi-structured
Tool Recommendations � Hadoop Distributed File System (HDFS) � Map. Reduce � Use technology that complement each other
Key Functions � Processes large data sets across clusters of computers � Write-once, read-many � Streaming access � Fault tolerance
References Coronel, C. , & Morris, S. (2017). Database Systems: Design, Implementation, and Management. Boston: Cengage Learning. � Baesens, B. , Bapna, R. , Marsden, J. R. , Vanthienen, J. , & Zhao, J. L. (2016). Transformational Issues of Big Data and Analytics in Networked Business. MIS Quarterly, 40(4), 807 -818. � HILTZ, A. (2017). Big Data: Strategic Assets. State Legislatures, 43(5), 8. �
- Slides: 10