Big Data a great potential Sofia Baltsa Vienna
Big Data , a great potential Sofia Baltsa Vienna, 26 January 2017
Big data, what it means to you? << The world is one big data problem, there’s a bit of arrogance in that, and a bit of truth as well>>. Andrew Mc. Afee The term Big Data is relatively new but the act of gathering and storing large amounts of information for eventual analysis is ages old. Doug Naley articulated the now-mainstream definition of big data as the 3 Vs. - Volume - Variety - Velocity 2
Gathering data from your environment and using them Identify issues and /or opportunities for collecting data. Select issues and/or opportunities and set good goal setting Plan and approach methods ü Who will the data be collected about? ü Who will the group of interest be compared to? ü What locations or geographical areas will the data be gathered from? How should data be collected Qualitative Data Quantitative Data Collect data Analyze and interpret data Act on a result 3
Where Big data comes from? In 2012 the amount of information stored worldwide exceeded 2. 8 zettabytes. 13% 4% 32% United States Rest of world Western Europe 19% India China 32% 4
Where Big data comes from? By 2020 the total amount of data stored is expected to be 50 times larger than today. 21% 23% United States Rest of world 6% Western Europe India 15% China 35%
Big data, sources OFFICIAL COMPANIES DATA PUBLIC DATA Nowadays, there available information about the Italian and foreign market as well. In Italy they are processed and constantly updated o o o COMPANIES PRIVATE DATA SOCIAL NETWORK NEWS WEB o 6 m companies 12 m members and exponents 1 m professional websites 70 k daily information 6
Knowledge Graph A SYSTEM IS BIG WHEN INCREASE THE VOLUME OF DATA AND AT THE SAME TIME INCREASES THE SPEED / FLOW OF INFORMATION THAT THE SYSTEM MUST BE ACQUIRED AND MANAGE AZIENDE PERSONE SOURCES BIG DATA SEMANTIC TECHNOLOGY KNOWLEDGE GRAPH INFO PROVIDER DATA INGESTION COMPANIES WEB DATA CURATION PERSONS DATA LINKING DATA& CONCETTI DOCUMENTI NEWS SOCIAL OPEN DATA FONTI PRIVATE CONCEPTS TEXT ANALYTICS 7
Semantic Engine A "semantic engine" is a software capable of extracting meaning from electronic documents and organizing it in a structured way by categorizing documents, suggesting appropriate labels (tags), finding similar (relative) documents, extracting and emphasizing entities (e. g people, books, locations , etc. ) but also significant principles COMPANIES DESCRIPTIONS CONTACTS PERSONS KEYS BIO & RUOLES BUSINESS METRICES ACTIVITY &PRODUCTS SOCIAL ACCOUNTS CREATING TASKS 8
SEO, Search Engine Optimization The web search engines (Google above all) try to understand the researcher's intent and the contextual meaning of the terms used in order to generate more relevant results. Semantic research considers the research context, the location of the researcher, the intent, the variation of words, synonyms, generalized and specialized queries, comparison of concepts, interrogations in natural language Words do not have a single meaning. A word can express different needs depending on the context and the same needs can be represented in different words DATE Algorithms can only infer the essence of content, its proximity to a particular topic, following a process of associative understanding How does the search engine understand the meaning? Use a knowledge base in which entities and co-occurrences are registered for each topic (topic) 9
How to use Big Data: Marketing Intelligence An example: A search for a list of companies (6 mil!) with certain characteristics (location, type of activity, etc. ). Research that we could do using standard tools and data from chamber of commerce PORTAFOGLIO CLIENTI DATI ESTERNI (Cloud e Banche dati) MACHINE LEARNING & FINE-TUNING With Big Data: the search (e. g by keyword) takes place at the same time on various sources and different types (not only in the register of chambers of commerce). So for example the social object is taken from what the company claims on its website (and therefore certainly more representative), employment data from the National Institution Db. Reconstruction of the network of relationships that bind the company to other entities, companies or people in order to exploit the concept of proximity (Similarity) Lead generation also based on News LISTA PRIORITIZZATA DI PROSPECT/ CLIENTI 10
Lead Generation FUNCTIONALITIES ADVANCED SEARCH FOR COMPANIES DETAILED INFORMATION ABOUT EVERY COMPANY EXPLORATION NETWORK OF PARTICIPATIONS AND ROLES OF PEOPLE AND COMPANIES LISTS & EXPORT COLLABORATION TOOLS BETWEEN USERS OPPORTUNITIES LEAD GENERATION AND IDENTIFICATION OF NEW POSSIBLE CUSTOMERS ENRICHING OF INFORMATION ASSETS ON CUSTOMERS AND LEADERS NETWORK ANALYSIS & SCOUTING OF NEW OPPORTUNITIES ON THE TERRITORY SEGMENTATION AND EXPLORATION OF NICHE MARKET AND TREND MONITORING OF COMPANY LISTS (CUSTOMERS, SUPPLIERS, COMPETITORS, ETC. ) 11
Intelligence Analysis IMPROVING THE CAPACITY OF ANALYSIS: DATA AND INFORMATION DEALING IN DEPTH. The intelligence analysis management is the process of treating and organizing the analytical procedure to move from "raw" to "finished" intelligence. Merge points (Sherman Kent) Creating an "intelligence mosaic" is an effective and imaginative way to Geospatial Crime Mapping describe this process. Analysis, (MI 6) processing and production are phases of the organization and evaluation of information. The form with which the mosaic is represented is that of the navigable graph (knowledge graph). The size and variety of data is now such as not to allow an operator to understand the design and connections 12
How knowledge graph helps MFIs Thanks to the knowledge graph we are able to find the right information (narrow down the search results only for what you are really looking for); • Provide an adequate and complete summary (summarize relevant content around the topic searched for through related information); • Going deep and extending the results (adding extra information to the research that allows us to discover relevant data). This means for MFIs: I. Cost reductions II. Time reductions III. New product development and optimized offerings IV. Smart decision making. • Microcredit low commissions need to make the maximum possible use of IT tools to reduce cost analysis. 13
Integration into IT systems is the key • • 1. Marketing Intelligence 2. Intelligence analysis 3. Knowledge analysis • • • 1. Quick and easy integration for sending lists to CRM and other applications Sharing of operational data to evaluate similarity criteria and to identify prospects similar to customers already covered. 2. Saving in the software all the elements that have been acquired by the Big Data search engine to support the analysis phase (preliminary investigation) Extraction of information in the form of reusable data (eg social offices 3. Automatic update of the system with the latest information (news, Feed Social Media, changes in official company data and website. . . )
Big Data, means too much data not all the data Soft information Contribution of Private and Business Data? Sensitive data and protected by privacy(!)
Thank you for your attention.
- Slides: 16