Big Data Test Infrastructure Connecting Europe Facility DIGIT
Big Data Test Infrastructure Connecting Europe Facility DIGIT Directorate-General for Informatics DG Connect Directorate-General for Communications Networks, Content and Technology
Agenda 1. The CEF programme overview 2. Big Data Test Infrastructure (BDTI) definition 3. BDTI applicability 4. BDTI service offerings – map overview 5. How does the test infrastructure work? 6. Examples of on-boarded projects 7 How to get started? 8. Use cases acceptance criteria 9. Annex 2
The CEF building blocks are funded by the Connecting Europe Facility CEF Regulation Defines how the Commission can finance support for the establishment of trans-European networks to reinforce an interconnected Europe. CEF TRANSPORT CEF Digital € 26. 25 bn € 970 M * TELECOM CEF Telecom Guidelines The CEF Telecom guidelines cover the specific objectives and priorities as well as eligibility criteria for funding of broadband networks and Digital Service Infrastructures (DSIs). Broadband € 170 M CEF Work Programmes Translates the CEF Telecom Guidelines in general objectives and actions planned on a yearly basis. 3 ENERGY € 5. 85 bn * - 100 M Juncker Package
How does CEF support projects to use the building blocks? In two ways: • One, it provides services to help you implement them in your Free Services Sample software Testing services system. There a range of services across the building blocks but services typically include training, sample software, testing services. Training sessions
How does CEF support projects to use the building blocks? Funding opportunities Call CEF-TC-2019 -1 Automated Translation (indicative budget: € 4 M) Open Calls Deadline for submissions 14 February 2019 14 May 2019 4 July 2019 14 November 2019 In two ways: • One, it provides services to help you implement them in your system. There a range of services across the building blocks CEF-TC-2019 -1 e. ID & e. Signature (indicative budget: € 5 M) but services typically include training, sample software, testing services. • Two, CEF provides grant funding. You can apply for grant funding to pay for the implementation of a building block in you system. CEF-TC-2019 -1 e. Delivery (indicative budget: € 1 M) More information on how you can apply, grant winners and ongoing projects is available via INEA’s website. Visit INEA Website CEF-TC-2019 -2 Public Open Data (indicative budget: € 5 M)
The Big Data Test Infrastructure
What is the Big Data Test Infrastructure? The Big Data Test Infrastructure will provide a set of data and analytics services, from infrastructure, tools and stakeholder onboarding services, allowing European public organisations to experiment with Big Data technologies and move towards data-driven decision making
Is BDTI for me? Yes, if you need to experiment with big data in a safe environment. What we can help you achieve Avoid setting up and maintaining a complex experimental environment Develop pilot projects on big data in a virtual environment Gather knowledge, insight and value from your data Experiment with and create quick prototypes to verify and test data hypotheses or data visualizations
Main Benefits 1 Access to a ready to use testing environment for your analytics experiments 2 The possibility to share and re-use data across policy domains and organizations 3 Have access to best practices on big data How do we support you? 1 Various data sources, software tools and big data techniques 2 Advice and ongoing onboarding support from the CEF BDTI Team 3 Knowledge base that provides insights to the platform and big data techniques
BDTI applicability Descriptive analysis Media analysis Time-series analysis Descriptive analysis Social Media Analysis Time-series Analysis Use of statistics to quantitatively describe features of a collection of information Gather and analyse data from social media to improve business decisions Analyse time series data in order to extract meaningful statistics and other data characteristics Predictive analysis Network analysis Text analysis Predictive analysis Network Analysis Text Analysis Use statistical techniques that analyse current and historical facts to make predictions about future or unknown events Investigate any structures through the use of network and graph theories Use natural language processing to analyse unstructured text data, to derive pattern and trends
Business Services Overview map Services already implemented Q 1 2019 Test Infrastructure Onboarding & stakeholders follow-up Service Desk Services being implemented Q 2 2019 Community building and Innovation Portal Q 3 2019 Data Catalogue Big Data and Analytics software catalogue Q 4 2019 Support for Analytics Implementation
Test Infrastructure Services already implemented The Test Infrastructure provides the big A public administration needs an data platform and all the data analytics tools analytical "sandbox" environment to supplied by the European Commission through a Platform as a Service. experiment with big data tools and test specific big data use cases. Test Infrastructure provides a ready Through this service, public -to-use environment, respecting administrations can implement their own privacy policies and using open pilots project in the big data field of source tools. expertise or experiment with big data technologies. The public administration can test their big data use case through a pilot project before deploying it into their production environment.
Onboarding & stakeholders follow-up Services already implemented Onboarding & stakeholders follow-up facilitate the onboarding process for stakeholders interested in using the CEF BDTI building block. A public administration decides to experiment with the test infrastructure and needs guidance during the duration of the pilot. Public administrations receive support in the The onboarding service assists the definition of their pilot scope, identification of public administrations throughout data sources, or analytical and technical the end-to-end pilot development. assistance. The public administration successfully completes the test infrastructure pilot.
Service desk Services already implemented The service desk acts as a Single Point of A public administration needs Contact: collect and classify the tickets and support using the test infrastructure. solve them interacting with the users. Through CEF digital, it can create and send a ticket to the Service Desk. The Service Desk takes care of the During the pilot execution, users can contact tickets (e. g. , configuration problems, the service desk for any kind of technical crashes or failures that affects BDTI issue. software) within 8 hours. The public administration receives updates regarding its issue and BDTI technical team closes the ticket.
Community building and Innovation portal Services to be implemented A big data community where users can share knowledge and big data artefacts (e. g. methodologies, statistical models, pilots’ outcomes and datasets). It will provide interactive sessions, where public administrations can contribute with their own ideas and launch new proposals. Share BDTI know-how Share artefacts Access to interactive content Access to news and events Users will be able to access the forums and discussions and share their big data experiences and pilot's results. Users will be able to access big data artefacts: methodologies, best practices, datasets and code shared among PAs. Users will be able to access interactive contents, such as tutorial videos, material on big data. Users will be able to access news and events related to the area of big data.
Data Catalogue and Data exchange APIs The service provides a structured data catalogue (classified by policy domains) of sample datasets which the European Commission provides in a centralised repository. A public administration needs to identify which set of data sources and related datasets are needed for its big data pilot. A shared data repository among public administrations provides sample In addition, the implementation of datasets to be combined with the user exchange APIs is under analysis. dataset. The public administration can easily find useful public datasets to be used in its pilot project.
Big Data and Analytics software catalogue The service provides a data analytics A public administration needs to software catalogue that users will be able understand which set of analytics to download for implementing big data software can use for a specific big solutions. data use case. The structured software catalogue allows The software will be clearly classified by use public administrations to choose which case and could be tested with the Test software better fits their use cases. Infrastructure service. The public administration can easily find all the information / documentation for the selected software.
Support for Analytics implementation A public administration needs support during the pilot execution The service provides support during the since it lacks big data skills. implementation of a pilot by laying out specialised features/services, which helps A technical team is available to support public administrations to implement big data the public administration pilot execution. pilots The public administration can easily implement its big data pilot implementation due to availability of resources.
How BDTI works (1/3) Technical architecture The BDTI architecture includes three parts: the software stack (i. e. , data analytic tools grouped in Data Ingestion, Data Elaboration, Data Consumption and Governance & Security), the infrastructure (used through a set of different templates, depending on the pilot) and the different data sources to be used by users, currently under analysis.
How BDTI works (2/3) Data sources – You can bring your own data or use the data provided by BDTI Available in 2020 Data Sources BDTI (Internal) Data Sources EC / National Data Portal European Data Portal Social Media EU Open Data Portal Context Broker Third party Data Provider User (External) Data Sources Bring your own data AND/ OR Use open data Deploy your data into BDTI Storage and/ or Access via provided APIs RDBMS (highly structured) Distributed File System (semi-structured) Distributed File System (unstructured)
How BDTI works (3/3) Solution Architecture – Platform as a Service, the BDTI offers SW to access and analyse data
BDTI possible use cases WEB ANALYSIS (SCRAPING / MONITORING Io. T & SMART CITY Gather information from websites, involving Gather relevant information on the usage of data scraping (using bot or web-crawler) and several interconnected devices (Internet of data parsing to extract un-organised web data, as Things environment) in a Smart City context. well as data from API's, into manageable format. Io. T Security Safeguard connected devices and networks in the Internet of Things, since security often has not been considered in Io. T products design. IMAGE PROCESSING Computational operations using any form of signal processing for which the input is an image, a series of images, or frames of a video; output of image processing may be either an image or a set of characteristics / parameters related to the image ROUTE-TRACEABILITY/ FLOW MONITORING Everything that concerns with tracking APPLYING BIOINFORMATICS TO GENETIC DATA and detection of objects through the use of sensors (e. g. GPS, mobile phone signals, road cameras) or any other types of data usable for this purpose. The use of computational biology, in terms of macromolecules applying “informatics” techniques to understand/organise the information associated to analyse POPULATION / CUSTOMER SEGMENTATION Divide a broad population into sub-groups of consumers based on some types of shared characteristics such as common needs, interests, similar lifestyles or even similar demographic profiles. genetic data.
EU Big Data Hackathon 2019 – ESTAT Stakeholders: ESTAT & 17 teams chosen by National Statistical Offices Objectives: • to create a data product that solves statistical problems, • to produce innovative products, including visualisation tools, developing prototypes that official statistics will be able to integrate at European and national level; • to promote partnerships with the research community and the private sector. Big Data Test Infrastructure solution BDTI provided a ready-to-use cloud-based test infrastructure that: • enabled experimentation with data analytics and the visualisation of its results, • was easy to use and helped increase the adoption of big data technologies and the acquisition of analytics skills. • facilitated the combination of the relevant data with open government data, through built -in APIs for data ingestion from different data sources.
Tracking ships (AIS) – EMSA/ESTAT Stakeholders: EMSA/ESTAT Duration: 5 months Objectives: Use big data on geo-positioning of ships (generated by Automatic Identification Systems) to: • improve the quality and internal comparability of existing statistics • produce new statistical products. Big Data Test Infrastructure solution BDTI provides the test infrastructure to help: • build a reference frame of maritime ships, • improve data on departing ships regarding the next destination (traffic matrices) and the average distance matrix for all ports, • model and monitor CO 2 ships emissions, amongst others.
Ready to get started? 1 Get familiar with the services of BDTI Big data technologies provide a large number of service solutions for PA. The user identifies the most suitable service for its business 2 Define your project and its scope The user provides the following high level information: • Name of the pilot; • Short description of the pilot; • Scope of the pilot; • The policy domain for which the pilot is developed; • The involved stakeholders 3 Find, format and specify your data The user provides information on the dataset in order to identify how the data can be imported in BDTI (E. g. The name of the dataset, the size of the dataset, the type of data, the storage type) 4 5 7 Start the pilot Once the user has received approval for the pilot, the BDTI team gives support for starting working with the infrastructure, getting the user up and running 6 Fill in the pilot request form* After gathering all the information, the user fills in the form Identify software requirements The user provides to BDTI team information on which software tools is intended to use during the pilot Wait for the approval of your pilot Based on the information provided, the BDTI team contacts the user for further specifications, planning and approving the project *https: //ec. europa. eu/cefdigital/tracker/plugins/servlet/desk/portal/2/create/63 #Connecting. Europe
Use cases acceptance criteria Business criteria Functional criteria Pilot duration: 6 months. Potential users: Member State or public administration at national level. Pilot use cases: (only use case in scope*). Clear value added: Business and technical. Pilot BDTI geographical distribution/ resource allocation. Resource usage limit: based on CEF budget Clear contact point for the entire pilot. . Skills/Maturity level: adequate skilled resources and/or level of maturity on the big data subject. *Predictive analysis, Route-traceability / flow monitoring, Web analysis (scraping / monitoring), Text analysis, Descriptive analysis, Time-series analysis, Social media analysis, Network analysis, Population / customer segmentation
Data Economy e. Invoicing Directive Big Data Test Infrastructure e. Invoicing Context Broker e. Archiving Collect & share realtime data from multiple sources Store and preserve digital information e. ID e. Delivery e. Signature e. Translation Securely identify European citizens Exchange data and documents securely Create and verify legally recognised electronic signatures Provide a multi-lingual public service Experiment with big data in a safe environment Send and receive electronic invoices e. IDAS enablers
Find our more! How can the CEF building blocks help you achieve your objectives? Visit our website Learn more on how to get started with the building blocks and get access to specific content such as our Success Stories, tech articles, sample software / specifications, etc. http: //ec. europa. eu/cefdigital Contact us Do you want to use the building blocks for your project? Do you want to tell us more about your project? Contact us: DIGIT-BDTI-CEFSUPPORT@ec. europa. eu cef-building-block@ec. europa. eu
- Slides: 28