Centralised System for Administrative Data Collection at Statistics

  • Slides: 11
Download presentation
Centralised System for Administrative Data Collection at Statistics Finland Kirsi-Marja Aalto, Perttu Muurimäki 14.

Centralised System for Administrative Data Collection at Statistics Finland Kirsi-Marja Aalto, Perttu Muurimäki 14. 3. 2017, Brussels New Techniques and Technologies for Statistics 2017

Centralized System for Administrative Data Collection at Statistics Finland – AVa • Most of

Centralized System for Administrative Data Collection at Statistics Finland – AVa • Most of the data used in statistical production are administrative data or registers – some 150 individual admin data from 50 different data providers • In the Statistics Act it is defined that official statistical organisations and some other organisations have the obligation to give Statistics Finland (statistical) data for producing statistics needed both by national and European Union authorities • More than two thirds of the admin data (in numbers) come via administrative data collection system, AVa 2 30 November 2020 Kirsi-Marja Aalto

Cooperation with data providers • Data providers are mostly government agencies and official organisations

Cooperation with data providers • Data providers are mostly government agencies and official organisations • With the biggest data providers Statistics Finland has skeleton agreements – the general terms of cooperation between Statistics Finland the organisation • More specific matters are agreed in the data acquisition agreement, which is included in the skeleton agreement as enclosures • when and how often the data are sent • how the data will be transferred • With smaller data providers Statistics Finland has only data acquisition agreements 3 30 November 2020 Kirsi-Marja Aalto

Phases of administrative data collection system AVa • The AVa system consists of three

Phases of administrative data collection system AVa • The AVa system consists of three phases: automatic phase, centralised phase and statistical production process. 4 30 November 2020 Kirsi-Marja Aalto

Ava process needs standardising 5 30 November 2020 Kirsi-Marja Aalto

Ava process needs standardising 5 30 November 2020 Kirsi-Marja Aalto

Data in transit 1/2 • For file transfer AVa uses generic file transfer service

Data in transit 1/2 • For file transfer AVa uses generic file transfer service (Fi. TS) developed in Statistics Finland. • Fi. TS provides several key features: • Bi-directional file transfers from and to external data providers (file exchange) • Passive/active file transfers i. e. Fi. TS either waits for data providers to fetch/push files or Fi. TS fetches/pushes the files itself. • Pluggable file transfer methods for external data providers i. e. sftp, https and ftp protocols with various authentication methods (password, public key, certificate) • Pluggable file transfer methods to internal data receivers (sftp for unix/linux servers, smb for Windows shares and servers) 6 30 November 2020 Perttu Muurimäki

Data in transit 2/2 • Method for grouping files into datasets by filename or

Data in transit 2/2 • Method for grouping files into datasets by filename or filename wildcard and provider id • Method for routing datasets to one or more recipients • Configurable fetch poll scheduling (hourly, "sensible", daily, weekly) • Possibility to use hook-scripts in various phases of file transfer to e. g. unzip files, rename files or split files into several smaller files. Other uses are also possible. • Establishing Fi. TS connection for AVa requires that the connection details are first sorted out (provider id, protocols, usernames, servers, directories). Then AVa team uses provider id to route data to one or more recipients target directories (on target servers). After that data "just flows". 7 30 November 2020 Perttu Muurimäki

Process flow for getting new data 8 30 November 2020 Perttu Muurimäki

Process flow for getting new data 8 30 November 2020 Perttu Muurimäki

Future challenges • New forms of data and the use of big data •

Future challenges • New forms of data and the use of big data • The amount and the number of incoming data increases → more automatisation i. e. automatic technical validation, data transformation into a SAS dataset and routing data to their predetermined target folder. • To develop an interface that could be used to control and manage the whole process of centralised administrative data collection 9 30 November 2020 Kirsi-Marja Aalto

Benefits of centralized administrative data collection • More efficient, safe and systematic administrative data

Benefits of centralized administrative data collection • More efficient, safe and systematic administrative data receiving processes • Better transparency of the utilized administrative data • Rationalised data requests and advanced data usage • • Before After Statistics Finland Statistical unit Data provider Stat unit Statistical unit 10 30 November 2020 Kirsi-Marja Aalto . Stat unit Data provider AVa Statistical unit

Kirsi-Marja Aalto kirsi-marja. aalto@stat. fi Perttu Muurimäki perttu. muurimaki@stat. fi NTTS – 2017

Kirsi-Marja Aalto kirsi-marja. aalto@stat. fi Perttu Muurimäki perttu. muurimaki@stat. fi NTTS – 2017