ROUTINE HEALTH INFORMATION SYSTEMS A Curriculum on Basic

ROUTINE HEALTH INFORMATION SYSTEMS A Curriculum on Basic Concepts and Practice MODULE 8: Information and Communication Technology for RHIS SESSION 4: Data Repository/Data Warehouse The complete RHIS curriculum is available here: https: //www. measureevaluation. org/our-work/ routine-health-information-systems/rhis-curriculum 1

Learning Objectives and Topics Covered Objective • By the end of this session, participants will be able to explain the basic definitions and concepts of data repositories and data warehousing architectures. Topics covered Patient-centered information systems: • Electronic medical records (EMRs) and aggregate information systems • Readiness for patient-centered IT solutions • Development of patients’ unique identifiers • Data repositories/data warehouses 2

Data Repository/Data Warehouse • Central place where an aggregation of data is kept and maintained in an organized way • May consist of several databases linked to one another by a common search engine • Can support data that are o Open access only o Mediated access only o Closed/private only 3

Data Repository/Datawarehouse • Defined as a place that holds data, makes data available to use, and organizes data in a logical manner • Enables deposit, preservation, and access to digital content • Real-time databases that consolidate data from a variety of sources to present a unified single view 4

Repository Examples • Research data repositories • Clinical data repositories • PHC or district data repositories 5

Data Warehouse Concepts Distinction between data and information • • • Data are observable and recordable facts that are often found in operational or transactional systems Data only have value to end-users when they are organized and presented as information Information is an integrated collection of facts and is used as the basis for decision making 6

Data Warehouse Concepts • Data warehouse is designed for query and analysis rather than for transaction processing • Data warehouse separates analysis workload from transaction workload. This helps: o Maintain historical records o Analyze data to better understand the business o Improve the business 7

Data Warehouse Definition (A Practitioner’s Viewpoint) “A data warehouse is simply a single, complete, and consistent store of data obtained from a variety of sources and made available to end users in a way they can understand use it in a business context. ” – Barry Devlin, IBM consultant 8

Data Warehouse Definition (An Alternative Viewpoint) “A data warehouse is a subject-oriented, integrated, time-variant, and non-volatile collection of data in support of management's decision-making process. ” – W. H. Inmon, computer scientist 9

Data Warehouse Definition 10

Subject-Oriented Stored Data • Target-specific subjects; data warehouse can be used to analyze a particular subject area: for example, high drop-off rates between the 1 st and 4 th antenatal care visit in a particular region • Provide a simple and concise view around particular subject issues by excluding data that are not useful in the decision support process 11

Integrated • • Data may be distributed across heterogeneous sources that have to be integrated. A data warehouse must put data from disparate sources into consistent format. Data cleaning and data integration techniques are applied. Standardization is emphasized at this point o Naming conventions o Coding structures o Data attributes 12

Time Variant • A data warehouse focuses on change over time. • Data are stored as a series of snapshots, each representing a period in time. • The data stored may not be current but vary with time, and data have an element of time: for example, antenatal care 1 st visits in the past 5 years. • Analysts need large amounts of data in order to discover trends in data. 13

Nonvolatile • Once entered in the data warehouse, data should not change, and should not be subjected to frequent modification. • Stored data generally are subject to only two operations: o Loading of data o Access to data 14

Data Warehouse Architectures Data warehouses and their architectures vary depending on the specifics of an organization's situation. Three common architectures are: • Data warehouse architecture (basic) • Data warehouse architecture (with staging area) • Data warehouse architecture (with a staging area and data marts) 15

Data Warehouse Architecture (Basic) 16

Data Warehouse Architecture (with Staging Area) 17

Data Staging Area Definition Any data store that is designed primarily to receive data in a warehousing environment Data staging involves: • Extraction (E): Reading source data and copying data needed for the data warehouse into the staging area for further manipulation • Transformation (T): Converting the read data to a common data format; cleaning, auditing, and combining data from various sources • Load (L): Writing the data into the data warehouse 18

Data Warehouse Architecture (with Staging Area and Data Marts) 19

Data Mart • Repository of data designed to serve a particular community of knowledge workers, in order to meet the demands of specific groups of users within the organization, such as human resource management (HRM) • A data mart represents data from a single “business process, ” such as: • ANC first visits • Completed reports and requisitions • Store inventory 20

Comparison of Data Warehouse and Operational Data Warehouse Data Application-oriented Detailed Accurate, as of the moment of access Serves the clerical community Can be updated Run repetitively and nonreflectively Performance-sensitive (immediate response required when entering a transaction) Subject-oriented Summarized, otherwise refined Represents values over time: snapshots Serves the managerial community Are not updated Are run heuristically Performance relaxed (immediacy not required) Transaction-driven High availability Analysis-driven Relaxed availability 21

Data Warehousing Process Iterative Development Process • Start with one subject area (or subset or superset) and one target user group. • Continue and add subject areas, user groups, and informational capabilities to the architecture. • The process is based on the organization’s requirements for information, not technology. • Improvements are made from what was learned from previous increments. • Improvements are made from what was learned about warehouse operation and support. • The technical environment may have changed. • Results are seen very quickly after each iteration. • The end-user requirements are refined after each iteration. 22

ROUTINE HEALTH INFORMATION SYSTEMS A Curriculum on Basic Concepts and Practice This presentation was produced with the support of the United States Agency for International Development (USAID) under the terms of MEASURE Evaluation cooperative agreement AID-OAA -L-14 -00004. MEASURE Evaluation is implemented by the Carolina Population Center, University of North Carolina at Chapel Hill in partnership with ICF International; John Snow, Inc. ; Management Sciences for Health; Palladium; and Tulane University. The views expressed in this presentation do not necessarily reflect the views of USAID or the United States government. 23
- Slides: 23