Faculty of Computer Science A Data Warehouse Architecture

  • Slides: 19
Download presentation
Faculty of Computer Science A Data Warehouse Architecture for Clinical Data Warehousing Tony R.

Faculty of Computer Science A Data Warehouse Architecture for Clinical Data Warehousing Tony R. Sahama and Peter R. Croll Amit Satsangi amit@cs. ualberta. ca CMPUT 603 © 2006

Department of Computing Science Focus Why are Clinical Data Warehouses (CDW) needed? Issues in

Department of Computing Science Focus Why are Clinical Data Warehouses (CDW) needed? Issues in their construction Design & design-choices in the construction of a CDW CMPUT 605 © 2006

Department of Computing Science Why Clinical Data Warehouse? Efficient Storage Uniformity in storage and

Department of Computing Science Why Clinical Data Warehouse? Efficient Storage Uniformity in storage and querying of data Timely analysis Quality of decision making and analytics —Decision based on larger sized datasets —More accurate information —Better strategies and research methods CMPUT 605 © 2006

Department of Computing Science Why Clinical Data Warehouse? Measurement of the effectiveness of treatment

Department of Computing Science Why Clinical Data Warehouse? Measurement of the effectiveness of treatment Relationships between causality and treatment protocols Safety Management —Breakdown of cost, and charge information —Forecasting demand —Better strategies and research methods CMPUT 605 © 2006

Department of Computing Science Some Facts… Large volume of data distributed in a number

Department of Computing Science Some Facts… Large volume of data distributed in a number of small repositories—”islands” of information Data has great scientific and medical insight Great potential for people practicing clinical medicine CMPUT 605 © 2006

Department of Computing Science Issues Heterogeneity—different clinical practices e. g. public vs. private hospitals

Department of Computing Science Issues Heterogeneity—different clinical practices e. g. public vs. private hospitals Data Location Technical platforms & data formats Organizational behaviors on processing the data Varying cultures amongst data management population CMPUT 605 © 2006

Department of Computing Science Past efforts Szirbik et al. – Medical data Warehouse for

Department of Computing Science Past efforts Szirbik et al. – Medical data Warehouse for elderly patients —Six methodological steps to build medical data warehouses for research. International Journal of Medical Informatics 75 (9): 683691 Used Rational Unified process (RUP) framework Identification of current trends (critical requirements of future) Data Modelling Ontology Building Quality Management and exception handling CMPUT 605 © 2006

Department of Computing Science Different DW Architectures (Sen & Sinha 2005) CMPUT 605 ©

Department of Computing Science Different DW Architectures (Sen & Sinha 2005) CMPUT 605 © 2006

Department of Computing Science Design and Planning Business Analytics Approach—understand the key processes of

Department of Computing Science Design and Planning Business Analytics Approach—understand the key processes of the business DW architect + Business Analyst + Expected Users Understand Key business processes + the questions that would be asked of those processes Analysis might be conducted on demographic, diagnosis, severity of illness, length of stay CMPUT 605 © 2006

Department of Computing Science Approach Integration of data from two Biomedical Knowledge Repositories (BKR’s)—Oncology

Department of Computing Science Approach Integration of data from two Biomedical Knowledge Repositories (BKR’s)—Oncology & Mental care Used SAS Data Warehouse Administrator (SAS 2002) —Flexibility to integrate external data repositories —Hassle-free ETL —Analytics with Data Miner —Reporting using SAS Enterprise Guide (EG) Operational Data Store Architecture & Distributed Data Warehouse Architecture CMPUT 605 © 2006

Department of Computing Science Several data marts to include different administration and management operations

Department of Computing Science Several data marts to include different administration and management operations —Summary reports —Monitoring of clinical outcomes by management CMPUT 605 © 2006

Department of Computing Science Oncology Patient Management CMPUT 605 © 2006

Department of Computing Science Oncology Patient Management CMPUT 605 © 2006

Department of Computing Science Mental Health Patient Management CMPUT 605 © 2006

Department of Computing Science Mental Health Patient Management CMPUT 605 © 2006

Department of Computing Science Data Transformation Source systems CDW (ETL— Extraction. Transformation-Load) Data preparation

Department of Computing Science Data Transformation Source systems CDW (ETL— Extraction. Transformation-Load) Data preparation & Integration takes 90% of the effort in a given CDW project Excel, SAS External File Interface (EFI) & SAS Enterprise Guide (EG) used to clean the data CMPUT 605 © 2006

Department of Computing Science Steps in creation of CDW Step 1: Data imported in

Department of Computing Science Steps in creation of CDW Step 1: Data imported in SAS —Standardization into SAS table format —Opportunity for data manipulation—create/delete columns Step 2: Creation of metadata using Operational Data definition Step 3: Creation and loading of Data Tables —Different tables for predictive and Database analysis —Creation of multi-dimensional cubes CMPUT 605 © 2006

Department of Computing Science Discussion Data acquisition step took very long—very little time left

Department of Computing Science Discussion Data acquisition step took very long—very little time left for cleaning, transformation Not enough time left to refine the shared environment (no modifications to their interface implementation etc. ) Security issues of federated Data Warehouses— anonymization of records CMPUT 605 © 2006

Department of Computing Science Discussion SAS EM used to interpret relationships between seemingly unconnected

Department of Computing Science Discussion SAS EM used to interpret relationships between seemingly unconnected data Newer CDW models coming from Case-based, Rolebased & evidence-based data structures need to be incorporated CMPUT 605 © 2006

Department of Computing Science Steps in creation of CDW Step 4: Data Mining —Tools

Department of Computing Science Steps in creation of CDW Step 4: Data Mining —Tools integrable with or within SAS used EM, EG etc. CMPUT 605 © 2006

Department of Computing Science Thank You For Your Attention! CMPUT 605 © 2006

Department of Computing Science Thank You For Your Attention! CMPUT 605 © 2006