METADATA Data Warehouse success depends on metadata Overview
- Slides: 33
METADATA Data Warehouse success depends on metadata
Overview • • What is metadata? Why is it needed? Types of metadata Metadata life cycle
Better end user data access and analysis tools can help users figure out how to get information they need out of the warehouse, but only good, easily accessible metadata can help them figure out what is available in the data warehouse and how to ask for it.
Data Warehouse Process Data Characteristics • Raw Detail • Integrated • History • No/Minimal History • Scrubbed • Summaries • Targeted • Specialized (OLAP) Source OLTP Systems Data Marts Data Warehouse • Design • Mapping • Extract • Scrub • Transform • Load • Index • Aggregation • Replication • Data Set Distribution Meta Data System Monitoring Copyright © 1997, Enterprise Group, Ltd. • Access & Analysis • Resource Scheduling & Distribution
Meta Data Description • Information about the data warehouse system – Content – Organizational – Structural – Management Information – Scheduling Information – Contact Information – Technical Information
Why Do You Need Meta Data? • Share resources – Users – Tools • Document system • Without metadata – Not Sustainable – Not able to fully utilize resource
Metadata Life Cycle • Collection - Identify metadata and capture into repository; automate • Maintenance - Put in place processes to synchronize metadata automatically with changing data architecture; automate • Deployment - Provide metadata to users in the right form and with the right tools; match metadata offered to specific needs of each audience
Metadata Collection • Right metadata at the right time • Variety of collection strategies • Sources – potential sources of data for DW – external data – data structures • Data Models - enterprise data model start point – import from CASE tool – correlate enterprise and warehouse models
Metadata Collection • Warehouse mappings – map operational data into warehouse data structure – Need record of logical connection used for mapping and transformation • Warehouse usage information – After roll out – What tables accessed, by whom and for what – What queries written – Capture nature of business problem or query
Maintaining Metadata • Up to date with reality • Capture incremental changes
Metadata Deployment • Warehouse developers need: – physical structure info for data sources – enterprise data model – warehouse data model – concerned with accuracy, completeness and flexibility of metadata – Need access to comprehensive impact analysis capabilities – Need to defend against accuracy & integrity questions
Meta Data • Types – Technical – Business / User • Levels – Core – Basic – Deluxe
Core Technical Meta Data • Source • Target • Algorithm
Basic Technical Meta Data • • • History of transformation changes Business rules Source program / system name Source program author / owner Extract program name & version Extract program author / owner Extract JCL / Script name Extract JCL / Script author / owner Load JCL / Script name
Basic Technical Meta Data (con’t) • • Load JCL / Script author / owner Load frequency Extract dependencies Transformation dependencies Load completion date / time stamp Load completion record count Load status
Deluxe Technical Meta Data • • • Source system platform Source system network address Source system support contact Source system support phone / beeper Target system platform Target system network address Target system support contact Target system support phone / beeper Etc.
Core Business Meta Data • Field / object description • Confidence level • Frequency of update
Basic Business Meta Data • Source system name • Valid entries (i. e. “There are three valid codes: A, B, C”) • Formats (i. e. Contract Date: 82/4/30) • Business rules used to calculate or derive the data • Changes in business rules over time
Deluxe Business Meta Data • • • Data owner contact information Typical uses Level of summarization Related fields / objects Existing queries / reports using this field / object • Estimated size (tables / objects)
Amount of Meta Data • How much Meta Data do I need? • As much as you can support!
The Meta Data Conundrum • Meta Data is absolutely required for success • Meta Data is 99% Manual Cold, Hard Reality 5, 000 data mart fields 7 manually populated and maintained meta data 35, 000 total manual meta data fields Are y ou re ady f or thi s, for ever? Copyright © 1997, Enterprise Group, Ltd.
The Meta Data Conundrum • • Can you support 35, 000 Meta Data fields? Calculate available ongoing resources Commit only to what you can maintain You MUST deliver core, probably some basic to be viable
Meta Data Functions - Technical • • Maintenance Troubleshooting Documentation Logging / Metrics
Meta Data Location • DB Resident – Almost always relational – C/S predominantly – Normalized design – OODB is popular option for proprietary solutions
Repository • Specialized databases designed to maintain metadata, together with tools and interfaces that allow a company to collect and distribute its metadata • Repository Requirements – Logically Common – Open – Extensible
Multiple Repository • Upside – Local instance, quick response – Local view • Users don’t have to wade through other’s material • Downside – More challenging implementation – Advanced replication – Requires maintenance resources – More susceptible to architecture modification to remote instances
Multiple Repository Where do I find all the information about sales? • Requires multiple access points • Requires more system resources Copyright © 1997, Enterprise Group, Ltd. Data Mart Meta Data Marts
Common Repository • Upside – Optimum solution – Avoids replication challenges – Allows central management/access • Downside – Requires remote access for remote DM’s – More network infrastructure – May require gateways
Common Repository Where do I find all the information about sales? Data Mart Meta Data • Single access point for all information resources • Low system resources required Copyright © 1997, Enterprise Group, Ltd. Data Marts
Meta Data Process • Integrated with entire process and data flow – Populated from beginning to end – Begin population at design phase of project – Dedicated resources throughout • Build • Maintain • Design • Mapping • Extract • Scrub • Transform • Load • Index • Aggregation • Replication • Data Set Distribution Meta Data System Monitoring Copyright © 1997, Enterprise Group, Ltd. • Access & Analysis • Resource Scheduling & Distribution
Meta Data Vision vs. Reality • Standards – OMG standard (June 2000) • • Common Warehouse Metadata Model XML based Supported by Oracle Designed by Oracle, Unisys, IBM, NCR and Hyperion • Industry initiatives just taking hold • Proprietary solutions inadequate • Who is missing?
Meta Data Challenges • The Meta Data conundrum • Thin tool support (pairing standards, MSFT coming) • Hidden resource trap • Absolute requirement for success
Web Sites • List of metadata tools http: //www. dwinfocenter. org/catalog. html • Universal Metadata http: //www. eaijournal. com/Data. Integration/Holy. G rail. asp • Metadata Project http: //www. dis. state. ar. us/DIS_Proj/EDWR/Metad ata/MD_Home. htm
- Metadata layer data warehouse
- The success of an indirect attack on a fire depends on the
- Contrived mediating stimulus
- Your child's success or lack of success
- Your child's success or lack of success
- Data quality and data cleaning an overview
- Data quality and data cleaning an overview
- Data quality and data cleaning an overview
- Data mining in data warehouse
- Contoh data mart
- Data warehouse elements
- Pengertian data mart
- Introduction to data warehouse
- Arsitektur data mining
- Perbedaan data warehouse dan data mining
- Data warehousing olap and data mining
- What is data acquisition in data warehouse
- Data warehouse vs data mart
- 4 tier architecture of data warehouse
- Data warehouse dan data mining
- Olap data mart
- Metadata-driven data management
- Metadata-driven data management
- Master data services overview
- Sql server master data services tutorial
- Chicago time
- An overview of data warehousing and olap technology
- Trajectory data mining an overview
- Methodologies for cross-domain data fusion: an overview
- Data warehouse visio
- Collier county data warehouse
- Microsoft sql server 2012 parallel data warehouse
- Populating data warehouse
- Epm data warehouse