Tackle Data Quality Issues Introduction Data Quality DQ

  • Slides: 12
Download presentation
Tackle Data Quality Issues

Tackle Data Quality Issues

Introduction Data Quality (DQ) is a significant issue facing many organizations: poor data quality

Introduction Data Quality (DQ) is a significant issue facing many organizations: poor data quality is associated with a variety of hard and soft costs. Most organizations struggle to define and implement a formal strategy for addressing DQ problems. Solve data quality woes by adopting a systematic approach for identifying and correcting current issues, then put processes in place to stop data quality problems from resurfacing. This solution set will help you: Understand the five types of data quality problems, and identify symptoms and causes. Assess what types of data quality problems are prevalent in your organization. Formulate effective strategies for cleaning your data – and keeping it clean. This research is ideal for: IT & business managers responsible for data quality processes IT professionals responsible for data integration & integrity Business professionals with data ownership responsibilities 2

Executive Summary DQ Overview Ø Ø Data quality is the overall ability of data

Executive Summary DQ Overview Ø Ø Data quality is the overall ability of data to fulfill its intended purpose. Organizations of all sizes and in many different industries need clean data to operate. Poor data quality negatively impacts a wide array of functions and business processes. Both the business and IT are responsible for solving data quality issues. Ø There are five distinct data quality problems: data duplication, stale data, incomplete DQ Under data, invalid data, and data conflicts. the Microscope Ø Each one of these problems requires a different approach. There are specific steps that can be taken by the business and IT to address these issues on both a corrective and preventative basis. Driving DQ Change Ø A massive data clean-up initiative is effective at correcting current problems, but unless process improvements are made, problems will continue to crop up. Ø Securing executive buy-in for improving data quality is essential. Ø Create ownership and accountability for data by using data stewards. Ø Establish clear policies and procedures for carrying out data integration. DQ Tool Landscape Ø There are several classes of vendors that provide either problem-specific solutions or full data quality suites. Ø Several vendors provide automated tools that speed up detection and elimination of data quality problems (i. e. de-duplication tools). Ø Data quality tools are not as effective as other methods for cleaning data.

Step 1: Data Quality Overview DQ Overview Driving DQ Change DQ Under the Microscope

Step 1: Data Quality Overview DQ Overview Driving DQ Change DQ Under the Microscope DQ Tool Landscape • Data quality is the overall ability of data to effectively fulfill its intended purpose (operations, decision making, planning, etc). Data Quality Overview • Data quality problems are endemic across many organizations: take steps to stop poor data in its tracks. Data quality impacts virtually every organization; internally, it can have a snowball effect across silos. • IT and the business have a role to play in ensuring the accuracy of data. While the business is responsible for the overall ownership of data, IT needs to take a proactive role in offering solutions that address and mitigate recurring data quality headaches.

Data is high quality if it can be successfully utilized for its intended purposes

Data is high quality if it can be successfully utilized for its intended purposes • Data quality expert Joseph Juran defines data quality as whether or not data is “fit for intended uses in operations, decision making and planning. ” There are five different issues under the DQ umbrella: • Data quality does not refer to a single problem: it’s an umbrella term referring to a family of different issues. • Data quality is not a matter of “excellent” vs. “poor. ” An organization may excel in some areas of data quality, but not in others. For example, an organization may struggle with duplicate data, but have processes in place for ensuring data remains fresh. • Different data quality problems are often interrelated. For example, duplicate data can give rise to data conflicts. Taking steps to fix one problem can have a positive “halo effect” on other problems. Data Quality Duplicate Data Incomplete Data Stale Data Invalid Data Conflicting Data Info-Tech Insight Data quality doesn’t refer to a single problem. It’s a broad term for several specific types of maladies.

Organizations of all sizes should be concerned about data quality Data quality impacts organization

Organizations of all sizes should be concerned about data quality Data quality impacts organization of all sizes across a variety of industries. • Contrary to popular belief, data quality is not a problem that’s limited to larger organizations or those that deal with large volumes of data. Smaller organizations also rely on accurate data for decision making. • For example, accurate and timely contact information is essential for the operations of virtually all organizations. But poor data quality in the form of duplicate records and stale data undermine the efficacy of e-mail, phone, and mail communications. • As a percentage of total costs, poor data can actually impact smaller organizations more than large firms. The loss of a key client due to erroneous data is often felt more by a firm with smaller overall revenues. (Adequate) (Severe) A recent Info-Tech survey found that organizations of all sizes encounter problems with inadequate data quality. There was not a strong pattern between small vs. large enterprises. Info-Tech Insight Even smaller organizations need to unearth and solve data quality problems. Poor data erodes business processes and leads to wasted time, effort, and money.

Data quality problems are rarely isolated: they create headaches for a wide variety of

Data quality problems are rarely isolated: they create headaches for a wide variety of user groups within the organization • Poor data quality is particularly problematic because it is rarely limited to a single business process or department. Data Quality Costs Snowball from Point-of-Origin $ • Low-quality data has a snowball effect. • Incorrect information entered into a sales database can quickly spill over into other departments and degrade other business processes. $$ Problem becomes larger: i. e. duplicate data creates data conflicts between systems. $$$ Problem snowballs into escalating costs: i. e. billing department sees conflict, but uses stale record for customer billing, driving up average A/R cycle time. • If a salesperson enters incorrect or incomplete contact information for a customer, there will be problems with order fulfillment and direct marketing. • In the short-term, this can lead to increased shipping and rework /discount costs. In the long-term, the damaged customer relationship means a lack of repeat orders. Issue at point-of-origin: i. e. duplicate data entered into CRM system rather than updating stale record. Info-Tech Insight Poor data quality is a serious issue that impacts a wide array of business processes. Viewing data quality as isolated within a particular silo can be dangerous – problems in one silo can easily spill into others.

The business needs to have ownership of the data, but IT must provide solutions

The business needs to have ownership of the data, but IT must provide solutions for fixing DQ problems • Conventional wisdom holds that the business is responsible for ensuring the integrity and accuracy of data. It’s not uncommon for IT to downplay its role in addressing data quality issues. • However, poor data quality is an endemic problem that often permeates the organization. Individual business units rarely have the resources or authority to unilaterally solve their data quality problems. • While the business needs to recognize that it is ultimately accountable for data ownership, IT must take a proactive stance on providing solutions and assistance with data quality. • It’s important to delineate the relationship between IT and the business and specify who is responsible for what. IT should not be taking charge of the data; rather, it should provide tools and assistance with data cleansing. The business needs to… Ø Set policies for matters such as refresh cycles for stale data. Ø Determine which systems will be “systems of record” to reduce conflicts. Ø Determine access privileges and data validation rights. IT needs to… Ø Advise the business on software tools for improving data quality. Ø Provide assistance with major cleansing efforts. Ø Provide assistance with database and interface design (e. g. locking down certain fields from end users, and setting up data validation). Info-Tech Insight IT and the business often try to “pass the buck” for data quality issues to one another. The business must own the data, but IT needs to have an active role in offering solutions to help the business address data quality problems.

Organizations experience fewer data quality problems when responsibility is evenly distributed between end users

Organizations experience fewer data quality problems when responsibility is evenly distributed between end users & IT Data Quality Problem score is defined as how problematic (on average) the following are at your org. Not a concern (0%) to significant concern (100%): -Data Duplication -Stale Data -Incomplete Data -Invalid Data -Data Conflicts Data Quality Problem Score Giving end users too little responsibility for data quality can encourage them to become sloppy or uncaring about data. Giving end users too much responsibility for data quality can overwhelm them, especially if they have insufficient tools to reduce problems. 30% 20% N = 113, Source: Info. Tech Research Group Low Medium High User Responsibility for Data Quality Shared responsibility is ideal.

Step 2: Data Quality Under the Microscope DQ Overview Driving DQ Change DQ Under

Step 2: Data Quality Under the Microscope DQ Overview Driving DQ Change DQ Under the Microscope DQ Tool Landscape • There are five distinct data quality issues: Ø Data duplication Data Quality Under the Microscope Identify & remedy the five problems that are undermining your data. Ø Stale data Ø Incomplete data Ø Invalid data Ø Data conflicts • Remedying each issue requires a combination of corrective and preventative measures by IT and the business.

It’s necessary to identify the five data quality problems & ascertain which of them

It’s necessary to identify the five data quality problems & ascertain which of them must be remedied in your organization The five data quality problems are distinct issues, but they may have similar underlying causes. What is it? What causes it? What does it impact? What can be done about it? Data Duplication Multiple copies of the same piece of data • Incorrect data entry • Poor integration • Faulty database design • Wasted storage space • Ongoing problems with direct sales and/or marketing communications • Data quality tools • Better integration • Unique indices for data Stale Data being incorrectly used on the assumption that it is current • Contacts changing position • One-time integration with no ongoing delta import • Data not being available fast enough from source systems • Problems with marketing correspondence, leading to lost sales and damaged customer relationships • Establish clear data refresh cycles • Pull customer information from user-supplied sources, such as social networking sites Incomplete Data Key fields are missing or not filled out • End user apathy • Required fields not being enforced • Poor user interface • Missing data can lead to productivity losses and flawed decision-making • End-user training • Strong data validation • Easy-to-use interfaces Invalid Data The wrong data or poorly formatted data is stored in columns • Ineffective or non-existent validation rules • Data type mismatches between integrated systems • Creates integration exception reports, which must be investigated • Interferes with operational reporting • Strong data validation • Elimination of extraneous use of note fields • User training Data Conflicts Data contained in one system is at odds with data contained in another system • No designated system of record • Poor integration • Lack of data interchange between systems • Data conflicts confuse users • Wasted time and effort • Threat of using incorrect data • Tighter system integration • Data auditing

Info-Tech Helps Professionals To: Sign up for free trial membership to get practical Solutions

Info-Tech Helps Professionals To: Sign up for free trial membership to get practical Solutions for your IT challenges “Info-Tech helps me to be proactive instead of reactive - a cardinal rule in stable and leading edge IT environment. ” - ARCS Commercial Mortgage Co. , LP