TABLE OF CONTENTS Sep 2007 - 0 comments

The Death of Dirty Data

By creating and maintaining accurate data sources insurers can go a long way to improving efficiency and enhancing business opportunities.

TEXT SIZE bigger text smaller text

The data management challenge insurance companies face is enormous. Every major decision is driven by information on customers, policies and claims. That same information is the foundation for both industry and regulatory compliance initiatives. In addition, a steady trend of mergers and acquisitions consistently introduce new customer bases and policy information (often containing non-standard or non-conforming information). The result is a confused, disjointed jumble of corporate data.

Bad data can affect any business, especially an insurer's business. The adage "garbage in, garbage out" applies perfectly to the problem many organizations face. However, companies often do not devote the time, money and resources to improve and manage the quality of this vital information.

The best way to create and maintain accurate data sources is through effective data management processes and technologies. With these tools, information can be standardized, validated and updated across an organization in real time.


Why is clean data so important? Put simply, inconsistent or inaccurate data can make even the simplest tasks -- such as creating an accurate list of customers -- seem difficult. For example, an insurance Web site allows consumers to compare multiple rates for the best automobile, life, home and other types of insurance coverage. The company processes around 400,000 transactions each day and, before cleansing its data, believed it had between 10 million and 14 million customers. The number was imprecise because customers often created duplicate identities when visiting the site multiple times. This company faced a major hurdle in attracting and retaining customers: if it didn't even know how many customers it had, how could it possibly communicate effectively with them?

Given the uncertainty about the size of its potential customer base - caused by rampant data quality problems in its customer data warehouse - the insurance comparison site started a data quality project to ensure it was making decisions based on accurate information. After the company implemented data-matching and data-correction technologies, the company began to improve both the quality of its database and its confi- dence that the newly-entered data was correct.

After the company re-engineered its Web site's method of managing data, online figures revealed the company had only eight million unique customers, thereby removing a sizable percentage of its perceived customer base. Although this "shrinking" customer base could be seen as a negative, the company understood that a more accurate customer count meant it was building a better understanding of its customers and the services they needed.

Data quality technology can also help you extend and enhance the value of customer information. By matching and clustering similar names at the same address, you can create "households" of records. This allows you to only send one message to that address, as well as to learn more about the aggregate household value for this existing or potential customer.

You can also use data quality technology to add information from third-party resources, such as:

* demographic data - help refine your search for the right customer based on age, income and other characteristics;

* geo-spatial data - use postal information to assign geographic information to accounts to help learn more about potential customers and where they live (which can affect the type and cost of policies); and

* business information - for business and commercial insurance policies, third-party information can add data about parent companies or subsidiaries to confirm that you are reaching the right people at the right business unit.

By instituting a data quality program, a company can create and better manipulate customer information to drive better service and support. The same rules that drive the initial clean-up can then be enacted in real time to ensure that incoming data from Web sites or internal applications meet the same standards. This is a tremendous step forward for an organization, as it creates a more effective, responsive IT environment.


Data quality is much more than a tool for effective marketing. Information contained in an insurer's IT infrastructure can run the gamut from customer relationship management to regulatory compliance. Regardless of the data type, or the use of the data, data quality technology and processes can analyze the condition of existing data, implement rules that govern acceptable data quality and monitor the health of information over time. The goal here is to use data quality to minimize risk and aid in compliance.

One of the most compelling uses of data quality is in the area of fraud detection. Recent industry and regulatory requirements have compelled insurers to use data quality capabilities to help find and eliminate fraudulent information. Through intelligent and tunable matching algorithms (for example, is the John Smith at 100 E. Main St. the same as J. Smith at 100 Main St.?), data quality technologies can uncover suspicious or misleading claim information.

By applying these matching routines to data sources, a data steward - a person or team responsible for making sure everyone at an organization applies strict business rules to their data - can identify duplicate customer information and fraudulent claims. Data quality technology can also check several similar characteristics for duplication, helping enhance the quality of potential matches across data sources.

Aside from its use in fraud detection, data quality technology is often used to establish and enforce business rules that monitor data for quality (is the data accurate?) and intent (is the data doing what we need it to do?). Through monitoring business rules, data stewards can codify already-established rules and procedures, and add these rules within the applications or data sources within the enterprise. For example, a company may track monthly payments in one application but store information on overall customer value in another application. So, the company would want to know if the total of the monthly payments in Table A is equal to the expected annual revenue from that customer in Database B. An automated business rule can examine those data points - and send users an email or system alert when the data is out of compliance.

The key element for this effort - often known as data governance - allows organizations to create a standardized approach to managing data. Because insurance companies often have silos of data, a critical success factor for insurers will be the ability to "speak the same language" within their data sources. This can be as simple as ensuring that a policyholder has the same information across applications, or as complex as standardizing policies across the enterprise.

By standardizing data quality technologies and procedures, you can eliminate potential points of failure. Even poor-quality data about an address can affect a company's ability to manage and mitigate risk. To query the risk associated with policies in New York, for example, various representations of that data element ("New York," "NY," or "N.Y.") can confuse reporting and provide invalid assessments. By creating and enforcing a business rule for this, you can intelligently aggregate and maintain better information about customers, policies and claims.

Data is the foundation of practically everything that goes on in today's organizations, and so managing the quality of data is reaching "business-critical" status. An organization's data is one of the true competitive advantages it can employ. Data contains insight about the company, its health, its customers and its finances. If the data is not consistent, accurate and reliable, the company will make poor decisions and experience poor results.

The good news is that insurers are perfectly positioned to embrace the tenets of data quality to create a more informed, responsive enterprise. The tools and best practices are available. With cooperation across and within organizations, better data can drive effective compliance, minimize risk and enhance customer relationships.


Tony Fisher, President and CEO, Dataflux Corporation
Larger photo & full caption

File size: 6.2 KB (150px X 204px)
Caption: Tony Fisher, President and CEO, Dataflux Corporation
Monitor These Topics

Horizontal ruler

Note: By submitting your comments you acknowledge that Canadian Underwriter has the right to reproduce, broadcast and publicize those comments or any part thereof in any manner whatsoever. Please note that due to the volume of e-mails we receive, not all comments will be published and those that are published will not be edited. However, all will be carefully read, considered and appreciated.

comments powered by Disqus