Question and Answer: Identity Resolution Reveals
Identity resolution engines provide the "smarts" to discover "who's who" by examining data in multiple sources and silos to give you a complete picture of your customer.
- By Linda L. Briggs
- April 1, 2009
Identity resolution software helps companies look across multiple data sources and silos in the enterprise to discover "who's who and who knows whom." That's according to Douglas Wood, senior vice president of global sales for Infoglide Software, which develops and markets a standards-based identity resolution solution. Government has been a primary user of identity resolution engines, especially since 9/11, but in the past several years, the software has expanded into many fraud detection uses within the commercial sector.
Wood says that the term identity resolution is sometimes misapplied in the industry to describe software that performs less-rigorous functions such as data matching. Although identity resolution engines do have data matching at the core, he says, they provide much more functionality and flexibility. Wood encourages readers who want more information to visit a recognized industry meeting place for sharing thoughts and articles about the topic at http://www.identityresolutiondaily.com.
BI This Week: You've said that the term "identity resolution technology" is used in various ways in the software industry. Can you start by defining the term?
Douglas Wood: You can think of identity resolution as an operational business intelligence process. It's software that allows organizations to connect disparate data sources in order to understand possible entity matches and non-obvious relationships. It boils down this: Providing capabilities for organizations to understand "who's who" and "who knows whom" across multiple data silos.
Occasionally, when we introduce the concept of identity resolution technology to a new customer, their immediate response is "I see, but we already have a data matching engine." The truth is, identity resolution engines have data matching at the core, but provide much more functionality and flexibility than that.
For example, the entire process of identity resolution should be data-agnostic. In other words, a true identity resolution engine should provide a variety of connectivity options and should have capabilities to analyze identities and entities both inside and outside the data warehouse. More powerful identity resolution engines allow analysis of remote data as well.
Perhaps the key element of an identity resolution engine is the ability to take the entity match and relationship results, then apply domain-specific rules to them. How does the enterprise treat Customer A by virtue of the fact that he or she has some non-obvious relationship with Customer B? The answer to that question is specific to the domain.
For example, an insurance company may decide to reject Customer A's claim if Customer B is a known fraudster. On the other hand, an airline may decide to give Customer A some free upgrades based upon Customer B's "platinum" status. Identity resolution engines automate decisions like these, then push those decisions out to users.
What are some of the ways that identity resolution software is used today?
With identity resolution technology, data isn't subjected to deterioration processes such as cleansing or record merging. Rather, the data can remain in its original state and in its original location. This means that the "forensic value" of individual records is preserved for ongoing analysis.
With that in mind, identity resolution is especially applicable to solutions that seek to uncover risk, fraud, and conflicts of interest. It's also a powerful matching and clustering tool for use within data warehouse and master data management applications.
Government remains a primary user of identity resolution engines, but the industry has expanded into commercial sectors, particularly within the past few years. Financial institutions can use identity resolution for PATRIOT Act compliance or credit card and loan fraud analysis. Retailers can analyze organized retail crime activities by comparing shoplifters with employees or frequent merchandise returners. Workers compensation insurance companies can detect employers that change attributes to avoid paying premiums, or detect non-obvious relationships between claimants and incident witnesses. Lottery corporations can compare winning ticket holders against lottery retail employees to discover potential fraud.
What are some of the problems with traditional matching engines, which you've said sometimes call themselves identity resolution solutions?
There are no problems with matching engines per se. They just aren't identity resolution engines. Matching engines typically use one or two algorithms, perhaps mathematical in nature, that examine structured data looking for name matches. They provide an improvement over Soundex, but little more. What they do is say "Yes, this is a match" or "No, this is not a match." Often, that isn't enough.
Identity resolution engines are different because they have the capability to connect to wider varieties of data types, to handle larger volumes of data, and to provide much deeper analysis into why something matched rather than if something matched. Additionally, identity resolution engines provide the non-obvious relationship detection element missing from simple matching engines.
In a nutshell, if the software requires the data to be pre-processed in some way, it's not a true identity resolution engine.
Why are identity resolution software solutions so popular right now?
Identity resolution engines have been successfully implemented for many years now. Even so, the technology was seen as niche up until the last few years. Unfortunately, it was the events of September 11, 2001 that brought the value of this technology to light. As part of the 9/11 Commission recommendations, the Department of Homeland Security began a robust search for identity resolution technology that could keep terrorists off airplanes by comparing passenger attributes against terrorist watch lists, no-fly lists, and other types of threat-related data.
Through a few iterations, the selection and implementation process evolved into the Transportation Security Administration's (TSA) Secure Flight Program, which ultimately thrust identity resolution technology into the limelight.
Secure Flight is now widely recognized as the pre-eminent use case for identity resolution technology requirements today, and a number of companies are involved in delivering that solution.
How does the software typically work with a BI or data warehouse application?
Data warehouse and BI applications have always gone hand in hand. Together, they go a long way to providing a single version of the truth. What has been missing until recently, though, is the deeper analysis of identity and entity matches and clusters. Identity resolution engines provide this functionality, and then make automated decisions which impact business processes in real time, limiting the need for human intervention.
How does Infoglide address some of what we've talked about today?
Infoglide develops and markets a standards-based identity resolution solution called Identity Resolution Engine (IRE). At its core is a library of over 50 algorithms that allow organizations to connect to a variety of data sources -- in or out of the data warehouse -- with a view to understanding "who's who" and "who knows whom" across those silos.
With over ten years of research and development, we've perfected the combination of lexigraphic and domain-specific algorithms to deliver a high degree of precision that virtually precludes false positives -- something that can't be approached using a generic matching engine.
Additionally, IRE has patented capabilities to search and analyze free text, and compare different elements in an unstructured blob of text to find similarities. Our product is highly configurable and easy to integrate and includes a robust link-visualization tool for additional data analysis.