By using website you agree to our use of cookies as described in our cookie policy. Learn More


Textual Data Marches On: Anticipating and Preventing Lawsuits

Natural language processing has its limits in text analytics.

By W. H. Inmon

In the beginning, relational data bases attempted to address the challenges of textual data with "blobs." Then there was the technique known as "tagging." After tagging came NLP -- natural language processing. With each new approach in the attempt to manage narrative text came new opportunities that tried to address the challenges of making corporate decisions based on text, which soon appeared in sentiment analysis of customers using NLP.

Indeed, NLP addressed some of the challenges of text and decision making. With NLP, at least people could start to include the sentiment of the customer in the corporate decision making process.

However, there were limitations to NLP processing of customer sentiment. Most sentiment analysis was based on the reading and interpretation of social media data. Although social media data is good for expressing some limited feelings, when it comes to longer, fuller, more sophisticated expressions of feelings, analysis of social media leaves a lot to be desired. In addition, the techniques used in NLP processing are somewhat limited and artificial, in any case.

However you want to look at it, there is only so much value in using NLP processing for sentiment analysis of social media.

Anticipating Lawsuits

There are other, much more productive venues where analytical processing of text can serve as the basis for making decisions based on textual data in the corporation. One of those places is in the anticipation of and prevention of the filing of lawsuits. There is the negative impact on the enterprise's image when a lawsuit is filed. It harms the corporate image when word gets out that a lawsuit (or a series of related lawsuits) has been filed, even if the lawsuit is ultimately successfully defended.

It is estimated that the average settlement of each lawsuit filed costs the enterprise approximately $300,000, a tremendous justification to minimize lawsuits.

With this background information in mind, how is this for an idea: anticipate that a lawsuit is in the works and resolve the problem before the problem turns into a lawsuit. Create an "early warning system" that a lawsuit is brewing and address the issues and the parties involved before the lawsuit is filed. Just think how much money that can save your enterprise!

Two Technological Advances

There are two new technological breakthroughs that enable an early warning system so you can anticipate a lawsuit. The first of these technology advances is the advent of technology known as "textual disambiguation." Textual disambiguation has some similarities to NLP but has some novel features as well. For example, textual disambiguation relies heavily on the identification and usage of the context of text, not just the text itself. Not only is text found and addressed by textual disambiguation but the context of that text is also found and is an equal partner to the text itself. There are other important differences between NLP and textual disambiguation.

The second technological advancement that makes an early warning system for the anticipation of a lawsuit possible is that of "big data." With the technology surrounding Big Data it is now possible to store and process huge amounts of data. In years past there was always a limit on how much data a system could store and process. The limit was both a technological one and an economic one. At the back of every system manager's mind was the notion that the system must not consume too much data.

With the advent of big data that limitation no longer exists. In today's world, it is technologically and economically feasible to build systems of enormous size.

Because of these two technological advances, it is now possible to construct an early warning system for corporate litigation. Now the organization can read, organize and analyze all sorts of textual information in search of the potential lawsuit.


So what kinds of lawsuits can be identified and anticipated? There are MANY kinds of lawsuits that can be detected. Two types of lawsuits (among many others) that are obvious and apparent are employment discrimination lawsuits and product liability lawsuits.

Discrimination lawsuits can focus on sexual, age, religious, racial, and sexual orientation.

Where is the basic early warning information for these types of lawsuits found? The answer is – in lots of places, including e-mails, call center conversations, help desk conversations, warranty claims, and insurance claims. Wherever it is found, inevitably the information is in the form of text.

Heretofore these sources of information have not been able to be effectively analyzed. But with the advances we described, it is now possible to perform analysis that leads to an early detection of the causes for potential lawsuits. It is noted that there are many kinds of lawsuits, just the ones we mentioned.

The Business Value of Anticipating Lawsuits

With an effective early warning system comes the potential of payback measuring in the millions of dollars, and the measurement of the effect of protection and enhancement of the corporate image goes beyond measurement in dollars.

The business value of lawsuit anticipation and prevention eclipses the payback of sentiment analysis of social media.

To Find Out More

For more information about such an early warning system for anticipating lawsuits and how such a system might be built, refer to the newly published book Preventing Litigation: An “Early Warning System to Get Big Value from Big Data” from Business Expert Press. For more information about textual disambiguation implementation and the contextualization of text into a standard data base management system, visit

Bill Inmon has written 54 books published in 9 languages. Bill's company -- Forest Rim Technology -- reads textual narrative and disambiguates the text and places the output in a standard data base. Once in the standard data base, the text can be analyzed using standard analytical tools such as Tableau, Qlikview, Concurrent Technologies, SAS, and many more analytical technologies.

TDWI Membership

Get immediate access to training discounts, video library, research, and more.

Find the right level of Membership for you.