"Turning Text into Gold" Offers Concise Overview of Text Analytics
This short introduction to taxonomies and text analysis is perfect for the analytics newbie or manager who wants an easy-to-understand overview of the technology and its benefits.
- By James E. Powell
- May 19, 2017
There's plenty of information in your data; the trick is to find it. Bill Inmon's new book, Turning Text into Gold (2017, Technics Publications, $24.95), provides a smart introduction to taxonomies and text analytics. If your (or your manager's) eyes glaze over when you start to hear a litany of tech terms, you'll appreciate Inmon's direct, down-to-earth, keep-it-simple approach. He keeps tech babble to a minimum, offers plenty of examples, and doesn't speak down to you.
The author explains the fundamentals of taxonomies -- what they are, how they work, and why they're important. He starts with simple examples (the taxonomy of cars, of animals, and of sports), then explains more complex arrangements (hierarchical taxonomies and relationships between taxonomies, for example).
He's also keen to explain where taxonomies come from -- the advantages of homegrown taxonomies and the benefits and challenges of buying commercial taxonomies geared to your industry (and why you may want to customize them). He explores the world of taxonomies as they apply to data models.
Inmon discusses how enterprises combine text from multiple sources to take advantage of integrated data and the role taxonomies play in this. His everyday examples of text analytics make it clear where all the "gotchas" are hiding.
The first half of the book takes you through the tech terms with clear examples, but the meat of the book comes in the second half, in which Inmon examines several real-world examples of how taxonomies are used to turn text into decisions. This includes discussion of textual disambiguation -- a fancy term for processing text into a form that lets you take action, for example, turning social media comments into a weekly report on your enterprise's reputation. Inmon tackles how companies can take that social media data and other unstructured data and merge it with records in your existing data warehouse.
I particularly appreciated the copious examples that illustrate the key points of his text. It's also helpful that, at 122 pages divided into short chapters, Inmon keeps things moving along. He provides just enough detail so you'll understand the fundamentals without feeling as if the book is "dumbed down."
Inmon has a good sense of what the reader is thinking. The first few chapters continually raised new questions for me, and remarkably, he always seemed to answer my question on the very next page. It was uncanny.
If you're pressed for time but want to know more about how you can turn unstructured text into meaningful, analytics-ready data, Inmon's book is a great place to start.
About the Author
James E. Powell is the editorial director of TDWI, including research reports, the Business Intelligence Journal, and Upside newsletter. You can contact him
via email here.