Do Your Analytics Speak to You?
Advances in technology make it possible to deliver your analytics in a new way using natural language generation.
- By Troy Hiltbrand
- September 17, 2019
Now that personal digital assistants are widely accepted around the world, end users are more comfortable with and open to getting information through conversation. These technology-infused conversations present two obstacles for analytics professionals looking to make them more useful as an analytics interface. The first impediment is natural language processing (NLP) -- using human-based conversation as a system input. The second is natural language generation (NLG) -- the machine-based creation of a response.
Although neither of these technologies is new, increased availability of computing power, the growing maturity of analytics algorithms, and the modern collection of human-tagged training data have greatly moved NLP and NLG forward in recent years -- both in academia and industry.
Of the two, NLP is more mature. There are libraries of code in multiple languages with built-in support of natural language processing. Many analytics platforms have added natural language query to their feature list. Natural language generation is more complex but is gaining momentum, and analytics tool providers are taking notice.
How do you leverage this technology within your analytics program? The answer is through progressively maturing your approach by starting with the basics and adding complexity and robustness to your program until you are delivering high-quality, automatically generated narrative based on the data in your systems.
Companies evaluating language generation usually begin by defining template messages as the basis of their natural language generation. At the time the message is needed, the template is merged with a set of input data to create an output message that can be delivered to the end user.
An example of this is an alert that can be the same every time with a few details changed. For example, if you are setting up an appointment system, the message template might be "[First Name], you have an appointment at [Appointment Time]." When you need to send the message, replacing the name and appointment time with data creates an effective message.
The next level of maturity includes adjusting the grammar of the message based on the data to be injected into the message. This would include correct utilization of indefinite articles such as a and an, pluralization, making the verbs agree with the subject, and matching reflexive pronouns with their counterparts. Through the application of defined grammatical rules to a templated message, the sentence will not only deliver information but will also be grammatically correct. This reduces the awkwardness that may arise from system-generated messages.
As narratives get longer and more complex, simple templates no longer are sufficient to effectively communicate information. An example of this is the automatic creation of the narrative surrounding scientific results. Depending on the nature of the results, the narrative would include different levels of content aggregation. If an experiment were successful, the detail level would be different than if the experiment failed. The results may also need to restructure the generated content, prioritizing some details over others, depending on the outcome.
In addition to content restructuring, some circumstances require augmentation of the tone of the output. Consider the automatic creation of financial reports. The tone of a report of positive financial results or upbeat leading indicators is hopeful and upbeat. If the financial results are not favorable, the report must use a somber tone and focus content on corrective actions.
Deep Learning/Generative Adversarial Networks
The future of natural language generation is the use of deep learning and generative adversarial networks to dynamically create complex narratives based on models trained with historical data. With what are being called deepfakes, we have seen the use of multiple layer neural networks and iterative training and refinement to seamlessly superimpose images and videos together. This technology has also been used to add motion to historical still photos and give them life. The networks can also be used to develop robust models based on a corpus of historical documents and to generate new documents with similar structure, tone, and phrasing. As this happens, the length and depth of the generated content will increase. As this matures, the output will become virtually indistinguishable from the output of human writers.
In addition to dynamic narratives, machine learning-based translation engines will make it possible to effectively distribute automatically generated content in multiple languages nearly simultaneously. As artificial intelligence algorithms mature, tools such as Google Translate refine their capability to translate content into many languages, keeping grammar and sentiment intact.
Text to Speech
Users will become more comfortable with conversational agents and they will expect to hear rather than read information. Beyond creating these complex messages, solutions must be able to convert text to speech to be effectively delivered to the end user.
Advances in natural language generation are quickly making it possible for machines to automatically deliver high-quality content. Companies can move beyond delivering simple numbers to their end users and improve their content delivery strategies so their analytics provides context and tone appropriate for the information.
As you examine your analytics program, start with the fundamentals and explore how you can improve grammar, tone, and structure within your narrative output. Watch for opportunities to use new and advanced technologies to completely revolutionize what you can deliver.
Troy Hiltbrand is the chief information officer at Amare Global where he is responsible for its enterprise systems, data architecture, and IT operations. You can reach the author via email.