By using website you agree to our use of cookies as described in our cookie policy. Learn More

TDWI Upside - Where Data Means Business

Tackling Information Overload in the Age of AI

Agile decision-making is often hampered by the volume and complexity of unstructured data. That’s where AI can help.

In 2022, the U.S. Congress passed the Inflation Reduction Act (IRA), which allocated billions in investment to clean energy. This set off a race among private equity and credit firms to identify potential beneficiaries -- the companies throughout the clean energy supply chain that may need additional capital to take advantage of the new opportunities the IRA would create.

It turned out to be quite a data challenge.

For Further Reading:

How RAG Will Usher In the Next Generation of LLMs and Generative AI

How to Deploy Generative AI Effectively in 2024

Generative AI and Its Implications for Data and Analytics

Clean energy supply chains are an intricate network of tens of thousands of companies that produce and process raw materials, manufacture and install equipment, generate and transmit clean energy, and lead the development of large and complex clean energy projects.

Private markets firms struggle with data just like every knowledge-intensive organization -- those in the business of collecting, creating, and applying knowledge. There are highly skilled and compensated professionals in every one of these organizations who must work with the data to solve problems and make informed decisions.

For private equity and credit, this kind of data work answers strategic questions: Can we identify deals quickly and engage before other firms? Can we screen out poor fits quickly and move on? Can we conduct thorough due diligence quickly and close fast?

There are many private companies and good data on them is hard to come by. However, there’s far more information about them than ever before due to the explosion of digital media, and it’s usually hidden in noise. Private company operations have also grown more complex over time as their ability to source materials and market their products from around the world has increased considerably. There’s more information to understand about the risks and opportunities associated with any given company.

The problem for knowledge professionals such as those in private markets is that while they excel at analyzing data, complex reasoning, and using those insights to make decisions, they just aren’t spending much of their time doing those things. Rather, they’re bogged down by tedious and repetitive data work, $200-per-hour workers performing $20-per-hour work.

Information overload is the default state of today’s knowledge worker, who invests much of his or her time managing it in highly inefficient ways because there’s been no alternative available.

The Problem: Unstructured Data

In private markets, this means professionals spend much of their time sourcing and sifting through reams of company filings, news articles, lawsuits, shipping data, sustainability reports in the form of web pages, PDFs, documents, charts, and images. In healthcare, it’s patient data, medical research, and protocols. In law, it’s legal documents, case law, and evidence. The story is the same in engineering, scientific research, education, and many other industries.

The reason this story is so universal is that the kind of information that drives knowledge-intensive workflows is unstructured data, which has stubbornly resisted the automation wave that has taken on so many other enterprise workflows using software and software-as-a-service (SaaS). SaaS has empowered teams with tools they can use to efficiently manage a wide variety of workflows involving structured data.

However, SaaS offerings have been unable to take on the core “jobs to be done” in the knowledge-intensive enterprise because they can’t read and understand unstructured data. They aren’t capable of performing human-like services with autonomous decision-making abilities. As a result, knowledge workers are still stuck doing a lot of monotonous and undifferentiated data work.

However, newly available large language models (LLMs) and generative AI excel at processing and extracting meaning from unstructured data. LLM-powered “AI agents” can perform services such as reading and summarizing content and prioritizing work and can automate multistage knowledge workflows autonomously.

Tech analyst Benedict Evans describes these AI agents as “infinite interns.” Consider all of the $20-per-hour jobs to be done in knowledge-based organizations -- work ideally suited for bright and capable interns to take on -- if only recruiting, training, and managing such a large group of interns was economically feasible.

Armed with an “infinite number of interns” in the form of generative AI agents or “copilots,” it is now economically feasible to liberate knowledge workers from tedious and repetitive data work so they can focus on what they do best: higher-value analysis and strategic decision-making.


However, deploying a virtual army of AI agents is not without its challenges, and it can’t be accomplished using LLMs alone. Although the data used to train leading LLMs is massive, those LLMs lack access to the domain-specific data required for enterprise workflows. That’s why tools such as ChatGPT are unsuitable. They are great at producing well-written responses, but they can’t give us the answers we need, and they tend to hallucinate.

By integrating information retrieval-based models with LLMs (generative models), it is possible to build and deploy generative AI tools that can access external data and limit output to only information derived from that data. A technique called retrieval-augmented generation (RAG) can be used to autonomously handle complex knowledge-intensive workflows where accuracy and trust requirements are high and output needs to be appropriately fine-tuned for specific use cases.

In our private markets example, professionals will leverage “infinite interns” to sift through the constant flow of news, filings, and other company data so they can focus on identifying and prescreening potential deals faster, conducting due diligence quickly and efficiently, and surfacing risks before they blow up into front-page controversies.

Generative AI will bring a platform shift in the knowledge-intensive enterprise comparable in its potential impact to the transformative role of cloud computing and SaaS over the past two decades and reshape the nature of knowledge work.

About the Author

Chandini Jain is the founder and CEO of Auquan, an AI innovator transforming the world’s unstructured data into actionable intelligence for financial services customers. Prior to founding Auquan, Jain spent 10 years in global finance, working as a trader at Optiver and Deutsche Bank. She is a recognized expert and speaker in the field of using AI for investment and ESG risk management. Jain holds a master's degree in mechanical engineering/computational science from the University of Illinois at Urbana-Champaign and a B.Tech from IIT Kanpur.

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.