Skip to main content

TDWI Articles

Agentic BI Is Still Not Ready for Enterprise Prime Time

The limitations of LLM-powered analysis tools mean we should be cautious in trusting them.

Agentic productivity tools are everywhere, and they’ve invaded the business analytics field in force.

Personal LLM Experiments

Like many professionals, I’ve experimented with the BI, analytics, and data visualization capabilities of ChatGPT, Copilot, and other LLM-powered tools. On the whole, I’m quite impressed with their ability to use natural language queries to accelerate distillation of quick insights from spreadsheets, CSV files, and other structured data files.

But I’m also a bit dismayed by the limitations of these tools. For example, I’ve used ChatGPT Data Analyst to do trend analyses, forecasts, and what-if simulations of my investment portfolio, only to find that results from my ad hoc analyses deviate greatly from corresponding analyses prepared by my professional wealth manager using his firm’s high-powered analytics tools.

This parallels my dissatisfaction with ChatGPT’s tendency to hallucinate wildly when analyzing my personal corpus of unstructured content, both professional publications and self-indulgent poetic noodlings. For example, the generative AI tool has no qualms about ignoring the existence of two-thirds of my published works and fabricating titles that I know perfectly well I never wrote.

So I have serious misgivings whether agentic, generative, and other LLM-powered analytics tools are ready for prime-time deployment in our lives.

Problems with Using Generative AI for BI

There are clear disadvantages to using LLMs to extract and analyze business intelligence. Their main shortcomings include:

  • Propensity to hallucinate: LLMs are probabilistic language generators, not deterministic analytic engines. They may fabricate facts, statistics, or citations that look plausible. Furthermore, their generated explanations can be internally coherent but analytically wrong. When summarizing data sets, the model may invent correlations or causal relationships. This problem may be especially acute when models are used to synthesize information from unstructured sources.
  • Weakness at quantitative reasoning: LLMs struggle with rigorous numeric analysis compared to traditional BI tools. They often commit arithmetic errors in multi-step calculations. They may misinterpret statistical distributions. And they may confuse correlations with causation.
  • Failure to account for data lineage and traceability: Traditional BI pipelines emphasize data lineage—the ability to trace results back to specific data sources and transformations. This capability is conspicuous by its absence from LLM-powered analytics tools. These typically produce opaque reasoning chains and fail to cite source records. They may also surreptitiously blend multiple input records into entirely synthetic, hence mostly false, data.
  • Susceptibility to context rot: Agentic BI tools can easily fail at the historical and multidimensional analysis that is central to enterprise decision-making. This happens when the contextual variables framing an analytic result exceed an LLM’s context window. That produces what is often called context rot. Token limits truncate historical information. Summaries compress earlier reasoning inaccurately. Agents that recursively call models accumulate semantic drift. Consequently, insights generated late in an analysis session may no longer reflect the original data and inferences may diverge from underlying evidence. This context rot may be particularly prevalent in agentic BI systems that autonomously run multi-step analyses.

I’m not entirely down on agentic business intelligence tools. But LLMs work best as interfaces to business analytic engines, not replacements for them.

Final Thoughts

Enterprises should regard LLMs as natural-language interfaces to BI platforms. They are especially useful as summarization layers over validated analytic back-ends. They are also useful as explanation generators for analytics delivered through dashboards.

Nevertheless, organizations should keep their production analytics in deterministic systems rather than trust their single version of truth to the proverbial stochastic parrot.

About the Author

James Kobielus is a veteran industry analyst, consultant, author, speaker, and blogger in analytics and data management. He was recently the senior director of research for data management at TDWI, where he focused on data management, artificial intelligence, and cloud computing. Previously, Kobielus held positions at Futurum Research, SiliconANGLEWikibon, Forrester Research, Current Analysis, and the Burton Group. He has also served as senior program director, product marketing for big data analytics for IBM, where he was both a subject matter expert and a strategist on thought leadership and content marketing programs targeted at the data science community. You can reach him on X (@jameskobielus) and on LinkedIn (https://www.linkedin.com/in/jameskobielus/).


TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.