The Inferencing Cost Problem No One Is Talking About: Unstructured Data Quality
How much is powering AI with poor-quality data costing your enterprise?
- By Krishna Subramanian
- June 11, 2026
Enterprise AI budgets are expanding at a pace that is making CFOs pay close attention. According to a 2026 survey of 2,360 senior executives, companies expect to spend approximately 1.7% of their revenue on AI this year, more than double the 0.8% average from 2025. Measuring ROI from AI is still a nascent practice. CFOs and IT leaders will need to manage AI related spend with a fine-toothed comb, especially since 60% of IT organizations are not increasing their budgets for AI.
The Meter Is Always Running
When deploying AI in production, the dominant cost is not the one-time investment in training a model. It is inferencing: the compute cost incurred every time a model generates a response. Every API call, every prompt, and every user interaction triggers a billable event.
Retrieval-augmented generation (RAG) has become a valuable strategy. Instead of feeding an entire data set into AI, RAG can be directed to pull in only the most relevant documents or facts from an external source when they are needed.
The FinOps community has developed frameworks for managing these inferencing costs: model routing, prompt compression, caching, batching, and tighter context window management. These are legitimate levers and the best engineering teams are pulling them hard. But there is a dimension of inferencing cost that the FinOps conversation has largely skipped over: the quality of the unstructured data being fed into AI pipelines.
Unstructured Data: Unknown and Unexamined
Enterprises are sitting on enormous volumes of unstructured data. Documents, images, medical files, contracts, research archives, email threads, and more form the raw material for many of the most valuable AI use cases. But unstructured data is largely unclassified, unsegmented, and unknown. Adding context and structure to this data is imperative for AI and other needs, yet the challenge is different than structured days of old.
File data is vast, growing quickly, and sometimes hidden, compared to data parked in a database. Adding context so that it can be searched and curated for AI requires adding metadata that serves as labels. Due to the size of unstructured data, metadata enrichment must be automated and ongoing as new data is created. For instance, a hospital may want to extract DICOM image header data to indicate the body part and type of study, appealing to clinical researchers using AI for diagnostic analysis.
Why Metadata Management Matters to the CFO
Inferencing costs are billed by the token. The number of tokens processed per request is the single most controllable lever in the cost equation. The most direct way to inflate token count is to feed an AI pipeline data that is poorly curated, undifferentiated, unfiltered, redundant, or contextually inappropriate for the task at hand.
It can be extremely expensive, not to mention time-consuming, to move large amounts of unstructured data for each AI process, especially when much of the data is unnecessary. The IT teams that understand this are not managing compute costs after the fact, but instead managing data quality before the inference request is ever made. That is a fundamentally different and far more efficient posture.
What Metadata Enrichment Does for the Balance Sheet
When unstructured data is enriched with contextual metadata, users can search and filter it by keywords per their interest and project requirements. Then it is possible to send the right data to an AI pipeline rather than all available data.
The cost difference is not marginal. By investing in metadata enrichment for unstructured data, you can reduce AI compute and storage costs by up to 80% by feeding only the right data into expensive GPU pipelines. That figure should get the attention of CFOs and CIOs building a multi-year AI budget where infrastructure costs can run $10 or $20 million annually.
The logic mirrors a well-established FinOps principle: match workloads to the right compute tier. The same discipline applies one layer down. Matching AI workloads to the right data, curated precisely for the task, avoids the waste of processing irrelevant, duplicate, or inappropriate content at model inference time.
A Healthcare Case That Quantifies the Gap
The NewYork-Presbyterian digital pathology initiative is a useful proof point on the financial ROI of metadata for AI. The hospital's IT team is managing an AI workflow for scanned lab image files, a high-volume and data-intensive domain where AI is delivering faster, more accurate diagnostic results to overstretched clinical teams. Yet high cloud storage costs were becoming a barrier.
By implementing a curated, metadata-driven approach with unstructured data management software, the team sent only the most relevant and recent files to cloud storage for AI processing. The automated workflow removed data copies after 30 days rather than accumulating everything indefinitely. The result was a 96% reduction in cloud storage costs and 10x faster AI data ingestion speeds. Persistent cloud storage dropped from 1 petabyte down to a rolling 33 terabytes.
The discipline of knowing which data matters, tagging it accurately, and routing only that data through the AI pipeline is what produced the economics. When less irrelevant data enters the pipeline, less compute is consumed to process it. Deleting data from cloud storage after jobs have completed was a significant contributor to the savings.
The Governance Risk of Unstructured Data Quality Problems
An additional cost vector connected to unstructured data quality is the cost of feeding the wrong data to an AI model. This includes prompting models on outdated documents, duplicate files, internal drafts that should never have been shared, or data containing regulated information.
The ethical and legal consequences range from degraded model accuracy delivering suboptimal outputs to libelous or damaging outcomes. With PII and IP exposure come regulatory injunctions and fines, loss of competitive advantage, and customer defection if PII is exposed. Some of these costs are harder to measure than token spend, but they are real and they scale with data volume.
A metadata-enriched approach gives organizations several advantages to mitigate these risks:
- The ability to discover and exclude all PII from AI pipelines based on policy, before the data ever reaches a model.
- The ability to scan specific file shares, directories, or data center sites for sensitive keywords that are unique to the organization.
- Scan and delete files based on date or owner (such as ex-employees or C-suite) to purge data estates of irrelevant, old, and protected information.
- Scan and delete duplicate files based on XYZ metadata.
The Reframe Finance Leaders Need
The enterprise AI cost conversation has matured considerably. FinOps frameworks now exist for GPU procurement, model routing, token budgeting, and unit economics at the inference level. That progress is real and valuable.
But sustainable AI economics require one more discipline: treating data preparation, governance, and metadata enrichment as a cost management function. The quality of data entering an AI pipeline directly determines the volume of compute that pipeline consumes, along with storage costs. Organizations that bring the same financial rigor to their unstructured data layer that they now apply to cloud infrastructure will find themselves with a durable advantage. Lower and more predictable inferencing costs, better model performance, and a much cleaner answer when the board asks what they are getting for the AI spend.
The meter is running. What feeds it matters as much as how fast it runs.
About the Author
Krishna Subramanian is COO, president, and co-founder of Komprise. In her career, Subramanian has built three successful venture-backed IT businesses and was named a “2021 Top 100 Women of Influence” by Silicon Valley Business Journal. You can reach the author via email, Twitter, or LinkedIn.