BI Tools and Data Responsibility
Although well-managed, integrated environments are still important, modern BI tools will need to get data from a variety of sources, not just those sanctioned and blessed by IT.
By Bob Potter, Senior Vice President and General Manager, Rocket Software
Traditionally, industry-leading BI tool vendors relied heavily on other technologies to access, transform, cleanse, integrate, and manage the data their tools would bring to life in the form of reports, briefing books and dashboards. Skilled IT professionals created data warehouses, data marts, OLAP cubes, and other intermediary managed environments. These environments were modeled and developed based on well-understood rules about how business worked at the time the BI project started.
For the most part, these were reliable systems, and decision makers became accustomed to the 24-48 hour delay in seeing what was going on in their business. However, if aspects of the business changed, the decision makers were left at a severe disadvantage because it was extremely difficult to adapt the systems and models to accommodate the changes in a reasonable timeframe.
This challenge opened the door for a new BI genre to emerge -- data discovery -- with its focus on improving the experience of the business users and decision makers. Data discovery was transformative because business users no longer had to rely on someone else to prepare the data or wait on the IT professionals to protect and control access to the data. A user simply connected to a data source with an ODBC driver and pulled the data into an in-memory prepared file that was already dashboard-aware. Voilà -- instant insight! More data discovery vendors entered the market over the last decade, capitalizing on the needs of the business user.
I've been hearing rumors about the demise of the data warehouse for over 20 years, yet data warehouses are still a common facet of IT infrastructure. That being said, the era of real-time analytics and business intelligence that can be traced back to the source system for compliance purposes is definitely upon us. BI vendors can't abdicate their responsibilities when it comes to guaranteeing the reliability of the data. Even the data in phenomenally expensive data warehouses can be wrong. Although well-managed, integrated environments still have their role, modern BI tools will need to get data from a variety of sources in addition to those sanctioned and blessed by IT.
What are the implications for modern BI tools? It starts with access. A vendor-supplied ODBC or JDBC driver, typically offered by the customer's database vendor, is not enough to enable the desired coverage for most BI applications. Neither is accessing everything through a CSV file, which almost every transactional system can generate. That's like drinking an extra thick milkshake through a very narrow straw. The BI vendor must take responsibility for getting to the source data with a high-quality, high-performance connector framework or portfolio of testing connectors.
Additionally, the BI vendor must provide some integration capabilities. This can be done through a data virtualization engine or a dataflow engine that reconciles disparate data on its way to being joined, aggregated, filtered, assembled, processed, and so on. Most BI tools do some calculations on the data and it would be negligent not to provide some level of integration and governance to help prevent decision makers from making risky or flawed decisions that negatively impact careers and businesses.
Finally, whatever persistent storage mechanism is employed by the BI vendor -- and today these are mostly in-memory data stores -- they must be synchronized with an enterprise's existing data warehouses and operational data stores. Changed data and newly created data from the sources and warehouses must be reflected in the BI models quickly, accurately and efficiently. Low latency has new meaning these days. Whether you're preventing fraud, stopping a crime, getting inventory to shelves or keeping the plant running, I can guarantee you, 24 to 48 hours of fresh insight is not going to cut it.
The BI data fabric has to be fast, adaptive, and agile -- but most important, it has to be reliable. That's why all the top BI vendors are investing in their data preparation and metadata management facilities. Beautiful visualizations sell, but companies are peeling back the veneer and seeing how robust those data layers are in the BI products.
Bob Potter is senior vice president and general manager of Rocket Software's business information/analytics business unit. He has spent 33 years in the software industry with start-ups and mid-size and large public companies with a focus on BI and data analytics. You can contact the author at firstname.lastname@example.org.