Prepare For Tomorrow By Creating an Agile Data Warehouse Environment Today
Be agile in more than just your development projects or prototypes. We offer 11 tools and techniques to help you build an agile data warehouse environment.
- By Mike Schiff
- September 30, 2014
Although most data warehouse practitioners have embraced agility as a best practice, some may not recognize that it extends beyond software development or prototyping efforts.
Most of us recognize that our organizations will have future analysis needs that are difficult, if not impossible, to predict today. We can, however, take steps to create an agile environment that will help prepare our organizations to quickly react when these needs arise. To do so, we need to embrace technologies and strategies that will allow our organizations to respond to new requirements arising from internal and external factors, such as changing customer preferences, competitor moves, government regulations and compliance or tax rules, new technologies, or even management changes and reorganizations within our own organizations.
For instance, when a company hires a new chief financial, operational, or marketing officer, the new executive is likely to ask (or most likely demand!) reports and analyses that were not previously generated. Some of these may be relatively simple; others are likely to be complex and might require access to data not previously collected.
Consider also how, in their early days, data warehouses were used almost exclusively to analyze and compare historic values. Today, they could contain data to analyze in real time as part of an operational application to up-sell or cross-sell customers when they order products or to prevent fraud.
Consider the vast amounts of data generated by social media. Many organizations now follow and analyze this data for both offensive (e.g., recognize and react to new sales opportunities) and defensive (e.g., quickly react to negative comments or concerns) purposes. A decade or so ago, many of today's social media data sources did not exist; we can safely assume that some of tomorrow's data sources do not exist today.
For instance, although many organizations currently analyze data generated by RFID chips, the much hyped Internet of things may soon lead to vast amounts of data generated by Internet-enabled devices that organizations will want to analyze. I suspect that today's "big data" may give way to tomorrow's "vast data." Think of applications that could use data generated by mobile devices, motor vehicles, public monitoring devices (such as traffic or street cameras), personal fitness monitoring devices (e.g. Fitbits), pacemakers, vending machines, smart homes, or even your refrigerator or coffee maker.
Despite the uncertainty of future requirements, we can assume that tomorrow's data warehouse environments will need to accommodate new data sources, database structures, analytic tools, and delivery vehicles. To prepare for these, our organizations need to create agile data warehouse environments.
Tools and techniques that can facilitate data warehouse agility include (but are certainly not limited to):
- Development methodologies and rapid prototyping so that the analysis is timely rather than only available long after the opportunity has passed.
- Cloud computing options allow us to quickly ramp up when additional storage and processing power are needed or utilize on-demand business intelligence applications.
- Virtualization to leverage and optimize what is already in place and thus reduce the need for additional hardware.
- An overall data warehouse architecture that is not limited to enterprise data warehouses but also includes special-purpose appliances, operational data stores, and data marts.
- Data virtualization techniques that access data stored in multiple operational systems without first moving this data into a data warehouse. Caution: make sure you are not integrating apples with oranges. Just because two systems have a common data element name this does not guarantee that they have the same unit-of-measure or even represent the same data element.
- Database options that embrace data structures in addition to relational and column-oriented databases. If not already doing so, organizations need to gain experience with NoSQL data structures and its numerous subsets including key-value, big table, document, and graph.
- Business intelligence tools that allow business users to generate their own reports and analyses with minimal dependence on IT staff; examples include interactive dashboards and generalized parameter-driven analysis templates.
- Closely monitoring the data integration market so you can quickly find vendors of tools that can access future data sources.
- Following advances in mobile technology and experimenting with delivering reports and analyses to new devices so we are prepared when our users (or perhaps only our CEO!) acquire and embrace them.
- Relaxation of policies such as those that forbid acquisition of technology from vendors not already on an "approved vendor" list so that appropriate technology can be quickly acquired and deployed with a minimum of organizational bureaucracy, especially when responding to short-term or one-time user requests.
- Data policies that allow new sources to be collected and analyzed while still protecting the overall integrity of the data in an enterprise data warehouse. For instance, an appliance might be a suitable platform for one-time or limited-scope (e.g., departmental) analyses.
In summary, organizations should take steps today so they will be sufficiently agile to meet tomorrow's analytical requirements. Data warehouse environments should be able to quickly react to new data content and new data sources, changing analysis requirements, and new applications. These environments must provide quicker response times, handle increasing usage demands, and support new delivery vehicles. They should do so by creating a data warehouse environment that leverages both in-house and on-demand technology, reduces bureaucracy, embraces a variety of data warehouse platforms, data structures, and analysis tools and can accommodate additional delivery vehicles.