How to Create an Agile Analytics Process: RIP, Star Schema
Having the proper data modeling and transformation tools will help you create an agile, modern analytics process.
- By John Morrell
- January 24, 2022
For many years analytics have been defined by how data warehouses are structured. Companies would build enormous data warehouses to support their analytics. They would use structured approaches advocated by Bill Inmon and Ralph Kimball, build out star or snowflake schemas, and create data marts as offshoots with multidimensional OLAP models and structures. To generate the analytics results, data was often exported into popular BI tools such as Tableau, Qlik, or PowerBI.
The process was complex. Data warehouse projects would take a minimum of 12 to 24 months. Building each data mart would take another 3 to 6 months. In the end, data was scattered, with the end of the line being the extracts residing within your BI tool of choice.
In the era of modern analytics in a cloud data warehouse, those days are over. In today’s business landscape, new business questions arise daily, requiring rapid analytics-driven answers. Analytics projects need to be agile.
Teams no longer gather large-scale requirements and create all-encompassing star schema-based data models. Every time a new use case comes up (e.g., a new business question needs to be answered via analytics), a data model can rapidly be created to address it.
One reason this approach can be used is because of the scale, flexibility, and cost-efficiency of cloud data warehouses. New virtual data warehouses, models, and schemas can be easily added to scale out the number of use cases while taking advantage of the cost-effective computational power and storage resources of a cloud data warehouse.
However, there is still a snag in the process. It requires complex coding in SQL and Python or Python-like languages (such as Jinja) for data modeling and transformation. Tools such as Dbt give you a nice interactive development environment (IDE) to organize and write your code, but you still need to write a lot of complex code, and it leaves non-programmer analysts out of the process. Analytics teams still heavily depend on data engineering teams to execute new analytics projects and often face delays.
Requirements of Modern Data Transformation Tools
True agility in the modern analytics life cycle requires modern data modeling and transformation tools that embrace the need for analyst self-service. Doing this requires a hybrid code/no-code toolset along with integrated data model cataloging, sharing, collaboration, and discovery.
Organizations armed with hybrid code/no-code data modeling and transformation tools can reach their desired speed in their analytics because:
- Data engineers can do their jobs much faster by mixing SQL and no-code approaches
- Less programming-centric data analysts can model and transform data themselves to speed up their analytics processes
- The data and analytics teams can collaborate on and share data models, so they don’t need to reinvent the wheel with each project
- All data models receive a rich catalog of searchable data documentation to promote discovery and sharing, along with increasing data literacy
A final requirement is deep integration with your cloud data warehouse. This includes creating a searchable inventory of existing schemas and data assets in the data warehouses, deploying new data models into the data warehouse in an optimized manner, integrating security and access controls, and leveraging the scalable, cost-efficient computational power and storage resources of the cloud data warehouse.
Creating a Modern Analytics Process
Hybrid code/no-code data modeling and transformation tools with deep cloud data warehouse integration allow organizations to create an agile “fail fast” analytics process. Create a data model; does it answer the question? If not, create another; and if that still does not answer the question, create another. Once it answers the question, deliver your analytics.
Analysts can perform this agile, modern analytics process themselves without external dependencies, delays, or interruptions. New analytics projects are executed with much greater speed and efficiency.
When new data sources are required, data engineers can rapidly shape raw data into a usable form. They can also ensure data models are deployed optimally for efficient execution within the cloud data warehouse. They can also provide oversight for data pipelines to ensure smooth execution and proper security and governance.
Finally, both teams can collaborate in multiple ways. Teams can work together on models to build them efficiently and to specifications the first time by leveraging their unique skills. They can also share their unique knowledge of the data to further promote the proper use of data and data literacy.
Where Do I Go From Here?
Data modeling and transformation tools are now the linchpins in your modern analytics process. Having the proper tools that get your entire team involved will create agility in your process.
If you seek to create an agile, modern analytics process, your next step is to explore newer data modeling and transformation tools. In each case, look to see if the tool supports the requirements detailed above so you can create an effective “fail fast” agile analytics process.
John Morrell is senior director for product evangelism at Datameer where he is responsible for technical marketing and go-to-market efforts. He brings over 25 years of experience in data management software. You can reach the author via LinkedIn.