3 Major Decisions When Choosing Your Data Platform
The many possibilities for data platforms create headaches for the data architects, but this multi-faceted decision can be broken down into three decisions.
- By William McKnight
- March 31, 2017
Most data architects need to make quick database platform decisions to support business applications. The applications teams notoriously fail to include the data team until late in the game, by which time deadlines have been set. This is an organizational problem that should be fixed, but meanwhile, data architects can still prepare themselves mentally and plan their process ahead of time.
Today there are new consequences of the database platform decision. Those who continually reach for the same hammer to fix every problem will soon find they are not aspiring to their larger role in the organization to improve the data maturity. Departments that never take the time to place a workload in the cloud or that never consider a non-relational solution are creating technical debt that will come back to haunt the shop and the data architect.
Decision #1: The Data Store Type
When considering the platform for a workload, there are now three major decisions (amid numerous other decisions you'll have to make). There used to be just one -- the data store itself. It used to be everything goes in a database, and largely the same database at that. Now, even the use of a database is very much in question, with file-based scale-out systems such as Hadoop and NoSQL providing immense utility for big data in particular.
The largest factor for distinguishing between databases and file-based scale-out system utilization is the data profile. The latter is best for data that fits the loose label of 'unstructured' (or semi-structured) data, while more traditional data -- and smaller volumes of all data -- still belong in a relational database.
Decision #2: Data Store Placement
You must also decide where to place your data store -- on-premises or in the cloud (and which cloud). In the past, the only clear choice for most organizations was on-premises data. However, the costs of scale are gnawing away at the notion that this remains the best approach for a data platform. For more on why databases are moving to the cloud, please read this article.
Decision #3: The Workload Architecture
Finally, you must keep in mind the distinction between operational or analytical workloads. Short transactional requests and more complex (often longer) analytics requests demand different architectures. Analytics databases, though quite diverse, are the preferred platforms for the analytics workload.
So many capabilities have been added to previously transactional-focused databases such as in-database analytics, in-memory capabilities, columnar orientations, new languages, new data types, etc. that we have many more analytic databases than before. If these types of capabilities are not being used, it is not, effectively, an analytic database and there is a mismatch with an analytical workload.
A Final Word
These three decisions are interrelated, especially the last two. If you're going cloud, or might move to the cloud in the future, you should look at database integration with the cloud. Several offerings have been able to leapfrog databases with much more history by being "born in the cloud" and tightly integrating with it.
Taken together, these three decisions can get you into a reasonable platform shortlist, tremendously increasing your odds for project success.
For Further Reading:
About the Author
McKnight Consulting Group is led by William McKnight. He serves as strategist, lead enterprise information architect, and program manager for sites worldwide utilizing the disciplines of data warehousing, master data management, business intelligence, and big data. Many of his clients have gone public with their success stories. McKnight has published hundreds of articles and white papers and given hundreds of international keynotes and public seminars. His teams’ implementations from both IT and consultant positions have won awards for best practices. William is a former IT VP of a Fortune 50 company and a former engineer of DB2 at IBM, and holds an MBA. He is author of the book Information Management: Strategies for Gaining a Competitive Advantage with Data.