Analysis: Aurora Is Amazon's Answer for Forgotten DBMS Users
Amazon's Aurora may close the gap between spreadsheets and full-blown RDBMSs for large enterprises.
- By Troy Hiltbrand
- July 22, 2016
When your enterprise launches an analytics initiative, you search for a tool to help start producing insights quickly and relatively inexpensively. Spreadsheets are user-friendly and widely available, so they are often the technology of choice for nascent analytics teams.
These spreadsheets become a valuable asset because they hold critical enterprise data, sometimes as the single source of truth. These spreadsheets allow your analysts to perform complex analysis with advanced features such as pivot tables and what-if scenario planning, in-spreadsheet scripting, and a plethora of built-in statistical and graphing functions.
I have seen first-hand that as your organization grows, you will reach the limits of data storage and manageability that a spreadsheet can handle, and you will need to have centralized data management and sharing across the user base. At this stage in your analytics life cycle, the cost and complexity of implementing data governance in a spreadsheet-only model becomes untenable.
As the spreadsheet management burden weighs on the analytics program, your organization might scramble to find an alternative that provides enterprise features such as centralized data storage and management. In your search, you must face a key question: should you upgrade to an expensive, commercial database solution?
Pricey Alternatives
There are relatively few large-scale providers offering robust and mature database solutions. These solutions are expensive, often based on a pricey per-core model that is a huge obstacle for growing organizations.
At Kyäni, we found ourselves in just such a situation. Outside of our main transactional system, we were limited to a combination of small MySQL databases and spreadsheets for analytics. We needed an operational data store to support an enterprise-level advanced analytics program, one that included streaming analytics.
Many companies look at open source databases, which provide many of the same basic features as the traditional commercial platforms without the exorbitant price tag. Unfortunately, as enterprises continue to grow and the scale of the data increases, these open source databases start to require creative solutions to scale up to the enterprise-class solutions that the commercial database systems provide out of the box.
At this point, an organization such as ours has to determine whether to abandon an open-source solution and migrate all data and logic to a traditional platform or continue to build an infrastructure around the open source database to keep it working under an ever-increasing load.
This is where Amazon saw an underserved niche in the market and released a solution which addresses customers in our situation. At the time, Amazon was already invested in providing their relational data services (RDS) platform, where companies spin up both commercially available and open-source instances of databases and pay for them on a monthly basis. The limitation they faced was that the open source databases still weren't scalable and the licensing of commercial databases hindered a smooth growth curve because of stair-step incremental core-based capital pricing for the underlying database technology.
For example, we had built our data and logic in an Amazon RDS MySQL database that grew to over 250 GB of data and thousands of lines of stored procedure code, but we quickly started to feel the burden of scale in the open source environment.
Affordable Speed
This is why Amazon added a service to their catalog named Amazon Aurora. Amazon is promising its subscribers the speed of a commercial database at a more affordable price.
The key to Aurora is that Amazon took the base code of MySQL and engineered around its limitations by integrating it with their existing Amazon Web Services (AWS) infrastructure. This creates a hybrid model that meets the needs of organizations not ready to buy into the core-based model of commercial database software. As a leader in cloud storage management, Amazon has taken lessons learned in that market and built much of the enhancements in Aurora at the storage layer of the database, making Aurora highly scalable and durable.
With the feature parity between MySQL and Amazon Aurora, we were able to migrate our data and embedded logic over to Aurora in less than four hours and have been running smoothly for almost one year.
Amazon Aurora is not an open source database and does not purport to be. It is built on the Open Source MySQL Edition and is compatible with the syntax and functionality of the Open Source MySQL Edition, but it is a proprietary, closed-source database.
High-End Features
Amazon Aurora offers many of the features and functionality available in high-end commercial database software, such as pushbutton scaling, automatic patching, failure detection, failover to provide for a self-healing environment, and real-time read-only instances of the data to scale reads over multiple instances.
This functionality built on top of an open source base allows Amazon to provide customers with a hybrid option for enterprise databases. It allows for incremental cloud model-based pricing with enterprise-level support and features of the commercial database. At the same time, Amazon's pricing model allowed our company to scale our databases as needed without any capital-based up-front licensing by simply paying for what we consume.
Amazon claims that they can provide up to five times the performance of the open-source MySQL at a price point that is one-tenth that of a commercial database while delivering similar performance and availability.
Are there limitations to Amazon Aurora? As with any solution, certainly there are. Because the solution is not open source, the code base is not available and cannot be optimized to meet a specific user's needs. In addition, it is only MySQL compatible, so as new features are released in the open-source version of MySQL, they will not be instantly available in Amazon Aurora but will likely be added incrementally to the Amazon offering.
As storage is the key to Amazon Aurora's platform, it limits users to only using the InnoDB storage mechanism. This means that MyISAM is not available to Aurora users.
Finally, as an enterprise solution, Aurora is not available on smaller instances. Small customers may need to use MySQL in Amazon's RDS, but later migration to Amazon Aurora is fairly simple.
Limitations
Amazon Aurora is not the end-all, be-all in database management software for every organization, but it does provide a solution for our segment of the market, which I think has been long underserved. It may be valuable for other companies that want the high-end features of enterprise database management software but don't have the means or willingness to invest up-front capital in core-based licensing.
For this purpose, Amazon Aurora will be a market competitor in the future and a viable option for organizations looking for a cloud-based alternative for their enterprise databases.
About the Author
Troy Hiltbrand is the senior vice president of digital product management and analytics at Partner.co where he is responsible for its enterprise analytics and digital product strategy. You can reach the author via email.