CEO Perspective: Future Trends in PostgreSQL Performance
Upside spoke recently with Swarm64 CEO Thomas Richter about how organizations can take advantage of performance improvements for open source database management system PostgreSQL.
- By James E. Powell
- September 15, 2020
Upside: Where do you see the most potential benefit for PostgreSQL performance?
Thomas Richter: PostgreSQL has always been a solid transactional performance engine, but there is a big opportunity to speed up its query processing. With improved query performance, PostgreSQL can be used to analyze data at scale, so companies can save tons of money by using free open source PostgreSQL for large-scale reporting and analytics instead of expensive proprietary data warehouse databases.
In what ways will faster PostgreSQL impact data warehousing and BI?
BI and data warehousing are two areas that, historically, have not had many good free, open source SQL database options. We’re seeing many Fortune 1000 companies exploring open source options such as PostgreSQL as they modernize aging legacy data warehouse platforms.
How must PostgreSQL change to better support data warehousing and BI?
There are some important improvements that are required -- even before you start talking about scale out and distributed database processing. Analytics workloads require a database with more parallel processing, columnar indexing to reduce I/O, and highly efficient resource utilization. Databases such as Oracle and SQL Server have an advantage over PostgreSQL in these areas, which is why they’re more commonly used in BI. Database acceleration extensions enhance PostgreSQL with these features, giving people a nice open source alternative to costly proprietary options.
PostgreSQL has been used as the basis for data warehousing databases in the past. How is database acceleration technology different?
Yes, PostgreSQL has been used as part of the foundation for many successful data warehousing solutions, including Netezza, Greenplum, and even Amazon Redshift. However, using any of these derivatives of PostgreSQL requires that people move off of free open source PostgreSQL. Database acceleration technologies like Swarm64 extend but do not replace native PostgreSQL, so companies get much faster performance without having to give up free open source PostgreSQL or move onto a niche version of it.
How big a performance difference does database acceleration make for PostgreSQL users?
It varies, of course, depending on the setup, the data, the queries, and so on. In our labs, the PostgreSQL acceleration extensions speed up PostgreSQL 12 performance on the TPC-H benchmark by 19 times overall, with some queries running as much as 65 times faster.
In user settings, we’ve seen the following:
-- Turbit Systems analyzes time series data from wind turbines, and users need subsecond query responses. Accelerated PostgreSQL queries ran 5.5 times faster than MongoDB, and delivered subsecond response to 6 times more concurrent users per server, which is a large cost savings at scale.
- A large Japanese auto manufacturer saw a similar result. Accelerated PostgreSQL PostGIS (geospatial) was able to serve 6 times more users on the same server hardware.
- A large healthcare company found that accelerated PostgreSQL ran queries just as fast as its legacy data warehouse appliance but at 90 percent lower annual cost.
Does acceleration technology change the skills PostgreSQL DBAs require?
Not really. SQL stays the same. The code stays the same. Really, it just requires knowing what PostgreSQL features might interfere with acceleration -- for example, knowing the ins and outs of PostgreSQL parallelism is helpful. Some extensions can make PostgreSQL more user resilient so that PostgreSQL finds the fastest way to perform something, even if the DBA doesn’t.
Are there new database acceleration technologies in the near future that you find interesting?
I've always been a hardware nerd and fan, and it is great to see that there's a lot happening in hardware right now. Over the last five to 10 years, SSDs have profoundly affected databases, and therefore BI and analytics performance. In the next five years, I think we'll see persistent memory, computational storage, and CPU accelerators affect database performance just as significantly, if not more, than SSDs have already done.
How can people get started using database acceleration?
It’s easy for PostgreSQL users to get started; just download and install the acceleration extensions from Swarm64.com or from the AWS Marketplace. You do need to be running PostgreSQL 11 or higher and have the ability to install PostgreSQL extensions (on premises or on any cloud), but other than that there’s no special hardware required or anything. The more CPU cores that are available to your server for parallelism, the better.
Database acceleration is an easy path to better price performance and scaling, and we’re excited to see it expanding the applicability of free open source PostgreSQL into analytics and data warehousing.
[Editor’s note: Thomas Richter is founder and CEO of Swarm64, a company specializing in high-performance PostgreSQL extensions for faster analytics and easier scaling. Richter previously was the CFO and R&D project manager of bMenu AS in Oslo and product manager at bMobilized, Inc. in New York City. He holds a bachelor’s degree from University of Bath and an MBA from Lancaster University. Contact Richter at firstname.lastname@example.org or on LinkedIn.]
James E. Powell is the editorial director of TDWI, including research reports, the Business Intelligence Journal, and Upside newsletter. You can contact him
via email here.