RESEARCH & RESOURCES

Citus Data Adds Fast Data to Big Data on PostgreSQL

New open-source extension increases PostgreSQL scalability for operational workloads.

Note: TDWI’s editors carefully choose vendor-issued press releases about new or upgraded products and services. We have edited and/or condensed this release to highlight key features but make no claims as to the accuracy of the vendor's statements.

Citus Data, creators of the horizontally scalable PostgreSQL database CitusDB, announced that PostgreSQL is now able to deliver simple, elastic scale across large datasets. The company is releasing an open-source extension “pg_shard,” that enables PostgreSQL to scale large datasets and operational workloads unlike ever before.

“PostgreSQL is the de-facto relational database in many organizations and therefore while its strategic importance and popularity grow, so must its ability to tackle ever larger datasets,” said Umur Cubukcu, CEO and cofounder of Citus Data. “With pg_shard, CitusDB is making PostgreSQL horizontally scale in a simple way while keeping the benefits of a mature, open ecosystem.”

With the world of data ever-increasing, organizations are swiftly realizing that the key to competitive advantage is harnessing and leveraging the value of diverse data types. As such, these organizations expect both high performance and high availability from their database to deliver results in a way that is fast and reliable. Users who leverage the extensive PostgreSQL ecosystem already bring together structured and semi-structured data. The CitusDB database adds on top of that the ability to run ad hoc analytic SQL queries on very large data sets on cost-effective standard servers. Having an elastically scalable, open RDBMS that runs on commodity machines makes a great complement to any commodity scale out infrastructure whether it is in a public, private, or a hybrid cloud.

Now, through the new open-source extension pg_shard, it is possible to elastically scale PostgreSQL for low-latency writes and reads while maintaining analytic scale out via CitusDB. Elastically scaling PostgreSQL lets users tackle heavy mixed analytic and operational workloads without sacrificing relational power or breaking the bank. The pg_shard open source extension to PostgresSQL is easy to use; it requires no changes to the application layer, no middleware for users to manage, and no additional training.

A horizontal partition of data in a database known as database sharding and elastic scaling for real-time workloads have long been challenges facing developers of any database. Solutions ranged from using middleware to shard out the relational database to manually sharding the database while managing replication, high availability, and routing in a custom way. Some organizations migrate to a NoSQL solution.  However, when an organization hits the wall with the scalability of their RDBMS, Citus Data believes these approaches are not ideal. They are difficult to set up and require changes on the application layer. If migrating to a different database, they also require extensive remodeling of the data and new skills at the expense of letting go of powerful relational semantics. Even after all the manual or migration work, they do not provide an acceptable solution for running interactive analytic queries across shards. pg_shard and CitusDB address all these issues.

“This is unlike anything else available today,” says Cubukcu. “When dealing with large data volumes -- especially machine-generated data such as clickstream data and user event logs -- customers want to maintain their relational semantics for analytics while maintaining their ability to ingest data in real time, and now they can. We are making it easy and simple to bring big data together with fast data on PostgreSQL.” 

The extention is built as an open source PostgreSQL extension, allowing users to simply run it on existing PostgreSQL instances without being forced to switch to a new database backend or make changes on the application side. In addition, it is simple to add more machines as the user’s data grows. With built-in replication, users automatically benefit from high availability should any part of the network fail.  By staying open, users can leverage the powerful open ecosystem and functionality around a major project like PostgreSQL.

For more information about Citus Data or about the pg_shard open source extension, please visit www.citusdata.com.

TDWI Membership

Get immediate access to training discounts, video library, BI Teams, Skills, Budget Report, and more

Individual, Student, & Team memberships available.