RESEARCH & RESOURCES

Updated Dremio Data Lakehouse Engine Provides Faster Insights, Streamlined Operations

Dremio’s new functionality simplifies query construction, optimizes performance and storage use, and enhances compatibility for unified data lake experience.

Note: TDWI’s editors carefully choose vendor-issued press releases about new or upgraded products and services. We have edited and/or condensed this release to highlight key features but make no claims as to the accuracy of the vendor's statements.

Dremio, the company that features an easy and open data lakehouse, has announced new features that enhance the performance and versatility of its data platform. These new capabilities empower organizations to accelerate their data analytics and enable faster, more efficient decision-making. Dremio is ensuring easy self-service analytics with data warehouse functionality and data lake flexibility across customer data. 

Among the key features unveiled by Dremio are querying, performance, and compatibility enhancements that include:

  • Effortless Iceberg table optimization. Data teams no longer need to be concerned about how a table is physically stored on object storage, including file counts, file sizes, statistics, repartitioning, and more. Dremio now offers SQL commands such as OPTIMIZE, ROLLBACK, and VACUUM. These commands optimize performance and streamline data lake management. The OPTIMIZE command improves query performance by optimizing data layout and statistics; the ROLLBACK command enables users to revert any unintended changes made to their data. The VACUUM command reclaims storage space by removing unnecessary data files.
  • Improved data compression. Dremio now supports native Zstandard (zstd) compression, offering an improvement of up to 40% on compression ratios and decompression speeds. This feature enables users to optimize storage utilization and improve query performance, all while reducing operational costs.
  • Tabular UDFs. Tabular User-Defined Functions enable users to extend the native capabilities of Dremio SQL and provide a layer of abstraction to simplify query construction. This allows users to create functions that can serve as native row and column policies, empowering data analysts and engineers to easily build complex calculations, transformations, and advanced analytics that unlock new possibilities for data-driven insights.
  • New mapping SQL functions. CARDINALITY returns the number of elements in a map or list and helps customers move array workloads from Presto and Athena; ST_GEOHASH returns the corresponding geohash for the given latitude and longitude coordinates. FROM_GEOHASH returns the latitude and longitude coordinates of the center of the given geohash. Both geohash functions help customers move workloads from Snowflake, Amazon Redshift, Databricks, and Vertica. Geohashing guarantees that the longer a shared prefix between two geohashes is, the spatially closer they are together.
  • Enhanced Delta Lake support. Dremio now supports multiple Delta Lake catalogs including Hive Metastore and AWS Glue. This allows seamless integration with existing Delta Lake-based workflows, providing a unified data lake experience across the organization.

These key features further solidify Dremio's position as a leader in the data lakehouse engine space, enabling organizations to efficiently analyze, transform, and derive insights from their data at scale.

TDWI Membership

Get immediate access to training discounts, video library, research, and more.

Find the right level of Membership for you.