Dremio’s Dart Initiative Enables Data Lake SQL Workloads
Release enhances performance for BI workloads on cloud data lakes.
Note: TDWI’s editors carefully choose vendor-issued press releases about new or upgraded products and services. We have edited and/or condensed this release to highlight key features but make no claims as to the accuracy of the vendor's statements.
Dremio released the first delivery in the company's Dart Initiative, which enables customers to run SQL workloads directly in a data lake.
Dremio is a service that sits between data-lake storage and end users who want to directly query that data for dashboards and interactive analytics, without copying data into data warehouses or creating aggregation tables, extracts, cubes, and other derivatives. Dremio simplifies the data architecture, accelerates query performance, and enables data democratization without the vendor lock-in of cloud data warehouses.
Although many of the world’s largest companies already use Dremio for SQL workloads, Dremio started the Dart Initiative to help companies run an even greater range of SQL workloads with enhanced performance and reduced resource consumption.
Optimal Query Planning
Database engines can choose from a wide range of strategies to plan queries, and the ability to generate an optimal query plan in any given situation can make a significant impact on performance. Dremio now gathers deep statistics about the underlying data, which helps its query optimizer choose the optimal execution path for any given query.
The Dart Initiative also introduces query plan caching, which eliminates planning overhead and latency for repeated queries. This is particularly impactful for BI dash-boarding use cases, where many users are simultaneously firing similar queries against the SQL engine as they navigate through dashboards. In these scenarios, the planning phase of queries often consumes a large proportion of the total query runtime, so eliminating this repeated planning workload yields an improvement in application response time.
Further, the Dart Initiative includes a high-performance compiler that enables much larger and more complex SQL statements with reduced resource requirements.
Comprehensive, ANSI-Standard SQL Coverage
The Dart Initiative enables companies to run an even broader set of enterprise SQL workloads on Dremio by broadening SQL coverage to include additional functions, operators, and SQL grammar constructs, including additional window and aggregate functions, grouping sets, intersect, and except/minor.
Faster Query Execution
Dremio is an in-memory engine powered by Apache Arrow, an open source columnar standard for in-memory computing co-created by Dremio. The Dart Initiative provides a boost in performance of end-user queries with complex expressions by extending Gandiva coverage to nearly all SQL functions, operators, and casts.
Distributed and Real-Time Metadata Management
Through the Dart Initiative, Dremio now supports unlimited table sizes with an unlimited number of partitions and files, as well as near-instantaneous availability of new data and data sets as they are persisted on the data lake. This is now possible with the introduction of manifest-based metadata and version management, supporting the largest data sets in enterprises with the most demanding data freshness SLAs.
Enhanced Acceleration Management
A key feature of the Dremio engine, which helps companies run mission-critical BI workloads directly on their cloud data lakes, is automated management of transparent query acceleration data structures (Data Reflections). With the Dart Initiative, Dremio greatly enhances the ability to support the orchestrated refresh of hundreds of these reflections within multi-tenant environments.
For details, visit www.dremio.com.