RESEARCH & RESOURCES

Dremio Update Provides BI Directly on Cloud Data Lakes

New low-latency query technology accelerates BI dashboard queries on Amazon S3 and Azure Data Lake Storage.

Note: TDWI’s editors carefully choose vendor-issued press releases about new or upgraded products and services. We have edited and/or condensed this release to highlight key features but make no claims as to the accuracy of the vendor's statements.

Dremio, a Santa Clara, CA-based cloud data lake engine company, announced product enhancements designed to deliver sub-second query response times directly on cloud data lakes with support for thousands of concurrent users and queries. In addition, the update includes built-in integration with Microsoft Power BI, enabling users to launch data visualization software from Dremio and start querying data via a direct connection.

The latest Dremio product release enables companies to run production BI workloads, including interactive dashboards, directly on Amazon S3 and Azure Data Lake Storage (ADLS) -- without having to move data into data warehouses, cubes, aggregation tables, or extracts. The new capabilities deliver self-service access to data and enable analysts to see results immediately, eliminating their dependency on manual ETL processes or data engineering.

“The fact that organizations don't need to copy their data into a data warehouse for BI workloads has been unthinkable for the last 30 years,” said Tomer Shiran, Dremio co-founder and chief product officer. “Today, our users can leverage Dremio to power live dashboards and reports directly on S3 and ADLS, instead of waiting weeks to have data moved into a data warehouse. We’re removing limitations, accelerating time to insight, and empowering data teams.”

New features of Dremio’s cloud data lake engine are designed to enable high-concurrency, low-latency SQL workloads, including BI dashboards, directly on the cloud data lake. These include:

  • Apache Arrow caching: Dremio can now cache data reflections (physically optimized representations of data) in the Apache Arrow format so the data can be loaded directly into memory with zero compute processing overhead. This eliminates the need to decode and decompress data at runtime, enabling sub-second query response times for BI dashboards.

  • Scale-out query planning: Dremio supports horizontal scaling for coordinator nodes, in addition to executor nodes, allowing companies to run high-concurrency workloads consisting of thousands of simultaneous users and queries.

  • Runtime filtering: By automatically leveraging runtime intelligence from dimension tables, Dremio reduces the amount of data that must be read from a fact table. This results in a performance speedup of more than 100x for star schemas, workloads that have traditionally only been run on data warehouses.

  • Enhanced Power BI integration: Microsoft and Dremio have partnered to develop a deeper integration that enables users to launch Power BI Desktop directly from the Dremio interface. Power BI automatically connects to Dremio using a native connector, so users can easily transition from building a dataset in Dremio to analyzing their data in Power BI.

  • External queries: Dremio enables users to incorporate explicit SQL queries on their relational databases within virtual datasets. This facilitates joining data between large datasets in a cloud data lake and smaller datasets in existing relational databases.

For details visit www.dremio.com.

TDWI Membership

Get immediate access to training discounts, video library, research, and more.

Find the right level of Membership for you.