RESEARCH & RESOURCES

LESSON - Redefining EDW: Data Integration Requirements for Enterprise Data Warehousing

By Judy Ko, Senior Director, Product Management, Informatica Corporation

Data warehousing was once a siloed activity dedicated to business intelligence and reporting at a departmental or business unit level. Over the last five years, however, the brisk pace of change coupled with growing regulatory requirements has forced data warehousing into a mission-critical, operational role. Meanwhile, technologies have matured to the point where they can support these enterprise requirements.

Data warehousing has now evolved into a strategic, enterprisewide initiative that supports multiple business applications. An enterprise data warehouse is a common data foundation that provides any and all data for business needs across applications and divisions. Enterprise data warehousing (EDW) is the process of designing, building, and managing an enterprise data warehouse to meet the requirements of consuming applications.

Key Capabilities of Enterprise Data Warehousing

To live up to its potential, enterprise data warehousing must function atop a data integration platform that powers an ongoing lifecycle of data access, discovery, quality, integration, and delivery from virtually any system. Key capabilities that characterize an enterprise data warehouse include:

Business Relevant Data Across the Enterprise
EDW must allow organizations to access, discover, and integrate data from virtually any business system, in any format. It should deliver data throughout an enterprise via data services to support multiple applications in a flexible manner as part of a service-oriented architecture (SOA). EDW must also provide relevant business context for the data, so that the consumers of the data are able to interpret it in a meaningful way.
Trusted, Certifiably Accurate Data
EDW must equip organizations to manage data quality in a metric-driven, programmatic fashion. Maintaining data integrity and security across extended teams throughout the data integration lifecycle is required to meet regulatory compliance and governance objectives. To ensure confidence in the validity of enterprise data, information as well as data flows and relationships must be auditable and traceable.
Enterprise-Class Deployment Readiness
EDW must enable organizations to deliver data with scalability and high performance, matched to the varying needs of each business application. Adaptability and flexibility in selecting appropriate data integration methods within the EDW is also critical to optimizing performance and costs. Real-time availability of data to front-line staff as well as executives increases operational agility and on-theground intelligence. Finally, gathering skilled resources and best practices within an Integration competency center accelerates the time to market for new projects, at lower implementation costs and risk.
Architectural Considerations

One of the critical considerations for deploying EDW is creating an architecture that can support the data requirements of all types of applications. To succeed in achieving enterprisewide visibility, businesses need a seamless mechanism for users to access operational data in various transactional systems and analytic data in data warehousing environments through a single abstraction layer. This mechanism is data services.

By taking a service-oriented approach, IT is able to reuse data access, transformation, and quality logic across multiple environments, reducing the time to implement new analytics functionality. IT is also able to leverage this same infrastructure to make the data in warehouses relevant and available for operational purposes, such as single-view applications for improved customer service.

Finally, the data services infrastructure is able to support users who require data at varying levels of latency—batch, real time, and near real time. As business processes evolve and businesses move from traditional decision-support systems to more operational decision making that requires operational business intelligence, the ability to support these varying latencies through an enterprise data warehouse environment built on a data services infrastructure has become a critical factor for success.

This article originally appeared in the issue of .

About the Author

Judy Ko is the chief product officer at StreamSets where she is responsible for the DataOps platform that delivers continuous data in a multicloud world and the user experience that delights data engineers. You can reach the author via email or via LinkedIn.


TDWI Membership

Get immediate access to training discounts, video library, research, and more.

Find the right level of Membership for you.