TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Think
- Research & Resources
  - TDWI Playbook | Next Generation Data Science: The AI-Driven Data Science Life Cycle
  - TDWI Data Points | The Data Foundation for AI
  - TDWI Best Practices Report | Data Strategies and Foundations for Modern Data Management
  - TDWI Insight Accelerator | Adopting a Platform Approach for Gaining Insights from Unstructured Data
- Webinars
  - Modernize and Govern: Unifying Your Data Strategy July 10, 2025
  - Expert Panel: Best Practices for Modernizing Your Data Environment July 14, 2025
  - Powering Data Science with AI-Driven Tools and Practices July 15, 2025
  - Data Integration for AI: Overcoming Modern Pipeline Challenges July 23, 2025
- Virtual Summits
  - Virtual Events Keys to Making Your Data AI Ready September 10, 2025
  - Virtual Events Data Quality for BI, Analytics and AI October 22, 2025
  - Virtual Events Modern Data Strategy November 12, 2025
  - Virtual Events What’s Ahead in 2026 for Data & Analytics December 10, 2025
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Speaking of Data Podcast
  
  Current Research Surveys
Train
- In-Person Events
  - Conference TDWI Transform 2025 San Diego August 18, 2025
  - Executive Summit TDWI Modern Data Leader's Summit San Diego: AI in the Enterprise August 18, 2025
  - Executive Summit AI Accelerate 2025, Brought to You by AI Boadroom & TDWI August 18, 2025
  - Conference TDWI Transform 2025 Orlando November 16, 2025
- Virtual Live Seminars
  - TDWI Data Governance Principles and Practices: Managing Data as an Asset June 25, 2025
  - Building Your Company’s Data Governance Roadmap June 25, 2025
  - Data Governance: Driving Engagement and Organizational Change June 26, 2025
  - A Framework for Modern Data Governance June 25, 2025
- Online Learning
- By Topic
  - By Topic
    
    Explore the Latest AI, Analytics, and Data Research and Training by Topic
  - BI, Analytics, and Data Literacy
  - AI, Data Science, and Machine Learning
  - Data Management and Governance
  - Platforms and Architecture
  - Strategy and Methods
- Train Your TeamCustom solutions for training your team
  
  Get CertifiedEarn a professional credential in BI and Analytics, Data Governance, or AI
  
  TDWI MembershipExclusive access to the research, tools, training, and connections
Engage
- Connect
  - Connect and Contribute to Our Vibrant Community of Data Leaders
    
    Subscribe to TDWI Stay up to date on the latest news and events. Sign Up
    
    Become a TDWI Member Gain exclusive access to the research, tools, training, and connections to move your careers, teams, and projects forward. Learn More
    
    Become a Part of the TDWI Research Panel Make a difference in the data and analytics industry and earn incentives by sharing your insights with TDWI. Explore Now
    
    Speak at TDWI Events Share your expertise and build your personal brand as a speaker at a TDWI In-Person or Virtual Event. Submit a Proposal
    
    Become a TDWI Research Fellow Apply to be a member of TDWI’s industry leading research team. Apply Today
    
    Become a Member of the Data & AI Leaders Forum Engage in collaborative discussions, stay ahead of the curve, and stay in the know. Apply Now
    
    Showcase Your Data & AI Solutions Reach and engage with TDWI community through multi-channel marketing programs. Learn More

TDWI Articles

Executive Q&A: Controlling Cloud Egress Costs

As the cloud grows in popularity and its use expands, enterprises are finding that their cloud costs also expand, including little-understood egress costs. Adit Madan, director of products at Alluxio, sheds light on several best practices that can help you reduce these costs.

By Upside Staff
July 7, 2023

Upside: Cloud storage has gained popularity in recent years. Managing egress costs can be a significant challenge. What constitutes egress? Do all the major cloud providers charge for the same thing?

For Further Reading:

How To Get the Upper Hand on Cloud Cost Management

The Importance of Seeing Cloud Costs in Business Context

Proven Ways to Use AI to Cut Cloud Costs

Adit Madan: Egress refers to the cost an enterprise pays whenever data traverses regions or goes outside a specific cloud provider’s network. This is a significant challenge not only for enterprises implementing a hybrid or multicloud strategy but also for enterprises that have silos of data spread across multiple regions. In both these situations, analytical processing that needs access to all this data bears egress charges based on the volume of data crossing the boundary of a single region or cloud.

All major cloud providers charge based on the same metrics, although there is an emerging class of storage clouds attempting to challenge the status quo. These storage clouds have only seen success for scenarios where data is not accessed frequently, such as archiving data.

Realizing each enterprise is different, what percentage of an enterprise’s cloud service expense might egress fees reasonably represent and why are they often such a surprise to enterprises?

For smaller enterprises, egress charges are fairly minimal as most data resides in a single cloud region and is accessed within that region. For larger enterprises, the number of scenarios which incur egress fees is higher. One such scenario is implementing a hybrid cloud for cost management or a multicloud to make use of the latest optimized computing hardware that might not be available in the primary cloud.

For these scenarios, egress fees might be as high as a third of the cloud service expense with naive implementations. More optimal implementations can bring down the egress cost but still fall short as more management complexity is introduced and operations staff needs to be hired to compensate. The reason such fees come as a surprise is that it's hard to predict how much data is going to be accessed across regions, and usually this number only increases with time.

With 34% of enterprises being affected by data egress costs, in a study by S&P Global, what specific challenges do organizations face in managing these costs?

In most cases, manually copying data across regions is a technique employed to reduce egress fees so that repeated access of the data does not cross network boundaries. This approach is brittle and often needs an entire team to manage this challenge. Further complexity is introduced by redundant services and extra capacity that needs to be provisioned. Copying data also introduces compliance and governance risks as frequent synchronization is needed to ensure that updates, such as access control policies, are propagated throughout the enterprise network.

What are your recommended best practices for saving egress costs as businesses scale their cloud usage and continue to evolve their data platform architecture? Are there other practices that might unknowingly cause inefficiency?

A data lake approach offers maximum flexibility without redundancy and is the right choice for managing most kinds of data used for analytics and machine learning. Initial data processing, such as data curation and tagging for security, should be performed close to the source of ingestion.

Moving raw data across network boundaries is infeasible. Building a federation layer to query across all curated data is key. This layer should be able to minimize cross-network traffic while placing data and computing resources wherever suitable based on cost and availability. Just as in software programming, abstractions are also critical to manage the complexity of data platforms so that underlying changes to the platform are shielded from data consumers. Often, changes to one part of the data stack have a domino effect that permeates across inefficiencies manifesting throughout the organization.

What savings can businesses expect if they follow each best practice?

Savings can be manyfold. The first kind of saving is the infrastructure or cloud service spending, and efficiencies gained here are significant on their own. Another major kind of saving is the spending on the number of people needed to manage the platform. Employing best practices can help reduce the cost here by as much as 75%. Finally, employees’ productivity and ability to drive solutions to business problems are impacted by the agility of a disciplined organization employing sound practices, and there is no factor which is more important than that.

Data caching is an effective technique to reduce egress costs. Can you explain what cached in the cloud means and its impact on application performance? For instance, do cloud managers have to explicitly say what is and isn’t cached? What kind of savings does caching offer?

Caching is a technique to represent data close to the consumer when the consumer is separated from the data source itself. By making data appear close to computing resources, access is made efficient by overcoming the effect of high network latency and low bandwidth on application performance.

For instance, caching can be employed when accessing data that resides on the East Coast of the U.S. from the West Coast. In this scenario, with caching, the application performance is the same as if data resided in the same cloud region. With specialized caching systems, cloud managers do not explicitly have to specify what to cache, and policies maintain what is and isn’t cached. For ad hoc analytics and model training, caching is very efficient as the same data is accessed multiple times over and can hide 80% of all egress charges.

As organizations increasingly adopt multicloud strategies that span on-premises and public clouds, how should data egress costs impact their decision about what data should reside on which cloud platform?

Each data set should reside in only one cloud without redundancy. The choice should be based on the service responsible for producing the data, and the cloud of choice is the cloud which is most suitable for running that particular computing service.

Let’s use this example of a model training pipeline. If cloud A is suitable for data preprocessing, then the curated training data should reside on cloud A. However, if cloud B offers the most efficient hardware for training, such as GPUs, then the trained model should reside on cloud B while being deployed for inference to the cloud most suitable for that stage. Egress charges must be considered carefully for a platform that benefits from maximum agility.

[Editor’s note: Adit Madan is the director of product management at Alluxio. Adit has extensive experience in distributed systems, storage systems, and large-scale data analytics. He holds an MS from Carnegie Mellon University and a BS from the Indian Institute of Technology -- Delhi. He is also a core maintainer and Project Management Committee (PMC) member of the Alluxio Open Source project.]

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.

↑

TDWI | Training & Research | Business Intelligence, Analytics, Big Data, Data Warehousing

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Executive Q&A: Controlling Cloud Egress Costs

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI

Engage

Research

Research & Resources

Webinars

Virtual Summits

By Topic

In-Person Events

Virtual Live Seminars

Online Learning

By Topic

Connect and Contribute to Our Vibrant Community of Data Leaders

TDWI Articles

Executive Q&A: Controlling Cloud Egress Costs

Related Articles

Trending Articles

Breaking Barriers in Conversational BI/AI with a Semantic Layer

AI in 2025: Key Considerations for Technology Leaders

The Tech Blanket: Building a Seamless Tech Ecosystem

What’s Ahead in Generative AI in 2025? (Part Two)

TDWI Membership

Accelerate Your Projects, and Your Career

TDWI

Engage

Research

Accelerate Your Projects,
and Your Career