TDWI Articles

How Enterprises Can Democratize Mainframe Data

New options are available so IT leaders can break mainframe data silos.

For too long, the difficulties of mainframe data management and transformation have resulted in mainframe data being locked in silos, inaccessible to the cloud-based analytics, AI, and ML tools that can make the best use of it. Today, however, new technologies are moving IT leaders toward a future where this will no longer be the case.

For Further Reading:

CEO Q&A: Democratizing Data with Self-Service Platforms

Empowering Everyone to Make Decisions with Confidence

How and Why Your Enterprise Should Democratize Data Science

For example, according to Gartner’s report Cloud Storage Management Is Transforming Mainframe Data by Jeff Vogel, mainframe organizations are now evolving toward more cloud-based data storage; about one-third of backup and archival data will be migrated to the cloud by 2025.

As we move through the coming year and beyond, executive leadership’s vision for a full digital transformation can not only coexist alongside mainframe data but also thrive. This trend will be one of the main themes of 2022 -- and it’s long overdue.

Mainframe data holds some of the most mission-critical information an organization collects, and keeping it siloed because of legacy storage hardware has always resulted in lost ROI.

This year, increasing executive demands for full digital transformations will continue to increase the pressure on IT leaders to find a solution to break the historically permitted silos surrounding mainframe data.

Why Mainframe Data Democratization Is the Key to a Complete Digital Transformation

Removing mainframe data silos and migrating them to the cloud is a crucial first step toward the cloud-first digital transformation strategy.

By making mainframe data available to all data consumers in an organization via the cloud, IT leaders can start their organizations on a path of democratizing the totality of their data --meaning they can make it accessible to virtually any consumer who needs it.

Huge possibilities exist once mainframe data is democratized because analytics, AI, and ML applications can be used to process it for insights (the same way other core business data such as production manufacturing processes, sales data, demand forecasting, accounting data, etc. is processed).

Historically, inefficiencies in migrating mainframe data in the cloud have persisted in mainframe data storage for a number of reasons. Two of the most significant are IT leaders’ historical need to either store data in mainframe format in the cloud while maintaining legacy storage systems or transform that data into open formats in the mainframe before migrating it to the cloud.

For example, take IBM’s DFSMShsm (HSM). The problem with HSM is that it relies on three VSAM data sets called Control Data Sets (CDS) to keep track of the data it manages. Over time, these VSAM data sets can get quite large and require regular maintenance (including reorganizations, resizing, auditing/error correction, and reporting). In addition, data on tape must be “recycled” to reclaim tapes when data expires, which necessitates the maintenance of legacy data storage infrastructure. To keep HSM tuned and working properly, a highly skilled mainframe technician is required.

Another tactic is to use TCT tools to put mainframe-formatted data into the cloud. However, TCT moves data into the cloud slowly, requires the maintenance of legacy storage options, and cannot transform the mainframe-formatted data once it is in the cloud.

IT leaders have found that although these migration options will move mainframe data around, the silo around it always goes with it. Either the data remains tethered to legacy mainframe storage hardware, or it is being turned into an island within the cloud environment.

Alternatively, IT leaders can transform their mainframe data out of proprietary formats and into open formats for use in the cloud. However, this uses mainframe MIPS, making it expensive and taking up mainframe processing power that could be used elsewhere. Prioritization usually centers around a small percentage of data deemed absolutely critical while the majority of mainframe insights remain siloed.

Ultimately, architectures that maintain mainframe data silos fail to realize the full potential ROI they could achieve by democratizing their data and making it available to all consumers for use in cloud-based AI/BI/ML applications.

How to Break Mainframe Data Silos

Today, there are new options for IT leaders to break mainframe data silos.

First, modern cloud data management for mainframe solutions can back up mainframe data still in mainframe format directly to cloud object storage (S3) without the need for tape or emulated tape. IT leaders can use these tools to capture mainframe data before it is handled by tapes, break apart large data sets in simultaneous chunks, push multiple sets at one time, and compress the data before it leaves the mainframe -- thus utilizing network bandwidth more efficiently. In contrast to legacy approaches, managing this process occurs via a software-only approach on the cloud side.

Second, the old extract, transform, and load (ETL) process has been replaced by an extract, load, and transform (ELT) process. ETL has been used by organizations that needed to get data out of a traditional application-data system in a mainframe environment (for example, to create a mainframe-based data warehouse). This process is well established but has significant limits.

One of those limits is time to value. ETL is computation-intensive in terms of mainframe CPU cycles and, therefore, billable MSUs. In short, ETL can be hard to work into mainframe operations, and it is expensive and slow. When we talk about moving mainframe data to the cloud (or anywhere off the mainframe), historically enterprises have been dependent on ETL and, therefore, have used it sparingly.

Fortunately, ELT has gained ground rapidly in recent years. By moving most of the processing to secondary computing resources that are less expensive, ELT avoids impacting the mainframe CPU and so does not interfere with crucial scheduled tasks. It is fast and efficient, and it’s one of the biggest reasons mainframe data can be democratized.

Data Democratization: The Big Trend of 2022

New cloud data management for mainframe capabilities, combined with new transformation technologies, means data can now be moved quickly from a mainframe silo and located to a different platform. Once there, it can continue to support mainframe operations or can be readily transformed to power analytics, business intelligence, and AI/machine learning activities.

The year ahead will have challenges for input and output tasks -- but it offers big opportunities as well. The biggest is getting mainframe data out of silos and democratizing it so it can play a full role in powering enterprise success. By putting that data in the cloud, your organization can put in place modern management solutions and affordable, fast, tiered storage for archive and backup that can also revolutionize how your data is used and the value it delivers.

About the Author

Gil Peleg, founder and CEO of Model9, has more than two decades of hands-on experience in mainframe system programming and data management, as well as a deep understanding of methods of operation, components, and diagnostic tools. He is a co-author of eight IBM Redbooks on z/OS implementation. He holds a B.S.c in computer science and mathematics.


TDWI Membership

Accelerate Your Projects,
and Your Career

TDWI Members have access to exclusive research reports, publications, communities and training.

Individual, Student, and Team memberships available.