Enable ongoing replication from on premise mainframe IBM DB2 z/OS databases to AWS S3.
The challenge with replicating mainframe data, is that binary database log files need to be read. If we
are replicating data from one mainframe system to another mainframe system, both systems are using
big-endian byte storage
and both systems can read and interpret binary logs correctly.
When replicating mainframe data to AWS, binary logs written in big-endian need to be read and translated
into little-endian to accommodate AWS x86 architecture.
To assist with this conversion, we used the IBM IIDR replication tool to
translate mainframe binary database logs and replicate the data into an AWS Aurora PostgreSQL instance.
This tool was currently being used to replicate data between systems on premise, which made it a great
candidate to use for our solution as it did not add more software licensing and a majority of setup was
The IBM IIDR version that we were using was a bit older and did not include the feature to replicate
AWS S3. Due to this limitation, we needed another replication step from AWS PostgreSQL to AWS S3 and
selected AWS Database Migration Service(DMS) to accomplish this task.
A future enhancement to this solution, was to upgrade IBM IIDR to the latest version, which
included a feature to extract data directly to AWS S3. We would then remove the AWS Aurora PostgreSQL
instance and AWS DMS components from this solution, reducing cost and increasing simplicity.
This solution required two replication steps, the first step was to replicate data from mainframe IBM
z/OS systems to AWS Aurora PostgreSQL and the second step was to replicate data from AWS Aurora
to AWS S3.
AWS Aurora PostgreSQL Configuration and Deployment
The IBM IIDR Replication target is an AWS Aurora PostgreSQL
database. AWS Aurora PostgreSQL was created using CDK as Infrastructure as Code.
High-availability/Multi-AZ was enabled for this instance. However, auto-scaling was not enabled because
resource consumption was low and steady. The CDK application
created all database user IDs and passwords in secret manager.
IBM IIDR Replication Configuration and Deployment
The IBM IIDR Replication solution requires installation of two pieces of software. The IIDR Replication
Agent, and the IIDR Access Server.
The IIDR Replication Agent is the process that is responsible for the actual conversion of data, and
transmission of data.
The IIDR Access Server is responsible for the coordination of tasks to and from replication agents.
Both the IIDR Replication Agent and the IIDR Access Server were installed on separate EC2 instances. The
deployment of EC2 instances, installation and configuration
of each IIDR software component was automated through a CDK Infrastructure as Code application.
Bootstrap scripts were created for each EC2 instance and all passwords were retrieved from AWS Secrets
IBM DB2 z/OS DDL Conversion
Once the AWS Aurora PostgreSQL database and IIDR EC2 instances were up and running, it
was time to configure the replication of data from on premise to AWS. Our first step was to convert the
mainframe IBM DB2 z/OS table DDLs to AWS Aurora PostgreSQL. Since there are no open sources tools to
convert IBM DB2 z/OS DDLs
and since our team has done this numerous times, table DDLs were manually converted.
Once PostgreSQL DDLs were converted and created on the AWS PostgreSQL instance, IBM IIDR was configured
to replicate data between on premise and AWS PostgreSQL.
AWS Aurora PostgreSQL DDL Updates
Another large component to this solution was to implement an automated method to apply
updates and changes to DDL and other objects in the database. We created another CDK Infrastructure as
Code application to achieve this requirement. The CDK application created a lambda function that
commands on a Docker container that ran Flyway, a free database
versioning tool. Database object updates were effortlessly worked into CI/CD pipelines with this CDK
application and mitigated numerous risks with database object changes.
AWS DMS Instance and Task Creation
Our final task was to set up AWS Database Migration Service(DMS) to replicate
AWS Aurora PostgreSQL data to AWS S3. An AWS DMS instance and AWS DMS endpoints for AWS Aurora
PostgreSQL and AWS S3 bucket and folder were created. Finally, an AWS DMS task was created
using the full-load and CDC replication option.
The solution was a success and mainframe IBM DB2 z/OS data was continuously and effortlessly replicated
to AWS S3. The project took at full year to implement, which was primarily due to working with numerous
teams and formal processes that are required at each stage of the project.
Working with the Government requires a significant amount of work for formal processes and project stage
clearance. Try to line up whom you need to work with, when you need to engagement with them, what
documents are required and what approvals are required. Try to streamline the process as much as
minimize any potential delays.