Boosting Data Processing for a "Big Three" Credit Rating Company

Near real-time vast dataset synchronization that ensured quick access to the most current information.

Quick Info

Complete team autonomy

in creating a solution that meets system needs.

Leveraging proof-of-concept

to verify the most suitable approach.

Real-time data synchronization

by creating a Change Data Capture (CDC) process.

Lowered operational costs

by removing duplicate S3 storage and optimizing resource usage.

Client

A global credit research and ratings leader that provides in-depth analysis and data to investors and companies worldwide.

Client Need

The client’s system manages vast economic datasets with over 60 million records in MySQL and millions more added daily. To keep searches fast and efficient, MongoDB worked alongside MySQL, demanding continuous synchronization. However, the existing daily sync job was slow, error-prone, and resource-intensive, leaving users with outdated data.

The client needed back-end web development services to provide near real-time synchronization between the databases, ensuring fast and reliable access to up-to-date information.

Solution

Recognizing the complexity of the challenge, our team was entrusted with significant autonomy to design and implement a solution. This allowed our specialists to start from proof-of-concept development and tool evaluations to create a system tailored to the client’s unique requirements.

We proposed a new approach, substituting the existing data synchronization mechanism with a real-time Change Data Capture (CDC) process. To monitor MySQL tables for changes, we employed Amazon Data Migration Service (DMS). Now, every time there is an insertion, update, or deletion of data in MySQL, an event is created and published via Kafka Message Queue.

To manage these events, we built a custom Back-end Consumer Service that continuously monitors Kafka topics, processes the updates, and immediately updates the MongoDB structure.

To make it even snappier, we introduced Redis as a caching layer, which makes data fetching lightning-quick and enhances the user experience on the web portal. And to ensure that things continue to run smoothly, we built strong error-handling mechanisms that detect and recover from failures automatically, reducing downtime and ensuring data consistency.

This transformation not only optimized the synchronization process but also reduced resource consumption. By transitioning to real-time updates and eliminating the storage of extensive changelogs on Amazon S3, the system became leaner, faster, and more cost-effective.

Results

Near-real-time data availability
MySQL updates now synchronize with MongoDB in under a minute, and the most timely data is always available to users.
Cost efficiency
Cost savings were achieved by eliminating a resource-intensive batch job and redundant S3 storage.
Enhanced system performance
Quicker data queries and exports enhanced user experience and system performance.
01 / 02

Technologies

Spring Boot, Kafka Streams, Spring Data, Kubernetes, Redis, AWS DMS

Contact Us
All submitted information will be kept confidential
EKATERINA LAPCHANKA

EKATERINA LAPCHANKA

Chief Operating Officer