SUCCESS STORIES / Cloud

Purging Big Data problems using a
cloud-powered BigQuery solution

How we successfully migrated Hadoop Workloads to GCP to enhance performance and reduced costs.

Background

Hadoop has become a prominent solution to big data problems in the past decade. It is rapidly gaining popularity in the big data industry, prompting cloud providers to modernize big data clusters. However, managing and scaling a Hadoop cluster can be complex and expensive, requiring a dedicated workforce.

Given these challenges, it is crucial for companies to consider alternative solutions that could streamline their big data processes and improve their bottom line.

menu-img

72% reduction in daily costs

21% lesser infrastructure maintenance costs

24% acceleration in the time-to-market

Client Situation

The client is a global leader in delivering broadband, wireless, and wireline communications in the US. They inherited a 3500-node Hadoop cluster as part of its acquisition. A significant amount of costs was incurred to maintain the infrastructure and Hadoop ecosystem, which the client wanted to avoid.

The client was seeking a partner to assist in migrating its Hadoop data lake to the Google Cloud Platform (GCP). This involves the transfer of huge amounts of data from Hadoop to GCP-based infrastructure. Besides, it requires optimizing the updated environment for efficient data handling and storage, adding overheads.

Our pre-emptive approach to Hadoop helped the client combat Big Data problems using cloud-based tools and improve their performance by 86%.

Diagnosis

The client attempted to migrate its data lake infrastructure to GCP's Big Data Query platform. However, they had to deal with the technical limitations of Hadoop frameworks and orchestrators, resource constraints, and cost-cutting measures.

Even though the client benefited from industry-leading Data Warehouse (DW) technology, they faced a considerable daily cost increase, making migration more challenging.

Hadoop clusters were evidently not a viable solution to data processing needs. Due to the dynamic nature of data processing, the client needed to explore significant alternatives, including cloud-based platforms that are scalable, reliable, and cost-effective.

Solving It

Prodapt team thoroughly reviewed their existing infrastructure and provided recommendations on the migration approach. We implemented their data pipelines using a cloud-native approach with GCP's native services like BigQuery, Cloud Storage, and Dataflow to put the plan into action.

Moving to GCP the client was able to utilize its built-in features such as Cloud Composer and Looker.

The GCP Cloud Composer helped us automate and streamline complex workflows during re-architecture. Looker, an observability tool, enabled proactive management of data processing pipelines, optimized job performance, and reduced costs.

We integrated Opsgenie, an incident management tool with GCP that triggered real-time alerts on metrics and events, automatically escalated incidents for redressal, built collaboration between teams, and orchestrate incident response processes.

Prodapt has introduced new accelerators that make it easier and faster to create, compare, migrate, and test different types of data in a database. This has led to a significant reduction in overall effort and costs.

Within 6 months, the daily costs went down by 72%, By moving the data pipeline from Hadoop to BigQuery, the infrastructure maintenance costs were reduced by 21%. Our accelerators and frameworks also helped the client speed up their time-to-market by 24%.

Let’s connect

How can we help?

We'd love to hear from you.

Talk to a consultant