• >
  • >
  • >
  • >
  • >
Databricks > Case Studies > Large Scale ETL and Lakehouse Implementation at Asurion

Large Scale ETL and Lakehouse Implementation at Asurion

Databricks Logo
Technology Category
  • Platform as a Service (PaaS) - Application Development Platforms
  • Robots - Parallel Robots
Applicable Industries
  • Buildings
  • Construction & Infrastructure
Applicable Functions
  • Maintenance
  • Product Research & Development
Use Cases
  • Inventory Management
  • Time Sensitive Networking
Services
  • System Integration
About The Customer
Asurion is a company that provides insurance and installation, repair, replacement, and 24/7 support services, helping people protect, connect, and enjoy the latest tech. Every day, their team of 10,000 experts helps nearly 300 million people around the world solve the most common and uncommon tech issues. They are just a call, tap, click, or visit away for everything from getting a same-day replacement of your smartphone to helping you stream or connect with no buffering, bumps, or bewilderment. Asurion's legacy data platform was operating at a massive scale, processing over 8,000 tables, 10,000 views, 2,000 reports, and 2,500 dashboards.
The Challenge
Asurion's Enterprise Data Service team was tasked with gathering over 3,500 data assets from the entire organization, providing a unified platform where all the data could be cleaned, joined, analyzed, enriched, and leveraged to create data products. The previous iterations of data platforms, built mostly on traditional databases and data warehouse solutions, encountered challenges with scaling and cost due to the lack of compute and storage separation. With increasing data volumes, a wide variety of data types, demand for lower latency and increased velocity, the platform engineering team began to consider moving the whole ecosystem to Apache Spark™ and Delta Lake using a lakehouse architecture as the new foundation. The previous platform was based on Lambda architecture, which introduced problems such as data duplication and synchronization, logic duplication, different ways to deal with late data, data reprocessing difficulty due to the lack of transactional layer, and platform maintenance downtimes.
The Solution
Asurion implemented a lakehouse architecture to simplify the platform by eliminating batch and speed layers, providing near real-time latency, supporting a variety of data formats and languages, and simplifying the technology stack into one integrated ecosystem. The goal of the new architecture was to create a single job that’s flexible enough to run thousands of times with different configurations. To achieve this goal, they chose Spark Structured Streaming, along with Auto Loader, which greatly simplified state management of each job. They built a framework around Spark using Scala and the fundamentals of object-oriented programming. They created a rich set of readers, transformations, and writers, as well as Job classes accepting details through run-time dependency injection. They divided the tables based on how frequently they are updated at the source and bundled them into job groups, one assigned to each ephemeral notebook. They also used Databricks SQL for data marts, which gave them a more attractive price point and exposed an easy JDBC connection for their user-facing SQL application.
Operational Impact
  • The implementation of the lakehouse architecture at Asurion resulted in a simplified platform that eliminated batch and speed layers, provided near real-time latency, supported a variety of data formats and languages, and simplified the technology stack into one integrated ecosystem. The new architecture was flexible enough to run thousands of times with different configurations, greatly simplifying state management of each job. The use of Databricks SQL for data marts gave them a more attractive price point and exposed an easy JDBC connection for their user-facing SQL application. This resulted in a more efficient and cost-effective data management system that could handle the massive scale of Asurion's operations.
Quantitative Benefit
  • The new setup allowed Asurion to run up to 1,000 slow-changing tables on one 25 node cluster.
  • They now have 600 data marts and are growing more in production in their lakehouse.
  • The new architecture was flexible enough to run thousands of times with different configurations.

Case Study missing?

Start adding your own!

Register with your work email and create a new case study profile for your business.

Add New Record

Related Case Studies.

Contact us

Let's talk!
* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that AGP may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from AGP.
Submit

Thank you for your message!
We will contact you soon.