Firebolt > Case Studies > How Explorium Serves Enriched Data in Production 3-50x Faster with Firebolt

How Explorium Serves Enriched Data in Production 3-50x Faster with Firebolt

Customer Company Size

Mid-size Company

Region

America

Country

United States

Product

Firebolt
Databricks Delta Lake
Amazon Athena

Tech Stack

SQL
Apache Spark
AWS

Implementation Scale

Enterprise-wide Deployment

Impact Metrics

Productivity Improvements
Customer Satisfaction

Technology Category

Analytics & Modeling - Predictive Analytics
Platform as a Service (PaaS) - Data Management Platforms

Applicable Industries

Software

Applicable Functions

Business Operation

Services

Cloud Planning, Design & Implementation Services
System Integration

About The Customer

Explorium is a company that provides an external data acquisition and management platform. Their platform enables companies to make better business decisions by automatically discovering, connecting, and matching their own data with hundreds of curated data sources and thousands of external data signals. As Explorium grew, they faced challenges in managing and processing large volumes of data efficiently. Their customers rely on Explorium to provide enriched data quickly and accurately, which is critical for making informed business decisions. The company needed a solution that could handle increasing data volumes and provide consistent, fast performance to meet customer expectations.

The Challenge

Explorium faced significant performance challenges as their data and customer requests grew. Their existing setup, which involved using a Presto cluster on AWS for processing time series data, was unable to handle high loads efficiently. The shared nature of the Presto cluster meant that large jobs could impact the performance of other requests, leading to slowdowns and customer dissatisfaction. Explorium's data volumes and requests were expected to triple, necessitating a new solution to handle customer requests for time series data enrichment.

The Solution

Explorium evaluated several options, including other Presto solutions and Amazon Redshift, but found them lacking in terms of workload isolation and performance. They ultimately chose Firebolt for its ability to handle large data sets with decoupled storage and compute architecture. The implementation process took two months, with most of the work done using SQL. Explorium used Apache Spark to process raw data and loaded it into Delta Lake, then into Firebolt using an ELT process. Firebolt's primary indexes improved the performance of live queries, while larger offline requests were handled using federated queries. Explorium deployed a lower-cost three-node engine, relying on primary indexes for fast performance.

Operational Impact

Firebolt provided predictable performance for every query, even under high loads, eliminating the need for timeouts or throttling.
The solution was easy to manage and offered linear scalability, allowing Explorium to support increasing data volumes and loads with minimal effort.
Firebolt's architecture allowed Explorium to serve enriched data to customers consistently and reliably, with room for future growth.

Quantitative Benefit

Queries ran 17-102x faster than Redshift across evaluated queries.
Live queries ran 15-50x faster compared to Redshift, with all queries running in 2 seconds or less.
Larger offline data enrichment requests saw a 3-5x performance improvement compared to the original Presto deployment.