Customer Company Size
Mid-size Company
Region
- America
Country
- United States
Product
- Firebolt
- Databricks Delta Lake
- Amazon Athena
Tech Stack
- SQL
- Apache Spark
- AWS
Implementation Scale
- Enterprise-wide Deployment
Impact Metrics
- Productivity Improvements
- Customer Satisfaction
Technology Category
- Analytics & Modeling - Predictive Analytics
- Platform as a Service (PaaS) - Data Management Platforms
Applicable Industries
- Software
Applicable Functions
- Business Operation
Services
- Cloud Planning, Design & Implementation Services
- System Integration
About The Customer
Explorium is a company that provides an external data acquisition and management platform. Their platform enables companies to make better business decisions by automatically discovering, connecting, and matching their own data with hundreds of curated data sources and thousands of external data signals. As Explorium grew, they faced challenges in managing and processing large volumes of data efficiently. Their customers rely on Explorium to provide enriched data quickly and accurately, which is critical for making informed business decisions. The company needed a solution that could handle increasing data volumes and provide consistent, fast performance to meet customer expectations.
The Challenge
Explorium faced significant performance challenges as their data and customer requests grew. Their existing setup, which involved using a Presto cluster on AWS for processing time series data, was unable to handle high loads efficiently. The shared nature of the Presto cluster meant that large jobs could impact the performance of other requests, leading to slowdowns and customer dissatisfaction. Explorium's data volumes and requests were expected to triple, necessitating a new solution to handle customer requests for time series data enrichment.
The Solution
Explorium evaluated several options, including other Presto solutions and Amazon Redshift, but found them lacking in terms of workload isolation and performance. They ultimately chose Firebolt for its ability to handle large data sets with decoupled storage and compute architecture. The implementation process took two months, with most of the work done using SQL. Explorium used Apache Spark to process raw data and loaded it into Delta Lake, then into Firebolt using an ELT process. Firebolt's primary indexes improved the performance of live queries, while larger offline requests were handled using federated queries. Explorium deployed a lower-cost three-node engine, relying on primary indexes for fast performance.
Operational Impact
Quantitative Benefit
Case Study missing?
Start adding your own!
Register with your work email and create a new case study profile for your business.
Related Case Studies.
Case Study
Infosys achieves a 5–7 percent effort reduction across projects
Infosys, a global leader in consulting, technology, and outsourcing solutions, was facing significant challenges in application development and maintenance due to its distributed teams, changing business priorities and the need to stay in alignment with customer needs. The company used a mix of open source, home-grown and third-party applications to support application development projects. However, challenges resulting from distributed teams using manual processes increased as the company grew. It became more and more important for Infosys to execute its projects efficiently, so they could improve quality, reduce defects and minimize delays.
Case Study
Arctic Wolf Envelops Teamworks with 24x7 Cybersecurity Protection and Comprehensive Visibility
Teamworks, a leading athlete engagement platform, faced rising cyberthreats and needed enhanced visibility into its network, servers, and laptops. With software developers connecting from all over the world, the company sought to improve its security posture and position itself for future growth. The company had a secure platform but recognized the need for a more proactive solution to identify gaps within its technology infrastructure. Data exfiltration and malicious access were top concerns, prompting the need for a comprehensive security upgrade.
Case Study
Sawback IT and Datto Save Client From a Costly Mistake
Ballistic Echo, a software development house, faced a critical challenge when human error led to the deletion of thousands of lines of unique code. This incident occurred before the code was pushed to source control, resulting in significant loss of time, revenue, and work. The previous file-level backup solution they used was slow and inefficient, making it nearly impossible to manually recreate the lost work. The need for a more reliable and efficient business continuity solution became evident to avoid such disasters in the future.
Case Study
Opal Helps Customers Shine Thanks to Datto
SP Flooring & Design Center faced a ransomware attack that encrypted and locked their files. The attack was initiated through a compromised service account set up by an outside vendor. The ransomware infection was isolated quickly, but there was a concern about the extent of the data at risk. The company had backups in place but was unsure of how much information was compromised. The situation required immediate action to prevent further damage and restore the affected data.
Case Study
Zapier Aggregates Multiple Analytics in a Single Dashboard with the New Relic Platform
Zapier, a company that enables non-technical users to push data between hundreds of web applications, was facing a challenge in automating and provisioning servers for optimal performance. The company's environment consisted of 50 Linux servers on the Amazon Elastic Compute Cloud (EC2), a Django application split across several servers, and a backend consisting of a dynamic number of celery task workers fed by messages published to a RabbitMQ cluster. They also maintained a number of internal web services on nginx in front of Gunicorn and Node.js processes. Redis handled simple key and value stores, with logging handled by Graylog2 and ElasticSearch. However, they realized that no level of automation would be sufficient without an effective monitoring solution in place. They needed a tool that could provide immediate alerts when something was breaking and could be easily implemented into their environment.
Case Study
Pipeline Insight Case Study: YARCDATA
YarcData faced challenges in determining the conversion rates of prospects into customers through various marketing efforts and identifying the source of its leads. They wanted to know the percentage of opportunities in the sales pipeline that came from different marketing events, web downloads, or self-sourced sales opportunities. Additionally, they needed the ability to drill down into the data to guide where to allocate more marketing dollars based on the success of previous efforts. Previously, YarcData relied heavily on spreadsheets and Salesforce.com reports, which made it difficult to extract the exact information they needed. This reliance on spreadsheets represented about 70% of their data presentation.