• >
  • >
  • >
  • >
  • >
Rockset > Case Studies > Developing Global Labor Market Intelligence at SkyHive Using Rockset and Databricks

Developing Global Labor Market Intelligence at SkyHive Using Rockset and Databricks

Rockset Logo
Customer Company Size
Large Corporate
Country
  • United States
Product
  • SkyHive Skill Passport
  • SkyHive Enterprise
  • Rockset
  • Databricks
Tech Stack
  • Databricks
  • Rockset
  • Spark ETL
  • Delta Lake
Implementation Scale
  • Enterprise-wide Deployment
Impact Metrics
  • Productivity Improvements
  • Customer Satisfaction
  • Digital Expertise
Technology Category
  • Analytics & Modeling - Real Time Analytics
  • Platform as a Service (PaaS) - Data Management Platforms
  • Analytics & Modeling - Predictive Analytics
Applicable Industries
  • Software
  • Professional Service
Applicable Functions
  • Business Operation
  • Product Research & Development
Use Cases
  • Real-Time Location System (RTLS)
  • Remote Asset Management
Services
  • Data Science Services
  • Cloud Planning, Design & Implementation Services
  • System Integration
About The Customer
SkyHive is an innovative end-to-end reskilling platform that automates skills assessment, identifies future talent needs, and fills skill gaps through targeted learning recommendations and job opportunities. The company collaborates with industry leaders such as Accenture and Workday and has been recognized by Gartner as a cool vendor in human capital management. SkyHive has developed a comprehensive Labor Market Intelligence database that stores profiles of 800 million anonymized workers and 40 million companies, along with 1.6 billion job descriptions from 150 countries. The platform ingests 16 TB of data daily from job postings and paid streaming data feeds, utilizing complex analytics and machine learning to provide insights into global job trends. SkyHive is rapidly growing, adding 2-4 corporate customers daily, driven by its data-driven services and partnerships.
The Challenge
SkyHive faced significant challenges with MongoDB for analytical queries due to its slow performance in handling complex analytics involving data across jobs, resumes, courses, and different geographics. The query latency was high, and the system struggled with multidimensional queries and joins, making it impossible to provide the interactive performance required by users. Additionally, there were limitations on payload sizes and other hardcoded quirks, such as the inability to query certain countries like Great Britain. These issues hindered SkyHive's ability to deliver immediate results to customers, especially when expanding searches to non-English speaking countries, as data normalization across different languages was problematic.
The Solution
To address the challenges with MongoDB, SkyHive transitioned to a real-time data stack using Databricks and Rockset. Databricks was chosen for its compatibility with more tooling options and support for open data formats, enabling SkyHive to deploy a lakehouse architecture. This architecture processes data through three Delta Lake stages, refining and enriching data for efficient storage and processing. Rockset was selected as the new user-facing serving database, continuously synchronizing with the Gold layer data and building an index for multidimensional analytics. This setup allows SkyHive to serve pre-defined Query Lambdas and ad hoc free-text searches, providing real-time answers to complex queries. The integration of Databricks and Rockset has significantly improved SkyHive's ability to handle large datasets, run ML models, and support complex queries with low latency, enhancing both internal operations and customer satisfaction.
Operational Impact
  • SkyHive successfully transitioned from MongoDB to a real-time data stack with Databricks and Rockset, improving query performance and handling large datasets efficiently.
  • The new architecture allows SkyHive to support complex queries on large-scale data, returning answers in milliseconds with minimal compute cost.
  • SkyHive can now provide real-time answers to customer queries, meeting sub-300 millisecond query time guarantees and enhancing customer satisfaction.
  • Rockset's SQL-to-REST API support simplifies presenting query results to applications, speeding up development time and boosting internal operations and external sales.
  • SkyHive plans to expand its use of Rockset for geospatial queries and serving data to ML models, further streamlining its data architecture.
Quantitative Benefit
  • SkyHive's database stores profiles of 800 million workers and 40 million companies.
  • The platform ingests 16 TB of data daily from various sources.
  • SkyHive adds 2-4 corporate customers every day.
  • Rockset can handle millions of queries a day, regardless of complexity.

Case Study missing?

Start adding your own!

Register with your work email and create a new case study profile for your business.

Add New Record

Related Case Studies.

Contact us

Let's talk!
* Required
* Required
* Required
* Invalid email address
By submitting this form, you agree that AGP may contact you with insights and marketing messaging.
No thanks, I don't want to receive any marketing emails from AGP.
Submit

Thank you for your message!
We will contact you soon.