Customer Company Size
Large Corporate
Country
- United States
Product
- SkyHive Skill Passport
- SkyHive Enterprise
- Rockset
- Databricks
Tech Stack
- Databricks
- Rockset
- Spark ETL
- Delta Lake
Implementation Scale
- Enterprise-wide Deployment
Impact Metrics
- Productivity Improvements
- Customer Satisfaction
- Digital Expertise
Technology Category
- Analytics & Modeling - Real Time Analytics
- Platform as a Service (PaaS) - Data Management Platforms
- Analytics & Modeling - Predictive Analytics
Applicable Industries
- Software
- Professional Service
Applicable Functions
- Business Operation
- Product Research & Development
Use Cases
- Real-Time Location System (RTLS)
- Remote Asset Management
Services
- Data Science Services
- Cloud Planning, Design & Implementation Services
- System Integration
About The Customer
SkyHive is an innovative end-to-end reskilling platform that automates skills assessment, identifies future talent needs, and fills skill gaps through targeted learning recommendations and job opportunities. The company collaborates with industry leaders such as Accenture and Workday and has been recognized by Gartner as a cool vendor in human capital management. SkyHive has developed a comprehensive Labor Market Intelligence database that stores profiles of 800 million anonymized workers and 40 million companies, along with 1.6 billion job descriptions from 150 countries. The platform ingests 16 TB of data daily from job postings and paid streaming data feeds, utilizing complex analytics and machine learning to provide insights into global job trends. SkyHive is rapidly growing, adding 2-4 corporate customers daily, driven by its data-driven services and partnerships.
The Challenge
SkyHive faced significant challenges with MongoDB for analytical queries due to its slow performance in handling complex analytics involving data across jobs, resumes, courses, and different geographics. The query latency was high, and the system struggled with multidimensional queries and joins, making it impossible to provide the interactive performance required by users. Additionally, there were limitations on payload sizes and other hardcoded quirks, such as the inability to query certain countries like Great Britain. These issues hindered SkyHive's ability to deliver immediate results to customers, especially when expanding searches to non-English speaking countries, as data normalization across different languages was problematic.
The Solution
To address the challenges with MongoDB, SkyHive transitioned to a real-time data stack using Databricks and Rockset. Databricks was chosen for its compatibility with more tooling options and support for open data formats, enabling SkyHive to deploy a lakehouse architecture. This architecture processes data through three Delta Lake stages, refining and enriching data for efficient storage and processing. Rockset was selected as the new user-facing serving database, continuously synchronizing with the Gold layer data and building an index for multidimensional analytics. This setup allows SkyHive to serve pre-defined Query Lambdas and ad hoc free-text searches, providing real-time answers to complex queries. The integration of Databricks and Rockset has significantly improved SkyHive's ability to handle large datasets, run ML models, and support complex queries with low latency, enhancing both internal operations and customer satisfaction.
Operational Impact
Quantitative Benefit
Case Study missing?
Start adding your own!
Register with your work email and create a new case study profile for your business.
Related Case Studies.
Case Study
SET Creative Ditches Google Vault for Datto Backupify
When Kienholz first started at SET, the staff was using Microsoft Outlook for email with no form of data backup. It became apparent that something needed to change as the staff was often burdened with trying to recover emails from departed employees. Kienholz transitioned the team to Google’s Gmail and implemented Google Vault for backup purposes. While SET employees quickly adjusted to Gmail, which many use for personal email, the same could not be said for Google Vault. “Unlike most Google products, Vault was not user friendly at all. It’s very hard to search for items. We never really figured out how to do a restore either,” explained Kienholz. Due to SET’s work with high-profile brands, projects often go through many rounds of revisions right down to the eleventh hour. This means that every bit of information - especially data living in project managers’ emails - is crucial to delivering clients a polished design at deadline.
Case Study
Infosys achieves a 5–7 percent effort reduction across projects
Infosys, a global leader in consulting, technology, and outsourcing solutions, was facing significant challenges in application development and maintenance due to its distributed teams, changing business priorities and the need to stay in alignment with customer needs. The company used a mix of open source, home-grown and third-party applications to support application development projects. However, challenges resulting from distributed teams using manual processes increased as the company grew. It became more and more important for Infosys to execute its projects efficiently, so they could improve quality, reduce defects and minimize delays.
Case Study
Arctic Wolf Envelops Teamworks with 24x7 Cybersecurity Protection and Comprehensive Visibility
Teamworks, a leading athlete engagement platform, faced rising cyberthreats and needed enhanced visibility into its network, servers, and laptops. With software developers connecting from all over the world, the company sought to improve its security posture and position itself for future growth. The company had a secure platform but recognized the need for a more proactive solution to identify gaps within its technology infrastructure. Data exfiltration and malicious access were top concerns, prompting the need for a comprehensive security upgrade.
Case Study
Sawback IT and Datto Save Client From a Costly Mistake
Ballistic Echo, a software development house, faced a critical challenge when human error led to the deletion of thousands of lines of unique code. This incident occurred before the code was pushed to source control, resulting in significant loss of time, revenue, and work. The previous file-level backup solution they used was slow and inefficient, making it nearly impossible to manually recreate the lost work. The need for a more reliable and efficient business continuity solution became evident to avoid such disasters in the future.
Case Study
Opal Helps Customers Shine Thanks to Datto
SP Flooring & Design Center faced a ransomware attack that encrypted and locked their files. The attack was initiated through a compromised service account set up by an outside vendor. The ransomware infection was isolated quickly, but there was a concern about the extent of the data at risk. The company had backups in place but was unsure of how much information was compromised. The situation required immediate action to prevent further damage and restore the affected data.
Case Study
Zapier Aggregates Multiple Analytics in a Single Dashboard with the New Relic Platform
Zapier, a company that enables non-technical users to push data between hundreds of web applications, was facing a challenge in automating and provisioning servers for optimal performance. The company's environment consisted of 50 Linux servers on the Amazon Elastic Compute Cloud (EC2), a Django application split across several servers, and a backend consisting of a dynamic number of celery task workers fed by messages published to a RabbitMQ cluster. They also maintained a number of internal web services on nginx in front of Gunicorn and Node.js processes. Redis handled simple key and value stores, with logging handled by Graylog2 and ElasticSearch. However, they realized that no level of automation would be sufficient without an effective monitoring solution in place. They needed a tool that could provide immediate alerts when something was breaking and could be easily implemented into their environment.