Transforming Health Insurance: Collective Health's Data Integration with Delta Live Tables and Structured Streaming
Technology Category
- Platform as a Service (PaaS) - Application Development Platforms
- Robots - Parallel Robots
Applicable Industries
- Healthcare & Hospitals
- Oil & Gas
Applicable Functions
- Quality Assurance
Use Cases
- Track & Trace of Assets
- Usage-Based Insurance
Services
- System Integration
About The Customer
Collective Health is a technology company that aims to revolutionize health insurance. The company is not an insurance provider but a technology innovator that seeks to make health insurance work better for everyone, starting with the 155 million+ Americans covered by their employer. Collective Health has created a powerful, flexible infrastructure to simplify employer-led healthcare. The company's platform is data-powered and human-empowered, designed to be easy to use, provide people with an advocate in their health journey, and help employers manage costs. One of the offerings on their platform is the Premier Partner Program™, built on the Databricks Lakehouse Platform.
The Challenge
Collective Health, a technology company revolutionizing health insurance, faced a significant challenge in managing and integrating data from various partners. The company's mission to simplify employer-led healthcare and improve health outcomes required a robust, flexible infrastructure that could handle vast amounts of data. However, the company's existing data integration architecture was not equipped to handle the evolving business requirements. The schema was constantly changing, and columns that previously contained data started to contain null values. Moreover, the company needed a solution that could ingest files incrementally without having to go through each file previously ingested. The challenge was to find a solution that could handle these complexities while ensuring data quality and scalability.
The Solution
Collective Health turned to the Databricks Lakehouse Platform and Delta Live Tables to address their data integration challenges. The company created a schema to set expectations from partners and used Apache Spark on Databricks to read multiple files from a cloud storage provider. The files were then saved into a delta table, with the ingest date and file name included for future reference. To handle the changing schema and null values, the company used Delta Live Tables, which provided validation tools, pipeline visualization, and a simple programmatic interface. The company also implemented Structured Streaming using Databricks' Auto Loader, which allowed them to ingest new incoming files as they arrived, similar to an event-driven model. This approach eliminated the need to keep compute resources running all the time. Delta Live Tables were also used to validate records and track occurrences of null values in non-nullable columns. Records with insufficient data or those that could not be validated were dropped.
Operational Impact
Case Study missing?
Start adding your own!
Register with your work email and create a new case study profile for your business.
Related Case Studies.
Case Study
Taking Oil and Gas Exploration to the Next Level
DownUnder GeoSolutions (DUG) wanted to increase computing performance by 5 to 10 times to improve seismic processing. The solution must build on current architecture software investments without sacrificing existing software and scale computing without scaling IT infrastructure costs.
Case Study
Remote Wellhead Monitoring
Each wellhead was equipped with various sensors and meters that needed to be monitored and controlled from a central HMI, often miles away from the assets in the field. Redundant solar and wind generators were installed at each wellhead to support the electrical needs of the pumpstations, temperature meters, cameras, and cellular modules. In addition to asset management and remote control capabilities, data logging for remote surveillance and alarm notifications was a key demand from the customer. Terra Ferma’s solution needed to be power efficient, reliable, and capable of supporting high-bandwidth data-feeds. They needed a multi-link cellular connection to a central server that sustained reliable and redundant monitoring and control of flow meters, temperature sensors, power supply, and event-logging; including video and image files. This open-standard network needed to interface with the existing SCADA and proprietary network management software.
Case Study
Refinery Saves Over $700,000 with Smart Wireless
One of the largest petroleum refineries in the world is equipped to refine various types of crude oil and manufacture various grades of fuel from motor gasoline to Aviation Turbine Fuel. Due to wear and tear, eight hydrogen valves in each refinery were leaking, and each cost $1800 per ton of hydrogen vented. The plant also had leakage on nearly 30 flare control hydrocarbon valves. The refinery wanted a continuous, online monitoring system that could catch leaks early, minimize hydrogen and hydrocarbon production losses, and improve safety for maintenance.
Case Study
Hospital Inventory Management
The hospital supply chain team is responsible for ensuring that the right medical supplies are readily available to clinicians when and where needed, and to do so in the most efficient manner possible. However, many of the systems and processes in use at the cancer center for supply chain management were not best suited to support these goals. Barcoding technology, a commonly used method for inventory management of medical supplies, is labor intensive, time consuming, does not provide real-time visibility into inventory levels and can be prone to error. Consequently, the lack of accurate and real-time visibility into inventory levels across multiple supply rooms in multiple hospital facilities creates additional inefficiency in the system causing over-ordering, hoarding, and wasted supplies. Other sources of waste and cost were also identified as candidates for improvement. Existing systems and processes did not provide adequate security for high-cost inventory within the hospital, which was another driver of cost. A lack of visibility into expiration dates for supplies resulted in supplies being wasted due to past expiry dates. Storage of supplies was also a key consideration given the location of the cancer center’s facilities in a dense urban setting, where space is always at a premium. In order to address the challenges outlined above, the hospital sought a solution that would provide real-time inventory information with high levels of accuracy, reduce the level of manual effort required and enable data driven decision making to ensure that the right supplies were readily available to clinicians in the right location at the right time.