Technology Category
- Application Infrastructure & Middleware - Database Management & Storage
- Infrastructure as a Service (IaaS) - Cloud Storage Services
Applicable Industries
- Equipment & Machinery
- Retail
Use Cases
- Intrusion Detection Systems
- Time Sensitive Networking
Services
- System Integration
About The Customer
Resmo is a tool that collects configuration data from Cloud and SaaS tools using APIs. It allows users to explore this data using SQL to ask any question they want. Resmo comes with thousands of pre-built SQL-based rules and questions and also provides visual exploration capabilities of the collected data through filters, free text search, or graph. Customers can create their own rules or use automation to receive notifications via various channels when there are changes to the data or rule status. Resmo's data collection generates more than 300 million spans per day, and this number is rapidly increasing with the customer size.
The Challenge
Resmo, a tool that gathers configuration data from Cloud and SaaS tools using APIs, faced a significant challenge in managing the large volume of network calls resulting from collecting data from thousands of APIs. The traditional approach of logs was too verbose and difficult to query, while aggregated metrics lacked sufficient context for detecting and diagnosing specific issues. Resmo utilized tracing, which provided a better view of the flow of requests and their associated responses. However, the volume of spans generated by Resmo's data collection was excessive, and the usual approach of sampling could cause blind spots, making it difficult to identify issues on non-happy paths of execution that happen rarely. Furthermore, many vendors charge by the number of ingested events and the volume of data per GB, which can be costly without any sampling. Only a few vendors allow custom SQL queries on the data.
The Solution
Resmo decided to use full tracing (no sampling) with OpenTelemetry and ClickHouse for cost-effective and efficient storage and querying of traces. Initially, they considered using S3 and Athena, but the fixed startup delay of 2-3 seconds for Athena was a drawback. They hosted their own ClickHouse instance, which allowed them to store more than 4 billion spans with a 92% compression percentage. To improve the performance of common queries, they added materialized columns for frequently used fields in queries, monitors, and dashboards. They also used the out of the box configuration of Opentelemetry Collector with ClickHouse and Java agent for distributed tracing, adding manual instrumentation in the form of context-specific tags to their spans. They connected ClickHouse to Postgres for their observability queries, joining user and tenant IDs in their spans to the actual account names and account status in the Postgres database. For visualizing data, they used Grafana, and for writing queries, they used IntelliJ IDEA & DataGrip.
Operational Impact
Quantitative Benefit
Case Study missing?
Start adding your own!
Register with your work email and create a new case study profile for your business.
Related Case Studies.

Case Study
Smart Water Filtration Systems
Before working with Ayla Networks, Ozner was already using cloud connectivity to identify and solve water-filtration system malfunctions as well as to monitor filter cartridges for replacements.But, in June 2015, Ozner executives talked with Ayla about how the company might further improve its water systems with IoT technology. They liked what they heard from Ayla, but the executives needed to be sure that Ayla’s Agile IoT Platform provided the security and reliability Ozner required.

Case Study
IoT enabled Fleet Management with MindSphere
In view of growing competition, Gämmerler had a strong need to remain competitive via process optimization, reliability and gentle handling of printed products, even at highest press speeds. In addition, a digitalization initiative also included developing a key differentiation via data-driven services offers.

Case Study
Predictive Maintenance for Industrial Chillers
For global leaders in the industrial chiller manufacturing, reliability of the entire production process is of the utmost importance. Chillers are refrigeration systems that produce ice water to provide cooling for a process or industrial application. One of those leaders sought a way to respond to asset performance issues, even before they occur. The intelligence to guarantee maximum reliability of cooling devices is embedded (pre-alarming). A pre-alarming phase means that the cooling device still works, but symptoms may appear, telling manufacturers that a failure is likely to occur in the near future. Chillers who are not internet connected at that moment, provide little insight in this pre-alarming phase.

Case Study
Premium Appliance Producer Innovates with Internet of Everything
Sub-Zero faced the largest product launch in the company’s history:It wanted to launch 60 new products as scheduled while simultaneously opening a new “greenfield” production facility, yet still adhering to stringent quality requirements and manage issues from new supply-chain partners. A the same time, it wanted to increase staff productivity time and collaboration while reducing travel and costs.

Case Study
Integration of PLC with IoT for Bosch Rexroth
The application arises from the need to monitor and anticipate the problems of one or more machines managed by a PLC. These problems, often resulting from the accumulation over time of small discrepancies, require, when they occur, ex post technical operations maintenance.

Case Study
Data Gathering Solution for Joy Global
Joy Global's existing business processes required customers to work through an unstable legacy system to collect mass volumes of data. With inadequate processes and tools, field level analytics were not sufficient to properly inform business decisions.