Technology Category
- Sensors - Flow Meters
- Sensors - Liquid Detection Sensors
Applicable Industries
- Cement
- Education
Applicable Functions
- Product Research & Development
- Quality Assurance
Use Cases
- Chatbots
- Machine Translation
Services
- Data Science Services
- Training
About The Customer
The Center for Security and Emerging Technology (CSET) is a policy research organization within Georgetown University’s Walsh School of Foreign Service. It produces data-driven research on security and technology and provides non-partisan analysis to the policy community. CSET is committed to preparing a new generation of decision-makers to address the challenges and opportunities of emerging technologies such as artificial intelligence, advanced computing, and biotechnology. It provides unprecedented coverage of the emerging technology ecosystem and its security implications, bolstered by novel methods to classify and analyze research and technical outputs from diverse sources, including foreign-language materials.
The Challenge
The Center for Security and Emerging Technology (CSET) at Georgetown University was faced with the challenge of building NLP applications to classify complex research documents. The goal was to surface scientific articles of analytic interest to inform data-driven policy recommendations. However, the team found that a large-scale manual labeling effort would be impractical. They initially experimented with the Snorkel Research Project, which allowed them to programmatically label 90K data points within weeks, achieving 77% precision. However, the collaboration between data scientists and subject-matter experts was time-consuming and inefficient, involving spreadsheets, Slack channels, and Python scripts. This workflow made improving data and model quality a slow process. The team was constrained by inefficient tooling to auto-label, gain visibility into data, and improve training data and model quality. The lack of an integrated feedback loop from model training and analysis to labeling also meant that data scientists and subject matter experts had to spend long cycles re-labeling data to match evolving business criteria. These challenges limited the team’s capacity to deliver production-grade models, shorten project timelines, and take on more projects.
The Solution
CSET's data scientists attended Snorkel's The Future of Data-centric AI conference and decided to explore Snorkel Flow, a data-centric AI platform, as a potential solution. Snorkel Flow drastically reduced labeling, model training, and iteration time, and better equipped CSET’s data science team to collaborate closely with analysts to gather, process, and interpret data at scale. The team was able to create 60+ labeling functions to programmatically label 107K data points using advanced features such as keyword LFs, auto-suggest LFs, cluster LFs, and more. They also used embedding similarity and negative sampling to improve the representation of the negative class. Snorkel Flow provided the ability to pinpoint data slices for domain expert spot-checks and troubleshooting to improve accuracy, powering an active learning workflow. The platform also improved collaboration between domain experts and data scientists with an easy-to-use GUI to author LFs and used comments and tags to discuss and resolve complex cases efficiently. It increased productivity with advanced LFs based on foundation-model embedding distances and clustering, and reduced time to adapt with guided error analysis and prioritized examples for targeted manual review using active learning.
Operational Impact
Quantitative Benefit
Case Study missing?
Start adding your own!
Register with your work email and create a new case study profile for your business.
Related Case Studies.
Case Study
System 800xA at Indian Cement Plants
Chettinad Cement recognized that further efficiencies could be achieved in its cement manufacturing process. It looked to investing in comprehensive operational and control technologies to manage and derive productivity and energy efficiency gains from the assets on Line 2, their second plant in India.
Case Study
Revolutionizing Medical Training in India: GSL Smart Lab and the LAP Mentor
The GSL SMART Lab, a collective effort of the GSL College of Medicine and the GSL College of Nursing and Health Science, was facing a challenge in providing superior training to healthcare professionals. As clinical medicine was becoming more focused on patient safety and quality of care, the need for medical simulation to bridge the educational gap between the classroom and the clinical environment was becoming increasingly apparent. Dr. Sandeep Ganni, the director of the GSL SMART Lab, envisioned a world-class surgical and medical training center where physicians and healthcare professionals could learn skills through simulation training. He was looking for different simulators for different specialties to provide both basic and advanced simulation training. For laparoscopic surgery, he was interested in a high fidelity simulator that could provide basic surgical and suturing skills training for international accreditation as well as specific hands-on training in complex laparoscopic procedures for practicing physicians in India.
Case Study
IoT platform Enables Safety Solutions for U.S. School Districts
Designed to alert drivers when schoolchildren are present, especially in low-visibility conditions, school-zone flasher signals are typically updated manually at each school. The switching is based on the school calendar and manually changed when an unexpected early dismissal occurs, as in the case of a weather-event altering the normal schedule. The process to reprogram the flashers requires a significant effort by school district personnel to implement due to the large number of warning flashers installed across an entire school district.
Case Study
Digital Transformation of Atlanta Grout & Tile: An IoT Case Study
Atlanta Grout & Tile, a Tile, Stone & Grout restoration company based in Woodstock, Georgia, was facing challenges with its traditional business model. Despite steady growth over the years, the company was falling behind the web revolution and missing out on the opportunity to tap into a new consumer base. They were using independent software from different vendors for each of their department information and workforce management. This resulted in a lot of manual work on excel and the need to export/import data between different systems. This not only increased overhead costs but also slowed down their response to clients. The company also had to prepare numerous reports manually and lacked access to customer trends for effective business decision-making.
Case Study
Implementing Robotic Surgery Training Simulator for Enhanced Surgical Proficiency
Fundacio Puigvert, a leading European medical center specializing in Urology, Nephrology, and Andrology, faced a significant challenge in training its surgical residents. The institution recognized the need for a more standardized and comprehensive training curriculum, particularly in the area of robotic surgery. The challenge was underscored by two independent studies showing that less than 5% of residents in Italian and German residency programs could perform major or complex procedures by the end of their residency. The institution sought to establish a virtual reality simulation lab that would include endourological, laparoscopic, and robotic platforms. However, they needed a simulator that could replicate both the hardware and software of the robotic Da Vinci console used in the operating room, without being connected to the actual physical console. They also required a system that could provide both basic and advanced simulation training, and a metrics system to assess the proficiency of the trainees before they performed surgical procedures in the operating theater.