Enhancing Data Quality in Higher Education: AI-Driven Automated Data Cleaning

In higher education, where data drives strategic decisions and student success, the quality of that data can make or break outcomes. Yet manually entered information, especially from student surveys, is often riddled with typos, inconsistencies, and ambiguous responses that distort analysis and delay critical insights. With data professionals spending up to 80% of their time cleaning data, institutions need smarter, faster, and more reliable solutions. This session introduces an innovative, AI-powered approach to solving this challenge. We will showcase a Python-based prototype that combines the large language model of OpenAI’s ChatGPT with the precision of Fuzzy Wuzzy for fuzzy matching. Together, they detect and correct messy categorical data, transforming raw student feedback into clean, standardized inputs ready for immediate analysis. Attendees will see how this solution enhances data quality, reduces cleanup time, and empowers institutions to act on student insights with greater speed and confidence. This approach significantly improves data accuracy and consistency, reduces the burden on analysts, and makes the entire data preparation process more efficient and reliable.            
 
Date and Time: November 13, 2025, 2:00 AM – 2:45 AM
Type: Breakout Session
Cost: $299
Organizer: EDUCAUSE
To learn more information and register, please visit: https://events.educause.edu/annual-conference/2025-online/agenda/enhancing-data-quality-in-higher-education-aidriven-automated-data-cleaning
  
         
 
 
	
	 
			 
			 
			