University of Rochester Under-Resourced NLP Lab
We develop methods to improve NLP tools for low-resource languages—those lacking the abundant data needed to train modern machine learning models.
Most ML approaches require vast amounts of text (hundreds of gigabytes), available only for high-resource languages like English and Chinese. This leaves the majority of the world's languages behind, undermining the vital role these systems can play in tools like keyboard autocorrect, speech recognition, and machine translation—tools that help languages thrive in the digital era.
To address this gap, we focus on machine learning techniques that work with limited data:
Prof. Downey gave an invited talk at the University at Buffalo Department of Linguistics on computational tools for under-resourced and endangered languages.
Work by Fei-Yueh Chen (MS Linguistics), Lateef Adeleke (PhD Linguistics), and C.M. Downey on linguistically informed evaluation of multilingual ASR for African languages has been selected to appear at the AfricaNLP workshop.
Read moreOur project on rapid adaptation of ASR models in data-scarce scenarios received 5,000 service units on Empire AI Beta.
Welcome to Ifeoma Okoh, a new PhD student working on low-resource NLP!
MS Student
Multilingual speech technology, targeted transfer learning
A modular toolkit for adapting multilingual language models to new languages. Handles dataset mixing, vocabulary replacement, and embedding initialization so you can focus on your language, not the infrastructure.
Exploring methods for rapidly adapting speech recognition models to endangered and under-documented languages, drawing on recordings from the Endangered Languages Archive.
For a complete list of publications, see lab members' individual pages.
We're always looking for students motivated by low-resource language technology!
If you're interested in joining the lab as a PhD student, I'm currently accepting students to the PhD in Linguistics. Please mention Professor Downey and/or UR2NLP in your application.
Undergraduate and MS students interested in research opportunities are encouraged to reach to c.m.downey@rochester.edu