Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | University of Washington |
| Country | United States |
| Start Date | May 01, 2022 |
| End Date | Apr 30, 2026 |
| Duration | 1,460 days |
| Number of Grantees | 1 |
| Roles | Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2202049 |
The lack of reading proficiency seen in children of underserved school districts has lasting impacts on students’ performances in various subjects. Low literacy is an especially pressing issue for African American students. Interactive spoken language systems offer the possibility of a powerful tool for assisting in early childhood education, freeing up teachers’ time, and engaging students in repeated opportunities for learning.
These systems involve both Automatic Speech Recognition and Text-to-Speech Systems. The goal of this research is to improve the performance of such systems for young speakers of African American English (AAE) such that automated oral literacy assessment can be developed. The research has important societal and technological impacts.
It will enhance the usability of speech technology in early education for AAE speaking children, providing a model for better supporting students with diverse dialects. Many under-resourced children do not have access to adequate reading and language assessments, and the proposed work will address these issues by creating methods for adapting spoken language technology to AAE children, increasing fairness in speech technology on a broader scale.
The work has strong outreach and dissemination programs and will train undergraduate and graduate students in interdisciplinary research in Electrical and Computer Engineering, Linguistics, Education, and Psychology.
Challenges facing children’s Automatic Speech Recognition (ASR) are due to (1) lack of child speech data and, hence, current models used for recognition are trained using data collected from adult speakers, and (2) children display a wider range of intra- and inter- speaker variability than adults. ASR performance is especially poor for children who are non-native English speakers or those who at times transition into dialects such as AAE that are different from what ASR systems are typically trained on.
In addition, most dialog systems built on text-to-speech (TTS) technology are designed using General American English (GAE) voices, which minority children may not identify with. In the high-stakes area of education, these considerations impact the effectiveness of technology for different groups. The work will utilize a new and continuously developing database of AAE children's speech to research the impact of spoken language systems on children’s learning outcomes.
On the learning side, the research will highlight the impact of dialect on literacy assessment. On the technology side, the work will yield novel machine learning algorithms for low-resource tasks. Specifically, this project will develop data augmentation techniques that can increase the amount of training data available for low-resource tasks, and data normalization techniques so that ASR performance is improved for AAE child speakers.
The work on TTS will explore new methods of disentangling speaker and dialect impacts on spectral realization of phrases that model dialect density (rather than treating dialect as a categorical variable) and separately accounting for pronunciation and prosodic factors. Methods found to be effective for TTS will be leveraged in the data augmentation work for ASR and explored as a diagnostic in literacy assessment.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
University of Washington
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant