Loading…
Loading grant details…
| Funder | National Science Foundation (US) |
|---|---|
| Recipient Organization | George Mason University |
| Country | United States |
| Start Date | Sep 01, 2021 |
| End Date | Feb 28, 2026 |
| Duration | 1,641 days |
| Number of Grantees | 2 |
| Roles | Principal Investigator; Co-Principal Investigator |
| Data Source | National Science Foundation (US) |
| Grant ID | 2109578 |
Documentation of languages, especially endangered languages, is crucial for conserving humanity’s knowledge and cultural heritage, as well as for advancing an understanding of human language. Traditional documentation methods produce invaluable materials such as grammars, dictionaries, and annotated texts, but require more time than can be afforded to keep up with current language extinction rates.
The most constructive response to this crisis is to complement documentation efforts by collecting data for as many languages as possible now and to make them accessible and interpretable so that they can be studied later by both linguists and members of the language communities. Digital technologies make it practical to obtain many hours of recordings in an endangered language along with translations.
This project advances technologies for analyzing the recordings at the sub-word, word, and clause level so that they become accessible for a wide variety of documentary purposes.
The project makes the information in digital recordings more interpretable for further linguistic analysis in three ways. First, the team is devising computational methods to automatically derive a basic phonological understanding and produce phonetic representations for languages, even if they do not have an established writing system. Second, the team is developing methods to automatically analyze the internal structure of words in languages where this structure is highly complex.
Third, the team uses knowledge of more widely spoken languages to analyze related endangered languages. The resulting tool, the AI-helper toolbox, will be packaged with software that is currently widely in use by linguists and language communities in the language documentation process. All tools will be accessible through a web-based interface and the source code will be publicly available through GitHub.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
George Mason University
Complete our application form to express your interest and we'll guide you through the process.
Apply for This Grant