NEH banner

Funded Projects Query Form
One match

Grant number like: HT-272570-20

Query elapsed time: 0.031 sec

Page size:
 1 items in 1 pages
Page size:
 1 items in 1 pages
Princeton University (Princeton, NJ 08540-5228)
Natalia Ermolaev (Project Director: March 2020 to present)
Andrew Janco (Co Project Director: July 2020 to present)

Institutes for Advanced Topics in the Digital Humanities
Digital Humanities

[Grant products]

$239,983 (approved)
$237,034 (awarded)

Grant period:
9/1/2020 – 8/31/2023

New Languages for NLP: Building Linguistic Diversity in the Digital Humanities

an institute to help humanities scholars learn how to create linguistic data and apply statistical models to new languages.

Natural Language Processing (NLP) has revolutionized our ability to interpret texts at scale and is an essential tool for scholars in the digital humanities. However, only a small percentage of the world’s languages are supported by the major NLP libraries. The New Languages for NLP Institute will help scholars with expertise in less-resourced languages to create linguistic data and train NLP models for their languages. In three workshops, held at the Center for Digital Humanities at Princeton University in 2021-2022, participants will create linguistic data and train statistical language models for new languages. They will learn best practices in project and research data management. As an outcome of the project, participants will publish an open dataset in the standard Conference on Computational Natural Language Learning format as well as a trained language model that can be used for computational text analysis.