NEH banner

[light] [dark]

Funded Projects Query Form
One match

Grant number like: HAA-271654-20

Query elapsed time: 0.031 sec

Export results to Excel
Save this query

HAA-271654-20

Regents of the University of California, Berkeley (Berkeley, CA 94704-5940)
David Bamman (Project Director: January 2020 to present)
Multilingual BookNLP: Building a Literary NLP Pipeline Across Languages

The expansion of the BookNLP platform for studying the linguistic structure of textual materials to allow for the analysis of resources in Spanish, Japanese, Russian and German.

BookNLP (Bamman et al., 2014) is a natural language processing pipeline for reasoning about the linguistic structure of text of books, specifically designed for works of fiction. In addition to its pipeline of part-of-speech tagging, named entity recognition, and coreference resolution, BookNLP identifies the characters in a literary text, and represents them through the actions they participate in, the objects they possess, their attributes, and dialogue. The availability of this tool has driven much work in the computational humanities, especially surrounding character (Underwood et al., 2018; Kraicer and Piper, 2018; Dubnicek et al., 2018). At the same time, however, BookNLP has one major limitation: it currently only supports texts written in English. The goal of this project is to develop a version of BookNLP to support literature in Spanish, Japanese, Russian and German, and create a blueprint for others to develop it for additional languages in the future.

Project fields:
Interdisciplinary Studies, General

Program:
Digital Humanities Advancement Grants

Division:
Digital Humanities

Totals:
$324,874 (approved)
$292,054 (awarded)

Grant period:
9/1/2020 – 8/31/2023