Development of a text analysis tool for examining and visualizing grammatical and stylistic features to assist authorship identification.
Increasing numbers of primary and secondary source texts have been digitized in recent years. Scholars who want to study these new collections in depth need computational assistance because of their large scale. The non-programmer tools for text analysis currently available operate at the word level, and they show tables of counts and lists of occurrences, but rarely interactive visualizations. We propose to build a text analysis tool that includes visualizations and works on the grammatical structure and stylistic features of text, applying highly accurate technology from computational linguistics and authorship identification to extract this information. We will develop our tool for a collection of slave narratives whose authorship is ambiguous. In doing so, we will find out whether visualizations of grammatical and stylistic features are useful to literary scholars, and whether this information allows them to make satisfying large-scale analyses of their text.