Research on methods to generate a dynamic lexicon for a text corpus in a digital library. Using Greek and Latin texts, the project would investigate processes to enumerate possible senses for the words being defined and provide detailed syntactic information and statistical data about their use in a corpus.
We propose to research core functions for the automatic analysis of historical languages (Greek & Latin) within an emerging cyberinfrastructure; we will research three technologies for building a dynamic lexicon, as well as the processes required to automatically create such a reference work for any textual collection. Our efforts will focus on parallel text analysis ? word sense induction and disambiguation ? as well as syntactic parsing. These technologies will enable us to create a reference work that lists the possible senses for a word while also providing syntactic information and statistical data about its use in a corpus. The methods we use to create this work will let users search a text not only by word form, but also by word sense, syntactic subcategorization and selectional preference. Our main contribution will be the steps that any digital library needs to take to dynamically create a reference work of their own and interface it with the texts in their collection.