Consultation with staff from the National Energy Research Scientific Computing Center to investigate the development of dynamic lexica for Latin and ancient Greek.
The Perseus Project recently received funding from the National Endowment for the Humanities to investigate the automatic construction of "dynamic lexica" for historical languages (specifically Latin and Greek) as the output of automatic processes based on both supervised and unsupervised learning techniques. We are seeking NEH/NERSC supercomputing support and training for two reasons: 1.) to let us significantly reduce our training time for two known automatic processes already under development (automatic parsing and parallel text alignment), in order to allow us to be more agile in our future development and optimization; and 2.) to let us begin experimenting with approaches not available to us without the use of such resources (such as a hybrid approach to word sense disambiguation involving labeled sense induction and clustering). In this we hope not only to improve upon our existing methods but also to investigate the possibility for innovative new work as well.