Accueil > Conférences filmées > Séminaire "Crossing borders : Three talks on Text Analysis and Digital Humanities"

Caroline Sporleder : Computational Linguistics and Digital Humanities : Chances and Challenges

publié le

Conférence de Caroline Sporleder dans le cadre du colloque "Crossing borders : Three talks on Text Analysis and Digital Humanities" organisé par le laboratoire LATTICE.

Computational Linguistics and Digital Humanities : Chances and Challenges

Digital Humanities (DH) is a field that has grown immensely in recent years. It is also a very diverse field covering -in its broadest definition- everything from corpus linguistics over computational philology and quantitative history to computational archaeology.
Because the origin of the field is rooted in corpus linguistics and computational philology and because data in the Humanities and Social Sciences are often (but not always) textual, digital text representation, processing, and mining are a major area of attention. Computational linguistics has a lot to contribute to this, both at the lower end of the scale (e.g., tools for OCR error correction and preprocessing) and at the higher end (e.g., sophisticated text mining tools). Computational linguistics can also benefit from evaluating its algorithms and tools on data from the Humanities as these data are often difficult, e.g. due to non-standard language and spelling, missing sentence boundaries, noisy input data and domains that are different from those typically considered in CL. Hence, CL for DH requires the development of very robust methods that work well on noisy data and do not require large amounts of training data. In this talk, I will address some of the chances and the challenges that arise when applying computational linguistic methods to data from the Humanities and Social Sciences.

Voir en ligne : Lien vers la vidéo