Language Technologies for Digital Humanities and Cultural Heritage

Workshop associated with the RANLP 2011 Conference

15-16 September 2011, Hissar, Bulgaria

NEW: Invited Talk: Gábor Prószéky, Morphologic /Hungary
Workshop Programme is available

Following several digitization campaigns during the last years, a large number of printed books, manuscripts and archaeological digital objects have become available through web portals and associated infrastructures to a broader public. These infrastructures enable not only virtual research and easier access to materials independent of their physical place, but also play a major role in the long term preservation and exploration. However, the access to digital materials opens new possibilities of textual research like: synchronous browsing of several materials, extraction of relevant passages for a certain event from different sources, rapid search though thousand pages, categorisation of sources, multilingual retrieval and support, etc.

Methods from Language Technology are therefore highly required in order to ensure extraction of content-related semantic metadata, and analysis of textual materials. There are several initiatives in Europe aiming to foster the application of language technology in humanities (CLARIN, DARIAH). Through such initiatives as well as many other research projects, the awareness of such methods for the humanities has risen considerably. However, there is still enough potential on both sides:

  • on one hand, there are still research tracks in the humanities which still do not sufficiently and effectively exploit language technology solutions
  • on the other hand, there are many languages, especially historical variants of languages , for which the available tools and resources still have to be developed or adapted to serve humanities applications.

The current workshop aims to bring together researchers from humanities and language technologies and foster the above-mentioned directions.

We are particularly interested in (but not restricted to):

  • language tools and resources for analysis of old textual material or language variants
  • (semi-) automatic extraction of content related metadata
  • semantic linkage of heterogonous data within digital libraries
  • multilingual applications in digital libraries
  • pilot applications in humanities using language technology methods

