Unlocking language archives using search
Stehouwer, H., & Auer, E.
Unlocking language archives using search. In C. Vertan, M. Slavcheva, P. Osenova, & S. Piperidis (Eds.
), Proceedings of the Workshop on Language Technologies for Digital Humanities and Cultural Heritage, Hissar, Bulgaria, 16 September 2011
(pp. 19-26). Shoumen, Bulgaria: Incoma Ltd.
The Language Archive manages one of the largest and most varied sets of natural language data. This data consists of video and audio enriched with annotations. It is available for more than 250 languages, many of which are endangered.
Researchers have a need to access this data conveniently and efficiently. We provide several browse and search methods to cover this need, which have been developed and expanded over the years. Metadata and content-oriented search methods can be connected for a more focused search.
This article aims to provide a complete overview of the available search mechanisms, with a focus on annotation content search, including a benchmark.