The Language Archive and its solutions for sustainable endangered languages corpora
Drude, S., Broeder, D., & Trilsbeek, P.
The Language Archive and its solutions for sustainable endangered languages corpora. Book 2.0, 4
, 5-20. doi:10.1386/btwo.4.1-2.5_1.
Since the late 1990s, the technical group at the Max-Planck-Institute for Psycholinguistics has worked on solutions for important challenges in building sustainable data archives, in particular, how to guarantee long-time-availability of digital research data for future research.
The support for the well-known DOBES (Documentation of Endangered Languages) programme has greatly inspired and advanced this work, and lead to the ongoing development of a whole suite of tools for annotating, cataloguing and archiving multi-media data. At the core of the LAT (Language Archiving Technology) tools is the IMDI metadata schema, now being integrated into a larger network of digital resources in the European CLARIN project. The multi-media annotator ELAN (with its web-based cousin ANNEX) is now well known not only among documentary linguists.
We aim at presenting an overview of the solutions, both achieved and in development, for creating and exploiting sustainable digital data, in particular in the area of documenting languages and cultures, and their interfaces with related other developments