Language archiving technology at the MPI

Ringersma, J., Trilsbeek, P., & Wittenburg, P. (2007). Language archiving technology at the MPI. Poster presented at 11th International Conference on Information Visualization, Zurich.
The repository of the MPI contains different types of linguistic material: the DOBES endangered languages archive, the ESF second learner corpus, the Dutch Spoken National Corpus, MPI's gesture corpora, MPI acquisition corpora and MPI language documentations of the language and cognition research group. The archive covers more than 200.000 objects, mostly organized in sessions that are described with the IMDI-based metadata descriptions. Mostly, these sessions contain digitized audio/video signals and layers of annotations. In general access to these resources is limited and can be made available upon request. The Language Archiving Technology (LAT) is meant to contribute to the archive infrastructure. It focuses on open accessibility of the language resources; it supports dynamic and continuously enriched collections according to the Live Archives ideas; it stresses the need for long-term archiving of our digital collections covering unique martial about languages that will probably be extinct in a few decades and it follows the trend towards service oriented architectures. LAT components consist of data management and ingestion tools (IMDI, LAMUS and AMS) and of archive enrichment and visualization tools (ELAN, ANNEX and LEXUS). The tools are being developed and maintained by the Technical Group of the MPI. All LAT products are or will become available under an Open Source license and will be available free-of-charge in academic research.
Additional information
Publication type
Publication date

Share this page