The Language Archive

The Language Archive (TLA) is an integral part of the Max Planck Institute in Nijmegen. It was originally established in the late nineties to archive language corpora collected by the MPI’s Language and Cognition and Language Acquisition researchers. In 2000, it became the central archive for the DOBES language documentation programme, funded by the Volkswagen Foundation.

What kind of data is stored in The Language Archive?

Currently, TLA contains more than 350 collections, covering over 250 different languages that are spoken around the world. This includes:

  • Languages from around the world studied by MPI field linguists
  • Language Development/Language Acquisition corpora
  • Rich multimedia language documentation corpora of endangered languages
  • The CGN (Spoken Dutch) corpus
  • Several Sign Language corpora


More information about the kind of data stored in The Language Archive, as well as information on how to access that data, can be found on The Language Archive website.


Paul Trilsbeek

Research Data Manager
Technical Group
+31 24 3521203
Paul [dot] Trilsbeek [at] mpi [dot] nl

Share this page