Data and corpora
This page gives an overview of the data archives and language corpora which can be accessed through the MPI:
Browsable corpora at the MPI
Language corpora of data collected within the framework of MPI projects or data that has been collected in earlier times and now stored in the MPI archives. more >
DoBeS
The DoBeS (Documentation of Endangered Languages) program is financed by the German Volkswagen Stiftung. The aim of the program is twofold: (1) to document languages which are at the edge of disappearing and (2) to provide a persistent and long lasting archive of the documentation material. Currently over 40 languages are being documented and archived. more >
Geographical Browsing of language sites
Use geographic browsing to explore a collection of places representing various research locations of the Max Planck Institute for Psycholinguistics and other Linguistic Archives. Download the Google Earth Overlay
The Corpus NGT
The NGT is a collection data from deaf signers using Sign Language of the Netherlands (NGT). Data consist of recordings with multiple synchronized video cameras, accompanied by gloss and translation annotations. All data are freely accessible to researchers and the general public. The project is carried out by Onno Crasborn, Inge Zwitserlood and Johan Ros from the Radboud University. The data is stored in the MPI archive for linguistic resources. more >
Database of Dutch diphone perception
This database is described in: Smits, R., Warner, N., McQueen, J.M. & Cutler (2003), Unfolding of phonetic information over time: A database of Dutch diphone perception, Journal of the Acoustical Society of America, 113, 563-574. to the database >
The Fromkin speech error database
The Fromkin Speech Error Database was collected over many years, and was converted to computer-readable form at UCLA with support from a National Science Foundation grant to Professor Victoria A. Fromkin.
At the time of Vicki Fromkin's death in January 2000, the wider availability of the database was in doubt because there was no longer support for the software format used to encode it. more >
The Stern diaries