Paul Trilsbeek

Publications

Displaying 1 - 5 of 5
  • Klamer, M., Trilsbeek, P., Hoogervorst, T., & Haskett, C. (2017). Creating a Language Archive of Insular South East Asia and West New Guinea. In J. Odijk, & A. Van Hessen (Eds.), CLARIN in the Low Countries (pp. 113-121). London: Ubiquity Press. doi:10.5334/bbi.10.

    Abstract

    The geographical region of Insular South East Asia and New Guinea is well-known as an area of mega-biodiversity. Less well-known is the extreme linguistic diversity in this area: over a quarter of the world’s 6,000 languages are spoken here. As small minority languages, most of them will cease to be spoken in the coming few generations. The project described here ensures the preservation of unique records of languages and the cultures encapsulated by them in the region. The language resources were gathered by twenty linguists at, or in collaboration with, Dutch universities over the last 40 years, and were compiled and archived in collaboration with The Language Archive (TLA) at the Max Planck Institute in Nijmegen. The resulting archive constitutes a collection ofmultimediamaterials and written documents from 48 languages in Insular South East Asia and West New Guinea. At TLA, the data was archived according to state-of-the-art standards (TLA holds the Data Seal of Approval): the component metadata infrastructure CMDI was used; all metadata categories as well as relevant units of annotation were linked to the ISO data category registry ISOcat. This guaranteed proper integration of the language resources into the CLARIN framework. Through the archive, future speaker communities and researchers will be able to extensively search thematerials for answers to their own questions, even if they do not themselves know the language, and even if the language dies.
  • Drude, S., Trilsbeek, P., Sloetjes, H., & Broeder, D. (2014). Best practices in the creation, archiving and dissemination of speech corpora at the Language Archive. In S. Ruhi, M. Haugh, T. Schmidt, & K. Wörner (Eds.), Best Practices for Spoken Corpora in Linguistic Research (pp. 183-207). Newcastle upon Tyne: Cambridge Scholars Publishing.
  • Trilsbeek, P., & Koenig, A. (2014). Increasing the future usage of endangered language archives. In D. Nathan, & P. Austin (Eds.), Language Documentation and Description vol 12 (pp. 151-163). London: SOAS. Retrieved from http://www.elpublishing.org/PID/142.
  • Wittenburg, P., Trilsbeek, P., & Wittenburg, F. (2014). Corpus archiving and dissemination. In J. Durand, U. Gut, & G. Kristoffersen (Eds.), The Oxford Handbook of Corpus Phonology (pp. 133-149). Oxford: Oxford University Press.
  • Broeder, D., Sloetjes, H., Trilsbeek, P., Van Uytvanck, D., Windhouwer, M., & Wittenburg, P. (2011). Evolving challenges in archiving and data infrastructures. In G. L. J. Haig, N. Nau, S. Schnell, & C. Wegener (Eds.), Documenting endangered languages: Achievements and perspectives (pp. 33-54). Berlin: De Gruyter.

    Abstract

    Introduction Increasingly often research in the humanities is based on data. This change in attitude and research practice is driven to a large extent by the availability of small and cheap yet high-quality recording equipment (video cameras, audio recorders) as well as advances in information technology (faster networks, larger data storage, larger computation power, suitable software). In some institutes such as the Max Planck Institute for Psycholinguistics, already in the 90s a clear trend towards an all-digital domain could be identified, making use of state-of-the-art technology for research purposes. This change of habits was one of the reasons for the Volkswagen Foundation to establish the DoBeS program in 2000 with a clear focus on language documentation based on recordings as primary material.

Share this page