Paul Trilsbeek

Publications

Displaying 1 - 8 of 8
  • Drude, S., Trilsbeek, P., & Broeder, D. (2012). Language Documentation and Digital Humanities: The (DoBeS) Language Archive. In J. C. Meister (Ed.), Digital Humanities 2012 Conference Abstracts. University of Hamburg, Germany; July 16–22, 2012 (pp. 169-173).

    Abstract

    Overview Since the early nineties, the on-going dramatic loss of the world’s linguistic diversity has gained attention, first by the linguists and increasingly also by the general public. As a response, the new field of language documentation emerged from around 2000 on, starting with the funding initiative ‘Dokumentation Bedrohter Sprachen’ (DoBeS, funded by the Volkswagen foundation, Germany), soon to be followed by others such as the ‘Endangered Languages Documentation Programme’ (ELDP, at SOAS, London), or, in the USA, ‘Electronic Meta-structure for Endangered Languages Documentation’ (EMELD, led by the LinguistList) and ‘Documenting Endangered Languages’ (DEL, by the NSF). From its very beginning, the new field focused on digital technologies not only for recording in audio and video, but also for annotation, lexical databases, corpus building and archiving, among others. This development not just coincides but is intrinsically interconnected with the increasing focus on digital data, technology and methods in all sciences, in particular in the humanities.
  • Drude, S., Broeder, D., Trilsbeek, P., & Wittenburg, P. (2012). The Language Archive: A new hub for language resources. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 3264-3267). European Language Resources Association (ELRA).

    Abstract

    This contribution presents “The Language Archive” (TLA), a new unit at the MPI for Psycholinguistics, discussing the current developments in management of scientific data, considering the need for new data research infrastructures. Although several initiatives worldwide in the realm of language resources aim at the integration, preservation and mobilization of research data, the state of such scientific data is still often problematic. Data are often not well organized and archived and not described by metadata ― even unique data such as field-work observational data on endangered languages is still mostly on perishable carriers. New data centres are needed that provide trusted, quality-reviewed, persistent services and suitable tools and that take legal and ethical issues seriously. The CLARIN initiative has established criteria for suitable centres. TLA is in a good position to be one of such centres. It is based on three essential pillars: (1) A data archive; (2) management, access and annotation tools; (3) archiving and software expertise for collaborative projects. The archive hosts mostly observational data on small languages worldwide and language acquisition data, but also data resulting from experiments
  • Seifart, F., Haig, G., Himmelmann, N. P., Jung, D., Margetts, A., & Trilsbeek, P. (Eds.). (2012). Potentials of language documentation: Methods, analyses, and utilization. Honolulu: University of Hawai‘i Press.

    Abstract

    In the past 10 or so years, intensive documentation activities, i.e. compilations of large, multimedia corpora of spoken endangered languages have contributed to the documentation of important linguistic and cultural aspects of dozens of languages. As laid out in Himmelmann (1998), language documentations include as their central components a collection of spoken texts from a variety of genres, recorded on video and/or audio, with time-aligned annotations consisting of transcription, translation, and also, for some data, morphological segmentation and glossing. Text collections are often complemented by elicited data, e.g. word lists, and structural descriptions such as a grammar sketch. All data are provided with metadata which serve as cataloguing devices for their accessibility in online archives. These newly available language documentation data have enormous potential.
  • Broeder, D., Sloetjes, H., Trilsbeek, P., Van Uytvanck, D., Windhouwer, M., & Wittenburg, P. (2011). Evolving challenges in archiving and data infrastructures. In G. L. J. Haig, N. Nau, S. Schnell, & C. Wegener (Eds.), Documenting endangered languages: Achievements and perspectives (pp. 33-54). Berlin: De Gruyter.

    Abstract

    Introduction Increasingly often research in the humanities is based on data. This change in attitude and research practice is driven to a large extent by the availability of small and cheap yet high-quality recording equipment (video cameras, audio recorders) as well as advances in information technology (faster networks, larger data storage, larger computation power, suitable software). In some institutes such as the Max Planck Institute for Psycholinguistics, already in the 90s a clear trend towards an all-digital domain could be identified, making use of state-of-the-art technology for research purposes. This change of habits was one of the reasons for the Volkswagen Foundation to establish the DoBeS program in 2000 with a clear focus on language documentation based on recordings as primary material.
  • Trilsbeek, P., & Wittenburg, P. (2007). "Los acervos lingüísticos digitales y sus desafíos". In J. Haviland, & F. Farfán (Eds.), Bases de la documentacíon lingüística (pp. 359-385). Mexico: Instituto Nacional de Lenguas Indígenas.

    Abstract

    This chapter describes the challenges that modern digital language archives are faced with. One essential aspect of such an archive is to have a rich metadata catalog such that the archived resources can be easily discovered. The challenge of the archive is to obtain these rich metadata descriptions from the depositors without creating too much overhead for them. The rapid changes in storage technology, file formats and encoding standards make it difficult to build a long-lasting repository, therefore archives need to be set up in such a way that a straightforward and automated migration process to newer technology is possible whenever certain technology becomes obsolete. Other problems arise from the fact that there are many different groups of users of the archive, each of them with their own specific expectations and demands. Often conflicts exist between the requirements for different purposes of the archive, e.g. between long-term preservation of the data versus direct access to the resources via the web. The task of the archive is to come up with a technical solution that works well for most usage scenarios.
  • Broeder, D., Claus, A., Offenga, F., Skiba, R., Trilsbeek, P., & Wittenburg, P. (2006). LAMUS: The Language Archive Management and Upload System. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 2291-2294).
  • Trilsbeek, P., & Wittenburg, P. (2005). Archiving challenges. In J. Gippert, N. Himmelmann, & U. Mosel (Eds.), Essentials of language documentation (pp. 311-335). Berlin: Mouton de Gruyter.
  • Wittenburg, P., Skiba, R., & Trilsbeek, P. (2005). The language archive at the MPI: Contents, tools, and technologies. Language Archives Newsletter, 5, 7-9.

Share this page