Paul Trilsbeek

Publications

Displaying 1 - 12 of 12
  • Wittenburg, P., Lautenschlager, M., Thiemann, H., Baldauf, C., & Trilsbeek, P. (2020). FAIR Practices in Europe. Data Intelligence, 2(1-2), 257-263. doi:10.1162/dint_a_00048.

    Abstract

    Institutions driving fundamental research at the cutting edge such as for example from the Max Planck Society (MPS) took steps to optimize data management and stewardship to be able to address new scientific questions. In this paper we selected three institutes from the MPS from the areas of humanities, environmental sciences and natural sciences as examples to indicate the efforts to integrate large amounts of data from collaborators worldwide to create a data space that is ready to be exploited to get new insights based on data intensive science methods. For this integration the typical challenges of fragmentation, bad quality and also social differences had to be overcome. In all three cases, well-managed repositories that are driven by the scientific needs and harmonization principles that have been agreed upon in the community were the core pillars. It is not surprising that these principles are very much aligned with what have now become the FAIR principles. The FAIR principles confirm the correctness of earlier decisions and their clear formulation identified the gaps which the projects need to address.
  • Seyfeddinipur, M., Ameka, F., Bolton, L., Blumtritt, J., Carpenter, B., Cruz, H., Drude, S., Epps, P. L., Ferreira, V., Galucio, A. V., Hellwig, B., Hinte, O., Holton, G., Jung, D., Buddeberg, I. K., Krifka, M., Kung, S., Monroig, M., Neba, A. N., Nordhoff, S. and 10 moreSeyfeddinipur, M., Ameka, F., Bolton, L., Blumtritt, J., Carpenter, B., Cruz, H., Drude, S., Epps, P. L., Ferreira, V., Galucio, A. V., Hellwig, B., Hinte, O., Holton, G., Jung, D., Buddeberg, I. K., Krifka, M., Kung, S., Monroig, M., Neba, A. N., Nordhoff, S., Pakendorf, B., Von Prince, K., Rau, F., Rice, K., Riessler, M., Szoelloesi Brenig, V., Thieberger, N., Trilsbeek, P., Van der Voort, H., & Woodbury, T. (2019). Public access to research data in language documentation: Challenges and possible strategies. Language Documentation and Conservation, 13, 545-563. Retrieved from http://hdl.handle.net/10125/24901.

    Abstract

    The Open Access Movement promotes free and unfettered access to research publications and, increasingly, to the primary data which underly those publications. As the field of documentary linguistics seeks to record and preserve culturally and linguistically relevant materials, the question of how openly accessible these materials should be becomes increasingly important. This paper aims to guide researchers and other stakeholders in finding an appropriate balance between accessibility and confidentiality of data, addressing community questions and legal, institutional, and intellectual issues that pose challenges to accessible data.
  • Drude, S., Broeder, D., & Trilsbeek, P. (2014). The Language Archive and its solutions for sustainable endangered languages corpora. Book 2.0, 4, 5-20. doi:10.1386/btwo.4.1-2.5_1.

    Abstract

    Since the late 1990s, the technical group at the Max-Planck-Institute for Psycholinguistics has worked on solutions for important challenges in building sustainable data archives, in particular, how to guarantee long-time-availability of digital research data for future research. The support for the well-known DOBES (Documentation of Endangered Languages) programme has greatly inspired and advanced this work, and lead to the ongoing development of a whole suite of tools for annotating, cataloguing and archiving multi-media data. At the core of the LAT (Language Archiving Technology) tools is the IMDI metadata schema, now being integrated into a larger network of digital resources in the European CLARIN project. The multi-media annotator ELAN (with its web-based cousin ANNEX) is now well known not only among documentary linguists. We aim at presenting an overview of the solutions, both achieved and in development, for creating and exploiting sustainable digital data, in particular in the area of documenting languages and cultures, and their interfaces with related other developments
  • Van den Heuvel, H., Sanders, E., Klatter-Folmer, J., Van Hout, R., Fikkert, P., Baker, A., De Jong, J., Wijnen, F., & Trilsbeek, P. (2014). Data curation for a VALID archive of Dutch language impairment data. Dutch journal of applied linguistics, 3(2), 127-135. doi:10.1075/dujal.3.2.02heu.

    Abstract

    The VALID Data Archive is an open multimedia data archive in which data from children and adults with language and/or communication problems are brought together. A pilot project, funded by CLARIN-NL, was carried out in which five existing data sets were curated. This pilot enabled us to build up experience in conserving different kinds of pathological language data in a searchable and persistent manner. These data sets reflect current research in language pathology rather well, both in the range of designs and the variety in pathological problems, such as Specific Language Impairment, deafness, dyslexia, and ADHD. In this paper, we present the VALID initiative, explain the curation process and discuss the materials of the data sets.

    Files private

    Request files
  • Seifart, F., Haig, G., Himmelmann, N. P., Jung, D., Margetts, A., & Trilsbeek, P. (Eds.). (2012). Potentials of language documentation: Methods, analyses, and utilization. Honolulu: University of Hawai‘i Press.

    Abstract

    In the past 10 or so years, intensive documentation activities, i.e. compilations of large, multimedia corpora of spoken endangered languages have contributed to the documentation of important linguistic and cultural aspects of dozens of languages. As laid out in Himmelmann (1998), language documentations include as their central components a collection of spoken texts from a variety of genres, recorded on video and/or audio, with time-aligned annotations consisting of transcription, translation, and also, for some data, morphological segmentation and glossing. Text collections are often complemented by elicited data, e.g. word lists, and structural descriptions such as a grammar sketch. All data are provided with metadata which serve as cataloguing devices for their accessibility in online archives. These newly available language documentation data have enormous potential.
  • Trilsbeek, P., & Van Uytvanck, D. (2009). Regional archives and community portals. IASA Journal, 32, 69-73.
  • Wittenburg, P., Skiba, R., & Trilsbeek, P. (2005). The language archive at the MPI: Contents, tools, and technologies. Language Archives Newsletter, 5, 7-9.
  • Russel, A., & Trilsbeek, P. (2004). ELAN Audio Playback. Language Archive Newsletter, 1(4), 12-13.
  • Skiba, R., Wittenburg, F., & Trilsbeek, P. (2004). New DoBeS web site: Contents & functions. Language Archive Newsletter, 1(2), 4-4.
  • Trilsbeek, P. (2004). DoBeS Training Course. Language Archive Newsletter, 1(2), 6-6.
  • Trilsbeek, P. (2004). Report from DoBeS training week. Language Archive Newsletter, 1(3), 12-12.
  • Wittenburg, P., Skiba, R., & Trilsbeek, P. (2004). Technology and Tools for Language Documentation. Language Archive Newsletter, 1(4), 3-4.

Share this page