Paul Trilsbeek

Presentations

Displaying 1 - 25 of 25
  • Trilsbeek, P. (2019). Migrating The Language Archive to a new repository solution. Talk presented at Open Repositories 2019. Hamburg, Germany. 2019-06-10 - 2019-06-13.
  • Trilsbeek, P., & Abdullah, I. (2019). Migrating The Language Archive to Islandora. Talk presented at iCampEU 2019. Zürich, Switzerland. 2019-06-17 - 2091-06-19.

    Abstract

    In the beginning of 2018, The Language Archive migrated its repository from an in-house built solution to a solution that is largely based on Islandora. We will talk about the migration trajectory and will present the new setup, which includes a custom ingest front- and back-end
  • Trilsbeek, P., Kung, S., & Seyfeddinipur, M. (2016). Case study: Citing archived resources in a Language publication. Talk presented at the 2nd Workshop on Data Citation & Attribution in Linguistics. Austin, TX, USA. 2016-04-08 - 2016-04-10.
  • Trilsbeek, P. (2016). UNESCO memory of the world:​ Selected data collections of the world’s language diversity at The Language Archive. Talk presented at the 15th session of the United Nations Permanent Forum on Indigenous Issues. New York, NY, USA. 2016-05-09 - 2016-05-20.
  • Trilsbeek, P., & Windhouwer, M. (2016). FLAT: A CLARIN-compatible repository solution based on Fedora Commons. Poster presented at CLARIN Annual Conference 2016, Aix-en-Provence, France.
  • Drude, S., Stehouwer, H., Trilsbeek, P., Broeder, D., & Sloetjes, H. (2013). Language documentation and the language archive as e-humanities centrum. Poster presented at the Soeterbeeck eHumanities Workshop, Ravenstein, The Netherlands.
  • Drude, S., Broeder, D., & Trilsbeek, P. (2013). The Language Archive as a centre of the Clarin infrastructure. Talk presented at the 2nd INNET Conference on Digital Language Archiving. Gniezno, Poland. 2013-09-06 - 2013-09-07.
  • Trilsbeek, P., Koenig, A., & Drude, S. (2013). The Language Archive. Poster presented at the 3rd International Conference on Language Documentation and Conservation (ICLDC), “Sharing Worlds of Knowledge", Honolulu, Hawaii.
  • Drude, S., Trilsbeek, P., & Broeder, D. (2012). Language documentation and digital humanities: The (DoBeS) Language Archive. Talk presented at International Digital Humanities Congress 2012. Hamburg. 2012-07-13 - 2012-07-21.
  • Drude, S., Broeder, D., & Trilsbeek, P. (2012). Sustainable solutions for endangered languages data: The Language Archive. Talk presented at Charting Vanishing Voices: A Collaborative Workshop to Map Endangered Oral Cultures. World Oral Literature Project 2012 Workshop. CRASSH, Cambridge. 2012-06-30 - 2012-06-30.

    Files private

    Request files
  • Drude, S., Broeder, D., Trilsbeek, P., & Wittenburg, P. (2012). The Language Archive - a new hub for language resources. Poster presented at LREC 2012: 8th International Conference on Language Resources and Evaluation, Istanbul, Turkey.
  • Drude, S., & Trilsbeek, P. (2011). The ‘Language Archiving Technology’ solutions for sustainable data from digital fieldwork research. Talk presented at the PARADISEC 2011 conference „Sustainable data from digital research: Humanities perspectives on digital scholarship“. Melbourne, Australia. 2011-11-12 - 2011-11-14.
  • Ringersma, J., & Trilsbeek, P. (2010). Metadata and language-resources. Documentation and Archival Training Workshop. Guwahati, Assam, India, 2010-02-04 - 2010-02-08.

    Abstract

    Teaching material on Metadata for the Documentation and Archival Training Workshop Guwahati, Assam, India
  • Wittenburg, P., Trilsbeek, P., & Lenkiewicz, P. (2010). Large multimedia archive for world languages. Talk presented at the ACM Workshop on Searching Spontaneous Conversational Speech [SSCS 2010]. Firenze, Italy. 2010-10-25 - 2010-10-29. doi:10.1145/1878101.1878113.

    Abstract

    In this paper, we describe the core pillars of a large archive oflanguage material recorded worldwide partly about languages that are highly endangered. The bases for the documentation of these languages are audio/video recordings which are then annotated at several linguistic layers. The digital age completely changed the requirements of long-term preservation and it is discussed how the archive met these new challenges. An extensive solution for data replication has been worked out to guarantee bit-stream preservation. Due to an immediate conversion of the incoming data to standards -based formats and checks at upload time lifecycle management of all 50 Terabyte of data is widely simplified. A suitable metadata framework not only allowing users to describe and discover resources, but also allowing them to organize their resources is enabling the management of this amount of resources very efficiently. Finally, it is the Language Archiving Technology software suite which allows users to create, manipulate, access and enrich all archived resources given that they have access permissions.
  • Koenig, A., Ringersma, J., & Trilsbeek, P. (2009). The Language Archiving Technology domain. Talk presented at 4th Language & Technology Conference. Poznań. 2009-11-06 - 2009-11-08.
  • Müller, G., & Trilsbeek, P. (2009). A general portal to the DOBES-Archive. Talk presented at DOBES workshop "Language Documentation – its role in linguistics, anthropology and language maintenance". MPI Nijmegen. 2009-10-15.
  • Trilsbeek, P. (2009). Language resource archiving at the MPI for Psycholinguistics. Talk presented at Third International Symposium on Field Linguistics. Moscow. 2009-10-20.
  • Trilsbeek, P., Müller, G., & Miller, J. (2009). Creating alternative access layers to the DOBES archive from existing metadata structure. Talk presented at 1th International Conference on Language Documentation and Conservation (ICLDC). Honolulu, Hawai'i. 2009-03-12 - 2009-03-14.

    Abstract

    In many areas of the world, language archives are being created, containing information on endangered languages, adhering to sophisticated metadata schemes and archiving standards. The data deposited in these archives, however, is as of yet hard to access, especially for community members who might be easily frustrated when trying to access data. In the DoBeS archive, there are various ways of searching and browsing through the deposited data, allowing for sophisticated queries targeting information in the metadata or annotations, so that expert users can work with the language documentations. However, this user-interface is too complex for a visitor that has not been thoroughly introduced to the structures and it is difficult to find results that may satisfy typical community members’ interests. As a shortcut for users from the community, a community portal has been created which displays an array of traditionally relevant topics in a simple and attractive way and links to resources in the archive. Topics include traditional and personal stories, procedurals and traditional activities. It is suitable for school use and due to its topical structure, may also serve as a base for developing teaching materials. In the community portal, a number of pre-defined searches have been set up for certain resource categories. These categories are marked in the metadata, so whenever a metadata file is uploaded into the archive containing one of these values, it will automatically become part of the search results in the portal. The query to the metadata database is made possible through a so-called REST interface. Via this protocol, the metadata search can be accessed as a web service within any other dynamic web content management framework. This search technology could also be used to implement a portal for a broader audience, introducing the archive from various angles to different potential user groups. Here too, the dynamic searches guarantee a low maintenance effort once the portal has been created. And finally, we will show additional ways to represent archived data (e. g. using Google Earth layers), in order to draw a comprehensive picture of the various ways to enter the DoBeS archive and efficiently access relevant information. It is hoped that this paper will contribute to bridging the gap between the creation of comprehensive language documentation and community efforts at revitalization, and help researchers to fulfill their ethical commitment to make data as accessible as possible.
  • Müller, G., Trilsbeek, P., & Van Uytvanck, D. (2008). Metadata-driven Community Portal. Talk presented at DOBES Workshop "Language Documentation Methods in Focus". MPI Nijmegen.
  • Ringersma, J., & Trilsbeek, P. (2008). Sharing linguistic multi-media resources at different complexity levels. Talk presented at IASA. Sydney. 2008-09-19.
  • Trilsbeek, P. (2008). Language Archiving at the MPI for Psycholinguistics. Talk presented at TAPE Fachtagung "Presentation and Access of Audiovisual Collections". Deutsche Kinemathek, Berlin. 2008-01-24.
  • Trilsbeek, P. (2008). Language Archiving Technology at the MPI for Psycholinguistics. Talk presented at Saami Documentation and Revitalization Workshop. Tromsø, Norway. 2008-02-28.
  • Trilsbeek, P., Schäfer, R., Schüller, D., Pavuza, F., & Wittenburg, P. (2008). Video encoding and archiving in field linguistics. Talk presented at International Association of Sound and Audiovisual Archives Annual Conference. Sydney. 2008-09-14 - 2008-09-19.

    Abstract

    Technological innovation is continuously creating new encoding formats for video. The introduction of HDTV, the wish to move towards 3D video etc will increase the required bandwidths and capacities by factors. New coding standards such as H.264 and JPEG2000 have been developed to overcome the problem of increasing bit rates and new codecs such as H.265 are in the pipeline. In addition we have seen in the recent decades that the maintenance of old formats is not guaranteed if their markets become too small.
This extreme innovation rate is problematic for all archiving intentions, since archiving means guaranteeing continuous accessibility of the archived digital resources. It is known that a continuous migration will be required to interpret stored video streams. At the bit-stream level migration to new storage technology can be organized by fully automatic procedures. At the encoding level problems are much more severe. When migrating compressed video for example we will be confronted with concatenation effects creating serious artifacts. Ideally we would like to store uncompressed or lossless compressed video so that we have a master copy from which we can generate the various presentation formats. Currently, frequently MPEG2 is used for this purpose although it does not prevent information degradation due to concatenation. We will argue for a move to lossless JPEG-2000 encoding as master format and proper process metadata description. Yet we have to solve the dilemma that field workers will deliver highly compressed formats due to the usage of consumer equipment also in future.
  • Trilsbeek, P., & Van Uytvanck, D. (2008). Regional archives and community portals. Talk presented at International Association of Sound and Audiovisual Archives Annual Conference. Sydney. 2008-09-14 - 2008-09-19.

    Abstract

    During the past 10 years, the Max Planck Institute for Psycholinguistics has developed an extensive technological framework around its digital archive for linguistic resources. About two years ago the MPI started installing archives based on this "Language Archiving Technology” (LAT) framework in various locations around the world. The idea behind this initiative is to have regional archives in the proximity of the area where the linguistic resources are collected. This will facilitate access to the resources and create more local involvement and awareness towards the preservation of endangered languages and cultures. The user interfaces of some of the LAT tools are not always very suitable for the speech community due to the language that is being used (English) and the extensive set of features of these tools, many of which are of less interest to the speech community. Therefore a framework was developed that allows the integration of archived content within a web portal that is managed using a standard Content Management System (Plone). A web service was developed that enables searching of the archive’s metadata database using the SOAP protocol. From within the CMS, the content editor can easily specify queries for specific metadata values, e.g. all songs in a particular language. These queries can be linked to buttons or images in the portal. The search results are then parsed into nicely formatted lists of resources. The facility will make the use of the local archive more efficient and user friendly. www.lat-mpi.eu www.mpi.nl/dobes www.plone.org
  • Ringersma, J., Trilsbeek, P., & Wittenburg, P. (2007). Language archiving technology at the MPI. Poster presented at 11th International Conference on Information Visualization, Zurich.

    Abstract

    The repository of the MPI contains different types of linguistic material: the DOBES endangered languages archive, the ESF second learner corpus, the Dutch Spoken National Corpus, MPI's gesture corpora, MPI acquisition corpora and MPI language documentations of the language and cognition research group. The archive covers more than 200.000 objects, mostly organized in sessions that are described with the IMDI-based metadata descriptions. Mostly, these sessions contain digitized audio/video signals and layers of annotations. In general access to these resources is limited and can be made available upon request. The Language Archiving Technology (LAT) is meant to contribute to the archive infrastructure. It focuses on open accessibility of the language resources; it supports dynamic and continuously enriched collections according to the Live Archives ideas; it stresses the need for long-term archiving of our digital collections covering unique martial about languages that will probably be extinct in a few decades and it follows the trend towards service oriented architectures. LAT components consist of data management and ingestion tools (IMDI, LAMUS and AMS) and of archive enrichment and visualization tools (ELAN, ANNEX and LEXUS). The tools are being developed and maintained by the Technical Group of the MPI. All LAT products are or will become available under an Open Source license and will be available free-of-charge in academic research.

    Supplementary material

    http://corpus1.mpi.nl/

Share this page