Personal tools
You are here: Home Research Research projects Language archiving technology AVATech

Language archiving technology -

AVATech - Advancing Video Audio Technology in Humanities Research

The goal of this project is to investigate and develop technology for semi-automatic annotation of audio and video recordings used in humanities research. Detectors that will be available via interactive annotation tools and also via batch processing can help for example with chunking, tagging, annotation and search.

 

The motivation for this project has two major aspects:

  1. The amount of AV recordings in typical humanities research institutes that can be manually annotated and used for theory building does not scale up with the amount of recordings created, increasing amounts of data are not being used.
  2. The currently available AV recognition technologies cannot cope with the material that is typically created in real world observations as they are used for modern linguistic theory building.

Currently even the simplest annotations of for example recorded dialogs take too much time and effort. One conclusion from this problem is that new ways need to be explored to overcome the current barriers hampering progress. By making the annotation process more efficient by using automatic detectors, we expect that more data can be annotated more efficiently, allowing new possibilities for search and corpus analysis and better theory building.

 

Initial research will focus on the creation of detector components which, given media recordings, generate lists of segments and annotations. Such detectors can be invoked from within annotation tools such as the widely used and proven ELAN software and from a batch processing framework, to process a number of recordings in one effort.

The project is organized in two major phases:

  1. First, low hanging fruit detectors will be identified that can operate on a selected collection of typical audio/video material, an integration will be done with ELAN and an interaction with researchers will take place.
  2. Second, more advanced and complex detector tasks will be tackled after the results of the low hanging fruit detectors have been evaluated.

In this project two Max Planck Institutes cooperate with two Fraunhofer institutes to investigate, develop and apply advanced technology for semi-automatic annotation of collected audio-visual material that is the basis for humanities research. The Max Planck Institutes act as experts for the research driven questions resulting from an analysis of the AV material and for user friendly interaction tools. The Fraunhofer Institutes act as experts for digital sound and video processing methods.

Project partners:

Project duration:

  • 2009 to 2012

Documents:

  • Errata for AVATecH Component Interface Specification Manual, as of April 14, 2010:
    • Time, start time and end time columns in tier and timeseries files can have arbitrary names. Those columns must be first but names are not relevant.
    • The times are in units of seconds, in 42.075 style syntax. No hh:mm:ss or exponent syntax allowed.
    • To cancel a detector run, abort LOCAL detector process (Ctrl-C, SIGINT, Process destroy() etc), disconnect from SHARED detector or use stop() method of DIRECT detector.
  • The CMDI component spec and XSD file used for AVATecH metadata can be found at http://www.clarin.eu/cmd/components/avatech/ (also has a recognizer.cmdi metadata example)
  • German Quersumme 2/09 article about the start of the AVATecH MPI / Fraunhofer cooperation project

Internal documents:

  • AVATecH Developer Wiki
  • example corpora

Last checked 2010-08-10 by Eric Auer

       Max Planck Institute
       for Psycholinguistics


       Street address
       Wundtlaan 1
       6525 XD Nijmegen
       The Netherlands


       Mailing address
       P.O. Box 310
       6500 AH Nijmegen
       The Netherlands

       Phone:  +31-24-3521911
       Fax:      +31-24-3521213

 

 

Image right

Scrabble stones 081202