Displaying 1 - 7 of 7
-
Casillas, M., Bergelson, E., Warlaumont, A. S., Cristia, A., Soderstrom, M., VanDam, M., & Sloetjes, H. (2017). A New Workflow for Semi-automatized Annotations: Tests with Long-Form Naturalistic Recordings of Childrens Language Environments. In Proceedings of Interspeech 2017 (pp. 2098-2102). doi:10.21437/Interspeech.2017-1418.
Abstract
Interoperable annotation formats are fundamental to the utility, expansion, and sustainability of collective data repositories.In language development research, shared annotation schemes have been critical to facilitating the transition from raw acoustic data to searchable, structured corpora. Current schemes typically require comprehensive and manual annotation of utterance boundaries and orthographic speech content, with an additional, optional range of tags of interest. These schemes have been enormously successful for datasets on the scale of dozens of recording hours but are untenable for long-format recording corpora, which routinely contain hundreds to thousands of audio hours. Long-format corpora would benefit greatly from (semi-)automated analyses, both on the earliest steps of annotation—voice activity detection, utterance segmentation, and speaker diarization—as well as later steps—e.g., classification-based codes such as child-vs-adult-directed speech, and speech recognition to produce phonetic/orthographic representations. We present an annotation workflow specifically designed for long-format corpora which can be tailored by individual researchers and which interfaces with the current dominant scheme for short-format recordings. The workflow allows semi-automated annotation and analyses at higher linguistic levels. We give one example of how the workflow has been successfully implemented in a large cross-database project. -
Lausberg, H., & Sloetjes, H. (2016). The revised NEUROGES–ELAN system: An objective and reliable interdisciplinary analysis tool for nonverbal behavior and gesture. Behavior Research Methods, 48, 973-993. doi:10.3758/s13428-015-0622-z.
Abstract
As visual media spread to all domains of public and scientific life, nonverbal behavior is taking its place as an important form of communication alongside the written and spoken word. An objective and reliable method of analysis for hand movement behavior and gesture is therefore currently required in various scientific disciplines, including psychology, medicine, linguistics, anthropology, sociology, and computer science. However, no adequate common methodological standards have been developed thus far. Many behavioral gesture-coding systems lack objectivity and reliability, and automated methods that register specific movement parameters often fail to show validity with regard to psychological and social functions. To address these deficits, we have combined two methods, an elaborated behavioral coding system and an annotation tool for video and audio data. The NEUROGES–ELAN system is an effective and user-friendly research tool for the analysis of hand movement behavior, including gesture, self-touch, shifts, and actions. Since its first publication in 2009 in Behavior Research Methods, the tool has been used in interdisciplinary research projects to analyze a total of 467 individuals from different cultures, including subjects with mental disease and brain damage. Partly on the basis of new insights from these studies, the system has been revised methodologically and conceptually. The article presents the revised version of the system, including a detailed study of reliability. The improved reproducibility of the revised version makes NEUROGES–ELAN a suitable system for basic empirical research into the relation between hand movement behavior and gesture and cognitive, emotional, and interactive processes and for the development of automated movement behavior recognition methods. -
Sloetjes, H., & Seibert, O. (2016). Measuring by marking; the multimedia annotation tool ELAN. In A. Spink, G. Riedel, L. Zhou, L. Teekens, R. Albatal, & C. Gurrin (
Eds. ), Measuring Behavior 2016, 10th International Conference on Methods and Techniques in Behavioral Research (pp. 492-495).Abstract
ELAN is a multimedia annotation tool developed by the Max Planck Institute for Psycholinguistics. It is applied in a variety of research areas. This paper presents a general overview of the tool and new developments as the calculation of inter-rater reliability, a commentary framework, semi-automatic segmentation and labeling and export to Theme.Additional information
http://www.measuringbehavior.org/files/2016/MB2016_Proceedings.pdf -
Crasborn, O., & Sloetjes, H. (2014). Improving the exploitation of linguistic annotations in ELAN. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (
Eds. ), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 3604-3608).Abstract
This paper discusses some improvements in recent and planned versions of the multimodal annotation tool ELAN, which are targeted at improving the usability of annotated files. Increased support for multilingual documents is provided, by allowing for multilingual vocabularies and by specifying a language per document, annotation layer (tier) or annotation. In addition, improvements in the search possibilities and the display of the results have been implemented, which are especially relevant in the interpretation of the results of complex multi-tier searches. -
Crasborn, O., Hulsbosch, M., Lampen, L., & Sloetjes, H. (2014). New multilayer concordance functions in ELAN and TROVA. In Proceedings of the Tilburg Gesture Research Meeting [TiGeR 2013].
Abstract
Collocations generated by concordancers are a standard instrument in the exploitation of text corpora for the analysis of language use. Multimodal corpora show similar types of patterns, activities that frequently occur together, but there is no tool that offers facilities for visualising such patterns. Examples include timing of eye contact with respect to speech, and the alignment of activities of the two hands in signed languages. This paper describes recent enhancements to the standard CLARIN tools ELAN and TROVA for multimodal annotation to address these needs: first of all the query and concordancing functions were improved, and secondly the tools now generate visualisations of multilayer collocations that allow for intuitive explorations and analyses of multimodal data. This will provide a boost to the linguistic fields of gesture and sign language studies, as it will improve the exploitation of multimodal corpora. -
Drude, S., Trilsbeek, P., Sloetjes, H., & Broeder, D. (2014). Best practices in the creation, archiving and dissemination of speech corpora at the Language Archive. In S. Ruhi, M. Haugh, T. Schmidt, & K. Wörner (
Eds. ), Best Practices for Spoken Corpora in Linguistic Research (pp. 183-207). Newcastle upon Tyne: Cambridge Scholars Publishing. -
Sloetjes, H. (2014). ELAN: Multimedia annotation application. In J. Durand, U. Gut, & G. Kristoffersen (
Eds. ), The Oxford Handbook of Corpus Phonology (pp. 305-320). Oxford: Oxford University Press.
Share this page