Publications

Displaying 1 - 100 of 176
  • Alhama, R. G., Scha, R., & Zuidema, W. (2014). Rule learning in humans and animals. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Cornish (Eds.), The evolution of language: Proceedings of the 10th International Conference (EVOLANG 10) (pp. 371-372). Singapore: World Scientific.
  • Aristar-Dry, H., Drude, S., Windhouwer, M., Gippert, J., & Nevskaya, I. (2012). „Rendering Endangered Lexicons Interoperable through Standards Harmonization”: The RELISH Project. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 766-770). European Language Resources Association (ELRA).

    Abstract

    The RELISH project promotes language-oriented research by addressing a two-pronged problem: (1) the lack of harmonization between digital standards for lexical information in Europe and America, and (2) the lack of interoperability among existing lexicons of endangered languages, in particular those created with the Shoebox/Toolbox lexicon building software. The cooperation partners in the RELISH project are the University of Frankfurt (FRA), the Max Planck Institute for Psycholinguistics (MPI Nijmegen), and Eastern Michigan University, the host of the Linguist List (ILIT). The project aims at harmonizing key European and American digital standards whose divergence has hitherto impeded international collaboration on language technology for resource creation and analysis, as well as web services for archive access. Focusing on several lexicons of endangered languages, the project will establish a unified way of referencing lexicon structure and linguistic concepts, and develop a procedure for migrating these heterogeneous lexicons to a standards-compliant format. Once developed, the procedure will be generalizable to the large store of lexical resources involved in the LEGO and DoBeS projects.
  • Bauer, B. L. M. (2012). Functions of nominal apposition in Vulgar and Late Latin: Change in progress? In F. Biville, M.-K. Lhommé, & D. Vallat (Eds.), Latin vulgaire – latin tardif IX (pp. 207-220). Lyon: Maison de l’Orient et de la Méditerranné.

    Abstract

    Analysis of the functions of nominal apposition in a number of Latin authors representing different periods, genres, and linguistic registers shows (1) that nominal apposition in Latin had a wide variety of functions; (2) that genre had some effect on functional use; (3) that change did not affect semantic fields as such; and (4) that with time the occurrence of apposition increasingly came to depend on the semantic field and within the semantic field on the individual lexical items. The ‘per-word’ treatment –also attested for the structural development of nominal apposition– underscores the specific characteristics of nominal apposition as a phenomenon at the cross-roads of syntax and derivational morphology
  • Bauer, B. L. M. (2014). Indefinite HOMO in the Gospels of the Vulgata. In P. Molinell, P. Cuzzoli, & C. Fedriani (Eds.), Latin vulgaire – latin tardif X (pp. 415-435). Bergamo: Bergamo University Press.
  • Benazzo, S., Flecken, M., & Soroli, E. (Eds.). (2012). Typological perspectives on language and thought: Thinking for speaking in L2. [Special Issue]. Language, Interaction and Acquisition, 3(2).
  • Bergmann, C., Boves, L., & Ten Bosch, L. (2012). A model of the Headturn Preference Procedure: Linking cognitive processes to overt behaviour. In Proceedings of the 2012 IEEE Conference on Development and Learning and Epigenetic Robotics (IEEE ICDL-EpiRob 2012), San Diego, CA.

    Abstract

    The study of first language acquisition still strongly relies on behavioural methods to measure underlying linguistic abilities. In the present paper, we closely examine and model one such method, the headturn preference procedure (HPP), which is widely used to measure infant speech segmentation and word recognition abilities Our model takes real speech as input, and only uses basic sensory processing and cognitive capabilities to simulate observable behaviour.We show that the familiarity effect found in many HPP experiments can be simulated without using the phonetic and phonological skills necessary for segmenting test sentences into words. The explicit modelling of the process that converts the result of the cognitive processing of the test sentences into observable behaviour uncovered two issues that can lead to null-results in HPP studies. Our simulations show that caution is needed in making inferences about underlying language skills from behaviour in HPP experiments. The simulations also generated questions that must be addressed in future HPP studies.
  • Bergmann, C., Ten Bosch, L., & Boves, L. (2014). A computational model of the headturn preference procedure: Design, challenges, and insights. In J. Mayor, & P. Gomez (Eds.), Computational Models of Cognitive Processes (pp. 125-136). World Scientific. doi:10.1142/9789814458849_0010.

    Abstract

    The Headturn Preference Procedure (HPP) is a frequently used method (e.g., Jusczyk & Aslin; and subsequent studies) to investigate linguistic abilities in infants. In this paradigm infants are usually first familiarised with words and then tested for a listening preference for passages containing those words in comparison to unrelated passages. Listening preference is defined as the time an infant spends attending to those passages with his or her head turned towards a flashing light and the speech stimuli. The knowledge and abilities inferred from the results of HPP studies have been used to reason about and formally model early linguistic skills and language acquisition. However, the actual cause of infants' behaviour in HPP experiments has been subject to numerous assumptions as there are no means to directly tap into cognitive processes. To make these assumptions explicit, and more crucially, to understand how infants' behaviour emerges if only general learning mechanisms are assumed, we introduce a computational model of the HPP. Simulations with the computational HPP model show that the difference in infant behaviour between familiarised and unfamiliar words in passages can be explained by a general learning mechanism and that many assumptions underlying the HPP are not necessarily warranted. We discuss the implications for conventional interpretations of the outcomes of HPP experiments.
  • Blasi, D. E., Christiansen, M. H., Wichmann, S., Hammarström, H., & Stadler, P. F. (2014). Sound symbolism and the origins of language. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Cornish (Eds.), The evolution of language: Proceedings of the 10th International Conference (EVOLANG 10) (pp. 391-392). Singapore: World Scientific.
  • Bocanegra, B. R., Poletiek, F. H., & Zwaan, R. A. (2014). Asymmetrical feature binding across language and perception. In Proceedings of the 7th annual Conference on Embodied and Situated Language Processing (ESLP 2014).
  • Bohnemeyer, J. (2004). Argument and event structure in Yukatek verb classes. In J.-Y. Kim, & A. Werle (Eds.), Proceedings of The Semantics of Under-Represented Languages in the Americas. Amherst, Mass: GLSA.

    Abstract

    In Yukatek Maya, event types are lexicalized in verb roots and stems that fall into a number of different form classes on the basis of (a) patterns of aspect-mood marking and (b) priviledges of undergoing valence-changing operations. Of particular interest are the intransitive classes in the light of Perlmutter’s (1978) Unaccusativity hypothesis. In the spirit of Levin & Rappaport Hovav (1995) [L&RH], Van Valin (1990), Zaenen (1993), and others, this paper investigates whether (and to what extent) the association between formal predicate classes and event types is determined by argument structure features such as ‘agentivity’ and ‘control’ or features of lexical aspect such as ‘telicity’ and ‘durativity’. It is shown that mismatches between agentivity/control and telicity/durativity are even more extensive in Yukatek than they are in English (Abusch 1985; L&RH, Van Valin & LaPolla 1997), providing new evidence against Dowty’s (1979) reconstruction of Vendler’s (1967) ‘time schemata of verbs’ in terms of argument structure configurations. Moreover, contrary to what has been claimed in earlier studies of Yukatek (Krämer & Wunderlich 1999, Lucy 1994), neither agentivity/control nor telicity/durativity turn out to be good predictors of verb class membership. Instead, the patterns of aspect-mood marking prove to be sensitive only to the presence or absense of state change, in a way that supports the unified analysis of all verbs of gradual change proposed by Kennedy & Levin (2001). The presence or absence of ‘internal causation’ (L&RH) may motivate the semantic interpretation of transitivization operations. An explicit semantics for the valence-changing operations is proposed, based on Parsons’s (1990) Neo-Davidsonian approach.
  • Brandt, M., Nitschke, S., & Kidd, E. (2012). Experience and processing of relative clauses in German. In A. K. Biller, E. Y. Chung, & A. E. Kimball (Eds.), Proceedings of the 36th annual Boston University Conference on Language Development (BUCLD 36) (pp. 87-100). Boston, MA: Cascadilla Press.
  • Broeder, D., Declerck, T., Romary, L., Uneson, M., Strömqvist, S., & Wittenburg, P. (2004). A large metadata domain of language resources. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004) (pp. 369-372). Paris: European Language Resources Association.
  • Broeder, D., Van Uytvanck, D., & Senft, G. (2012). Citing on-line language resources. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 1391-1394). European Language Resources Association (ELRA).

    Abstract

    Although the possibility of referring or citing on-line data from publications is seen at least theoretically as an important means to provide immediate testable proof or simple illustration of a line of reasoning, the practice has not been wide-spread yet and no extensive experience has been gained about the possibilities and problems of referring to raw data-sets. This paper makes a case to investigate the possibility and need of persistent data visualization services that facilitate the inspection and evaluation of the cited data.
  • Broeder, D., Schuurman, I., & Windhouwer, M. (2014). Experiences with the ISOcat Data Category Registry. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 4565-4568).
  • Broeder, D., Nava, M., & Declerck, T. (2004). INTERA - a Distributed Domain of Metadata Resources. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Spoken Language Resources and Evaluation (LREC 2004) (pp. 369-372). Paris: European Language Resources Association.
  • Broeder, D., Van Uytvanck, D., Gavrilidou, M., Trippel, T., & Windhouwer, M. (2012). Standardizing a component metadata infrastructure. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 1387-1390). European Language Resources Association (ELRA).

    Abstract

    This paper describes the status of the standardization efforts of a Component Metadata approach for describing Language Resources with metadata. Different linguistic and Language & Technology communities as CLARIN, META-SHARE and NaLiDa use this component approach and see its standardization of as a matter for cooperation that has the possibility to create a large interoperable domain of joint metadata. Starting with an overview of the component metadata approach together with the related semantic interoperability tools and services as the ISOcat data category registry and the relation registry we explain the standardization plan and efforts for component metadata within ISO TC37/SC4. Finally, we present information about uptake and plans of the use of component metadata within the three mentioned linguistic and L&T communities.
  • Broeder, D., Wittenburg, P., & Crasborn, O. (2004). Using Profiles for IMDI Metadata Creation. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004) (pp. 1317-1320). Paris: European Language Resources Association.
  • Broeder, D., Brugman, H., Oostdijk, N., & Wittenburg, P. (2004). Towards Dynamic Corpora: Workshop on compiling and processing spoken corpora. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004) (pp. 59-62). Paris: European Language Resource Association.
  • Broersma, M., & Kolkman, K. M. (2004). Lexical representation of non-native phonemes. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 1241-1244). Seoul: Sunjijn Printing Co.
  • Broersma, M. (2012). Lexical representation of perceptually difficult second-language words [Abstract]. Program abstracts from the 164th Meeting of the Acoustical Society of America published in the Journal of the Acoustical Society of America, 132(3), 2053.

    Abstract

    This study investigates the lexical representation of second-language words that contain difficult to distinguish phonemes. Dutch and English listeners' perception of partially onset-overlapping word pairs like DAFFOdil-DEFIcit and minimal pairs like flash-flesh, was assessed with two cross-modal priming experiments, examining two stages of lexical processing: activation of intended and mismatching lexical representations (Exp.1) and competition between those lexical representations (Exp.2). Exp.1 shows that truncated primes like daffo- and defi- activated lexical representations of mismatching words (either deficit or daffodil) more for L2 than L1 listeners. Exp.2 shows that for minimal pairs, matching primes (prime: flash, target: FLASH) facilitated recognition of visual targets for L1 and L2 listeners alike, whereas mismatching primes (flesh, FLASH) inhibited recognition consistently for L1 listeners but only in a minority of cases for L2 listeners; in most cases, for them, primes facilitated recognition of both words equally strongly. Importantly, all listeners experienced a combination of facilitation and inhibition (and all items sometimes caused facilitation and sometimes inhibition). These results suggest that for all participants, some of the minimal pairs were represented with separate, native-like lexical representations, whereas other pairs were stored as homophones. The nature of the L2 lexical representations thus varied strongly even within listeners.
  • Brugman, H., & Russel, A. (2004). Annotating Multi-media/Multi-modal resources with ELAN. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Language Evaluation (LREC 2004) (pp. 2065-2068). Paris: European Language Resources Association.
  • Brugman, H., Crasborn, O., & Russel, A. (2004). Collaborative annotation of sign language data with Peer-to-Peer technology. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Language Evaluation (LREC 2004) (pp. 213-216). Paris: European Language Resources Association.
  • Burenhult, N. (2004). Spatial deixis in Jahai. In S. Burusphat (Ed.), Papers from the 11th Annual Meeting of the Southeast Asian Linguistics Society 2001 (pp. 87-100). Arizona State University: Program for Southeast Asian Studies.
  • Casillas, M., & Frank, M. C. (2012). Cues to turn boundary prediction in adults and preschoolers. In S. Brown-Schmidt, J. Ginzburg, & S. Larsson (Eds.), Proceedings of SemDial 2012 (SeineDial): The 16th Workshop on the Semantics and Pragmatics of Dialogue (pp. 61-69). Paris: Université Paris-Diderot.

    Abstract

    Conversational turns often proceed with very brief pauses between speakers. In order to maintain “no gap, no overlap” turntaking, we must be able to anticipate when an ongoing utterance will end, tracking the current speaker for upcoming points of potential floor exchange. The precise set of cues that listeners use for turn-end boundary anticipation is not yet established. We used an eyetracking paradigm to measure adults’ and children’s online turn processing as they watched videos of conversations in their native language (English) and a range of other languages they did not speak. Both adults and children anticipated speaker transitions effectively. In addition, we observed evidence of turn-boundary anticipation for questions even in languages that were unknown to participants, suggesting that listeners’ success in turn-end anticipation does not rely solely on lexical information.
  • Chen, A. (2014). Production-comprehension (A)Symmetry: Individual differences in the acquisition of prosodic focus-marking. In N. Campbell, D. Gibbon, & D. Hirst (Eds.), Proceedings of Speech Prosody 2014 (pp. 423-427).

    Abstract

    Previous work based on different groups of children has shown that four- to five-year-old children are similar to adults in both producing and comprehending the focus-toaccentuation mapping in Dutch, contra the alleged productionprecedes- comprehension asymmetry in earlier studies. In the current study, we addressed the question of whether there are individual differences in the production-comprehension (a)symmetricity. To this end, we examined the use of prosody in focus marking in production and the processing of focusrelated prosody in online language comprehension in the same group of 4- to 5-year-olds. We have found that the relationship between comprehension and production can be rather diverse at an individual level. This result suggests some degree of independence in learning to use prosody to mark focus in production and learning to process focus-related prosodic information in online language comprehension, and implies influences of other linguistic and non-linguistic factors on the production-comprehension (a)symmetricity
  • Chen, A., Chen, A., Kager, R., & Wong, P. (2014). Rises and falls in Dutch and Mandarin Chinese. In C. Gussenhoven, Y. Chen, & D. Dediu (Eds.), Proceedings of the 4th International Symposium on Tonal Aspects of Language (pp. 83-86).

    Abstract

    Despite of the different functions of pitch in tone and nontone languages, rises and falls are common pitch patterns across different languages. In the current study, we ask what is the language specific phonetic realization of rises and falls. Chinese and Dutch speakers participated in a production experiment. We used contexts composed for conveying specific communicative purposes to elicit rises and falls. We measured both tonal alignment and tonal scaling for both patterns. For the alignment measurements, we found language specific patterns for the rises, but for falls. For rises, both peak and valley were aligned later among Chinese speakers compared to Dutch speakers. For all the scaling measurements (maximum pitch, minimum pitch, and pitch range), no language specific patterns were found for either the rises or the falls
  • Cho, T., & Johnson, E. K. (2004). Acoustic correlates of phrase-internal lexical boundaries in Dutch. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 1297-1300). Seoul: Sunjin Printing Co.

    Abstract

    The aim of this study was to determine if Dutch speakers reliably signal phrase-internal lexical boundaries, and if so, how. Six speakers recorded 4 pairs of phonemically identical strong-weak-strong (SWS) strings with matching syllable boundaries but mismatching intended word boundaries (e.g. reis # pastei versus reispas # tij, or more broadly C1V2(C)#C2V2(C)C3V3(C) vs. C1V2(C)C2V2(C)#C3V3(C)). An Analysis of Variance revealed 3 acoustic parameters that were significantly greater in S#WS items (C2 DURATION, RIME1 DURATION, C3 BURST AMPLITUDE) and 5 parameters that were significantly greater in the SW#S items (C2 VOT, C3 DURATION, RIME2 DURATION, RIME3 DURATION, and V2 AMPLITUDE). Additionally, center of gravity measurements suggested that the [s] to [t] coarticulation was greater in reis # pa[st]ei versus reispa[s] # [t]ij. Finally, a Logistic Regression Analysis revealed that the 3 parameters (RIME1 DURATION, RIME2 DURATION, and C3 DURATION) contributed most reliably to a S#WS versus SW#S classification.
  • Cho, T., & McQueen, J. M. (2004). Phonotactics vs. phonetic cues in native and non-native listening: Dutch and Korean listeners' perception of Dutch and English. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 1301-1304). Seoul: Sunjijn Printing Co.

    Abstract

    We investigated how listeners of two unrelated languages, Dutch and Korean, process phonotactically legitimate and illegitimate sounds spoken in Dutch and American English. To Dutch listeners, unreleased word-final stops are phonotactically illegal because word-final stops in Dutch are generally released in isolation, but to Korean listeners, released final stops are illegal because word-final stops are never released in Korean. Two phoneme monitoring experiments showed a phonotactic effect: Dutch listeners detected released stops more rapidly than unreleased stops whereas the reverse was true for Korean listeners. Korean listeners with English stimuli detected released stops more accurately than unreleased stops, however, suggesting that acoustic-phonetic cues associated with released stops improve detection accuracy. We propose that in non-native speech perception, phonotactic legitimacy in the native language speeds up phoneme recognition, the richness of acousticphonetic cues improves listening accuracy, and familiarity with the non-native language modulates the relative influence of these two factors.
  • Chu, M., & Kita, S. (2012). The nature of the beneficial role of spontaneous gesture in spatial problem solving [Abstract]. Cognitive Processing; Special Issue "ICSC 2012, the 5th International Conference on Spatial Cognition: Space and Embodied Cognition". Oral Presentations, 13(Suppl. 1), S39.

    Abstract

    Spontaneous gestures play an important role in spatial problem solving. We investigated the functional role and underlying mechanism of spontaneous gestures in spatial problem solving. In Experiment 1, 132 participants were required to solve a mental rotation task (see Figure 1) without speaking. Participants gestured more frequently in difficult trials than in easy trials. In Experiment 2, 66 new participants were given two identical sets of mental rotation tasks problems, as the one used in experiment 1. Participants who were encouraged to gesture in the first set of mental rotation task problemssolved more problems correctly than those who were allowed to gesture or those who were prohibited from gesturing both in the first set and in the second set in which all participants were prohibited from gesturing. The gestures produced by the gestureencouraged group and the gesture-allowed group were not qualitatively different. In Experiment 3, 32 new participants were first given a set of mental rotation problems and then a second set of nongesturing paper folding problems. The gesture-encouraged group solved more problems correctly in the first set of mental rotation problems and the second set of non-gesturing paper folding problems. We concluded that gesture improves spatial problem solving. Furthermore, gesture has a lasting beneficial effect even when gesture is not available and the beneficial effect is problem-general.We suggested that gesture enhances spatial problem solving by provide a rich sensori-motor representation of the physical world and pick up information that is less readily available to visuo-spatial processes.
  • Clark, N., & Perlman, M. (2014). Breath, vocal, and supralaryngeal flexibility in a human-reared gorilla. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 11-15).

    Abstract

    “Gesture-first” theories dismiss ancestral great apes’ vocalization as a substrate for language evolution based on the claim that extant apes exhibit minimal learning and volitional control of vocalization. Contrary to this claim, we present data of novel learned and voluntarily controlled vocal behaviors produced by a human-fostered gorilla (G. gorilla gorilla). These behaviors demonstrate varying degrees of flexibility in the vocal apparatus (including diaphragm, lungs, larynx, and supralaryngeal articulators), and are predominantly performed in coordination with manual behaviors and gestures. Instead of a gesture-first theory, we suggest that these findings support multimodal theories of language evolution in which vocal and gestural forms are coordinated and supplement one another
  • Collins, J. (2012). The evolution of the Greenbergian word order correlations. In T. C. Scott-Phillips, M. Tamariz, E. A. Cartmill, & J. R. Hurford (Eds.), The evolution of language. Proceedings of the 9th International Conference (EVOLANG9) (pp. 72-79). Singapore: World Scientific.
  • Connell, L., Cai, Z. G., & Holler, J. (2012). Do you see what I'm singing? Visuospatial movement biases pitch perception. In N. Miyake, D. Peebles, & R. P. Cooper (Eds.), Proceedings of the 34th Annual Meeting of the Cognitive Science Society (CogSci 2012) (pp. 252-257). Austin, TX: Cognitive Science Society.

    Abstract

    The nature of the connection between musical and spatial processing is controversial. While pitch may be described in spatial terms such as “high” or “low”, it is unclear whether pitch and space are associated but separate dimensions or whether they share representational and processing resources. In the present study, we asked participants to judge whether a target vocal note was the same as (or different from) a preceding cue note. Importantly, target trials were presented as video clips where a singer sometimes gestured upward or downward while singing that target note, thus providing an alternative, concurrent source of spatial information. Our results show that pitch discrimination was significantly biased by the spatial movement in gesture. These effects were eliminated by spatial memory load but preserved under verbal memory load conditions. Together, our findings suggest that pitch and space have a shared representation such that the mental representation of pitch is audiospatial in nature.
  • Cooper, N., & Cutler, A. (2004). Perception of non-native phonemes in noise. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 469-472). Seoul: Sunjijn Printing Co.

    Abstract

    We report an investigation of the perception of American English phonemes by Dutch listeners proficient in English. Listeners identified either the consonant or the vowel in most possible English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios (16 dB, 8 dB, and 0 dB). Effects of signal-to-noise ratio on vowel and consonant identification are discussed as a function of syllable position and of relationship to the native phoneme inventory. Comparison of the results with previously reported data from native listeners reveals that noise affected the responding of native and non-native listeners similarly.
  • Crasborn, O., Hulsbosch, M., Lampen, L., & Sloetjes, H. (2014). New multilayer concordance functions in ELAN and TROVA. In Proceedings of the Tilburg Gesture Research Meeting [TiGeR 2013].

    Abstract

    Collocations generated by concordancers are a standard instrument in the exploitation of text corpora for the analysis of language use. Multimodal corpora show similar types of patterns, activities that frequently occur together, but there is no tool that offers facilities for visualising such patterns. Examples include timing of eye contact with respect to speech, and the alignment of activities of the two hands in signed languages. This paper describes recent enhancements to the standard CLARIN tools ELAN and TROVA for multimodal annotation to address these needs: first of all the query and concordancing functions were improved, and secondly the tools now generate visualisations of multilayer collocations that allow for intuitive explorations and analyses of multimodal data. This will provide a boost to the linguistic fields of gesture and sign language studies, as it will improve the exploitation of multimodal corpora.
  • Crasborn, O., & Sloetjes, H. (2014). Improving the exploitation of linguistic annotations in ELAN. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 3604-3608).

    Abstract

    This paper discusses some improvements in recent and planned versions of the multimodal annotation tool ELAN, which are targeted at improving the usability of annotated files. Increased support for multilingual documents is provided, by allowing for multilingual vocabularies and by specifying a language per document, annotation layer (tier) or annotation. In addition, improvements in the search possibilities and the display of the results have been implemented, which are especially relevant in the interpretation of the results of complex multi-tier searches.
  • Cristia, A., & Peperkamp, S. (2012). Generalizing without encoding specifics: Infants infer phonotactic patterns on sound classes. In A. K. Biller, E. Y. Chung, & A. E. Kimball (Eds.), Proceedings of the 36th Annual Boston University Conference on Language Development (BUCLD 36) (pp. 126-138). Somerville, Mass.: Cascadilla Press.

    Abstract

    publication expected April 2012
  • Cutler, A., Norris, D., & Sebastián-Gallés, N. (2004). Phonemic repertoire and similarity within the vocabulary. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 65-68). Seoul: Sunjijn Printing Co.

    Abstract

    Language-specific differences in the size and distribution of the phonemic repertoire can have implications for the task facing listeners in recognising spoken words. A language with more phonemes will allow shorter words and reduced embedding of short words within longer ones, decreasing the potential for spurious lexical competitors to be activated by speech signals. We demonstrate that this is the case via comparative analyses of the vocabularies of English and Spanish. A language which uses suprasegmental as well as segmental contrasts, however, can substantially reduce the extent of spurious embedding.
  • Dalli, A., Tablan, V., Bontcheva, K., Wilks, Y., Broeder, D., Brugman, H., & Wittenburg, P. (2004). Web services architecture for language resources. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC2004) (pp. 365-368). Paris: ELRA - European Language Resources Association.
  • Dediu, D., & Levinson, S. C. (2014). Language and speech are old: A review of the evidence and consequences for modern linguistic diversity. In E. A. Cartmill, S. G. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 421-422). Singapore: World Scientific.
  • Defina, R., & Majid, A. (2012). Conceptual event units of putting and taking in two unrelated languages. In N. Miyake, D. Peebles, & R. Cooper (Eds.), Proceedings of the 34th Annual Meeting of the Cognitive Science Society (CogSci 2012) (pp. 1470-1475). Austin, TX: Cognitive Science Society.

    Abstract

    People automatically chunk ongoing dynamic events into discrete units. This paper investigates whether linguistic structure is a factor in this process. We test the claim that describing an event with a serial verb construction will influence a speaker’s conceptual event structure. The grammar of Avatime (a Kwa language spoken in Ghana)requires its speakers to describe some, but not all, placement events using a serial verb construction which also encodes the preceding taking event. We tested Avatime and English speakers’ recognition memory for putting and taking events. Avatime speakers were more likely to falsely recognize putting and taking events from episodes associated with takeput serial verb constructions than from episodes associated with other constructions. English speakers showed no difference in false recognitions between episode types. This demonstrates that memory for episodes is related to the type of language used; and, moreover, across languages different conceptual representations are formed for the same physical episode, paralleling habitual linguistic practices
  • Dingemanse, M., Hammond, J., Stehouwer, H., Somasundaram, A., & Drude, S. (2012). A high speed transcription interface for annotating primary linguistic data. In Proceedings of 6th Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (pp. 7-12). Stroudsburg, PA: Association for Computational Linguistics.

    Abstract

    We present a new transcription mode for the annotation tool ELAN. This mode is designed to speed up the process of creating transcriptions of primary linguistic data (video and/or audio recordings of linguistic behaviour). We survey the basic transcription workflow of some commonly used tools (Transcriber, BlitzScribe, and ELAN) and describe how the new transcription interface improves on these existing implementations. We describe the design of the transcription interface and explore some further possibilities for improvement in the areas of segmentation and computational enrichment of annotations.
  • Dingemanse, M., & Majid, A. (2012). The semantic structure of sensory vocabulary in an African language. In N. Miyake, D. Peebles, & R. P. Cooper (Eds.), Proceedings of the 34th Annual Meeting of the Cognitive Science Society (CogSci 2012) (pp. 300-305). Austin, TX: Cognitive Science Society.

    Abstract

    The widespread occurrence of ideophones, large classes of words specialized in evoking sensory imagery, is little known outside linguistics and anthropology. Ideophones are a common feature in many of the world’s languages but are underdeveloped in English and other Indo-European languages. Here we study the meanings of ideophones in Siwu (a Kwa language from Ghana) using a pile-sorting task. The goal was to uncover the underlying structure of the lexical space and to examine the claimed link between ideophones and perception. We found that Siwu ideophones are principally organized around fine-grained aspects of sensory perception, and map onto salient psychophysical dimensions identified in sensory science. The results ratify ideophones as dedicated sensory vocabulary and underline the relevance of ideophones for research on language and perception.
  • Dingemanse, M., Verhoef, T., & Roberts, S. G. (2014). The role of iconicity in the cultural evolution of communicative signals. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 11-15).
  • Dingemanse, M., Torreira, F., & Enfield, N. J. (2014). Conversational infrastructure and the convergent evolution of linguistic items. In E. A. Cartmill, S. G. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 425-426). Singapore: World Scientific.
  • Dolscheid, S., Hunnius, S., Casasanto, D., & Majid, A. (2012). The sound of thickness: Prelinguistic infants' associations of space and pitch. In N. Miyake, D. Peebles, & R. P. Cooper (Eds.), Proceedings of the 34th Annual Meeting of the Cognitive Science Society (CogSci 2012) (pp. 306-311). Austin, TX: Cognitive Science Society.

    Abstract

    People often talk about musical pitch in terms of spatial metaphors. In English, for instance, pitches can be high or low, whereas in other languages pitches are described as thick or thin. According to psychophysical studies, metaphors in language can also shape people’s nonlinguistic space-pitch representations. But does language establish mappings between space and pitch in the first place or does it modify preexisting associations? Here we tested 4-month-old Dutch infants’ sensitivity to height-pitch and thickness-pitch mappings in two preferential looking tasks. Dutch infants looked significantly longer at cross-modally congruent stimuli in both experiments, indicating that infants are sensitive to space-pitch associations prior to language. This early presence of space-pitch mappings suggests that these associations do not originate from language. Rather, language may build upon pre-existing mappings and change them gradually via some form of competitive associative learning.
  • Dolscheid, S., Willems, R. M., Hagoort, P., & Casasanto, D. (2014). The relation of space and musical pitch in the brain. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 421-426). Austin, Tx: Cognitive Science Society.

    Abstract

    Numerous experiments show that space and musical pitch are closely linked in people's minds. However, the exact nature of space-pitch associations and their neuronal underpinnings are not well understood. In an fMRI experiment we investigated different types of spatial representations that may underlie musical pitch. Participants judged stimuli that varied in spatial height in both the visual and tactile modalities, as well as auditory stimuli that varied in pitch height. In order to distinguish between unimodal and multimodal spatial bases of musical pitch, we examined whether pitch activations were present in modality-specific (visual or tactile) versus multimodal (visual and tactile) regions active during spatial height processing. Judgments of musical pitch were found to activate unimodal visual areas, suggesting that space-pitch associations may involve modality-specific spatial representations, supporting a key assumption of embodied theories of metaphorical mental representation.
  • Drozdova, P., Van Hout, R., & Scharenborg, O. (2014). Phoneme category retuning in a non-native language. In Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 553-557).

    Abstract

    Previous studies have demonstrated that native listeners modify their interpretation of a speech sound when a talker produces an ambiguous sound in order to quickly tune into a speaker, but there is hardly any evidence that non-native listeners employ a similar mechanism when encountering ambiguous pronunciations. So far, one study demonstrated this lexically-guided perceptual learning effect for nonnatives, using phoneme categories similar in the native language of the listeners and the non-native language of the stimulus materials. The present study investigates the question whether phoneme category retuning is possible in a nonnative language for a contrast, /l/-/r/, which is phonetically differently embedded in the native (Dutch) and nonnative (English) languages involved. Listening experiments indeed showed a lexically-guided perceptual learning effect. Assuming that Dutch listeners have different phoneme categories for the native Dutch and non-native English /r/, as marked differences between the languages exist for /r/, these results, for the first time, seem to suggest that listeners are not only able to retune their native phoneme categories but also their non-native phoneme categories to include ambiguous pronunciations.
  • Drude, S., Trilsbeek, P., & Broeder, D. (2012). Language Documentation and Digital Humanities: The (DoBeS) Language Archive. In J. C. Meister (Ed.), Digital Humanities 2012 Conference Abstracts. University of Hamburg, Germany; July 16–22, 2012 (pp. 169-173).

    Abstract

    Overview Since the early nineties, the on-going dramatic loss of the world’s linguistic diversity has gained attention, first by the linguists and increasingly also by the general public. As a response, the new field of language documentation emerged from around 2000 on, starting with the funding initiative ‘Dokumentation Bedrohter Sprachen’ (DoBeS, funded by the Volkswagen foundation, Germany), soon to be followed by others such as the ‘Endangered Languages Documentation Programme’ (ELDP, at SOAS, London), or, in the USA, ‘Electronic Meta-structure for Endangered Languages Documentation’ (EMELD, led by the LinguistList) and ‘Documenting Endangered Languages’ (DEL, by the NSF). From its very beginning, the new field focused on digital technologies not only for recording in audio and video, but also for annotation, lexical databases, corpus building and archiving, among others. This development not just coincides but is intrinsically interconnected with the increasing focus on digital data, technology and methods in all sciences, in particular in the humanities.
  • Drude, S., Broeder, D., Trilsbeek, P., & Wittenburg, P. (2012). The Language Archive: A new hub for language resources. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 3264-3267). European Language Resources Association (ELRA).

    Abstract

    This contribution presents “The Language Archive” (TLA), a new unit at the MPI for Psycholinguistics, discussing the current developments in management of scientific data, considering the need for new data research infrastructures. Although several initiatives worldwide in the realm of language resources aim at the integration, preservation and mobilization of research data, the state of such scientific data is still often problematic. Data are often not well organized and archived and not described by metadata ― even unique data such as field-work observational data on endangered languages is still mostly on perishable carriers. New data centres are needed that provide trusted, quality-reviewed, persistent services and suitable tools and that take legal and ethical issues seriously. The CLARIN initiative has established criteria for suitable centres. TLA is in a good position to be one of such centres. It is based on three essential pillars: (1) A data archive; (2) management, access and annotation tools; (3) archiving and software expertise for collaborative projects. The archive hosts mostly observational data on small languages worldwide and language acquisition data, but also data resulting from experiments
  • Eisner, F. (2012). Competition in the acoustic encoding of emotional speech. In L. McCrohon (Ed.), Five approaches to language evolution. Proceedings of the workshops of the 9th International Conference on the Evolution of Language (pp. 43-44). Tokyo: Evolang9 Organizing Committee.

    Abstract

    1. Introduction Speech conveys not only linguistic meaning but also paralinguistic information, such as features of the speaker’s social background, physiology, and emotional state. Linguistic and paralinguistic information is encoded in speech by using largely the same vocal apparatus and both are transmitted simultaneously in the acoustic signal, drawing on a limited set of acoustic cues. How this simultaneous encoding is achieved, how the different types of information are disentangled by the listener, and how much they interfere with one another is presently not well understood. Previous research has highlighted the importance of acoustic source and filter cues for emotion and linguistic encoding respectively, which may suggest that the two types of information are encoded independently of each other. However, those lines of investigation have been almost completely disconnected (Murray & Arnott, 1993).
  • Elbers, W., Broeder, D., & Van Uytvanck, D. (2012). Proper language resource centers. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 3260-3263). European Language Resources Association (ELRA).

    Abstract

    Language resource centers allow researchers to reliably deposit their structured data together with associated meta data and run services operating on this deposited data. We are looking into possibilities to create long-term persistency of both the deposited data and the services operating on this data. Challenges, both technical and non-technical, that need to be solved are the need to replicate more than just the data, proper identification of the digital objects in a distributed environment by making use of persistent identifiers and the set-up of a proper authentication and authorization domain including the management of the authorization information on the digital objects. We acknowledge the investment that most language resource centers have made in their current infrastructure. Therefore one of the most important requirements is the loose coupling with existing infrastructures without the need to make many changes. This shift from a single language resource center into a federated environment of many language resource centers is discussed in the context of a real world center: The Language Archive supported by the Max Planck Institute for Psycholinguistics.
  • Enfield, N. J. (2004). Areal grammaticalisation of postverbal 'acquire' in mainland Southeast Asia. In S. Burusphat (Ed.), Proceedings of the 11th Southeast Asia Linguistics Society Meeting (pp. 275-296). Arizona State University: Tempe.
  • Ernestus, M., Kočková-Amortová, L., & Pollak, P. (2014). The Nijmegen corpus of casual Czech. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 365-370).

    Abstract

    This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which contains more than 30 hours of high-quality recordings of casual conversations in Common Czech, among ten groups of three male and ten groups of three female friends. All speakers were native speakers of Czech, raised in Prague or in the region of Central Bohemia, and were between 19 and 26 years old. Every group of speakers consisted of one confederate, who was instructed to keep the conversations lively, and two speakers naive to the purposes of the recordings. The naive speakers were engaged in conversations for approximately 90 minutes, while the confederate joined them for approximately the last 72 minutes. The corpus was orthographically annotated by experienced transcribers and this orthographic transcription was aligned with the speech signal. In addition, the conversations were videotaped. This corpus can form the basis for all types of research on casual conversations in Czech, including phonetic research and research on how to improve automatic speech recognition. The corpus will be freely available
  • Filippi, P. (2014). Linguistic animals: understanding language through a comparative approach. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Crnish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 74-81). doi:10.1142/9789814603638_0082.

    Abstract

    With the aim to clarify the definition of humans as “linguistic animals”, in the present paper I functionally distinguish three types of language competences: i) language as a general biological tool for communication, ii) “perceptual syntax”, iii) propositional language. Following this terminological distinction, I review pivotal findings on animals' communication systems, which constitute useful evidence for the investigation of the nature of three core components of humans' faculty of language: semantics, syntax, and theory of mind. In fact, despite the capacity to process and share utterances with an open-ended structure is uniquely human, some isolated components of our linguistic competence are in common with nonhuman animals. Therefore, as I argue in the present paper, the investigation of animals' communicative competence provide crucial insights into the range of cognitive constraints underlying humans' ability of language, enabling at the same time the analysis of its phylogenetic path as well as of the selective pressures that have led to its emergence.
  • Filippi, P., Gingras, B., & Fitch, W. T. (2014). The effect of pitch enhancement on spoken language acquisition. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Crnish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 437-438). doi:10.1142/9789814603638_0082.

    Abstract

    The aim of this study is to investigate the word-learning phenomenon utilizing a new model that integrates three processes: a) extracting a word out of a continuous sounds sequence, b) inducing referential meanings, c) mapping a word onto its intended referent, with the possibility to extend the acquired word over a potentially infinite sets of objects of the same semantic category, and over not-previously-heard utterances. Previous work has examined the role of statistical learning and/or of prosody in each of these processes separately. In order to examine the multilayered word-learning task, we integrate these two strands of investigation into a single approach. We have conducted the study on adults and included six different experimental conditions, each including specific perceptual manipulations of the signal. In condition 1, the only cue to word-meaning mapping was the co-occurrence between words and referents (“statistical cue”). This cue was present in all the conditions. In condition 2, we added infant-directed-speech (IDS) typical pitch enhancement as a marker of the target word and of the statistical cue. In condition 3 we placed IDS typical pitch enhancement on random words of the utterances, i.e. inconsistently matching the statistical cue. In conditions 4, 5 and 6 we manipulated respectively duration, a non-prosodic acoustic cue and a visual cue as markers of the target word and of the statistical cue. Systematic comparisons between learning performance in condition 1 with the other conditions revealed that the word-learning process is facilitated only when pitch prominence consistently marks the target word and the statistical cue…
  • Fitch, W. T., Friederici, A. D., & Hagoort, P. (Eds.). (2012). Pattern perception and computational complexity [Special Issue]. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 367 (1598).
  • Floyd, S. (2004). Purismo lingüístico y realidad local: ¿Quichua puro o puro quichuañol? In Proceedings of the Conference on Indigenous Languages of Latin America (CILLA)-I.
  • Francisco, A. A., Jesse, A., Groen, M. a., & McQueen, J. M. (2014). Audiovisual temporal sensitivity in typical and dyslexic adult readers. In Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) (pp. 2575-2579).

    Abstract

    Reading is an audiovisual process that requires the learning of systematic links between graphemes and phonemes. It is thus possible that reading impairments reflect an audiovisual processing deficit. In this study, we compared audiovisual processing in adults with developmental dyslexia and adults without reading difficulties. We focused on differences in cross-modal temporal sensitivity both for speech and for non-speech events. When compared to adults without reading difficulties, adults with developmental dyslexia presented a wider temporal window in which unsynchronized speech events were perceived as synchronized. No differences were found between groups for the non-speech events. These results suggests a deficit in dyslexia in the perception of cross-modal temporal synchrony for speech events.
  • De la Fuente, J., Santiago, J., Roma, A., Dumitrache, C., & Casasanto, D. (2012). Facing the past: cognitive flexibility in the front-back mapping of time [Abstract]. Cognitive Processing; Special Issue "ICSC 2012, the 5th International Conference on Spatial Cognition: Space and Embodied Cognition". Poster Presentations, 13(Suppl. 1), S58.

    Abstract

    In many languages the future is in front and the past behind, but in some cultures (like Aymara) the past is in front. Is it possible to find this mapping as an alternative conceptualization of time in other cultures? If so, what are the factors that affect its choice out of the set of available alternatives? In a paper and pencil task, participants placed future or past events either in front or behind a character (a schematic head viewed from above). A sample of 24 Islamic participants (whose language also places the future in front and the past behind) tended to locate the past event in the front box more often than Spanish participants. This result might be due to the greater cultural value assigned to tradition in Islamic culture. The same pattern was found in a sample of Spanish elders (N = 58), what may support that conclusion. Alternatively, the crucial factor may be the amount of attention paid to the past. In a final study, young Spanish adults (N = 200) who had just answered a set of questions about their past showed the past-in-front pattern, whereas questions about their future exacerbated the future-in-front pattern. Thus, the attentional explanation was supported: attended events are mapped to front space in agreement with the experiential connection between attending and seeing. When attention is paid to the past, it tends to occupy the front location in spite of available alternative mappings in the language-culture.
  • Ganushchak, L. Y., & Acheson, D. J. (Eds.). (2014). What's to be learned from speaking aloud? - Advances in the neurophysiological measurement of overt language production. [Research topic] [Special Issue]. Frontiers in Language Sciences. Retrieved from http://www.frontiersin.org/Language_Sciences/researchtopics/What_s_to_be_Learned_from_Spea/1671.

    Abstract

    Researchers have long avoided neurophysiological experiments of overt speech production due to the suspicion that artifacts caused by muscle activity may lead to a bad signal-to-noise ratio in the measurements. However, the need to actually produce speech may influence earlier processing and qualitatively change speech production processes and what we can infer from neurophysiological measures thereof. Recently, however, overt speech has been successfully investigated using EEG, MEG, and fMRI. The aim of this Research Topic is to draw together recent research on the neurophysiological basis of language production, with the aim of developing and extending theoretical accounts of the language production process. In this Research Topic of Frontiers in Language Sciences, we invite both experimental and review papers, as well as those about the latest methods in acquisition and analysis of overt language production data. All aspects of language production are welcome: i.e., from conceptualization to articulation during native as well as multilingual language production. Focus should be placed on using the neurophysiological data to inform questions about the processing stages of language production. In addition, emphasis should be placed on the extent to which the identified components of the electrophysiological signal (e.g., ERP/ERF, neuronal oscillations, etc.), brain areas or networks are related to language comprehension and other cognitive domains. By bringing together electrophysiological and neuroimaging evidence on language production mechanisms, a more complete picture of the locus of language production processes and their temporal and neurophysiological signatures will emerge.
  • Gebre, B. G., & Wittenburg, P. (2012). Adaptive automatic gesture stroke detection. In J. C. Meister (Ed.), Digital Humanities 2012 Conference Abstracts. University of Hamburg, Germany; July 16–22, 2012 (pp. 458-461).

    Abstract

    Print Friendly XML Gebre, Binyam Gebrekidan, Max Planck Institute for Psycholinguistics, The Netherlands, binyamgebrekidan.gebre [at] mpi.nl Wittenburg, Peter, Max Planck Institute for Psycholinguistics, The Netherlands, peter.wittenburg [at] mpi.nl Introduction Many gesture and sign language researchers manually annotate video recordings to systematically categorize, analyze and explain their observations. The number and kinds of annotations are so diverse and unpredictable that any attempt at developing non-adaptive automatic annotation systems is usually less effective. The trend in the literature has been to develop models that work for average users and for average scenarios. This approach has three main disadvantages. First, it is impossible to know beforehand all the patterns that could be of interest to all researchers. Second, it is practically impossible to find enough training examples for all patterns. Third, it is currently impossible to learn a model that is robustly applicable across all video quality-recording variations.
  • Gebre, B. G., Wittenburg, P., Heskes, T., & Drude, S. (2014). Motion history images for online speaker/signer diarization. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 1537-1541). Piscataway, NJ: IEEE.

    Abstract

    We present a solution to the problem of online speaker/signer diarization - the task of determining "who spoke/signed when?". Our solution is based on the idea that gestural activity (hands and body movement) is highly correlated with uttering activity. This correlation is necessarily true for sign languages and mostly true for spoken languages. The novel part of our solution is the use of motion history images (MHI) as a likelihood measure for probabilistically detecting uttering activities. MHI is an efficient representation of where and how motion occurred for a fixed period of time. We conducted experiments on 4.9 hours of a publicly available dataset (the AMI meeting data) and 1.4 hours of sign language dataset (Kata Kolok data). The best performance obtained is 15.70% for sign language and 31.90% for spoken language (measurements are in DER). These results show that our solution is applicable in real-world applications like video conferences.

    Files private

    Request files
  • Gebre, B. G., Wittenburg, P., Drude, S., Huijbregts, M., & Heskes, T. (2014). Speaker diarization using gesture and speech. In H. Li, & P. Ching (Eds.), Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 582-586).

    Abstract

    We demonstrate how the problem of speaker diarization can be solved using both gesture and speaker parametric models. The novelty of our solution is that we approach the speaker diarization problem as a speaker recognition problem after learning speaker models from speech samples corresponding to gestures (the occurrence of gestures indicates the presence of speech and the location of gestures indicates the identity of the speaker). This new approach offers many advantages: comparable state-of-the-art performance, faster computation and more adaptability. In our implementation, parametric models are used to model speakers' voice and their gestures: more specifically, Gaussian mixture models are used to model the voice characteristics of each person and all persons, and gamma distributions are used to model gestural activity based on features extracted from Motion History Images. Tests on 4.24 hours of the AMI meeting data show that our solution makes DER score improvements of 19% on speech-only segments and 4% on all segments including silence (the comparison is with the AMI system).
  • Gebre, B. G., Wittenburg, P., & Lenkiewicz, P. (2012). Towards automatic gesture stroke detection. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 231-235). European Language Resources Association.

    Abstract

    Automatic annotation of gesture strokes is important for many gesture and sign language researchers. The unpredictable diversity of human gestures and video recording conditions require that we adopt a more adaptive case-by-case annotation model. In this paper, we present a work-in progress annotation model that allows a user to a) track hands/face b) extract features c) distinguish strokes from non-strokes. The hands/face tracking is done with color matching algorithms and is initialized by the user. The initialization process is supported with immediate visual feedback. Sliders are also provided to support a user-friendly adjustment of skin color ranges. After successful initialization, features related to positions, orientations and speeds of tracked hands/face are extracted using unique identifiable features (corners) from a window of frames and are used for training a learning algorithm. Our preliminary results for stroke detection under non-ideal video conditions are promising and show the potential applicability of our methodology.
  • Gebre, B. G., Crasborn, O., Wittenburg, P., Drude, S., & Heskes, T. (2014). Unsupervised feature learning for visual sign language identification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Vol 2 (pp. 370-376). Redhook, NY: Curran Proceedings.

    Abstract

    Prior research on language identification focused primarily on text and speech. In this paper, we focus on the visual modality and present a method for identifying sign languages solely from short video samples. The method is trained on unlabelled video data (unsupervised feature learning) and using these features, it is trained to discriminate between six sign languages (supervised learning). We ran experiments on video samples involving 30 signers (running for a total of 6 hours). Using leave-one-signer-out cross-validation, our evaluation on short video samples shows an average best accuracy of 84%. Given that sign languages are under-resourced, unsupervised feature learning techniques are the right tools and our results indicate that this is realistic for sign language identification.
  • Gentzsch, W., Lecarpentier, D., & Wittenburg, P. (2014). Big data in science and the EUDAT project. In Proceeding of the 2014 Annual SRII Global Conference.
  • Gisladottir, R. S., Chwilla, D., Schriefers, H., & Levinson, S. C. (2012). Speech act recognition in conversation: Experimental evidence. In N. Miyake, D. Peebles, & R. P. Cooper (Eds.), Proceedings of the 34th Annual Meeting of the Cognitive Science Society (CogSci 2012) (pp. 1596-1601). Austin, TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2012/papers/0282/index.html.

    Abstract

    Recognizing the speech acts in our interlocutors’ utterances is a crucial prerequisite for conversation. However, it is not a trivial task given that the form and content of utterances is frequently underspecified for this level of meaning. In the present study we investigate participants’ competence in categorizing speech acts in such action-underspecific sentences and explore the time-course of speech act inferencing using a self-paced reading paradigm. The results demonstrate that participants are able to categorize the speech acts with very high accuracy, based on limited context and without any prosodic information. Furthermore, the results show that the exact same sentence is processed differently depending on the speech act it performs, with reading times starting to differ already at the first word. These results indicate that participants are very good at “getting” the speech acts, opening up a new arena for experimental research on action recognition in conversation.
  • Le Guen, O. (2012). Socializing with the supernatural: The place of supernatural entities in Yucatec Maya daily life and socialization. In P. Nondédéo, & A. Breton (Eds.), Maya daily lives: Proceedings of the 13th European Maya Conference (pp. 151-170). Markt Schwaben: Verlag Anton Saurwein.
  • Guerra, E., & Knoeferle, P. (2014). Spatial distance modulates reading times for sentences about social relations: evidence from eye tracking. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2315-2320). Austin, TX: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2014/papers/403/.

    Abstract

    Recent evidence from eye tracking during reading showed that non-referential spatial distance presented in a visual context can modulate semantic interpretation of similarity relations rapidly and incrementally. In two eye-tracking reading experiments we extended these findings in two important ways; first, we examined whether other semantic domains (social relations) could also be rapidly influenced by spatial distance during sentence comprehension. Second, we aimed to further specify how abstract language is co-indexed with spatial information by varying the syntactic structure of sentences between experiments. Spatial distance rapidly modulated reading times as a function of the social relation expressed by a sentence. Moreover, our findings suggest that abstract language can be co-indexed as soon as critical information becomes available for the reader.
  • Guerra, E., Huettig, F., & Knoeferle, P. (2014). Assessing the time course of the influence of featural, distributional and spatial representations during reading. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2309-2314). Austin, TX: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2014/papers/402/.

    Abstract

    What does semantic similarity between two concepts mean? How could we measure it? The way in which semantic similarity is calculated might differ depending on the theoretical notion of semantic representation. In an eye-tracking reading experiment, we investigated whether two widely used semantic similarity measures (based on featural or distributional representations) have distinctive effects on sentence reading times. In other words, we explored whether these measures of semantic similarity differ qualitatively. In addition, we examined whether visually perceived spatial distance interacts with either or both of these measures. Our results showed that the effect of featural and distributional representations on reading times can differ both in direction and in its time course. Moreover, both featural and distributional information interacted with spatial distance, yet in different sentence regions and reading measures. We conclude that featural and distributional representations are distinct components of semantic representation.
  • Habscheid, S., & Klein, W. (Eds.). (2012). Dinge und Maschinen in der Kommunikation [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 42(168).

    Abstract

    “The most profound technologies are those that disappear. They weave themselves into the fabric of everyday life until they are indistinguishable from it.” (Weiser 1991, S. 94). – Die Behauptung stammt aus einem vielzitierten Text von Mark Weiser, ehemals Chief Technology Officer am berühmten Xerox Palo Alto Research Center (PARC), wo nicht nur einige bedeutende computertechnische Innovationen ihren Ursprung hatten, sondern auch grundlegende anthropologische Einsichten zum Umgang mit technischen Artefakten gewonnen wurden.1 In einem populärwissenschaftlichen Artikel mit dem Titel „The Computer for the 21st Century” entwarf Weiser 1991 die Vision einer Zukunft, in der wir nicht mehr mit einem einzelnen PC an unserem Arbeitsplatz umgehen – vielmehr seien wir in jedem Raum umgeben von hunderten elektronischer Vorrichtungen, die untrennbar in Alltagsgegenstände eingebettet und daher in unserer materiellen Umwelt gleichsam „verschwunden“ sind. Dabei ging es Weiser nicht allein um das ubiquitäre Phänomen, das in der Medientheorie als „Transparenz der Medien“ bekannt ist2 oder in allgemeineren Theorien der Alltagserfahrung als eine selbstverständliche Verwobenheit des Menschen mit den Dingen, die uns in ihrem Sinn vertraut und praktisch „zuhanden“ sind.3 Darüber hinaus zielte Weisers Vision darauf, unsere bereits existierende Umwelt durch computerlesbare Daten zu erweitern und in die Operationen eines solchen allgegenwärtigen Netzwerks alltägliche Praktiken gleichsam lückenlos zu integrieren: In der Welt, die Weiser entwirft, öffnen sich Türen für denjenigen, der ein bestimmtes elektronisches Abzeichen trägt, begrüßen Räume Personen, die sie betreten, mit Namen, passen sich Computerterminals an die Präferenzen individueller Nutzer an usw. (Weiser 1991, S. 99).
  • Haderlein, T., Moers, C., Möbius, B., & Nöth, E. (2012). Automatic rating of hoarseness by text-based cepstral and prosodic evaluation. In P. Sojka, A. Horák, I. Kopecek, & K. Pala (Eds.), Proceedings of the 15th International Conference on Text, Speech and Dialogue (TSD 2012) (pp. 573-580). Heidelberg: Springer.

    Abstract

    The standard for the analysis of distorted voices is perceptual rating of read-out texts or spontaneous speech. Automatic voice evaluation, however, is usually done on stable sections of sustained vowels. In this paper, text-based and established vowel-based analysis are compared with respect to their ability to measure hoarseness and its subclasses. 73 hoarse patients (48.3±16.8 years) uttered the vowel /e/ and read the German version of the text “The North Wind and the Sun”. Five speech therapists and physicians rated roughness, breathiness, and hoarseness according to the German RBH evaluation scheme. The best human-machine correlations were obtained for measures based on the Cepstral Peak Prominence (CPP; up to |r | = 0.73). Support Vector Regression (SVR) on CPP-based measures and prosodic features improved the results further to r ≈0.8 and confirmed that automatic voice evaluation should be performed on a text recording.
  • Hammarström, H., & van den Heuvel, W. (Eds.). (2012). On the history, contact & classification of Papuan languages [Special Issue]. Language & Linguistics in Melanesia, 2012. Retrieved from http://www.langlxmelanesia.com/specialissues.htm.
  • Hanique, I., & Ernestus, M. (2012). The processes underlying two frequent casual speech phenomena in Dutch: A production experiment. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 2011-2014).

    Abstract

    This study investigated whether a shadowing task can provide insights in the nature of reduction processes that are typical of casual speech. We focused on the shortening and presence versus absence of schwa and /t/ in Dutch past participles. Results showed that the absence of these segments was affected by the same variables as their shortening, suggesting that absence mostly resulted from extreme gradient shortening. This contrasts with results based on recordings of spontaneous conversations. We hypothesize that this difference is due to non-casual fast speech elicited by a shadowing task.
  • Heyselaar, E., Hagoort, P., & Segaert, K. (2014). In dialogue with an avatar, syntax production is identical compared to dialogue with a human partner. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2351-2356). Austin, Tx: Cognitive Science Society.

    Abstract

    The use of virtual reality (VR) as a methodological tool is becoming increasingly popular in behavioural research due to its seemingly limitless possibilities. This new method has not been used frequently in the field of psycholinguistics, however, possibly due to the assumption that humancomputer interaction does not accurately reflect human-human interaction. In the current study we compare participants’ language behaviour in a syntactic priming task with human versus avatar partners. Our study shows comparable priming effects between human and avatar partners (Human: 12.3%; Avatar: 12.6% for passive sentences) suggesting that VR is a valid platform for conducting language research and studying dialogue interactions.
  • Hoffmann, C. W. G., Sadakata, M., Chen, A., Desain, P., & McQueen, J. M. (2014). Within-category variance and lexical tone discrimination in native and non-native speakers. In C. Gussenhoven, Y. Chen, & D. Dediu (Eds.), Proceedings of the 4th International Symposium on Tonal Aspects of Language (pp. 45-49). Nijmegen: Radboud University Nijmegen.

    Abstract

    In this paper, we show how acoustic variance within lexical tones in disyllabic Mandarin Chinese pseudowords affects discrimination abilities in both native and non-native speakers of Mandarin Chinese. Within-category acoustic variance did not hinder native speakers in discriminating between lexical tones, whereas it precludes Dutch native speakers from reaching native level performance. Furthermore, the influence of acoustic variance was not uniform but asymmetric, dependent on the presentation order of the lexical tones to be discriminated. An exploratory analysis using an active adaptive oddball paradigm was used to quantify the extent of the perceptual asymmetry. We discuss two possible mechanisms underlying this asymmetry and propose possible paradigms to investigate these mechanisms
  • Holler, J., Kelly, S., Hagoort, P., & Ozyurek, A. (2012). When gestures catch the eye: The influence of gaze direction on co-speech gesture comprehension in triadic communication. In N. Miyake, D. Peebles, & R. P. Cooper (Eds.), Proceedings of the 34th Annual Meeting of the Cognitive Science Society (CogSci 2012) (pp. 467-472). Austin, TX: Cognitive Society. Retrieved from http://mindmodeling.org/cogsci2012/papers/0092/index.html.

    Abstract

    Co-speech gestures are an integral part of human face-to-face communication, but little is known about how pragmatic factors influence our comprehension of those gestures. The present study investigates how different types of recipients process iconic gestures in a triadic communicative situation. Participants (N = 32) took on the role of one of two recipients in a triad and were presented with 160 video clips of an actor speaking, or speaking and gesturing. Crucially, the actor’s eye gaze was manipulated in that she alternated her gaze between the two recipients. Participants thus perceived some messages in the role of addressed recipient and some in the role of unaddressed recipient. In these roles, participants were asked to make judgements concerning the speaker’s messages. Their reaction times showed that unaddressed recipients did comprehend speaker’s gestures differently to addressees. The findings are discussed with respect to automatic and controlled processes involved in gesture comprehension.
  • Janzen, G., & Weststeijn, C. (2004). Neural representation of object location and route direction: An fMRI study. NeuroImage, 22(Supplement 1), e634-e635.
  • Janzen, G., & Van Turennout, M. (2004). Neuronale Markierung navigationsrelevanter Objekte im räumlichen Gedächtnis: Ein fMRT Experiment. In D. Kerzel (Ed.), Beiträge zur 46. Tagung experimentell arbeitender Psychologen (pp. 125-125). Lengerich: Pabst Science Publishers.
  • Johns, T. G., Perera, R. M., Vitali, A. A., Vernes, S. C., & Scott, A. (2004). Phosphorylation of a glioma-specific mutation of the EGFR [Abstract]. Neuro-Oncology, 6, 317.

    Abstract

    Mutations of the epidermal growth factor receptor (EGFR) gene are found at a relatively high frequency in glioma, with the most common being the de2-7 EGFR (or EGFRvIII). This mutation arises from an in-frame deletion of exons 2-7, which removes 267 amino acids from the extracellular domain of the receptor. Despite being unable to bind ligand, the de2-7 EGFR is constitutively active at a low level. Transfection of human glioma cells with the de2-7 EGFR has little effect in vitro, but when grown as tumor xenografts this mutated receptor imparts a dramatic growth advantage. We mapped the phosphorylation pattern of de2-7 EGFR, both in vivo and in vitro, using a panel of antibodies specific for different phosphorylated tyrosine residues. Phosphorylation of de2-7 EGFR was detected constitutively at all tyrosine sites surveyed in vitro and in vivo, including tyrosine 845, a known target in the wild-type EGFR for src kinase. There was a substantial upregulation of phosphorylation at every yrosine residue of the de2-7 EGFR when cells were grown in vivo compared to the receptor isolated from cells cultured in vitro. Upregulation of phosphorylation at tyrosine 845 could be stimulated in vitro by the addition of specific components of the ECM via an integrindependent mechanism. These observations may partially explain why the growth enhancement mediated by de2-7 EGFR is largely restricted to the in vivo environment
  • Jung, D., Klessa, K., Duray, Z., Oszkó, B., Sipos, M., Szeverényi, S., Várnai, Z., Trilsbeek, P., & Váradi, T. (2014). Languagesindanger.eu - Including multimedia language resources to disseminate knowledge and create educational material on less-resourced languages. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 530-535).

    Abstract

    The present paper describes the development of the languagesindanger.eu interactive website as an example of including multimedia language resources to disseminate knowledge and create educational material on less-resourced languages. The website is a product of INNET (Innovative networking in infrastructure for endangered languages), European FP7 project. Its main functions can be summarized as related to the three following areas: (1) raising students' awareness of language endangerment and arouse their interest in linguistic diversity, language maintenance and language documentation; (2) informing both students and teachers about these topics and show ways how they can enlarge their knowledge further with a special emphasis on information about language archives; (3) helping teachers include these topics into their classes. The website has been localized into five language versions with the intention to be accessible to both scientific and non-scientific communities such as (primarily) secondary school teachers and students, beginning university students of linguistics, journalists, the interested public, and also members of speech communities who speak minority languages
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses? A corpus study revealing unexpected rigidity. In S. Kepser, & M. Reis (Eds.), Pre-Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: Niemeyer.
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses?: A corpus study revealing unexpected rigidity. In Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: University of Tübingen.
  • Kempen, G. (2004). Interactive visualization of syntactic structure assembly for grammar-intensive first- and second-language instruction. In R. Delmonte, P. Delcloque, & S. Tonelli (Eds.), Proceedings of InSTIL/ICALL2004 Symposium on NLP and speech technologies in advanced language learning systems (pp. 183-186). Venice: University of Venice.
  • Kempen, G. (2004). Human grammatical coding: Shared structure formation resources for grammatical encoding and decoding. In Cuny 2004 - The 17th Annual CUNY Conference on Human Sentence Processing. March 25-27, 2004. University of Maryland (pp. 66).
  • Klatter-Folmer, J., Van Hout, R., Van den Heuvel, H., Fikkert, P., Baker, A., De Jong, J., Wijnen, F., Sanders, E., & Trilsbeek, P. (2014). Vulnerability in acquisition, language impairments in Dutch: Creating a VALID data archive. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 357-364).

    Abstract

    The VALID Data Archive is an open multimedia data archive (under construction) with data from speakers suffering from language impairments. We report on a pilot project in the CLARIN-NL framework in which five data resources were curated. For all data sets concerned, written informed consent from the participants or their caretakers has been obtained. All materials were anonymized. The audio files were converted into wav (linear PCM) files and the transcriptions into CHAT or ELAN format. Research data that consisted of test, SPSS and Excel files were documented and converted into CSV files. All data sets obtained appropriate CMDI metadata files. A new CMDI metadata profile for this type of data resources was established and care was taken that ISOcat metadata categories were used to optimize interoperability. After curation all data are deposited at the Max Planck Institute for Psycholinguistics Nijmegen where persistent identifiers are linked to all resources. The content of the transcriptions in CHAT and plain text format can be searched with the TROVA search engine
  • Klein, W. (Ed.). (2004). Philologie auf neuen Wegen [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 136.
  • Klein, W. (Ed.). (1979). Sprache und Kontext [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (33).
  • Klein, W. (Ed.). (2004). Universitas [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik (LiLi), 134.
  • Latrouite, A., & Van Valin Jr., R. D. (2014). Event existentials in Tagalog: A Role and Reference Grammar account. In W. Arka, & N. L. K. Mas Indrawati (Eds.), Argument realisations and related constructions in Austronesian languages: papers from 12-ICAL (pp. 161-174). Canberra: Pacific Linguistics.
  • Lenkiewicz, P., Auer, E., Schreer, O., Masneri, S., Schneider, D., & Tschöpe, S. (2012). AVATecH ― automated annotation through audio and video analysis. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 209-214). European Language Resources Association.

    Abstract

    In different fields of the humanities annotations of multimodal resources are a necessary component of the research workflow. Examples include linguistics, psychology, anthropology, etc. However, creation of those annotations is a very laborious task, which can take 50 to 100 times the length of the annotated media, or more. This can be significantly improved by applying innovative audio and video processing algorithms, which analyze the recordings and provide automated annotations. This is the aim of the AVATecH project, which is a collaboration of the Max Planck Institute for Psycholinguistics (MPI) and the Fraunhofer institutes HHI and IAIS. In this paper we present a set of results of automated annotation together with an evaluation of their quality.
  • Lenkiewicz, P., Drude, S., Lenkiewicz, A., Gebre, B. G., Masneri, S., Schreer, O., Schwenninger, J., & Bardeli, R. (2014). Application of audio and video processing methods for language research and documentation: The AVATecH Project. In Z. Vetulani, & J. Mariani (Eds.), 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25-27, 2011, Revised Selected Papers (pp. 288-299). Berlin: Springer.

    Abstract

    Evolution and changes of all modern languages is a wellknown fact. However, recently it is reaching dynamics never seen before, which results in loss of the vast amount of information encoded in every language. In order to preserve such rich heritage, and to carry out linguistic research, properly annotated recordings of world languages are necessary. Since creating those annotations is a very laborious task, reaching times 100 longer than the length of the annotated media, innovative video processing algorithms are needed, in order to improve the efficiency and quality of annotation process. This is the scope of the AVATecH project presented in this article
  • Lenkiewicz, A., Lis, M., & Lenkiewicz, P. (2012). Linguistic concepts described with Media Query Language for automated annotation. In J. C. Meiser (Ed.), Digital Humanities 2012 Conference Abstracts. University of Hamburg, Germany; July 16–22, 2012 (pp. 477-479).

    Abstract

    Introduction Human spoken communication is multimodal, i.e. it encompasses both speech and gesture. Acoustic properties of voice, body movements, facial expression, etc. are an inherent and meaningful part of spoken interaction; they can provide attitudinal, grammatical and semantic information. In the recent years interest in audio-visual corpora has been rising rapidly as they enable investigation of different communicative modalities and provide more holistic view on communication (Kipp et al. 2009). Moreover, for some languages such corpora are the only available resource, as is the case for endangered languages for which no written resources exist.
  • Lenkiewicz, P., Shkaravska, O., Goosen, T., Windhouwer, M., Broeder, D., Roth, S., & Olsson, O. (2014). The DWAN framework: Application of a web annotation framework for the general humanities to the domain of language resources. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 3644-3649).
  • Lenkiewicz, P., Van Uytvanck, D., Wittenburg, P., & Drude, S. (2012). Towards automated annotation of audio and video recordings by application of advanced web-services. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 1880-1883).

    Abstract

    In this paper we describe audio and video processing algorithms that are developed in the scope of AVATecH project. The purpose of these algorithms is to shorten the time taken by manual annotation of audio and video recordings by extracting features from media files and creating semi-automated annotations. We show that the use of such supporting algorithms can shorten the annotation time to 30-50% of the time necessary to perform a fully manual annotation of the same kind.
  • Lev-Ari, S., & Peperkamp, S. (2014). Do people converge to the linguistic patterns of non-reliable speakers? Perceptual learning from non-native speakers. In S. Fuchs, M. Grice, A. Hermes, L. Lancia, & D. Mücke (Eds.), Proceedings of the 10th International Seminar on Speech Production (ISSP) (pp. 261-264).

    Abstract

    People's language is shaped by the input from the environment. The environment, however, offers a range of linguistic inputs that differ in their reliability. We test whether listeners accordingly weigh input from sources that differ in reliability differently. Using a perceptual learning paradigm, we show that listeners adjust their representations according to linguistic input provided by native but not by non-native speakers. This is despite the fact that listeners are able to learn the characteristics of the speech of both speakers. These results provide evidence for a disassociation between adaptation to the characteristic of specific speakers and adjustment of linguistic representations in general based on these learned characteristics. This study also has implications for theories of language change. In particular, it cast doubts on the hypothesis that a large proportion of non-native speakers in a community can bring about linguistic changes
  • Levinson, S. C. (1979). Pragmatics and social deixis: Reclaiming the notion of conventional implicature. In C. Chiarello (Ed.), Proceedings of the Fifth Annual Meeting of the Berkeley Linguistics Society (pp. 206-223).
  • Lew, A. A., Hall-Lew, L., & Fairs, A. (2014). Language and Tourism in Sabah, Malaysia and Edinburgh, Scotland. In B. O'Rourke, N. Bermingham, & S. Brennan (Eds.), Opening New Lines of Communication in Applied Linguistics: Proceedings of the 46th Annual Meeting of the British Association for Applied Linguistics (pp. 253-259). London, UK: Scitsiugnil Press.
  • Little, H., & Silvey, C. (2014). Interpreting emerging structures: The interdependence of combinatoriality and compositionality. In Proceedings of the First Conference of the International Association for Cognitive Semiotics (IACS 2014) (pp. 113-114).
  • Little, H., & Eryilmaz, K. (2014). The effect of physical articulation constraints on the emergence of combinatorial structure. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 11-17).

Share this page