Publications

Displaying 1 - 100 of 124
  • Alhama, R. G., & Zuidema, W. (2017). Segmentation as Retention and Recognition: the R&R model. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 1531-1536). Austin, TX: Cognitive Science Society.

    Abstract

    We present the Retention and Recognition model (R&R), a probabilistic exemplar model that accounts for segmentation in Artificial Language Learning experiments. We show that R&R provides an excellent fit to human responses in three segmentation experiments with adults (Frank et al., 2010), outperforming existing models. Additionally, we analyze the results of the simulations and propose alternative explanations for the experimental findings.
  • Alhama, R. G., Scha, R., & Zuidema, W. (2014). Rule learning in humans and animals. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Cornish (Eds.), The evolution of language: Proceedings of the 10th International Conference (EVOLANG 10) (pp. 371-372). Singapore: World Scientific.
  • Azar, Z., Backus, A., & Ozyurek, A. (2017). Highly proficient bilinguals maintain language-specific pragmatic constraints on pronouns: Evidence from speech and gesture. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 81-86). Austin, TX: Cognitive Science Society.

    Abstract

    The use of subject pronouns by bilingual speakers using both a pro-drop and a non-pro-drop language (e.g. Spanish heritage speakers in the USA) is a well-studied topic in research on cross-linguistic influence in language contact situations. Previous studies looking at bilinguals with different proficiency levels have yielded conflicting results on whether there is transfer from the non-pro-drop patterns to the pro-drop language. Additionally, previous research has focused on speech patterns only. In this paper, we study the two modalities of language, speech and gesture, and ask whether and how they reveal cross-linguistic influence on the use of subject pronouns in discourse. We focus on elicited narratives from heritage speakers of Turkish in the Netherlands, in both Turkish (pro-drop) and Dutch (non-pro-drop), as well as from monolingual control groups. The use of pronouns was not very common in monolingual Turkish narratives and was constrained by the pragmatic contexts, unlike in Dutch. Furthermore, Turkish pronouns were more likely to be accompanied by localized gestures than Dutch pronouns, presumably because pronouns in Turkish are pragmatically marked forms. We did not find any cross-linguistic influence in bilingual speech or gesture patterns, in line with studies (speech only) of highly proficient bilinguals. We therefore suggest that speech and gesture parallel each other not only in monolingual but also in bilingual production. Highly proficient heritage speakers who have been exposed to diverse linguistic and gestural patterns of each language from early on maintain monolingual patterns of pragmatic constraints on the use of pronouns multimodally.
  • Bauer, B. L. M. (2014). Indefinite HOMO in the Gospels of the Vulgata. In P. Molinell, P. Cuzzoli, & C. Fedriani (Eds.), Latin vulgaire – latin tardif X (pp. 415-435). Bergamo: Bergamo University Press.
  • Bergmann, C., Ten Bosch, L., & Boves, L. (2014). A computational model of the headturn preference procedure: Design, challenges, and insights. In J. Mayor, & P. Gomez (Eds.), Computational Models of Cognitive Processes (pp. 125-136). World Scientific. doi:10.1142/9789814458849_0010.

    Abstract

    The Headturn Preference Procedure (HPP) is a frequently used method (e.g., Jusczyk & Aslin; and subsequent studies) to investigate linguistic abilities in infants. In this paradigm infants are usually first familiarised with words and then tested for a listening preference for passages containing those words in comparison to unrelated passages. Listening preference is defined as the time an infant spends attending to those passages with his or her head turned towards a flashing light and the speech stimuli. The knowledge and abilities inferred from the results of HPP studies have been used to reason about and formally model early linguistic skills and language acquisition. However, the actual cause of infants' behaviour in HPP experiments has been subject to numerous assumptions as there are no means to directly tap into cognitive processes. To make these assumptions explicit, and more crucially, to understand how infants' behaviour emerges if only general learning mechanisms are assumed, we introduce a computational model of the HPP. Simulations with the computational HPP model show that the difference in infant behaviour between familiarised and unfamiliar words in passages can be explained by a general learning mechanism and that many assumptions underlying the HPP are not necessarily warranted. We discuss the implications for conventional interpretations of the outcomes of HPP experiments.
  • Bergmann, C., Tsuji, S., & Cristia, A. (2017). Top-down versus bottom-up theories of phonological acquisition: A big data approach. In Proceedings of Interspeech 2017 (pp. 2103-2107).

    Abstract

    Recent work has made available a number of standardized meta- analyses bearing on various aspects of infant language processing. We utilize data from two such meta-analyses (discrimination of vowel contrasts and word segmentation, i.e., recognition of word forms extracted from running speech) to assess whether the published body of empirical evidence supports a bottom-up versus a top-down theory of early phonological development by leveling the power of results from thousands of infants. We predicted that if infants can rely purely on auditory experience to develop their phonological categories, then vowel discrimination and word segmentation should develop in parallel, with the latter being potentially lagged compared to the former. However, if infants crucially rely on word form information to build their phonological categories, then development at the word level must precede the acquisition of native sound categories. Our results do not support the latter prediction. We discuss potential implications and limitations, most saliently that word forms are only one top-down level proposed to affect phonological development, with other proposals suggesting that top-down pressures emerge from lexical (i.e., word-meaning pairs) development. This investigation also highlights general procedures by which standardized meta-analyses may be reused to answer theoretical questions spanning across phenomena.

    Additional information

    Scripts and data
  • Black, A., & Bergmann, C. (2017). Quantifying infants' statistical word segmentation: A meta-analysis. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Meeting of the Cognitive Science Society (pp. 124-129). Austin, TX: Cognitive Science Society.

    Abstract

    Theories of language acquisition and perceptual learning increasingly rely on statistical learning mechanisms. The current meta-analysis aims to clarify the robustness of this capacity in infancy within the word segmentation literature. Our analysis reveals a significant, small effect size for conceptual replications of Saffran, Aslin, & Newport (1996), and a nonsignificant effect across all studies that incorporate transitional probabilities to segment words. In both conceptual replications and the broader literature, however, statistical learning is moderated by whether stimuli are naturally produced or synthesized. These findings invite deeper questions about the complex factors that influence statistical learning, and the role of statistical learning in language acquisition.
  • Blasi, D. E., Christiansen, M. H., Wichmann, S., Hammarström, H., & Stadler, P. F. (2014). Sound symbolism and the origins of language. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Cornish (Eds.), The evolution of language: Proceedings of the 10th International Conference (EVOLANG 10) (pp. 391-392). Singapore: World Scientific.
  • Bocanegra, B. R., Poletiek, F. H., & Zwaan, R. A. (2014). Asymmetrical feature binding across language and perception. In Proceedings of the 7th annual Conference on Embodied and Situated Language Processing (ESLP 2014).
  • Bosker, H. R., & Kösem, A. (2017). An entrained rhythm's frequency, not phase, influences temporal sampling of speech. In Proceedings of Interspeech 2017 (pp. 2416-2420). doi:10.21437/Interspeech.2017-73.

    Abstract

    Brain oscillations have been shown to track the slow amplitude fluctuations in speech during comprehension. Moreover, there is evidence that these stimulus-induced cortical rhythms may persist even after the driving stimulus has ceased. However, how exactly this neural entrainment shapes speech perception remains debated. This behavioral study investigated whether and how the frequency and phase of an entrained rhythm would influence the temporal sampling of subsequent speech. In two behavioral experiments, participants were presented with slow and fast isochronous tone sequences, followed by Dutch target words ambiguous between as /ɑs/ “ash” (with a short vowel) and aas /a:s/ “bait” (with a long vowel). Target words were presented at various phases of the entrained rhythm. Both experiments revealed effects of the frequency of the tone sequence on target word perception: fast sequences biased listeners to more long /a:s/ responses. However, no evidence for phase effects could be discerned. These findings show that an entrained rhythm’s frequency, but not phase, influences the temporal sampling of subsequent speech. These outcomes are compatible with theories suggesting that sensory timing is evaluated relative to entrained frequency. Furthermore, they suggest that phase tracking of (syllabic) rhythms by theta oscillations plays a limited role in speech parsing.
  • Bosker, H. R. (2017). The role of temporal amplitude modulations in the political arena: Hillary Clinton vs. Donald Trump. In Proceedings of Interspeech 2017 (pp. 2228-2232). doi:10.21437/Interspeech.2017-142.

    Abstract

    Speech is an acoustic signal with inherent amplitude modulations in the 1-9 Hz range. Recent models of speech perception propose that this rhythmic nature of speech is central to speech recognition. Moreover, rhythmic amplitude modulations have been shown to have beneficial effects on language processing and the subjective impression listeners have of the speaker. This study investigated the role of amplitude modulations in the political arena by comparing the speech produced by Hillary Clinton and Donald Trump in the three presidential debates of 2016. Inspection of the modulation spectra, revealing the spectral content of the two speakers’ amplitude envelopes after matching for overall intensity, showed considerably greater power in Clinton’s modulation spectra (compared to Trump’s) across the three debates, particularly in the 1-9 Hz range. The findings suggest that Clinton’s speech had a more pronounced temporal envelope with rhythmic amplitude modulations below 9 Hz, with a preference for modulations around 3 Hz. This may be taken as evidence for a more structured temporal organization of syllables in Clinton’s speech, potentially due to more frequent use of preplanned utterances. Outcomes are interpreted in light of the potential beneficial effects of a rhythmic temporal envelope on intelligibility and speaker perception.
  • Bowerman, M., de León, L., & Choi, S. (1995). Verbs, particles, and spatial semantics: Learning to talk about spatial actions in typologically different languages. In E. V. Clark (Ed.), Proceedings of the Twenty-seventh Annual Child Language Research Forum (pp. 101-110). Stanford, CA: Center for the Study of Language and Information.
  • Broeder, D., Schuurman, I., & Windhouwer, M. (2014). Experiences with the ISOcat Data Category Registry. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 4565-4568).
  • Burchfield, L. A., Luk, S.-.-H.-K., Antoniou, M., & Cutler, A. (2017). Lexically guided perceptual learning in Mandarin Chinese. In Proceedings of Interspeech 2017 (pp. 576-580). doi:10.21437/Interspeech.2017-618.

    Abstract

    Lexically guided perceptual learni ng refers to the use of lexical knowledge to retune sp eech categories and thereby adapt to a novel talker’s pronunciation. This adaptation has been extensively documented, but primarily for segmental-based learning in English and Dutch. In languages with lexical tone, such as Mandarin Chinese, tonal categories can also be retuned in this way, but segmental category retuning had not been studied. We report two experiment s in which Mandarin Chinese listeners were exposed to an ambiguous mixture of [f] and [s] in lexical contexts favoring an interpretation as either [f] or [s]. Listeners were subsequently more likely to identify sounds along a continuum between [f] and [s], and to interpret minimal word pairs, in a manner consistent with this exposure. Thus lexically guided perceptual learning of segmental categories had indeed taken place, consistent with suggestions that such learning may be a universally available adaptation process
  • Casillas, M., Bergelson, E., Warlaumont, A. S., Cristia, A., Soderstrom, M., VanDam, M., & Sloetjes, H. (2017). A New Workflow for Semi-automatized Annotations: Tests with Long-Form Naturalistic Recordings of Childrens Language Environments. In Proceedings of Interspeech 2017 (pp. 2098-2102). doi:10.21437/Interspeech.2017-1418.

    Abstract

    Interoperable annotation formats are fundamental to the utility, expansion, and sustainability of collective data repositories.In language development research, shared annotation schemes have been critical to facilitating the transition from raw acoustic data to searchable, structured corpora. Current schemes typically require comprehensive and manual annotation of utterance boundaries and orthographic speech content, with an additional, optional range of tags of interest. These schemes have been enormously successful for datasets on the scale of dozens of recording hours but are untenable for long-format recording corpora, which routinely contain hundreds to thousands of audio hours. Long-format corpora would benefit greatly from (semi-)automated analyses, both on the earliest steps of annotation—voice activity detection, utterance segmentation, and speaker diarization—as well as later steps—e.g., classification-based codes such as child-vs-adult-directed speech, and speech recognition to produce phonetic/orthographic representations. We present an annotation workflow specifically designed for long-format corpora which can be tailored by individual researchers and which interfaces with the current dominant scheme for short-format recordings. The workflow allows semi-automated annotation and analyses at higher linguistic levels. We give one example of how the workflow has been successfully implemented in a large cross-database project.
  • Casillas, M., Amatuni, A., Seidl, A., Soderstrom, M., Warlaumont, A., & Bergelson, E. (2017). What do Babies hear? Analyses of Child- and Adult-Directed Speech. In Proceedings of Interspeech 2017 (pp. 2093-2097). doi:10.21437/Interspeech.2017-1409.

    Abstract

    Child-directed speech is argued to facilitate language development, and is found cross-linguistically and cross-culturally to varying degrees. However, previous research has generally focused on short samples of child-caregiver interaction, often in the lab or with experimenters present. We test the generalizability of this phenomenon with an initial descriptive analysis of the speech heard by young children in a large, unique collection of naturalistic, daylong home recordings. Trained annotators coded automatically-detected adult speech 'utterances' from 61 homes across 4 North American cities, gathered from children (age 2-24 months) wearing audio recorders during a typical day. Coders marked the speaker gender (male/female) and intended addressee (child/adult), yielding 10,886 addressee and gender tags from 2,523 minutes of audio (cf. HB-CHAAC Interspeech ComParE challenge; Schuller et al., in press). Automated speaker-diarization (LENA) incorrectly gender-tagged 30% of male adult utterances, compared to manually-coded consensus. Furthermore, we find effects of SES and gender on child-directed and overall speech, increasing child-directed speech with child age, and interactions of speaker gender, child gender, and child age: female caretakers increased their child-directed speech more with age than male caretakers did, but only for male infants. Implications for language acquisition and existing classification algorithms are discussed.
  • Chen, A. (2014). Production-comprehension (A)Symmetry: Individual differences in the acquisition of prosodic focus-marking. In N. Campbell, D. Gibbon, & D. Hirst (Eds.), Proceedings of Speech Prosody 2014 (pp. 423-427).

    Abstract

    Previous work based on different groups of children has shown that four- to five-year-old children are similar to adults in both producing and comprehending the focus-toaccentuation mapping in Dutch, contra the alleged productionprecedes- comprehension asymmetry in earlier studies. In the current study, we addressed the question of whether there are individual differences in the production-comprehension (a)symmetricity. To this end, we examined the use of prosody in focus marking in production and the processing of focusrelated prosody in online language comprehension in the same group of 4- to 5-year-olds. We have found that the relationship between comprehension and production can be rather diverse at an individual level. This result suggests some degree of independence in learning to use prosody to mark focus in production and learning to process focus-related prosodic information in online language comprehension, and implies influences of other linguistic and non-linguistic factors on the production-comprehension (a)symmetricity
  • Chen, A., Chen, A., Kager, R., & Wong, P. (2014). Rises and falls in Dutch and Mandarin Chinese. In C. Gussenhoven, Y. Chen, & D. Dediu (Eds.), Proceedings of the 4th International Symposium on Tonal Aspects of Language (pp. 83-86).

    Abstract

    Despite of the different functions of pitch in tone and nontone languages, rises and falls are common pitch patterns across different languages. In the current study, we ask what is the language specific phonetic realization of rises and falls. Chinese and Dutch speakers participated in a production experiment. We used contexts composed for conveying specific communicative purposes to elicit rises and falls. We measured both tonal alignment and tonal scaling for both patterns. For the alignment measurements, we found language specific patterns for the rises, but for falls. For rises, both peak and valley were aligned later among Chinese speakers compared to Dutch speakers. For all the scaling measurements (maximum pitch, minimum pitch, and pitch range), no language specific patterns were found for either the rises or the falls
  • Clark, N., & Perlman, M. (2014). Breath, vocal, and supralaryngeal flexibility in a human-reared gorilla. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 11-15).

    Abstract

    “Gesture-first” theories dismiss ancestral great apes’ vocalization as a substrate for language evolution based on the claim that extant apes exhibit minimal learning and volitional control of vocalization. Contrary to this claim, we present data of novel learned and voluntarily controlled vocal behaviors produced by a human-fostered gorilla (G. gorilla gorilla). These behaviors demonstrate varying degrees of flexibility in the vocal apparatus (including diaphragm, lungs, larynx, and supralaryngeal articulators), and are predominantly performed in coordination with manual behaviors and gestures. Instead of a gesture-first theory, we suggest that these findings support multimodal theories of language evolution in which vocal and gestural forms are coordinated and supplement one another
  • Crasborn, O., Hulsbosch, M., Lampen, L., & Sloetjes, H. (2014). New multilayer concordance functions in ELAN and TROVA. In Proceedings of the Tilburg Gesture Research Meeting [TiGeR 2013].

    Abstract

    Collocations generated by concordancers are a standard instrument in the exploitation of text corpora for the analysis of language use. Multimodal corpora show similar types of patterns, activities that frequently occur together, but there is no tool that offers facilities for visualising such patterns. Examples include timing of eye contact with respect to speech, and the alignment of activities of the two hands in signed languages. This paper describes recent enhancements to the standard CLARIN tools ELAN and TROVA for multimodal annotation to address these needs: first of all the query and concordancing functions were improved, and secondly the tools now generate visualisations of multilayer collocations that allow for intuitive explorations and analyses of multimodal data. This will provide a boost to the linguistic fields of gesture and sign language studies, as it will improve the exploitation of multimodal corpora.
  • Crasborn, O., & Sloetjes, H. (2014). Improving the exploitation of linguistic annotations in ELAN. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 3604-3608).

    Abstract

    This paper discusses some improvements in recent and planned versions of the multimodal annotation tool ELAN, which are targeted at improving the usability of annotated files. Increased support for multilingual documents is provided, by allowing for multilingual vocabularies and by specifying a language per document, annotation layer (tier) or annotation. In addition, improvements in the search possibilities and the display of the results have been implemented, which are especially relevant in the interpretation of the results of complex multi-tier searches.
  • Cutler, A. (2017). Converging evidence for abstract phonological knowledge in speech processing. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 1447-1448). Austin, TX: Cognitive Science Society.

    Abstract

    The perceptual processing of speech is a constant interplay of multiple competing albeit convergent processes: acoustic input vs. higher-level representations, universal mechanisms vs. language-specific, veridical traces of speech experience vs. construction and activation of abstract representations. The present summary concerns the third of these issues. The ability to generalise across experience and to deal with resulting abstractions is the hallmark of human cognition, visible even in early infancy. In speech processing, abstract representations play a necessary role in both production and perception. New sorts of evidence are now informing our understanding of the breadth of this role.
  • Cutler, A., & Fear, B. D. (1991). Categoricality in acceptability judgements for strong versus weak vowels. In J. Llisterri (Ed.), Proceedings of the ESCA Workshop on Phonetics and Phonology of Speaking Styles (pp. 18.1-18.5). Barcelona, Catalonia: Universitat Autonoma de Barcelona.

    Abstract

    A distinction between strong and weak vowels can be drawn on the basis of vowel quality, of stress, or of both factors. An experiment was conducted in which sets of contextually matched word-intial vowels ranging from clearly strong to clearly weak were cross-spliced, and the naturalness of the resulting words was rated by listeners. The ratings showed that in general cross-spliced words were only significantly less acceptable than unspliced words when schwa was not involved; this supports a categorical distinction based on vowel quality.
  • Ip, M. H. K., & Cutler, A. (2017). Intonation facilitates prediction of focus even in the presence of lexical tones. In Proceedings of Interspeech 2017 (pp. 1218-1222). doi:10.21437/Interspeech.2017-264.

    Abstract

    In English and Dutch, listeners entrain to prosodic contours to predict where focus will fall in an utterance. However, is this strategy universally available, even in languages with different phonological systems? In a phoneme detection experiment, we examined whether prosodic entrainment is also found in Mandarin Chinese, a tone language, where in principle the use of pitch for lexical identity may take precedence over the use of pitch cues to salience. Consistent with the results from Germanic languages, response times were facilitated when preceding intonation predicted accent on the target-bearing word. Acoustic analyses revealed greater F0 range in the preceding intonation of the predicted-accent sentences. These findings have implications for how universal and language-specific mechanisms interact in the processing of salience.
  • Cutler, A., & Chen, H.-C. (1995). Phonological similarity effects in Cantonese word recognition. In K. Elenius, & P. Branderud (Eds.), Proceedings of the Thirteenth International Congress of Phonetic Sciences: Vol. 1 (pp. 106-109). Stockholm: Stockholm University.

    Abstract

    Two lexical decision experiments in Cantonese are described in which the recognition of spoken target words as a function of phonological similarity to a preceding prime is investigated. Phonological similaritv in first syllables produced inhibition, while similarity in second syllables led to facilitation. Differences between syllables in tonal and segmental structure had generally similar effects.
  • Cutler, A. (1991). Prosody in situations of communication: Salience and segmentation. In Proceedings of the Twelfth International Congress of Phonetic Sciences: Vol. 1 (pp. 264-270). Aix-en-Provence: Université de Provence, Service des publications.

    Abstract

    Speakers and listeners have a shared goal: to communicate. The processes of speech perception and of speech production interact in many ways under the constraints of this communicative goal; such interaction is as characteristic of prosodic processing as of the processing of other aspects of linguistic structure. Two of the major uses of prosodic information in situations of communication are to encode salience and segmentation, and these themes unite the contributions to the symposium introduced by the present review.
  • Cutler, A., & Butterfield, S. (1986). The perceptual integrity of initial consonant clusters. In R. Lawrence (Ed.), Speech and Hearing: Proceedings of the Institute of Acoustics (pp. 31-36). Edinburgh: Institute of Acoustics.
  • Cutler, A. (1995). Universal and Language-Specific in the Development of Speech. Biology International, (Special Issue 33).
  • Dediu, D., & Levinson, S. C. (2014). Language and speech are old: A review of the evidence and consequences for modern linguistic diversity. In E. A. Cartmill, S. G. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 421-422). Singapore: World Scientific.
  • Dingemanse, M., Verhoef, T., & Roberts, S. G. (2014). The role of iconicity in the cultural evolution of communicative signals. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 11-15).
  • Dingemanse, M., Torreira, F., & Enfield, N. J. (2014). Conversational infrastructure and the convergent evolution of linguistic items. In E. A. Cartmill, S. G. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 425-426). Singapore: World Scientific.
  • Doherty, M., & Klein, W. (Eds.). (1991). Übersetzung [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (84).
  • Dolscheid, S., Willems, R. M., Hagoort, P., & Casasanto, D. (2014). The relation of space and musical pitch in the brain. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 421-426). Austin, Tx: Cognitive Science Society.

    Abstract

    Numerous experiments show that space and musical pitch are closely linked in people's minds. However, the exact nature of space-pitch associations and their neuronal underpinnings are not well understood. In an fMRI experiment we investigated different types of spatial representations that may underlie musical pitch. Participants judged stimuli that varied in spatial height in both the visual and tactile modalities, as well as auditory stimuli that varied in pitch height. In order to distinguish between unimodal and multimodal spatial bases of musical pitch, we examined whether pitch activations were present in modality-specific (visual or tactile) versus multimodal (visual and tactile) regions active during spatial height processing. Judgments of musical pitch were found to activate unimodal visual areas, suggesting that space-pitch associations may involve modality-specific spatial representations, supporting a key assumption of embodied theories of metaphorical mental representation.
  • Doumas, L. A. A., Hamer, A., Puebla, G., & Martin, A. E. (2017). A theory of the detection and learning of structured representations of similarity and relative magnitude. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 1955-1960). Austin, TX: Cognitive Science Society.

    Abstract

    Responding to similarity, difference, and relative magnitude (SDM) is ubiquitous in the animal kingdom. However, humans seem unique in the ability to represent relative magnitude (‘more’/‘less’) and similarity (‘same’/‘different’) as abstract relations that take arguments (e.g., greater-than (x,y)). While many models use structured relational representations of magnitude and similarity, little progress has been made on how these representations arise. Models that developuse these representations assume access to computations of similarity and magnitude a priori, either encoded as features or as output of evaluation operators. We detail a mechanism for producing invariant responses to “same”, “different”, “more”, and “less” which can be exploited to compute similarity and magnitude as an evaluation operator. Using DORA (Doumas, Hummel, & Sandhofer, 2008), these invariant responses can serve be used to learn structured relational representations of relative magnitude and similarity from pixel images of simple shapes
  • Drozdova, P., Van Hout, R., & Scharenborg, O. (2014). Phoneme category retuning in a non-native language. In Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 553-557).

    Abstract

    Previous studies have demonstrated that native listeners modify their interpretation of a speech sound when a talker produces an ambiguous sound in order to quickly tune into a speaker, but there is hardly any evidence that non-native listeners employ a similar mechanism when encountering ambiguous pronunciations. So far, one study demonstrated this lexically-guided perceptual learning effect for nonnatives, using phoneme categories similar in the native language of the listeners and the non-native language of the stimulus materials. The present study investigates the question whether phoneme category retuning is possible in a nonnative language for a contrast, /l/-/r/, which is phonetically differently embedded in the native (Dutch) and nonnative (English) languages involved. Listening experiments indeed showed a lexically-guided perceptual learning effect. Assuming that Dutch listeners have different phoneme categories for the native Dutch and non-native English /r/, as marked differences between the languages exist for /r/, these results, for the first time, seem to suggest that listeners are not only able to retune their native phoneme categories but also their non-native phoneme categories to include ambiguous pronunciations.
  • Edmiston, P., Perlman, M., & Lupyan, G. (2017). Creating words from iterated vocal imitation. In G. Gunzelman, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 331-336). Austin, TX: Cognitive Science Society.

    Abstract

    We report the results of a large-scale (N=1571) experiment to investigate whether spoken words can emerge from the process of repeated imitation. Participants played a version of the children’s game “Telephone”. The first generation was asked to imitate recognizable environmental sounds (e.g., glass breaking, water splashing); subsequent generations imitated the imitators for a total of 8 generations. We then examined whether the vocal imitations became more stable and word-like, retained a resemblance to the original sound, and became more suitable as learned category labels. The results showed (1) the imitations became progressively more word-like, (2) even after 8 generations, they could be matched above chance to the environmental sound that motivated them, and (3) imitations from later generations were more effective as learned category labels. These results show how repeated imitation can create progressively more word-like forms while retaining a semblance of iconicity.
  • Ernestus, M., Kočková-Amortová, L., & Pollak, P. (2014). The Nijmegen corpus of casual Czech. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 365-370).

    Abstract

    This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which contains more than 30 hours of high-quality recordings of casual conversations in Common Czech, among ten groups of three male and ten groups of three female friends. All speakers were native speakers of Czech, raised in Prague or in the region of Central Bohemia, and were between 19 and 26 years old. Every group of speakers consisted of one confederate, who was instructed to keep the conversations lively, and two speakers naive to the purposes of the recordings. The naive speakers were engaged in conversations for approximately 90 minutes, while the confederate joined them for approximately the last 72 minutes. The corpus was orthographically annotated by experienced transcribers and this orthographic transcription was aligned with the speech signal. In addition, the conversations were videotaped. This corpus can form the basis for all types of research on casual conversations in Czech, including phonetic research and research on how to improve automatic speech recognition. The corpus will be freely available
  • Filippi, P. (2014). Linguistic animals: understanding language through a comparative approach. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Crnish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 74-81). doi:10.1142/9789814603638_0082.

    Abstract

    With the aim to clarify the definition of humans as “linguistic animals”, in the present paper I functionally distinguish three types of language competences: i) language as a general biological tool for communication, ii) “perceptual syntax”, iii) propositional language. Following this terminological distinction, I review pivotal findings on animals' communication systems, which constitute useful evidence for the investigation of the nature of three core components of humans' faculty of language: semantics, syntax, and theory of mind. In fact, despite the capacity to process and share utterances with an open-ended structure is uniquely human, some isolated components of our linguistic competence are in common with nonhuman animals. Therefore, as I argue in the present paper, the investigation of animals' communicative competence provide crucial insights into the range of cognitive constraints underlying humans' ability of language, enabling at the same time the analysis of its phylogenetic path as well as of the selective pressures that have led to its emergence.
  • Filippi, P., Gingras, B., & Fitch, W. T. (2014). The effect of pitch enhancement on spoken language acquisition. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Crnish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 437-438). doi:10.1142/9789814603638_0082.

    Abstract

    The aim of this study is to investigate the word-learning phenomenon utilizing a new model that integrates three processes: a) extracting a word out of a continuous sounds sequence, b) inducing referential meanings, c) mapping a word onto its intended referent, with the possibility to extend the acquired word over a potentially infinite sets of objects of the same semantic category, and over not-previously-heard utterances. Previous work has examined the role of statistical learning and/or of prosody in each of these processes separately. In order to examine the multilayered word-learning task, we integrate these two strands of investigation into a single approach. We have conducted the study on adults and included six different experimental conditions, each including specific perceptual manipulations of the signal. In condition 1, the only cue to word-meaning mapping was the co-occurrence between words and referents (“statistical cue”). This cue was present in all the conditions. In condition 2, we added infant-directed-speech (IDS) typical pitch enhancement as a marker of the target word and of the statistical cue. In condition 3 we placed IDS typical pitch enhancement on random words of the utterances, i.e. inconsistently matching the statistical cue. In conditions 4, 5 and 6 we manipulated respectively duration, a non-prosodic acoustic cue and a visual cue as markers of the target word and of the statistical cue. Systematic comparisons between learning performance in condition 1 with the other conditions revealed that the word-learning process is facilitated only when pitch prominence consistently marks the target word and the statistical cue…
  • Francisco, A. A., Jesse, A., Groen, M. a., & McQueen, J. M. (2014). Audiovisual temporal sensitivity in typical and dyslexic adult readers. In Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) (pp. 2575-2579).

    Abstract

    Reading is an audiovisual process that requires the learning of systematic links between graphemes and phonemes. It is thus possible that reading impairments reflect an audiovisual processing deficit. In this study, we compared audiovisual processing in adults with developmental dyslexia and adults without reading difficulties. We focused on differences in cross-modal temporal sensitivity both for speech and for non-speech events. When compared to adults without reading difficulties, adults with developmental dyslexia presented a wider temporal window in which unsynchronized speech events were perceived as synchronized. No differences were found between groups for the non-speech events. These results suggests a deficit in dyslexia in the perception of cross-modal temporal synchrony for speech events.
  • Franken, M. K., Eisner, F., Schoffelen, J.-M., Acheson, D. J., Hagoort, P., & McQueen, J. M. (2017). Audiovisual recalibration of vowel categories. In Proceedings of Interspeech 2017 (pp. 655-658). doi:10.21437/Interspeech.2017-122.

    Abstract

    One of the most daunting tasks of a listener is to map a continuous auditory stream onto known speech sound categories and lexical items. A major issue with this mapping problem is the variability in the acoustic realizations of sound categories, both within and across speakers. Past research has suggested listeners may use visual information (e.g., lipreading) to calibrate these speech categories to the current speaker. Previous studies have focused on audiovisual recalibration of consonant categories. The present study explores whether vowel categorization, which is known to show less sharply defined category boundaries, also benefit from visual cues. Participants were exposed to videos of a speaker pronouncing one out of two vowels, paired with audio that was ambiguous between the two vowels. After exposure, it was found that participants had recalibrated their vowel categories. In addition, individual variability in audiovisual recalibration is discussed. It is suggested that listeners’ category sharpness may be related to the weight they assign to visual information in audiovisual speech perception. Specifically, listeners with less sharp categories assign more weight to visual information during audiovisual speech recognition.
  • Fusaroli, R., Tylén, K., Garly, K., Steensig, J., Christiansen, M. H., & Dingemanse, M. (2017). Measures and mechanisms of common ground: Backchannels, conversational repair, and interactive alignment in free and task-oriented social interactions. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 2055-2060). Austin, TX: Cognitive Science Society.

    Abstract

    A crucial aspect of everyday conversational interactions is our ability to establish and maintain common ground. Understanding the relevant mechanisms involved in such social coordination remains an important challenge for cognitive science. While common ground is often discussed in very general terms, different contexts of interaction are likely to afford different coordination mechanisms. In this paper, we investigate the presence and relation of three mechanisms of social coordination – backchannels, interactive alignment and conversational repair – across free and task-oriented conversations. We find significant differences: task-oriented conversations involve higher presence of repair – restricted offers in particular – and backchannel, as well as a reduced level of lexical and syntactic alignment. We find that restricted repair is associated with lexical alignment and open repair with backchannels. Our findings highlight the need to explicitly assess several mechanisms at once and to investigate diverse activities to understand their role and relations.
  • Ganushchak, L. Y., & Acheson, D. J. (Eds.). (2014). What's to be learned from speaking aloud? - Advances in the neurophysiological measurement of overt language production. [Research topic] [Special Issue]. Frontiers in Language Sciences. Retrieved from http://www.frontiersin.org/Language_Sciences/researchtopics/What_s_to_be_Learned_from_Spea/1671.

    Abstract

    Researchers have long avoided neurophysiological experiments of overt speech production due to the suspicion that artifacts caused by muscle activity may lead to a bad signal-to-noise ratio in the measurements. However, the need to actually produce speech may influence earlier processing and qualitatively change speech production processes and what we can infer from neurophysiological measures thereof. Recently, however, overt speech has been successfully investigated using EEG, MEG, and fMRI. The aim of this Research Topic is to draw together recent research on the neurophysiological basis of language production, with the aim of developing and extending theoretical accounts of the language production process. In this Research Topic of Frontiers in Language Sciences, we invite both experimental and review papers, as well as those about the latest methods in acquisition and analysis of overt language production data. All aspects of language production are welcome: i.e., from conceptualization to articulation during native as well as multilingual language production. Focus should be placed on using the neurophysiological data to inform questions about the processing stages of language production. In addition, emphasis should be placed on the extent to which the identified components of the electrophysiological signal (e.g., ERP/ERF, neuronal oscillations, etc.), brain areas or networks are related to language comprehension and other cognitive domains. By bringing together electrophysiological and neuroimaging evidence on language production mechanisms, a more complete picture of the locus of language production processes and their temporal and neurophysiological signatures will emerge.
  • Gebre, B. G., Wittenburg, P., Heskes, T., & Drude, S. (2014). Motion history images for online speaker/signer diarization. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 1537-1541). Piscataway, NJ: IEEE.

    Abstract

    We present a solution to the problem of online speaker/signer diarization - the task of determining "who spoke/signed when?". Our solution is based on the idea that gestural activity (hands and body movement) is highly correlated with uttering activity. This correlation is necessarily true for sign languages and mostly true for spoken languages. The novel part of our solution is the use of motion history images (MHI) as a likelihood measure for probabilistically detecting uttering activities. MHI is an efficient representation of where and how motion occurred for a fixed period of time. We conducted experiments on 4.9 hours of a publicly available dataset (the AMI meeting data) and 1.4 hours of sign language dataset (Kata Kolok data). The best performance obtained is 15.70% for sign language and 31.90% for spoken language (measurements are in DER). These results show that our solution is applicable in real-world applications like video conferences.

    Files private

    Request files
  • Gebre, B. G., Wittenburg, P., Drude, S., Huijbregts, M., & Heskes, T. (2014). Speaker diarization using gesture and speech. In H. Li, & P. Ching (Eds.), Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 582-586).

    Abstract

    We demonstrate how the problem of speaker diarization can be solved using both gesture and speaker parametric models. The novelty of our solution is that we approach the speaker diarization problem as a speaker recognition problem after learning speaker models from speech samples corresponding to gestures (the occurrence of gestures indicates the presence of speech and the location of gestures indicates the identity of the speaker). This new approach offers many advantages: comparable state-of-the-art performance, faster computation and more adaptability. In our implementation, parametric models are used to model speakers' voice and their gestures: more specifically, Gaussian mixture models are used to model the voice characteristics of each person and all persons, and gamma distributions are used to model gestural activity based on features extracted from Motion History Images. Tests on 4.24 hours of the AMI meeting data show that our solution makes DER score improvements of 19% on speech-only segments and 4% on all segments including silence (the comparison is with the AMI system).
  • Gebre, B. G., Crasborn, O., Wittenburg, P., Drude, S., & Heskes, T. (2014). Unsupervised feature learning for visual sign language identification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Vol 2 (pp. 370-376). Redhook, NY: Curran Proceedings.

    Abstract

    Prior research on language identification focused primarily on text and speech. In this paper, we focus on the visual modality and present a method for identifying sign languages solely from short video samples. The method is trained on unlabelled video data (unsupervised feature learning) and using these features, it is trained to discriminate between six sign languages (supervised learning). We ran experiments on video samples involving 30 signers (running for a total of 6 hours). Using leave-one-signer-out cross-validation, our evaluation on short video samples shows an average best accuracy of 84%. Given that sign languages are under-resourced, unsupervised feature learning techniques are the right tools and our results indicate that this is realistic for sign language identification.
  • Gentzsch, W., Lecarpentier, D., & Wittenburg, P. (2014). Big data in science and the EUDAT project. In Proceeding of the 2014 Annual SRII Global Conference.
  • Guerra, E., & Knoeferle, P. (2014). Spatial distance modulates reading times for sentences about social relations: evidence from eye tracking. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2315-2320). Austin, TX: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2014/papers/403/.

    Abstract

    Recent evidence from eye tracking during reading showed that non-referential spatial distance presented in a visual context can modulate semantic interpretation of similarity relations rapidly and incrementally. In two eye-tracking reading experiments we extended these findings in two important ways; first, we examined whether other semantic domains (social relations) could also be rapidly influenced by spatial distance during sentence comprehension. Second, we aimed to further specify how abstract language is co-indexed with spatial information by varying the syntactic structure of sentences between experiments. Spatial distance rapidly modulated reading times as a function of the social relation expressed by a sentence. Moreover, our findings suggest that abstract language can be co-indexed as soon as critical information becomes available for the reader.
  • Guerra, E., Huettig, F., & Knoeferle, P. (2014). Assessing the time course of the influence of featural, distributional and spatial representations during reading. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2309-2314). Austin, TX: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2014/papers/402/.

    Abstract

    What does semantic similarity between two concepts mean? How could we measure it? The way in which semantic similarity is calculated might differ depending on the theoretical notion of semantic representation. In an eye-tracking reading experiment, we investigated whether two widely used semantic similarity measures (based on featural or distributional representations) have distinctive effects on sentence reading times. In other words, we explored whether these measures of semantic similarity differ qualitatively. In addition, we examined whether visually perceived spatial distance interacts with either or both of these measures. Our results showed that the effect of featural and distributional representations on reading times can differ both in direction and in its time course. Moreover, both featural and distributional information interacted with spatial distance, yet in different sentence regions and reading measures. We conclude that featural and distributional representations are distinct components of semantic representation.
  • Heyselaar, E., Hagoort, P., & Segaert, K. (2014). In dialogue with an avatar, syntax production is identical compared to dialogue with a human partner. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2351-2356). Austin, Tx: Cognitive Science Society.

    Abstract

    The use of virtual reality (VR) as a methodological tool is becoming increasingly popular in behavioural research due to its seemingly limitless possibilities. This new method has not been used frequently in the field of psycholinguistics, however, possibly due to the assumption that humancomputer interaction does not accurately reflect human-human interaction. In the current study we compare participants’ language behaviour in a syntactic priming task with human versus avatar partners. Our study shows comparable priming effects between human and avatar partners (Human: 12.3%; Avatar: 12.6% for passive sentences) suggesting that VR is a valid platform for conducting language research and studying dialogue interactions.
  • Hoffmann, C. W. G., Sadakata, M., Chen, A., Desain, P., & McQueen, J. M. (2014). Within-category variance and lexical tone discrimination in native and non-native speakers. In C. Gussenhoven, Y. Chen, & D. Dediu (Eds.), Proceedings of the 4th International Symposium on Tonal Aspects of Language (pp. 45-49). Nijmegen: Radboud University Nijmegen.

    Abstract

    In this paper, we show how acoustic variance within lexical tones in disyllabic Mandarin Chinese pseudowords affects discrimination abilities in both native and non-native speakers of Mandarin Chinese. Within-category acoustic variance did not hinder native speakers in discriminating between lexical tones, whereas it precludes Dutch native speakers from reaching native level performance. Furthermore, the influence of acoustic variance was not uniform but asymmetric, dependent on the presentation order of the lexical tones to be discriminated. An exploratory analysis using an active adaptive oddball paradigm was used to quantify the extent of the perceptual asymmetry. We discuss two possible mechanisms underlying this asymmetry and propose possible paradigms to investigate these mechanisms
  • Isbilen, E. S., McCauley, S. M., Kidd, E., & Christiansen, M. H. (2017). Testing statistical learning implicitly: A novel chunk-based measure of statistical learning. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 564-569). Austin, TX: Cognitive Science Society.

    Abstract

    Attempts to connect individual differences in statistical learning with broader aspects of cognition have received considerable attention, but have yielded mixed results. A possible explanation is that statistical learning is typically tested using the two-alternative forced choice (2AFC) task. As a meta-cognitive task relying on explicit familiarity judgments, 2AFC may not accurately capture implicitly formed statistical computations. In this paper, we adapt the classic serial-recall memory paradigm to implicitly test statistical learning in a statistically-induced chunking recall (SICR) task. We hypothesized that artificial language exposure would lead subjects to chunk recurring statistical patterns, facilitating recall of words from the input. Experiment 1 demonstrates that SICR offers more fine-grained insights into individual differences in statistical learning than 2AFC. Experiment 2 shows that SICR has higher test-retest reliability than that reported for 2AFC. Thus, SICR offers a more sensitive measure of individual differences, suggesting that basic chunking abilities may explain statistical learning.
  • Jung, D., Klessa, K., Duray, Z., Oszkó, B., Sipos, M., Szeverényi, S., Várnai, Z., Trilsbeek, P., & Váradi, T. (2014). Languagesindanger.eu - Including multimedia language resources to disseminate knowledge and create educational material on less-resourced languages. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 530-535).

    Abstract

    The present paper describes the development of the languagesindanger.eu interactive website as an example of including multimedia language resources to disseminate knowledge and create educational material on less-resourced languages. The website is a product of INNET (Innovative networking in infrastructure for endangered languages), European FP7 project. Its main functions can be summarized as related to the three following areas: (1) raising students' awareness of language endangerment and arouse their interest in linguistic diversity, language maintenance and language documentation; (2) informing both students and teachers about these topics and show ways how they can enlarge their knowledge further with a special emphasis on information about language archives; (3) helping teachers include these topics into their classes. The website has been localized into five language versions with the intention to be accessible to both scientific and non-scientific communities such as (primarily) secondary school teachers and students, beginning university students of linguistics, journalists, the interested public, and also members of speech communities who speak minority languages
  • Karadöller, D. Z., Sumer, B., & Ozyurek, A. (2017). Effects of delayed language exposure on spatial language acquisition by signing children and adults. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 2372-2376). Austin, TX: Cognitive Science Society.

    Abstract

    Deaf children born to hearing parents are exposed to language input quite late, which has long-lasting effects on language production. Previous studies with deaf individuals mostly focused on linguistic expressions of motion events, which have several event components. We do not know if similar effects emerge in simple events such as descriptions of spatial configurations of objects. Moreover, previous data mainly come from late adult signers. There is not much known about language development of late signing children soon after learning sign language. We compared simple event descriptions of late signers of Turkish Sign Language (adults, children) to age-matched native signers. Our results indicate that while late signers in both age groups are native-like in frequency of expressing a relational encoding, they lag behind native signers in using morphologically complex linguistic forms compared to other simple forms. Late signing children perform similar to adults and thus showed no development over time.
  • Kember, H., Grohe, A.-.-K., Zahner, K., Braun, B., Weber, A., & Cutler, A. (2017). Similar prosodic structure perceived differently in German and English. In Proceedings of Interspeech 2017 (pp. 1388-1392).

    Abstract

    English and German have similar prosody, but their speakers realize some pitch falls (not rises) in subtly different ways. We here test for asymmetry in perception. An ABX discrimination task requiring F0 slope or duration judgements on isolated vowels revealed no cross-language difference in duration or F0 fall discrimination, but discrimination of rises (realized similarly in each language) was less accurate for English than for German listeners. This unexpected finding may reflect greater sensitivity to rising patterns by German listeners, or reduced sensitivity by English listeners as a result of extensive exposure to phrase-final rises (“uptalk”) in their language
  • Klatter-Folmer, J., Van Hout, R., Van den Heuvel, H., Fikkert, P., Baker, A., De Jong, J., Wijnen, F., Sanders, E., & Trilsbeek, P. (2014). Vulnerability in acquisition, language impairments in Dutch: Creating a VALID data archive. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 357-364).

    Abstract

    The VALID Data Archive is an open multimedia data archive (under construction) with data from speakers suffering from language impairments. We report on a pilot project in the CLARIN-NL framework in which five data resources were curated. For all data sets concerned, written informed consent from the participants or their caretakers has been obtained. All materials were anonymized. The audio files were converted into wav (linear PCM) files and the transcriptions into CHAT or ELAN format. Research data that consisted of test, SPSS and Excel files were documented and converted into CSV files. All data sets obtained appropriate CMDI metadata files. A new CMDI metadata profile for this type of data resources was established and care was taken that ISOcat metadata categories were used to optimize interoperability. After curation all data are deposited at the Max Planck Institute for Psycholinguistics Nijmegen where persistent identifiers are linked to all resources. The content of the transcriptions in CHAT and plain text format can be searched with the TROVA search engine
  • Klein, W. (1995). A simplest analysis of the English tense-aspect system. In W. Riehle, & H. Keiper (Eds.), Proceedings of the Anglistentag 1994 (pp. 139-151). Tübingen: Niemeyer.
  • Klein, W. (Ed.). (1995). Epoche [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (100).
  • Klein, W. (Ed.). (1985). Schriftlichkeit [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (59).
  • Klein, W. (Ed.). (1986). Sprachverfall [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (62).
  • Latrouite, A., & Van Valin Jr., R. D. (2014). Event existentials in Tagalog: A Role and Reference Grammar account. In W. Arka, & N. L. K. Mas Indrawati (Eds.), Argument realisations and related constructions in Austronesian languages: papers from 12-ICAL (pp. 161-174). Canberra: Pacific Linguistics.
  • Lee, R., Chambers, C. G., Huettig, F., & Ganea, P. A. (2017). Children’s semantic and world knowledge overrides fictional information during anticipatory linguistic processing. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Meeting of the Cognitive Science Society (CogSci 2017) (pp. 730-735). Austin, TX: Cognitive Science Society.

    Abstract

    Using real-time eye-movement measures, we asked how a fantastical discourse context competes with stored representations of semantic and world knowledge to influence children's and adults' moment-by-moment interpretation of a story. Seven-year- olds were less effective at bypassing stored semantic and world knowledge during real-time interpretation than adults. Nevertheless, an effect of discourse context on comprehension was still apparent.
  • Lenkiewicz, P., Drude, S., Lenkiewicz, A., Gebre, B. G., Masneri, S., Schreer, O., Schwenninger, J., & Bardeli, R. (2014). Application of audio and video processing methods for language research and documentation: The AVATecH Project. In Z. Vetulani, & J. Mariani (Eds.), 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25-27, 2011, Revised Selected Papers (pp. 288-299). Berlin: Springer.

    Abstract

    Evolution and changes of all modern languages is a wellknown fact. However, recently it is reaching dynamics never seen before, which results in loss of the vast amount of information encoded in every language. In order to preserve such rich heritage, and to carry out linguistic research, properly annotated recordings of world languages are necessary. Since creating those annotations is a very laborious task, reaching times 100 longer than the length of the annotated media, innovative video processing algorithms are needed, in order to improve the efficiency and quality of annotation process. This is the scope of the AVATecH project presented in this article
  • Lenkiewicz, P., Shkaravska, O., Goosen, T., Windhouwer, M., Broeder, D., Roth, S., & Olsson, O. (2014). The DWAN framework: Application of a web annotation framework for the general humanities to the domain of language resources. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 3644-3649).
  • Lev-Ari, S., & Peperkamp, S. (2014). Do people converge to the linguistic patterns of non-reliable speakers? Perceptual learning from non-native speakers. In S. Fuchs, M. Grice, A. Hermes, L. Lancia, & D. Mücke (Eds.), Proceedings of the 10th International Seminar on Speech Production (ISSP) (pp. 261-264).

    Abstract

    People's language is shaped by the input from the environment. The environment, however, offers a range of linguistic inputs that differ in their reliability. We test whether listeners accordingly weigh input from sources that differ in reliability differently. Using a perceptual learning paradigm, we show that listeners adjust their representations according to linguistic input provided by native but not by non-native speakers. This is despite the fact that listeners are able to learn the characteristics of the speech of both speakers. These results provide evidence for a disassociation between adaptation to the characteristic of specific speakers and adjustment of linguistic representations in general based on these learned characteristics. This study also has implications for theories of language change. In particular, it cast doubts on the hypothesis that a large proportion of non-native speakers in a community can bring about linguistic changes
  • Levelt, W. J. M. (1991). Lexical access in speech production: Stages versus cascading. In H. Peters, W. Hulstijn, & C. Starkweather (Eds.), Speech motor control and stuttering (pp. 3-10). Amsterdam: Excerpta Medica.
  • Levelt, W. J. M., & Plomp, R. (1962). Musical consonance and critical bandwidth. In Proceedings of the 4th International Congress Acoustics (pp. 55-55).
  • Lew, A. A., Hall-Lew, L., & Fairs, A. (2014). Language and Tourism in Sabah, Malaysia and Edinburgh, Scotland. In B. O'Rourke, N. Bermingham, & S. Brennan (Eds.), Opening New Lines of Communication in Applied Linguistics: Proceedings of the 46th Annual Meeting of the British Association for Applied Linguistics (pp. 253-259). London, UK: Scitsiugnil Press.
  • Little, H., & Silvey, C. (2014). Interpreting emerging structures: The interdependence of combinatoriality and compositionality. In Proceedings of the First Conference of the International Association for Cognitive Semiotics (IACS 2014) (pp. 113-114).
  • Little, H., Perlman, M., & Eryilmaz, K. (2017). Repeated interactions can lead to more iconic signals. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 760-765). Austin, TX: Cognitive Science Society.

    Abstract

    Previous research has shown that repeated interactions can cause iconicity in signals to reduce. However, data from several recent studies has shown the opposite trend: an increase in iconicity as the result of repeated interactions. Here, we discuss whether signals may become less or more iconic as a result of the modality used to produce them. We review several recent experimental results before presenting new data from multi-modal signals, where visual input creates audio feedback. Our results show that the growth in iconicity present in the audio information may come at a cost to iconicity in the visual information. Our results have implications for how we think about and measure iconicity in artificial signalling experiments. Further, we discuss how iconicity in real world speech may stem from auditory, kinetic or visual information, but iconicity in these different modalities may conflict.
  • Little, H. (Ed.). (2017). Special Issue on the Emergence of Sound Systems [Special Issue]. The Journal of Language Evolution, 2(1).
  • Little, H., & Eryilmaz, K. (2014). The effect of physical articulation constraints on the emergence of combinatorial structure. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 11-17).
  • Little, H., & De Boer, B. (2014). The effect of size of articulation space on the emergence of combinatorial structure. In E. Cartmill A., S. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th international conference (EvoLangX) (pp. 479-481). Singapore: World Scientific.
  • Liu, Z., Chen, A., & Van de Velde, H. (2014). Prosodic focus marking in Bai. In N. Campbell, D. Gibbon, & D. Hirst (Eds.), Proceedings of Speech Prosody 2014 (pp. 628-631).

    Abstract

    This study investigates prosodic marking of focus in Bai, a Sino-Tibetan language spoken in the Southwest of China, by adopting a semi-spontaneous experimental approach. Our data show that Bai speakers increase the duration of the focused constituent and reduce the duration of the post-focus constituent to encode focus. However, duration is not used in Bai to distinguish focus types differing in size and contrastivity. Further, pitch plays no role in signaling focus and differentiating focus types. The results thus suggest that Bai uses prosody to mark focus, but to a lesser extent, compared to Mandarin Chinese, with which Bai has been in close contact for centuries, and Cantonese, to which Bai is similar in the tonal system, although Bai is similar to Cantonese in its reliance on duration in prosodic focus marking.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). Whether long-term tracking of speech rate affects perception depends on who is talking. In Proceedings of Interspeech 2017 (pp. 586-590). doi:10.21437/Interspeech.2017-1517.

    Abstract

    Speech rate is known to modulate perception of temporally ambiguous speech sounds. For instance, a vowel may be perceived as short when the immediate speech context is slow, but as long when the context is fast. Yet, effects of long-term tracking of speech rate are largely unexplored. Two experiments tested whether long-term tracking of rate influences perception of the temporal Dutch vowel contrast /ɑ/-/a:/. In Experiment 1, one low-rate group listened to 'neutral' rate speech from talker A and to slow speech from talker B. Another high-rate group was exposed to the same neutral speech from A, but to fast speech from B. Between-group comparison of the 'neutral' trials revealed that the low-rate group reported a higher proportion of /a:/ in A's 'neutral' speech, indicating that A sounded faster when B was slow. Experiment 2 tested whether one's own speech rate also contributes to effects of long-term tracking of rate. Here, talker B's speech was replaced by playback of participants' own fast or slow speech. No evidence was found that one's own voice affected perception of talker A in larger speech contexts. These results carry implications for our understanding of the mechanisms involved in rate-dependent speech perception and of dialogue.
  • Matic, D., & Nikolaeva, I. (2014). Focus feature percolation: Evidence from Tundra Nenets and Tundra Yukaghir. In S. Müller (Ed.), Proceedings of the 21st International Conference on Head-Driven Phrase Structure Grammar (HPSG 2014) (pp. 299-317). Stanford, CA: CSLI Publications.

    Abstract

    Two Siberian languages, Tundra Nenets and Tundra Yukaghir, do not obey strong island constraints in questioning: any sub-constituent of a relative or adverbial clause can be questioned. We argue that this has to do with how focusing works in these languages. The focused sub-constituent remains in situ, but there is abundant morphosyntactic evidence that the focus feature is passed up to the head of the clause. The result is the formation of a complex focus structure in which both the head and non head daughter are overtly marked as focus, and they are interpreted as a pairwise list such that the focus background is applicable to this list, but not to other alternative lists
  • Micklos, A. (2014). The nature of language in interaction. In E. Cartmill, S. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference.
  • Mizera, P., Pollak, P., Kolman, A., & Ernestus, M. (2014). Impact of irregular pronunciation on phonetic segmentation of Nijmegen corpus of Casual Czech. In P. Sojka, A. Horák, I. Kopecek, & K. Pala (Eds.), Text, Speech and Dialogue: 17th International Conference, TSD 2014, Brno, Czech Republic, September 8-12, 2014. Proceedings (pp. 499-506). Heidelberg: Springer.

    Abstract

    This paper describes the pilot study of phonetic segmentation applied to Nijmegen Corpus of Casual Czech (NCCCz). This corpus contains informal speech of strong spontaneous nature which influences the character of produced speech at various levels. This work is the part of wider research related to the analysis of pronunciation reduction in such informal speech. We present the analysis of the accuracy of phonetic segmentation when canonical or reduced pronunciation is used. The achieved accuracy of realized phonetic segmentation provides information about general accuracy of proper acoustic modelling which is supposed to be applied in spontaneous speech recognition. As a byproduct of presented spontaneous speech segmentation, this paper also describes the created lexicon with canonical pronunciations of words in NCCCz, a tool supporting pronunciation check of lexicon items, and finally also a minidatabase of selected utterances from NCCCz manually labelled on phonetic level suitable for evaluation purposes
  • Monaghan, P., Brand, J., Frost, R. L. A., & Taylor, G. (2017). Multiple variable cues in the environment promote accurate and robust word learning. In G. Gunzelman, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 817-822). Retrieved from https://mindmodeling.org/cogsci2017/papers/0164/index.html.

    Abstract

    Learning how words refer to aspects of the environment is a complex task, but one that is supported by numerous cues within the environment which constrain the possibilities for matching words to their intended referents. In this paper we tested the predictions of a computational model of multiple cue integration for word learning, that predicted variation in the presence of cues provides an optimal learning situation. In a cross-situational learning task with adult participants, we varied the reliability of presence of distributional, prosodic, and gestural cues. We found that the best learning occurred when cues were often present, but not always. The effect of variability increased the salience of individual cues for the learner, but resulted in robust learning that was not vulnerable to individual cues’ presence or absence. Thus, variability of multiple cues in the language-learning environment provided the optimal circumstances for word learning.
  • Ortega, G., Schiefner, A., & Ozyurek, A. (2017). Speakers’ gestures predict the meaning and perception of iconicity in signs. In G. Gunzelmann, A. Howe, & T. Tenbrink (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 889-894). Austin, TX: Cognitive Science Society.

    Abstract

    Sign languages stand out in that there is high prevalence of conventionalised linguistic forms that map directly to their referent (i.e., iconic). Hearing adults show low performance when asked to guess the meaning of iconic signs suggesting that their iconic features are largely inaccessible to them. However, it has not been investigated whether speakers’ gestures, which also share the property of iconicity, may assist non-signers in guessing the meaning of signs. Results from a pantomime generation task (Study 1) show that speakers’ gestures exhibit a high degree of systematicity, and share different degrees of form overlap with signs (full, partial, and no overlap). Study 2 shows that signs with full and partial overlap are more accurately guessed and are assigned higher iconicity ratings than signs with no overlap. Deaf and hearing adults converge in their iconic depictions for some concepts due to the shared conceptual knowledge and manual-visual modality.
  • Ortega, G., Sumer, B., & Ozyurek, A. (2014). Type of iconicity matters: Bias for action-based signs in sign language acquisition. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 1114-1119). Austin, Tx: Cognitive Science Society.

    Abstract

    Early studies investigating sign language acquisition claimed that signs whose structures are motivated by the form of their referent (iconic) are not favoured in language development. However, recent work has shown that the first signs in deaf children’s lexicon are iconic. In this paper we go a step further and ask whether different types of iconicity modulate learning sign-referent links. Results from a picture description task indicate that children and adults used signs with two possible variants differentially. While children signing to adults favoured variants that map onto actions associated with a referent (action signs), adults signing to another adult produced variants that map onto objects’ perceptual features (perceptual signs). Parents interacting with children used more action variants than signers in adult-adult interactions. These results are in line with claims that language development is tightly linked to motor experience and that iconicity can be a communicative strategy in parental input.
  • Otake, T., Davis, S. M., & Cutler, A. (1995). Listeners’ representations of within-word structure: A cross-linguistic and cross-dialectal investigation. In J. Pardo (Ed.), Proceedings of EUROSPEECH 95: Vol. 3 (pp. 1703-1706). Madrid: European Speech Communication Association.

    Abstract

    Japanese, British English and American English listeners were presented with spoken words in their native language, and asked to mark on a written transcript of each word the first natural division point in the word. The results showed clear and strong patterns of consensus, indicating that listeners have available to them conscious representations of within-word structure. Orthography did not play a strongly deciding role in the results. The patterns of response were at variance with results from on-line studies of speech segmentation, suggesting that the present task taps not those representations used in on-line listening, but levels of representation which may involve much richer knowledge of word-internal structure.
  • Peeters, D., Azar, Z., & Ozyurek, A. (2014). The interplay between joint attention, physical proximity, and pointing gesture in demonstrative choice. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 1144-1149). Austin, Tx: Cognitive Science Society.
  • Perlman, M., Clark, N., & Tanner, J. (2014). Iconicity and ape gesture. In E. A. Cartmill, S. G. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 236-243). New Jersey: World Scientific.

    Abstract

    Iconic gestures are hypothesized to be c rucial to the evolution of language. Yet the important question of whether apes produce iconic gestures is the subject of considerable debate. This paper presents the current state of research on iconicity in ape gesture. In particular, it describes some of the empirical evidence suggesting that apes produce three different kinds of iconic gestures; it compares the iconicity hypothesis to other major hypotheses of ape gesture; and finally, it offers some directions for future ape gesture research
  • Perlman, M., Fusaroli, R., Fein, D., & Naigles, L. (2017). The use of iconic words in early child-parent interactions. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 913-918). Austin, TX: Cognitive Science Society.

    Abstract

    This paper examines the use of iconic words in early conversations between children and caregivers. The longitudinal data include a span of six observations of 35 children-parent dyads in the same semi-structured activity. Our findings show that children’s speech initially has a high proportion of iconic words, and over time, these words become diluted by an increase of arbitrary words. Parents’ speech is also initially high in iconic words, with a decrease in the proportion of iconic words over time – in this case driven by the use of fewer iconic words. The level and development of iconicity are related to individual differences in the children’s cognitive skills. Our findings fit with the hypothesis that iconicity facilitates early word learning and may play an important role in learning to produce new words.
  • Popov, V., Ostarek, M., & Tenison, C. (2017). Inferential Pitfalls in Decoding Neural Representations. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 961-966). Austin, TX: Cognitive Science Society.

    Abstract

    A key challenge for cognitive neuroscience is to decipher the representational schemes of the brain. A recent class of decoding algorithms for fMRI data, stimulus-feature-based encoding models, is becoming increasingly popular for inferring the dimensions of neural representational spaces from stimulus-feature spaces. We argue that such inferences are not always valid, because decoding can occur even if the neural representational space and the stimulus-feature space use different representational schemes. This can happen when there is a systematic mapping between them. In a simulation, we successfully decoded the binary representation of numbers from their decimal features. Since binary and decimal number systems use different representations, we cannot conclude that the binary representation encodes decimal features. The same argument applies to the decoding of neural patterns from stimulus-feature spaces and we urge caution in inferring the nature of the neural code from such methods. We discuss ways to overcome these inferential limitations.
  • Pouw, W., Aslanidou, A., Kamermans, K. L., & Paas, F. (2017). Is ambiguity detection in haptic imagery possible? Evidence for Enactive imaginings. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 2925-2930). Austin, TX: Cognitive Science Society.

    Abstract

    A classic discussion about visual imagery is whether it affords reinterpretation, like discovering two interpretations in the duck/rabbit illustration. Recent findings converge on reinterpretation being possible in visual imagery, suggesting functional equivalence with pictorial representations. However, it is unclear whether such reinterpretations are necessarily a visual-pictorial achievement. To assess this, 68 participants were briefly presented 2-d ambiguous figures. One figure was presented visually, the other via manual touch alone. Afterwards participants mentally rotated the memorized figures as to discover a novel interpretation. A portion (20.6%) of the participants detected a novel interpretation in visual imagery, replicating previous research. Strikingly, 23.6% of participants were able to reinterpret figures they had only felt. That reinterpretation truly involved haptic processes was further supported, as some participants performed co-thought gestures on an imagined figure during retrieval. These results are promising for further development of an Enactivist approach to imagination.
  • Ravignani, A., Bowling, D., & Kirby, S. (2014). The psychology of biological clocks: A new framework for the evolution of rhythm. In E. A. Cartmill, S. G. Roberts, & H. Lyn (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 262-269). Singapore: World Scientific.
  • Roberts, S. G., Dediu, D., & Levinson, S. C. (2014). Detecting differences between the languages of Neandertals and modern humans. In E. A. Cartmill, S. G. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 501-502). Singapore: World Scientific.

    Abstract

    Dediu and Levinson (2013) argue that Neandertals had essentially modern language and speech, and that they were in genetic contact with the ancestors of modern humans during our dispersal out of Africa. This raises the possibility of cultural and linguistic contact between the two human lineages. If such contact did occur, then it might have influenced the cultural evolution of the languages. Since the genetic traces of contact with Neandertals are limited to the populations outside of Africa, Dediu & Levinson predict that there may be structural differences between the present-day languages derived from languages in contact with Neanderthals, and those derived from languages that were not influenced by such contact. Since the signature of such deep contact might reside in patterns of features, they suggested that machine learning methods may be able to detect these differences. This paper attempts to test this hypothesis and to estimate particular linguistic features that are potential candidates for carrying a signature of Neandertal languages.
  • Roberts, S. G., & De Vos, C. (2014). Gene-culture coevolution of a linguistic system in two modalities. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 23-27).

    Abstract

    Complex communication can take place in a range of modalities such as auditory, visual, and tactile modalities. In a very general way, the modality that individuals use is constrained by their biological biases (humans cannot use magnetic fields directly to communicate to each other). The majority of natural languages have a large audible component. However, since humans can learn sign languages just as easily, it’s not clear to what extent the prevalence of spoken languages is due to biological biases, the social environment or cultural inheritance. This paper suggests that we can explore the relative contribution of these factors by modelling the spontaneous emergence of sign languages that are shared by the deaf and hearing members of relatively isolated communities. Such shared signing communities have arisen in enclaves around the world and may provide useful insights by demonstrating how languages evolve as the deaf proportion of its members has strong biases towards the visual language modality. In this paper we describe a model of cultural evolution in two modalities, combining aspects that are thought to impact the emergence of sign languages in a more general evolutionary framework. The model can be used to explore hypotheses about how sign languages emerge.
  • Roberts, S. G., Thompson, B., & Smith, K. (2014). Social interaction influences the evolution of cognitive biases for language. In E. A. Cartmill, S. G. Roberts, & H. Lyn (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 278-285). Singapore: World Scientific. doi:0.1142/9789814603638_0036.

    Abstract

    Models of cultural evolution demonstrate that the link between individual biases and population- level phenomena can be obscured by the process of cultural transmission (Kirby, Dowman, & Griffiths, 2007). However, recent extensions to these models predict that linguistic diversity will not emerge and that learners should evolve to expect little linguistic variation in their input (Smith & Thompson, 2012). We demonstrate that this result derives from assumptions that privilege certain kinds of social interaction by exploring a range of alternative social models. We find several evolutionary routes to linguistic diversity, and show that social interaction not only influences the kinds of biases which could evolve to support language, but also the effects those biases have on a linguistic system. Given the same starting situation, the evolution of biases for language learning and the distribution of linguistic variation are affected by the kinds of social interaction that a population privileges.
  • Schmidt, J., Janse, E., & Scharenborg, O. (2014). Age, hearing loss and the perception of affective utterances in conversational speech. In Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 1929-1933).

    Abstract

    This study investigates whether age and/or hearing loss influence the perception of the emotion dimensions arousal (calm vs. aroused) and valence (positive vs. negative attitude) in conversational speech fragments. Specifically, this study focuses on the relationship between participants' ratings of affective speech and acoustic parameters known to be associated with arousal and valence (mean F0, intensity, and articulation rate). Ten normal-hearing younger and ten older adults with varying hearing loss were tested on two rating tasks. Stimuli consisted of short sentences taken from a corpus of conversational affective speech. In both rating tasks, participants estimated the value of the emotion dimension at hand using a 5-point scale. For arousal, higher intensity was generally associated with higher arousal in both age groups. Compared to younger participants, older participants rated the utterances as less aroused, and showed a smaller effect of intensity on their arousal ratings. For valence, higher mean F0 was associated with more negative ratings in both age groups. Generally, age group differences in rating affective utterances may not relate to age group differences in hearing loss, but rather to other differences between the age groups, as older participants' rating patterns were not associated with their individual hearing loss.
  • Schuller, B., Steidl, S., Batliner, A., Bergelson, E., Krajewski, J., Janott, C., Amatuni, A., Casillas, M., Seidl, A., Soderstrom, M., Warlaumont, A. S., Hidalgo, G., Schnieder, S., Heiser, C., Hohenhorst, W., Herzog, M., Schmitt, M., Qian, K., Zhang, Y., Trigeorgis, G. and 2 moreSchuller, B., Steidl, S., Batliner, A., Bergelson, E., Krajewski, J., Janott, C., Amatuni, A., Casillas, M., Seidl, A., Soderstrom, M., Warlaumont, A. S., Hidalgo, G., Schnieder, S., Heiser, C., Hohenhorst, W., Herzog, M., Schmitt, M., Qian, K., Zhang, Y., Trigeorgis, G., Tzirakis, P., & Zafeiriou, S. (2017). The INTERSPEECH 2017 computational paralinguistics challenge: Addressee, cold & snoring. In Proceedings of Interspeech 2017 (pp. 3442-3446). doi:10.21437/Interspeech.2017-43.

    Abstract

    The INTERSPEECH 2017 Computational Paralinguistics Challenge addresses three different problems for the first time in research competition under well-defined conditions: In the Addressee sub-challenge, it has to be determined whether speech produced by an adult is directed towards another adult or towards a child; in the Cold sub-challenge, speech under cold has to be told apart from ‘healthy’ speech; and in the Snoring subchallenge, four different types of snoring have to be classified. In this paper, we describe these sub-challenges, their conditions, and the baseline feature extraction and classifiers, which include data-learnt feature representations by end-to-end learning with convolutional and recurrent neural networks, and bag-of-audiowords for the first time in the challenge series
  • Sekine, K. (2017). Gestural hesitation reveals children’s competence on multimodal communication: Emergence of disguised adaptor. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 3113-3118). Austin, TX: Cognitive Science Society.

    Abstract

    Speakers sometimes modify their gestures during the process of production into adaptors such as hair touching or eye scratching. Such disguised adaptors are evidence that the speaker can monitor their gestures. In this study, we investigated when and how disguised adaptors are first produced by children. Sixty elementary school children participated in this study (ten children in each age group; from 7 to 12 years old). They were instructed to watch a cartoon and retell it to their parents. The results showed that children did not produce disguised adaptors until the age of 8. The disguised adaptors accompany fluent speech until the children are 10 years old and accompany dysfluent speech until they reach 11 or 12 years of age. These results suggest that children start to monitor their gestures when they are 9 or 10 years old. Cognitive changes were considered as factors to influence emergence of disguised adaptors
  • Senft, G. (1991). Bakavilisi Biga - we can 'turn' the language - or: What happens to English words in Kilivila language? In W. Bahner, J. Schildt, & D. Viehwegger (Eds.), Proceedings of the XIVth International Congress of Linguists (pp. 1743-1746). Berlin: Akademie Verlag.
  • Seuren, P. A. M. (1991). Notes on noun phrases and quantification. In Proceedings of the International Conference on Current Issues in Computational Linguistics (pp. 19-44). Penang, Malaysia: Universiti Sains Malaysia.
  • Seuren, P. A. M. (1985). Predicate raising and semantic transparency in Mauritian Creole. In N. Boretzky, W. Enninger, & T. Stolz (Eds.), Akten des 2. Essener Kolloquiums über "Kreolsprachen und Sprachkontakte", 29-30 Nov. 1985 (pp. 203-229). Bochum: Brockmeyer.
  • Seuren, P. A. M. (2014). Scope and external datives. In B. Cornillie, C. Hamans, & D. Jaspers (Eds.), Proceedings of a mini-symposium on Pieter Seuren's 80th birthday organised at the 47th Annual Meeting of the Societas Linguistica Europaea.

    Abstract

    In this study it is argued that scope, as a property of scope‐creating operators, is a real and important element in the semantico‐grammatical description of languages. The notion of scope is illustrated and, as far as possible, defined. A first idea is given of the ‘grammar of scope’, which defines the relation between scope in the logically structured semantic analysis (SA) of sentences on the one hand and surface structure on the other. Evidence is adduced showing that peripheral preposition phrases (PPPs) in the surface structure of sentences represent scope‐creating operators in SA, and that external datives fall into this category: they are scope‐creating PPPs. It follows that, in English and Dutch, the internal dative (I gave John a book) and the external dative (I gave a book to John) are not simple syntactic variants expressing the same meaning. Instead, internal datives are an integral part of the argument structure of the matrix predicate, whereas external datives represent scope‐creating operators in SA. In the Romance languages, the (non‐pronominal) external dative has been re‐analysed as an argument type dative, but this has not happened in English and Dutch, which have many verbs that only allow for an external dative (e.g. donate, reveal). When both datives are allowed, there are systematic semantic differences, including scope differences.
  • Seuren, P. A. M. (1991). What makes a text untranslatable? In H. M. N. Noor Ein, & H. S. Atiah (Eds.), Pragmatik Penterjemahan: Prinsip, Amalan dan Penilaian Menuju ke Abad 21 ("The Pragmatics of Translation: Principles, Practice and Evaluation Moving towards the 21st Century") (pp. 19-27). Kuala Lumpur: Dewan Bahasa dan Pustaka.
  • Shkaravska, O., Van Eekelen, M., & Tamalet, A. (2014). Collected size semantics for strict functional programs over general polymorphic lists. In U. Dal Lago, & R. Pena (Eds.), Foundational and Practical Aspects of Resource Analysis: Third International Workshop, FOPARA 2013, Bertinoro, Italy, August 29-31, 2013, Revised Selected Papers (pp. 143-159). Berlin: Springer.

    Abstract

    Size analysis can be an important part of heap consumption analysis. This paper is a part of ongoing work about typing support for checking output-on-input size dependencies for function definitions in a strict functional language. A significant restriction for our earlier results is that inner data structures (e.g. in a list of lists) all must have the same size. Here, we make a big step forwards by overcoming this limitation via the introduction of higher-order size annotations such that variate sizes of inner data structures can be expressed. In this way the analysis becomes applicable for general, polymorphic nested lists.

Share this page