Publications

Displaying 101 - 168 of 168
  • Yu, J., Mailhammer, R., & Cutler, A. (2020). Vocabulary structure affects word recognition: Evidence from German listeners. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (Eds.), Proceedings of Speech Prosody 2020 (pp. 474-478). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-97.

    Abstract

    Lexical stress is realised similarly in English, German, and
    Dutch. On a suprasegmental level, stressed syllables tend to be
    longer and more acoustically salient than unstressed syllables;
    segmentally, vowels in unstressed syllables are often reduced.
    The frequency of unreduced unstressed syllables (where only
    the suprasegmental cues indicate lack of stress) however,
    differs across the languages. The present studies test whether
    listener behaviour is affected by these vocabulary differences,
    by investigating German listeners’ use of suprasegmental cues
    to lexical stress in German and English word recognition. In a
    forced-choice identification task, German listeners correctly
    assigned single-syllable fragments (e.g., Kon-) to one of two
    words differing in stress (KONto, konZEPT). Thus, German
    listeners can exploit suprasegmental information for
    identifying words. German listeners also performed above
    chance in a similar task in English (with, e.g., DIver, diVERT),
    i.e., their sensitivity to these cues also transferred to a nonnative
    language. An English listener group, in contrast, failed
    in the English fragment task. These findings mirror vocabulary
    patterns: German has more words with unreduced unstressed
    syllables than English does.
  • Majid, A., Enfield, N. J., & Van Staden, M. (Eds.). (2006). Parts of the body: Cross-linguistic categorisation [Special Issue]. Language Sciences, 28(2-3).
  • Majid, A., & Bowerman, M. (Eds.). (2007). Cutting and breaking events: A crosslinguistic perspective [Special Issue]. Cognitive Linguistics, 18(2).

    Abstract

    This special issue of Cognitive Linguistics explores the linguistic encoding of events of cutting and breaking. In this article we first introduce the project on which it is based by motivating the selection of this conceptual domain, presenting the methods of data collection used by all the investigators, and characterizing the language sample. We then present a new approach to examining crosslinguistic similarities and differences in semantic categorization. Applying statistical modeling to the descriptions of cutting and breaking events elicited from speakers of all the languages, we show that although there is crosslinguistic variation in the number of distinctions made and in the placement of category boundaries, these differences take place within a strongly constrained semantic space: across languages, there is a surprising degree of consensus on the partitioning of events in this domain. In closing, we compare our statistical approach with more conventional semantic analyses, and show how an extensional semantic typological approach like the one illustrated here can help illuminate the intensional distinctions made by languages.
  • Malaisé, V., Gazendam, L., & Brugman, H. (2007). Disambiguating automatic semantic annotation based on a thesaurus structure. In Proceedings of TALN 2007.
  • Malaisé, V., Aroyo, L., Brugman, H., Gazendam, L., De Jong, A., Negru, C., & Schreiber, G. (2006). Evaluating a thesaurus browser for an audio-visual archive. In S. Staab, & V. Svatek (Eds.), Managing knowledge in a world of networks (pp. 272-286). Berlin: Springer.
  • McQueen, J. M., & Cutler, A. (1998). Spotting (different kinds of) words in (different kinds of) context. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2791-2794). Sydney: ICSLP.

    Abstract

    The results of a word-spotting experiment are presented in which Dutch listeners tried to spot different types of bisyllabic Dutch words embedded in different types of nonsense contexts. Embedded verbs were not reliably harder to spot than embedded nouns; this suggests that nouns and verbs are recognised via the same basic processes. Iambic words were no harder to spot than trochaic words, suggesting that trochaic words are not in principle easier to recognise than iambic words. Words were harder to spot in consonantal contexts (i.e., contexts which themselves could not be words) than in longer contexts which contained at least one vowel (i.e., contexts which, though not words, were possible words of Dutch). A control experiment showed that this difference was not due to acoustic differences between the words in each context. The results support the claim that spoken-word recognition is sensitive to the viability of sound sequences as possible words.
  • McQueen, J. M., & Cutler, A. (1992). Words within words: Lexical statistics and lexical access. In J. Ohala, T. Neary, & B. Derwing (Eds.), Proceedings of the Second International Conference on Spoken Language Processing: Vol. 1 (pp. 221-224). Alberta: University of Alberta.

    Abstract

    This paper presents lexical statistics on the pattern of occurrence of words embedded in other words. We report the results of an analysis of 25000 words, varying in length from two to six syllables, extracted from a phonetically-coded English dictionary (The Longman Dictionary of Contemporary English). Each syllable, and each string of syllables within each word was checked against the dictionary. Two analyses are presented: the first used a complete list of polysyllables, with look-up on the entire dictionary; the second used a sublist of content words, counting only embedded words which were themselves content words. The results have important implications for models of human speech recognition. The efficiency of these models depends, in different ways, on the number and location of words within words.
  • Melinger, A., Schulte im Walde, S., & Weber, A. (2006). Characterizing response types and revealing noun ambiguity in German association norms. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics. Trento: Association for Computational Linguistics.

    Abstract

    This paper presents an analysis of semantic association norms for German nouns. In contrast to prior studies, we not only collected associations elicited by written representations of target objects but also by their pictorial representations. In a first analysis, we identified systematic differences in the type and distribution of associate responses for the two presentation forms. In a second analysis, we applied a soft cluster analysis to the collected target-response pairs. We subsequently used the clustering to predict noun ambiguity and to discriminate senses in our target nouns.
  • Mengede, J., Devanna, P., Hörpel, S. G., Firzla, U., & Vernes, S. C. (2020). Studying the genetic bases of vocal learning in bats. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 280-282). Nijmegen: The Evolution of Language Conferences.
  • Meyer, A. S., & Wheeldon, L. (Eds.). (2006). Language production across the life span [Special Issue]. Language and Cognitive Processes, 21(1-3).
  • Mitterer, H. (2007). Top-down effects on compensation for coarticulation are not replicable. In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 1601-1604). Adelaide: Causal Productions.

    Abstract

    Listeners use lexical knowledge to judge what speech sounds they heard. I investigated whether such lexical influences are truly top-down or just reflect a merging of perceptual and lexical constraints. This is achieved by testing whether the lexically determined identity of a phone exerts the appropriate context effects on surrounding phones. The current investigations focuses on compensation for coarticulation in vowel-fricative sequences, where the presence of a rounded vowel (/y/ rather than /i/) leads fricatives to be perceived as /s/ rather than //. This results was consistently found in all three experiments. A vowel was also more likely to be perceived as rounded /y/ if that lead listeners to be perceive words rather than nonwords (Dutch: meny, English id. vs. meni nonword). This lexical influence on the perception of the vowel had, however, no consistent influence on the perception of following fricative.
  • Mitterer, H., & McQueen, J. M. (2007). Tracking perception of pronunciation variation by tracking looks to printed words: The case of word-final /t/. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1929-1932). Dudweiler: Pirrot.

    Abstract

    We investigated perception of words with reduced word-final /t/ using an adapted eyetracking paradigm. Dutch listeners followed spoken instructions to click on printed words which were accompanied on a computer screen by simple shapes (e.g., a circle). Targets were either above or next to their shapes, and the shapes uniquely identified the targets when the spoken forms were ambiguous between words with or without final /t/ (e.g., bult, bump, vs. bul, diploma). Analysis of listeners’ eye-movements revealed, in contrast to earlier results, that listeners use the following segmental context when compensating for /t/-reduction. Reflecting that /t/-reduction is more likely to occur before bilabials, listeners were more likely to look at the /t/-final words if the next word’s first segment was bilabial. This result supports models of speech perception in which prelexical phonological processes use segmental context to modulate word recognition.
  • Mitterer, H. (2007). Behavior reflects the (degree of) reality of phonological features in the brain as well. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 127-130). Dudweiler: Pirrot.

    Abstract

    To assess the reality of phonological features in language processing (vs. language description), one needs to specify the distinctive claims of distinctive-feature theory. Two of the more farreaching claims are compositionality and generalizability. I will argue that there is some evidence for the first and evidence against the second claim from a recent behavioral paradigm. Highlighting the contribution of a behavioral paradigm also counterpoints the use of brain measures as the only way to elucidate what is "real for the brain". The contributions of the speakers exemplify how brain measures can help us to understand the reality of phonological features in language processing. The evidence is, however, not convincing for a) the claim for underspecification of phonological features—which has to deal with counterevidence from behavioral as well as brain measures—, and b) the claim of position independence of phonological features.
  • Mudd, K., Lutzenberger, H., De Vos, C., Fikkert, P., Crasborn, O., & De Boer, B. (2020). How does social structure shape language variation? A case study of the Kata Kolok lexicon. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 302-304). Nijmegen: The Evolution of Language Conferences.
  • Narasimhan, B., Eisenbeiss, S., & Brown, P. (Eds.). (2007). The linguistic encoding of multiple-participant events [Special Issue]. Linguistics, 45(3).

    Abstract

    This issue investigates the linguistic encoding of events with three or more participants from the perspectives of language typology and acquisition. Such “multiple-participant events” include (but are not limited to) any scenario involving at least three participants, typically encoded using transactional verbs like 'give' and 'show', placement verbs like 'put', and benefactive and applicative constructions like 'do (something for someone)', among others. There is considerable crosslinguistic and withinlanguage variation in how the participants (the Agent, Causer, Theme, Goal, Recipient, or Experiencer) and the subevents involved in multipleparticipant situations are encoded, both at the lexical and the constructional levels
  • Norris, D., Van Ooijen, B., & Cutler, A. (1992). Speeded detection of vowels and steady-state consonants. In J. Ohala, T. Neary, & B. Derwing (Eds.), Proceedings of the Second International Conference on Spoken Language Processing; Vol. 2 (pp. 1055-1058). Alberta: University of Alberta.

    Abstract

    We report two experiments in which vowels and steady-state consonants served as targets in a speeded detection task. In the first experiment, two vowels were compared with one voiced and once unvoiced fricative. Response times (RTs) to the vowels were longer than to the fricatives. The error rate was higher for the consonants. Consonants in word-final position produced the shortest RTs, For the vowels, RT correlated negatively with target duration. In the second experiment, the same two vowel targets were compared with two nasals. This time there was no significant difference in RTs, but the error rate was still significantly higher for the consonants. Error rate and length correlated negatively for the vowels only. We conclude that RT differences between phonemes are independent of vocalic or consonantal status. Instead, we argue that the process of phoneme detection reflects more finely grained differences in acoustic/articulatory structure within the phonemic repertoire.
  • Offenga, F., Broeder, D., Wittenburg, P., Ducret, J., & Romary, L. (2006). Metadata profile in the ISO data category registry. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 1866-1869).
  • Omar, R., Henley, S. M., Hailstone, J. C., Sauter, D., Scott, S. K., Fox, N. C., Rossor, M. N., & Warren, J. D. (2007). Recognition of emotions in faces, voices and music in frontotemporal lobar regeneration [Abstract]. Journal of Neurology, Neurosurgery & Psychiatry, 78(9), 1014.

    Abstract

    Frontotemporal lobar degeneration (FTLD) is a group of neurodegenerative conditions characterised by focal frontal and/or temporal lobe atrophy. Patients develop a range of cognitive and behavioural abnormalities, including prominent difficulties in comprehending and expressing emotions, with significant clinical and social consequences. Here we report a systematic prospective analysis of emotion processing in different input modalities in patients with FTLD. We examined recognition of happiness, sadness, fear and anger in facial expressions, non-verbal vocalisations and music in patients with FTLD and in healthy age matched controls. The FTLD group was significantly impaired in all modalities compared with controls, and this effect was most marked for music. Analysing each emotion separately, recognition of negative emotions was impaired in all three modalities in FTLD, and this effect was most marked for fear and anger. Recognition of happiness was deficient only with music. Our findings support the idea that FTLD causes impaired recognition of emotions across input channels, consistent with a common central representation of emotion concepts. Music may be a sensitive probe of emotional deficits in FTLD, perhaps because it requires a more abstract representation of emotion than do animate stimuli such as faces and voices.
  • Ozyurek, A. (1998). An analysis of the basic meaning of Turkish demonstratives in face-to-face conversational interaction. In S. Santi, I. Guaitella, C. Cave, & G. Konopczynski (Eds.), Oralite et gestualite: Communication multimodale, interaction: actes du colloque ORAGE 98 (pp. 609-614). Paris: L'Harmattan.
  • Ozyurek, A. (2020). From hands to brains: How does human body talk, think and interact in face-to-face language use? In K. Truong, D. Heylen, & M. Czerwinski (Eds.), ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction (pp. 1-2). New York, NY, USA: Association for Computing Machinery. doi:10.1145/3382507.3419442.
  • Papafragou, A., & Ozturk, O. (2007). Children's acquisition of modality. In Proceedings of the 2nd Conference on Generative Approaches to Language Acquisition North America (GALANA 2) (pp. 320-327). Somerville, Mass.: Cascadilla Press.
  • Papafragou, A. (2007). On the acquisition of modality. In T. Scheffler, & L. Mayol (Eds.), Penn Working Papers in Linguistics. Proceedings of the 30th Annual Penn Linguistics Colloquium (pp. 281-293). Department of Linguistics, University of Pennsylvania.
  • Papafragou, A., & Ozturk, O. (2006). The acquisition of epistemic modality. In A. Botinis (Ed.), Proceedings of ITRW on Experimental Linguistics in ExLing-2006 (pp. 201-204). ISCA Archive.

    Abstract

    In this paper we try to contribute to the body of knowledge about the acquisition of English epistemic modal verbs (e.g. Mary may/has to be at school). Semantically, these verbs encode possibility or necessity with respect to available evidence. Pragmatically, the use of epistemic modals often gives rise to scalar conversational inferences (Mary may be at school -> Mary doesn’t have to be at school). The acquisition of epistemic modals is challenging for children on both these levels. In this paper, we present findings from two studies which were conducted with 5-year-old children and adults. Our findings, unlike previous work, show that 5-yr-olds have mastered epistemic modal semantics, including the notions of necessity and possibility. However, they are still in the process of acquiring epistemic modal pragmatics.
  • Paplu, S. H., Mishra, C., & Berns, K. (2020). Pseudo-randomization in automating robot behaviour during human-robot interaction. In 2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp. 1-6). Institute of Electrical and Electronics Engineers. doi:10.1109/ICDL-EpiRob48136.2020.9278115.

    Abstract

    Automating robot behavior in a specific situation is an active area of research. There are several approaches available in the literature of robotics to cater for the automatic behavior of a robot. However, when it comes to humanoids or human-robot interaction in general, the area has been less explored. In this paper, a pseudo-randomization approach has been introduced to automatize the gestures and facial expressions of an interactive humanoid robot called ROBIN based on its mental state. A significant number of gestures and facial expressions have been implemented to allow the robot more options to perform a relevant action or reaction based on visual stimuli. There is a display of noticeable differences in the behaviour of the robot for the same stimuli perceived from an interaction partner. This slight autonomous behavioural change in the robot clearly shows a notion of automation in behaviour. The results from experimental scenarios and human-centered evaluation of the system help validate the approach.

    Files private

    Request files
  • Pereiro Estevan, Y., Wan, V., Scharenborg, O., & Gallardo Antolín, A. (2006). Segmentación de fonemas no supervisada basada en métodos kernel de máximo margen. In Proceedings of IV Jornadas en Tecnología del Habla.

    Abstract

    En este artículo se desarrolla un método automático de segmentación de fonemas no supervisado. Este método utiliza el algoritmo de agrupación de máximo margen [1] para realizar segmentación de fonemas sobre habla continua sin necesidad de información a priori para el entrenamiento del sistema.
  • Pluymaekers, M., Ernestus, M., Baayen, R. H., & Booij, G. (2006). The role of morphology in fine phonetic detail: The case of Dutch -igheid. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 53-54).
  • Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2006). Effects of word frequency on the acoustic durations of affixes. In Proceedings of Interspeech 2006 (pp. 953-956). Pittsburgh: ICSLP.

    Abstract

    This study investigates whether the acoustic durations of derivational affixes in Dutch are affected by the frequency of the word they occur in. In a word naming experiment, subjects were presented with a large number of words containing one of the affixes ge-, ver-, ont, or -lijk. Their responses were recorded on DAT tapes, and the durations of the affixes were measured using Automatic Speech Recognition technology. To investigate whether frequency also affected durations when speech rate was high, the presentation rate of the stimuli was varied. The results show that a higher frequency of the word as a whole led to shorter acoustic realizations for all affixes. Furthermore, affixes became shorter as the presentation rate of the stimuli increased. There was no interaction between word frequency and presentation rate, suggesting that the frequency effect also applies in situations in which the speed of articulation is very high.
  • Poletiek, F. H., & Chater, N. (2006). Grammar induction profits from representative stimulus sampling. In R. Sun (Ed.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006) (pp. 1968-1973). Austin, TX, USA: Cognitive Science Society.
  • Rapold, C. J. (2007). From demonstratives to verb agreement in Benchnon: A diachronic perspective. In A. Amha, M. Mous, & G. Savà (Eds.), Omotic and Cushitic studies: Papers from the Fourth Cushitic Omotic Conference, Leiden, 10-12 April 2003 (pp. 69-88). Cologne: Rüdiger Köppe.
  • Rasenberg, M., Dingemanse, M., & Ozyurek, A. (2020). Lexical and gestural alignment in interaction and the emergence of novel shared symbols. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 356-358). Nijmegen: The Evolution of Language Conferences.
  • Raviv, L., Meyer, A. S., & Lev-Ari, S. (2020). Network structure and the cultural evolution of linguistic structure: A group communication experiment. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 359-361). Nijmegen: The Evolution of Language Conferences.
  • de Reus, K., Carlson, D., Jadoul, Y., Lowry, A., Gross, S., Garcia, M., Salazar-Casals, A., Rubio-García, A., Haas, C. E., De Boer, B., & Ravignani, A. (2020). Relationships between vocal ontogeny and vocal tract anatomy in harbour seals (Phoca vitulina). In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 63-66). Nijmegen: The Evolution of Language Conferences.
  • Ringersma, J., & Kemps-Snijders, M. (2007). Creating multimedia dictionaries of endangered languages using LEXUS. In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 65-68). Baixas, France: ISCA-Int.Speech Communication Assoc.

    Abstract

    This paper reports on the development of a flexible web based lexicon tool, LEXUS. LEXUS is targeted at linguists involved in language documentation (of endangered languages). It allows the creation of lexica within the structure of the proposed ISO LMF standard and uses the proposed concept naming conventions from the ISO data categories, thus enabling interoperability, search and merging. LEXUS also offers the possibility to visualize language, since it provides functionalities to include audio, video and still images to the lexicon. With LEXUS it is possible to create semantic network knowledge bases, using typed relations. The LEXUS tool is free for use. Index Terms: lexicon, web based application, endangered languages, language documentation.
  • De Ruiter, J. P. (2007). Some multimodal signals in humans. In I. Van de Sluis, M. Theune, E. Reiter, & E. Krahmer (Eds.), Proceedings of the Workshop on Multimodal Output Generation (MOG 2007) (pp. 141-148).

    Abstract

    In this paper, I will give an overview of some well-studied multimodal signals that humans produce while they communicate with other humans, and discuss the implications of those studies for HCI. I will first discuss a conceptual framework that allows us to distinguish between functional and sensory modalities. This distinction is important, as there are multiple functional modalities using the same sensory modality (e.g., facial expression and eye-gaze in the visual modality). A second theoretically important issue is redundancy. Some signals appear to be redundant with a signal in another modality, whereas others give new information or even appear to give conflicting information (see e.g., the work of Susan Goldin-Meadows on speech accompanying gestures). I will argue that multimodal signals are never truly redundant. First, many gestures that appear at first sight to express the same meaning as the accompanying speech generally provide extra (analog) information about manner, path, etc. Second, the simple fact that the same information is expressed in more than one modality is itself a communicative signal. Armed with this conceptual background, I will then proceed to give an overview of some multimodalsignals that have been investigated in human-human research, and the level of understanding we have of the meaning of those signals. The latter issue is especially important for potential implementations of these signals in artificial agents. First, I will discuss pointing gestures. I will address the issue of the timing of pointing gestures relative to the speech it is supposed to support, the mutual dependency between pointing gestures and speech, and discuss the existence of alternative ways of pointing from other cultures. The most frequent form of pointing that does not involve the index finger is a cultural practice called lip-pointing which employs two visual functional modalities, mouth-shape and eye-gaze, simultaneously for pointing. Next, I will address the issue of eye-gaze. A classical study by Kendon (1967) claims that there is a systematic relationship between eye-gaze (at the interlocutor) and turn-taking states. Research at our institute has shown that this relationship is weaker than has often been assumed. If the dialogue setting contains a visible object that is relevant to the dialogue (e.g., a map), the rate of eye-gaze-at-other drops dramatically and its relationship to turn taking disappears completely. The implications for machine generated eye-gaze are discussed. Finally, I will explore a theoretical debate regarding spontaneous gestures. It has often been claimed that the class of gestures that is called iconic by McNeill (1992) are a “window into the mind”. That is, they are claimed to give the researcher (or even the interlocutor) a direct view into the speaker’s thought, without being obscured by the complex transformation that take place when transforming a thought into a verbal utterance. I will argue that this is an illusion. Gestures can be shown to be specifically designed such that the listener can be expected to interpret them. Although the transformations carried out to express a thought in gesture are indeed (partly) different from the corresponding transformations for speech, they are a) complex, and b) severely understudied. This obviously has consequences both for the gesture research agenda, and for the generation of iconic gestures by machines.
  • De Ruiter, J. P., & Enfield, N. J. (2007). The BIC model: A blueprint for the communicator. In C. Stephanidis (Ed.), Universal access in Human-Computer Interaction: Applications and services (pp. 251-258). Berlin: Springer.
  • Scharenborg, O., Ernestus, M., & Wan, V. (2007). Segmentation of speech: Child's play? In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 1953-1956). Adelaide: Causal Productions.

    Abstract

    The difficulty of the task of segmenting a speech signal into its words is immediately clear when listening to a foreign language; it is much harder to segment the signal into its words, since the words of the language are unknown. Infants are faced with the same task when learning their first language. This study provides a better understanding of the task that infants face while learning their native language. We employed an automatic algorithm on the task of speech segmentation without prior knowledge of the labels of the phonemes. An analysis of the boundaries erroneously placed inside a phoneme showed that the algorithm consistently placed additional boundaries in phonemes in which acoustic changes occur. These acoustic changes may be as great as the transition from the closure to the burst of a plosive or as subtle as the formant transitions in low or back vowels. Moreover, we found that glottal vibration may attenuate the relevance of acoustic changes within obstruents. An interesting question for further research is how infants learn to overcome the natural tendency to segment these ‘dynamic’ phonemes.
  • Scharenborg, O., & Wan, V. (2007). Can unquantised articulatory feature continuums be modelled? In INTERSPEECH 2007 - 8th Annual Conference of the International Speech Communication Association (pp. 2473-2476). ISCA Archive.

    Abstract

    Articulatory feature (AF) modelling of speech has received a considerable amount of attention in automatic speech recognition research. Although termed ‘articulatory’, previous definitions make certain assumptions that are invalid, for instance, that articulators ‘hop’ from one fixed position to the next. In this paper, we studied two methods, based on support vector classification (SVC) and regression (SVR), in which the articulation continuum is modelled without being restricted to using discrete AF value classes. A comparison with a baseline system trained on quantised values of the articulation continuum showed that both SVC and SVR outperform the baseline for two of the three investigated AFs, with improvements up to 5.6% absolute.
  • Scharenborg, O., Wan, V., & Moore, R. K. (2006). Capturing fine-phonetic variation in speech through automatic classification of articulatory features. In Speech Recognition and Intrinsic Variation Workshop [SRIV2006] (pp. 77-82). ISCA Archive.

    Abstract

    The ultimate goal of our research is to develop a computational model of human speech recognition that is able to capture the effects of fine-grained acoustic variation on speech recognition behaviour. As part of this work we are investigating automatic feature classifiers that are able to create reliable and accurate transcriptions of the articulatory behaviour encoded in the acoustic speech signal. In the experiments reported here, we compared support vector machines (SVMs) with multilayer perceptrons (MLPs). MLPs have been widely (and rather successfully) used for the task of multi-value articulatory feature classification, while (to the best of our knowledge) SVMs have not. This paper compares the performances of the two classifiers and analyses the results in order to better understand the articulatory representations. It was found that the MLPs outperformed the SVMs, but it is concluded that both classifiers exhibit similar behaviour in terms of patterns of errors.
  • Scheu, O., & Zinn, C. (2007). How did the e-learning session go? The student inspector. In Proceedings of the 13th International Conference on Artificial Intelligence and Education (AIED 2007). Amsterdam: IOS Press.

    Abstract

    Good teachers know their students, and exploit this knowledge to adapt or optimise their instruction. Traditional teachers know their students because they interact with them face-to-face in classroom or one-to-one tutoring sessions. In these settings, they can build student models, i.e., by exploiting the multi-faceted nature of human-human communication. In distance-learning contexts, teacher and student have to cope with the lack of such direct interaction, and this must have detrimental effects for both teacher and student. In a past study we have analysed teacher requirements for tracking student actions in computer-mediated settings. Given the results of this study, we have devised and implemented a tool that allows teachers to keep track of their learners'interaction in e-learning systems. We present the tool's functionality and user interfaces, and an evaluation of its usability.
  • Schulte im Walde, S., Melinger, A., Roth, M., & Weber, A. (2007). An empirical characterization of response types in German association norms. In Proceedings of the GLDV workshop on lexical-semantic and ontological resources.
  • Scott, S., & Sauter, D. (2006). Non-verbal expressions of emotion - acoustics, valence, and cross cultural factors. In Third International Conference on Speech Prosody 2006. ISCA.

    Abstract

    This presentation will address aspects of the expression of emotion in non-verbal vocal behaviour, specifically attempting to determine the roles of both positive and negative emotions, their acoustic bases, and the extent to which these are recognized in non-Western cultures.
  • Scott, D. R., & Cutler, A. (1982). Segmental cues to syntactic structure. In Proceedings of the Institute of Acoustics 'Spectral Analysis and its Use in Underwater Acoustics' (pp. E3.1-E3.4). London: Institute of Acoustics.
  • Seidlmayer, E., Voß, J., Melnychuk, T., Galke, L., Tochtermann, K., Schultz, C., & Förstner, K. U. (2020). ORCID for Wikidata. Data enrichment for scientometric applications. In L.-A. Kaffee, O. Tifrea-Marciuska, E. Simperl, & D. Vrandečić (Eds.), Proceedings of the 1st Wikidata Workshop (Wikidata 2020). Aachen, Germany: CEUR Workshop Proceedings.

    Abstract

    Due to its numerous bibliometric entries of scholarly articles and connected information Wikidata can serve as an open and rich
    source for deep scientometrical analyses. However, there are currently certain limitations: While 31.5% of all Wikidata entries represent scientific articles, only 8.9% are entries describing a person and the number
    of entries researcher is accordingly even lower. Another issue is the frequent absence of established relations between the scholarly article item and the author item although the author is already listed in Wikidata.
    To fill this gap and to improve the content of Wikidata in general, we established a workflow for matching authors and scholarly publications by integrating data from the ORCID (Open Researcher and Contributor ID) database. By this approach we were able to extend Wikidata by more than 12k author-publication relations and the method can be
    transferred to other enrichments based on ORCID data. This is extension is beneficial for Wikidata users performing bibliometrical analyses or using such metadata for other purposes.
  • Senft, G. (2007). Language, culture and cognition: Frames of spatial reference and why we need ontologies of space [Abstract]. In A. G. Cohn, C. Freksa, & B. Bebel (Eds.), Spatial cognition: Specialization and integration (pp. 12).

    Abstract

    One of the many results of the "Space" research project conducted at the MPI for Psycholinguistics is that there are three "Frames of spatial Reference" (FoRs), the relative, the intrinsic and the absolute FoR. Cross-linguistic research showed that speakers who prefer one FoR in verbal spatial references rely on a comparable coding system for memorizing spatial configurations and for making inferences with respect to these spatial configurations in non-verbal problem solving. Moreover, research results also revealed that in some languages these verbal FoRs also influence gestural behavior. These results document the close interrelationship between language, culture and cognition in the domain "Space". The proper description of these interrelationships in the spatial domain requires language and culture specific ontologies.
  • Seuren, P. A. M. (1982). Riorientamenti metodologici nello studio della variabilità linguistica. In D. Gambarara, & A. D'Atri (Eds.), Ideologia, filosofia e linguistica: Atti del Convegno Internazionale di Studi, Rende (CS) 15-17 Settembre 1978 ( (pp. 499-515). Roma: Bulzoni.
  • Stevens, M. A., McQueen, J. M., & Hartsuiker, R. J. (2007). No lexically-driven perceptual adjustments of the [x]-[h] boundary. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1897-1900). Dudweiler: Pirrot.

    Abstract

    Listeners can make perceptual adjustments to phoneme categories in response to a talker who consistently produces a specific phoneme ambiguously. We investigate here whether this type of perceptual learning is also used to adapt to regional accent differences. Listeners were exposed to words produced by a Flemish talker whose realization of [x℄or [h℄ was ambiguous (producing [x℄like [h℄is a property of the West-Flanders regional accent). Before and after exposure they categorized a [x℄-[h℄continuum. For both Dutch and Flemish listeners there was no shift of the categorization boundary after exposure to ambiguous sounds in [x℄- or [h℄-biasing contexts. The absence of a lexically-driven learning effect for this contrast may be because [h℄is strongly influenced by coarticulation. As is not stable across contexts, it may be futile to adapt its representation when new realizations are heard
  • Ten Bosch, L., Baayen, R. H., & Ernestus, M. (2006). On speech variation and word type differentiation by articulatory feature representations. In Proceedings of Interspeech 2006 (pp. 2230-2233).

    Abstract

    This paper describes ongoing research aiming at the description of variation in speech as represented by asynchronous articulatory features. We will first illustrate how distances in the articulatory feature space can be used for event detection along speech trajectories in this space. The temporal structure imposed by the cosine distance in articulatory feature space coincides to a large extent with the manual segmentation on phone level. The analysis also indicates that the articulatory feature representation provides better such alignments than the MFCC representation does. Secondly, we will present first results that indicate that articulatory features can be used to probe for acoustic differences in the onsets of Dutch singulars and plurals.
  • ten Bosch, L., Hämäläinen, A., Scharenborg, O., & Boves, L. (2006). Acoustic scores and symbolic mismatch penalties in phone lattices. In Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing [ICASSP 2006]. IEEE.

    Abstract

    This paper builds on previous work that aims at unraveling the structure of the speech signal by means of using probabilistic representations. The context of this work is a multi-pass speech recognition system in which a phone lattice is created and used as a basis for a lexical search in which symbolic mismatches are allowed at certain costs. The focus is on the optimization of the costs of phone insertions, deletions and substitutions that are used in the lexical decoding pass. Two optimization approaches are presented, one related to a multi-pass computational model for human speech recognition, the other based on a decoding in which Bayes’ risks are minimized. In the final section, the advantages of these optimization methods are discussed and compared.
  • Ter Bekke, M., Drijvers, L., & Holler, J. (2020). The predictive potential of hand gestures during conversation: An investigation of the timing of gestures in relation to speech. In Proceedings of the 7th GESPIN - Gesture and Speech in Interaction Conference. Stockholm: KTH Royal Institute of Technology.

    Abstract

    In face-to-face conversation, recipients might use the bodily movements of the speaker (e.g. gestures) to facilitate language processing. It has been suggested that one way through which this facilitation may happen is prediction. However, for this to be possible, gestures would need to precede speech, and it is unclear whether this is true during natural conversation.
    In a corpus of Dutch conversations, we annotated hand gestures that represent semantic information and occurred during questions, and the word(s) which corresponded most closely to the gesturally depicted meaning. Thus, we tested whether representational gestures temporally precede their lexical affiliates. Further, to see whether preceding gestures may indeed facilitate language processing, we asked whether the gesture-speech asynchrony predicts the response time to the question the gesture is part of.
    Gestures and their strokes (most meaningful movement component) indeed preceded the corresponding lexical information, thus demonstrating their predictive potential. However, while questions with gestures got faster responses than questions without, there was no evidence that questions with larger gesture-speech asynchronies get faster responses. These results suggest that gestures indeed have the potential to facilitate predictive language processing, but further analyses on larger datasets are needed to test for links between asynchrony and processing advantages.
  • Thompson, B., Raviv, L., & Kirby, S. (2020). Complexity can be maintained in small populations: A model of lexical variability in emerging sign languages. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 440-442). Nijmegen: The Evolution of Language Conferences.
  • Tsoukala, C., Frank, S. L., Van den Bosch, A., Kroff, J. V., & Broersma, M. (2020). Simulating Spanish-English code-switching: El modelo está generating code-switches. In E. Chersoni, C. Jacobs, Y. Oseki, L. Prévot, & E. Santus (Eds.), Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (pp. 20-29). Stroudsburg, PA, USA: Association for Computational Linguistics (ACL).

    Abstract

    Multilingual speakers are able to switch from
    one language to the other (“code-switch”) be-
    tween or within sentences. Because the under-
    lying cognitive mechanisms are not well un-
    derstood, in this study we use computational
    cognitive modeling to shed light on the pro-
    cess of code-switching. We employed the
    Bilingual Dual-path model, a Recurrent Neu-
    ral Network of bilingual sentence production
    (Tsoukala et al., 2017) and simulated sentence
    production in simultaneous Spanish-English
    bilinguals. Our first goal was to investigate
    whether the model would code-switch with-
    out being exposed to code-switched training
    input. The model indeed produced code-
    switches even without any exposure to such
    input and the patterns of code-switches are
    in line with earlier linguistic work (Poplack,
    1980). The second goal of this study was to
    investigate an auxiliary phrase asymmetry that
    exists in Spanish-English code-switched pro-
    duction. Using this cognitive model, we ex-
    amined a possible cause for this asymmetry.
    To our knowledge, this is the first computa-
    tional cognitive model that aims to simulate
    code-switched sentence production.
  • Tuinman, A. (2006). Overcompensation of /t/ reduction in Dutch by German/Dutch bilinguals. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 101-102).
  • Tuinman, A., Mitterer, H., & Cutler, A. (2007). Speakers differentiate English intrusive and onset /r/, but L2 listeners do not. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1905-1908). Dudweiler: Pirrot.

    Abstract

    We investigated whether non-native listeners can exploit phonetic detail in recognizing potentially ambiguous utterances, as native listeners can [6, 7, 8, 9, 10]. Due to the phenomenon of intrusive /r/, the English phrase extra ice may sound like extra rice. A production study indicates that the intrusive /r/ can be distinguished from the onset /r/ in rice, as it is phonetically weaker. In two cross-modal identity priming studies, however, we found no conclusive evidence that Dutch learners of English are able to make use of this difference. Instead, auditory primes such as extra rice and extra ice with onset and intrusive /r/s activate both types of targets such as ice and rice. This supports the notion of spurious lexical activation in L2 perception.
  • Van Alphen, P. M., De Bree, E., Fikkert, P., & Wijnen, F. (2007). The role of metrical stress in comprehension and production of Dutch children at risk of dyslexia. In Proceedings of Interspeech 2007 (pp. 2313-2316). Adelaide: Causal Productions.

    Abstract

    The present study compared the role of metrical stress in comprehension and production of three-year-old children with a familial risk of dyslexia with that of normally developing children. A visual fixation task with stress (mis-)matches in bisyllabic words, as well as a non-word repetition task with bisyllabic targets were presented to the control and at-risk children. Results show that the at-risk group is less sensitive to stress mismatches in word recognition than the control group. Correct production of metrical stress patterns did not differ significantly between the groups, but the percentages of phonemes produced correctly were lower for the at-risk than the control group. The findings indicate that processing of metrical stress patterns is not impaired in at-risk children, but that the at-risk group cannot exploit metrical stress in word recognition
  • Van den Heuvel, H., Oostdijk, N., Rowland, C. F., & Trilsbeek, P. (2020). The CLARIN Knowledge Centre for Atypical Communication Expertise. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. Declerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of the 12th Language Resources and Evaluation Conference (LREC 2020) (pp. 3312-3316). Marseille, France: European Language Resources Association.

    Abstract

    This paper introduces a new CLARIN Knowledge Center which is the K-Centre for Atypical Communication Expertise (ACE for short) which has been established at the Centre for Language and Speech Technology (CLST) at Radboud University. Atypical communication is an umbrella term used here to denote language use by second language learners, people with language disorders or those suffering from language disabilities, but also more broadly by bilinguals and users of sign languages. It involves multiple modalities (text, speech, sign, gesture) and encompasses different developmental stages. ACE closely collaborates with The Language Archive (TLA) at the Max Planck Institute for Psycholinguistics in order to safeguard GDPR-compliant data storage and access. We explain the mission of ACE and show its potential on a number of showcases and a use case.
  • Van Dooren, A. (2020). The temporal perspective of epistemics in Dutch. In M. Franke, N. Kompa, M. Liu, J. L. Mueller, & J. Schwab (Eds.), Proceedings of Sinn Und Bedeutung 24 (pp. 143-160). Osnabrück: Osnabrück University.

    Abstract

    A series of experiments is conducted on naïve native speakers of Dutch and English to study the scope relation between tense and epistemic modality. The results are consistent with the claim that epistemics scope over tense (Stowell 2004, Hacquard 2006, a.o.), and challenge recent research that states that epistemics can, or must, scope under tense (von Fintel and Gillies 2007, Rullmann & Matthewson 2018): Dutch and English participants in a Truth Value Judgment Task judge sentences to be false when the past tense forms of the modals have to and moeten 'have to' are used to make an epistemic claim that held at a time before speech time, and true when they are used to make an epistemic claim that holds at speech time. Moreover, English participants in an Acceptability Judgment Task judge sentences to be infelicitous when the same past tense form of have to is used to make an epistemic claim that held at a time before speech time. Besides these general patterns, the results show variation within and across the two languages, which leads to interesting new questions about the interaction between tense and (epistemic) modality.
  • Van Arkel, J., Woensdregt, M., Dingemanse, M., & Blokpoel, M. (2020). A simple repair mechanism can alleviate computational demands of pragmatic reasoning: simulations and complexity analysis. In R. Fernández, & T. Linzen (Eds.), Proceedings of the 24th Conference on Computational Natural Language Learning (CoNLL 2020) (pp. 177-194). Stroudsburg, PA, USA: The Association for Computational Linguistics. doi:10.18653/v1/2020.conll-1.14.

    Abstract

    How can people communicate successfully while keeping resource costs low in the face of ambiguity? We present a principled theoretical analysis comparing two strategies for disambiguation in communication: (i) pragmatic reasoning, where communicators reason about each other, and (ii) other-initiated repair, where communicators signal and resolve trouble interactively. Using agent-based simulations and computational complexity analyses, we compare the efficiency of these strategies in terms of communicative success, computation cost and interaction cost. We show that agents with a simple repair mechanism can increase efficiency, compared to pragmatic agents, by reducing their computational burden at the cost of longer interactions. We also find that efficiency is highly contingent on the mechanism, highlighting the importance of explicit formalisation and computational rigour.
  • Van den Bos, E. J., & Poletiek, F. H. (2006). Implicit artificial grammar learning in adults and children. In R. Sun (Ed.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006) (pp. 2619). Austin, TX, USA: Cognitive Science Society.
  • Vernes, S. C. (2020). Understanding bat vocal learning to gain insight into speech and language. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 6). Nijmegen: The Evolution of Language Conferences.
  • Weber, A. (1998). Listening to nonnative language which violates native assimilation rules. In D. Duez (Ed.), Proceedings of the European Scientific Communication Association workshop: Sound patterns of Spontaneous Speech (pp. 101-104).

    Abstract

    Recent studies using phoneme detection tasks have shown that spoken-language processing is neither facilitated nor interfered with by optional assimilation, but is inhibited by violation of obligatory assimilation. Interpretation of these results depends on an assessment of their generality, specifically, whether they also obtain when listeners are processing nonnative language. Two separate experiments are presented in which native listeners of German and native listeners of Dutch had to detect a target fricative in legal monosyllabic Dutch nonwords. All of the nonwords were correct realisations in standard Dutch. For German listeners, however, half of the nonwords contained phoneme strings which violate the German fricative assimilation rule. Whereas the Dutch listeners showed no significant effects, German listeners detected the target fricative faster when the German fricative assimilation was violated than when no violation occurred. The results might suggest that violation of assimilation rules does not have to make processing more difficult per se.
  • Weber, A., Melinger, A., & Lara Tapia, L. (2007). The mapping of phonetic information to lexical presentations in Spanish: Evidence from eye movements. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1941-1944). Dudweiler: Pirrot.

    Abstract

    In a visual-world study, we examined spoken-wordrecognition in Spanish. Spanish listeners followed spoken instructions to click on pictures while their eye movements were monitored. When instructed to click on the picture of a door (puerta), they experienced interference from the picture of a pig (p u e r c o ). The same interference from phonologically related items was observed when the displays contained printed names or a combination of pictures with their names printed underneath, although the effect was strongest for displays with printed names. Implications of the finding that the interference effect can be induced with standard pictorial displays as well as with orthographic displays are discussed.
  • Widlok, T. (2006). Two ways of looking at a Mangetti grove. In A. Takada (Ed.), Proceedings of the workshop: Landscape and society (pp. 11-16). Kyoto: 21st Century Center of Excellence Program.
  • Wittek, A. (1998). Learning verb meaning via adverbial modification: Change-of-state verbs in German and the adverb "wieder" again. In A. Greenhill, M. Hughes, H. Littlefield, & H. Walsh (Eds.), Proceedings of the 22nd Annual Boston University Conference on Language Development (pp. 779-790). Somerville, MA: Cascadilla Press.
  • Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., & Sloetjes, H. (2006). ELAN: a professional framework for multimodality research. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 1556-1559).

    Abstract

    Utilization of computer tools in linguistic research has gained importance with the maturation of media frameworks for the handling of digital audio and video. The increased use of these tools in gesture, sign language and multimodal interaction studies has led to stronger requirements on the flexibility, the efficiency and in particular the time accuracy of annotation tools. This paper describes the efforts made to make ELAN a tool that meets these requirements, with special attention to the developments in the area of time accuracy. In subsequent sections an overview will be given of other enhancements in the latest versions of ELAN, that make it a useful tool in multimodality research.
  • Wittenburg, P., Broeder, D., Klein, W., Levinson, S. C., & Romary, L. (2006). Foundations of modern language resource archives. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 625-628).

    Abstract

    A number of serious reasons will convince an increasing amount of researchers to store their relevant material in centers which we will call "language resource archives". They combine the duty of taking care of long-term preservation as well as the task to give access to their material to different user groups. Access here is meant in the sense that an active interaction with the data will be made possible to support the integration of new data, new versions or commentaries of all sort. Modern Language Resource Archives will have to adhere to a number of basic principles to fulfill all requirements and they will have to be involved in federations to create joint language resource domains making it even more simple for the researchers to access the data. This paper makes an attempt to formulate the essential pillars language resource archives have to adhere to.
  • Woensdregt, M., & Dingemanse, M. (2020). Other-initiated repair can facilitate the emergence of compositional language. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 474-476). Nijmegen: The Evolution of Language Conferences.
  • Yang, J., Van den Bosch, A., & Frank, S. L. (2020). Less is Better: A cognitively inspired unsupervised model for language segmentation. In M. Zock, E. Chersoni, A. Lenci, & E. Santus (Eds.), Proceedings of the Workshop on the Cognitive Aspects of the Lexicon ( 28th International Conference on Computational Linguistics) (pp. 33-45). Stroudsburg: Association for Computational Linguistics.

    Abstract

    Language users process utterances by segmenting them into many cognitive units, which vary in their sizes and linguistic levels. Although we can do such unitization/segmentation easily, its cognitive mechanism is still not clear. This paper proposes an unsupervised model, Less-is-Better (LiB), to simulate the human cognitive process with respect to language unitization/segmentation. LiB follows the principle of least effort and aims to build a lexicon which minimizes the number of unit tokens (alleviating the effort of analysis) and number of unit types (alleviating the effort of storage) at the same time on any given corpus. LiB’s workflow is inspired by empirical cognitive phenomena. The design makes the mechanism of LiB cognitively plausible and the computational requirement light-weight. The lexicon generated by LiB performs the best among different types of lexicons (e.g. ground-truth words) both from an information-theoretical view and a cognitive view, which suggests that the LiB lexicon may be a plausible proxy of the mental lexicon.

    Additional information

    full text via ACL website
  • Zhang, Y., Amatuni, A., Crain, E., & Yu, C. (2020). Seeking meaning: Examining a cross-situational solution to learn action verbs using human simulation paradigm. In S. Denison, M. Mack, Y. Xu, & B. C. Armstrong (Eds.), Proceedings of the 42nd Annual Meeting of the Cognitive Science Society (CogSci 2020) (pp. 2854-2860). Montreal, QB: Cognitive Science Society.

    Abstract

    To acquire the meaning of a verb, language learners not only need to find the correct mapping between a specific verb and an action or event in the world, but also infer the underlying relational meaning that the verb encodes. Most verb naming instances in naturalistic contexts are highly ambiguous as many possible actions can be embedded in the same scenario and many possible verbs can be used to describe those actions. To understand whether learners can find the correct verb meaning from referentially ambiguous learning situations, we conducted three experiments using the Human Simulation Paradigm with adult learners. Our results suggest that although finding the right verb meaning from one learning instance is hard, there is a statistical solution to this problem. When provided with multiple verb learning instances all referring to the same verb, learners are able to aggregate information across situations and gradually converge to the correct semantic space. Even in cases where they may not guess the exact target verb, they can still discover the right meaning by guessing a similar verb that is semantically close to the ground truth.

Share this page