Publications

Displaying 1 - 100 of 147
  • Akamine, S., Ghaleb, E., Rasenberg, M., Fernandez, R., Meyer, A. S., & Özyürek, A. (2024). Speakers align both their gestures and words not only to establish but also to maintain reference to create shared labels for novel objects in interaction. In L. K. Samuelson, S. L. Frank, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 2435-2442).

    Abstract

    When we communicate with others, we often repeat aspects of each other's communicative behavior such as sentence structures and words. Such behavioral alignment has been mostly studied for speech or text. Yet, language use is mostly multimodal, flexibly using speech and gestures to convey messages. Here, we explore the use of alignment in speech (words) and co-speech gestures (iconic gestures) in a referential communication task aimed at finding labels for novel objects in interaction. In particular, we investigate how people flexibly use lexical and gestural alignment to create shared labels for novel objects and whether alignment in speech and gesture are related over time. The present study shows that interlocutors establish shared labels multimodally, and alignment in words and iconic gestures are used throughout the interaction. We also show that the amount of lexical alignment positively associates with the amount of gestural alignment over time, suggesting a close relationship between alignment in the vocal and manual modalities.

    Additional information

    link to eScholarship
  • Amatuni, A., Schroer, S. E., Zhang, Y., Peters, R. E., Reza, M. A., Crandall, D., & Yu, C. (2021). In-the-moment visual information from the infant's egocentric view determines the success of infant word learning: A computational study. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 265-271). Vienna: Cognitive Science Society.

    Abstract

    Infants learn the meaning of words from accumulated experiences of real-time interactions with their caregivers. To study the effects of visual sensory input on word learning, we recorded infant's view of the world using head-mounted eye trackers during free-flowing play with a caregiver. While playing, infants were exposed to novel label-object mappings and later learning outcomes for these items were tested after the play session. In this study we use a classification based approach to link properties of infants' visual scenes during naturalistic labeling moments to their word learning outcomes. We find that a model which integrates both highly informative and ambiguous sensory evidence is a better fit to infants' individual learning outcomes than models where either type of evidence is taken alone, and that raw labeling frequency is unable to account for the word learning differences we observe. Here we demonstrate how a computational model, using only raw pixels taken from the egocentric scene image, can derive insights on human language learning.
  • Bauer, B. L. M. (2021). Formation of numerals in the romance languages. In Oxford Research Encyclopedia of Linguistics. Oxford: Oxford University Press. doi:10.1093/acrefore/9780199384655.013.685.

    Abstract

    The Romance languages have a rich numeral system that includes cardinals—providing the bases on which the other types of numeral series are built—ordinals, fractions, collectives, approximatives, distributives, and multiplicatives. Latin plays a decisive and continued role in their formation, both as the language to which many numerals go back directly and as an ongoing source for lexemes and formatives. While the Latin numeral system was synthetic, with a distinct ending for each type of numeral, the Romance numerals often feature more than one (unevenly distributed) marker or structure per series, which feature varying degrees of inherited, borrowed, or innovative elements. Formal consistency is strongest in cardinals, followed by ordinals and then the other types of numeral, which also tend to be more analytic or periphrastic. From a morphological perspective, Romance numerals overall have moved away from the inherited syntheticity, but several series continue to be synthetic formations—at least in part—with morphological markers drawn from Latin that may have undergone functional change (e.g. distributive > ordinal > collective). The underlying syntax of Romance numerals is in line with the overall grammatical patterns of Romance languages, as reflected in the prevalence of word order (with arithmetical correlates), connectors, (partial) loss of agreement, and analyticity. Innovation is prominent in the formation of higher numerals with bases beyond ‘thousand’, of teens and decads in Romanian, and of vigesimals in numerous Romance varieties.
  • Ben-Ami, S., Shukla, Vishakha, V., Gupta, P., Shah, P., Ralekar, C., Ganesh, S., Gilad-Gutnick, S., Rubio-Fernández, P., & Sinha, P. (2024). Form perception as a bridge to real-world functional proficiency. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 6094-6102).

    Abstract

    Recognizing the limitations of standard vision assessments in capturing the real-world capabilities of individuals with low vision, we investigated the potential of the Seguin Form Board Test (SFBT), a widely-used intelligence assessment employing a visuo-haptic shape-fitting task, as an estimator of vision's practical utility. We present findings from 23 children from India, who underwent treatment for congenital bilateral dense cataracts, and 21 control participants. To assess the development of functional visual ability, we conducted the SFBT and the standard measure of visual acuity, before and longitudinally after treatment. We observed a dissociation in the development of shape-fitting and visual acuity. Improvements of patients' shape-fitting preceded enhancements in their visual acuity after surgery and emerged even with acuity worse than that of control participants. Our findings highlight the importance of incorporating multi-modal and cognitive aspects into evaluations of visual proficiency in low-vision conditions, to better reflect vision's impact on daily activities.

    Additional information

    link to eScholarship
  • Bock, K., & Levelt, W. J. M. (1994). Language production: Grammatical encoding. In M. A. Gernsbacher (Ed.), Handbook of Psycholinguistics (pp. 945-984). San Diego,: Academic Press.
  • Bodur, K., Branje, S., Peirolo, M., Tiscareno, I., & German, J. S. (2021). Domain-initial strengthening in Turkish: Acoustic cues to prosodic hierarchy in stop consonants. In Proceedings of Interspeech 2021 (pp. 1459-1463). doi:10.21437/Interspeech.2021-2230.

    Abstract

    Studies have shown that cross-linguistically, consonants at the left edge of higher-level prosodic boundaries tend to be more forcefully articulated than those at lower-level boundaries, a phenomenon known as domain-initial strengthening. This study tests whether similar effects occur in Turkish, using the Autosegmental-Metrical model proposed by Ipek & Jun [1, 2] as the basis for assessing boundary strength. Productions of /t/ and /d/ were elicited in four domain-initial prosodic positions corresponding to progressively higher-level boundaries: syllable, word, intermediate phrase, and Intonational Phrase. A fifth position, nuclear word, was included in order to better situate it within the prosodic hierarchy. Acoustic correlates of articulatory strength were measured, including closure duration for /d/ and /t/, as well as voice onset time and burst energy for /t/. Our results show that closure duration increases cumulatively from syllable to intermediate phrase, while voice onset time and burst energy are not influenced by boundary strength. These findings provide corroborating evidence for Ipek & Jun’s model, particularly for the distinction between word and intermediate phrase boundaries. Additionally, articulatory strength at the left edge of the nuclear word patterned closely with word-initial position, supporting the view that the nuclear word is not associated with a distinct phrasing domain
  • Bosker, H. R. (2021). The contribution of amplitude modulations in speech to perceived charisma. In B. Weiss, J. Trouvain, M. Barkat-Defradas, & J. J. Ohala (Eds.), Voice attractiveness: Prosody, phonology and phonetics (pp. 165-181). Singapore: Springer. doi:10.1007/978-981-15-6627-1_10.

    Abstract

    Speech contains pronounced amplitude modulations in the 1–9 Hz range, correlating with the syllabic rate of speech. Recent models of speech perception propose that this rhythmic nature of speech is central to speech recognition and has beneficial effects on language processing. Here, we investigated the contribution of amplitude modulations to the subjective impression listeners have of public speakers. The speech from US presidential candidates Hillary Clinton and Donald Trump in the three TV debates of 2016 was acoustically analyzed by means of modulation spectra. These indicated that Clinton’s speech had more pronounced amplitude modulations than Trump’s speech, particularly in the 1–9 Hz range. A subsequent perception experiment, with listeners rating the perceived charisma of (low-pass filtered versions of) Clinton’s and Trump’s speech, showed that more pronounced amplitude modulations (i.e., more ‘rhythmic’ speech) increased perceived charisma ratings. These outcomes highlight the important contribution of speech rhythm to charisma perception.
  • Bouman, M. A., & Levelt, W. J. M. (1994). Werner E. Reichardt: Levensbericht. In H. W. Pleket (Ed.), Levensberichten en herdenkingen 1993 (pp. 75-80). Amsterdam: Koninklijke Nederlandse Akademie van Wetenschappen.
  • Bowerman, M. (1994). Learning a semantic system: What role do cognitive predispositions play? [Reprint]. In P. Bloom (Ed.), Language acquisition: Core readings (pp. 329-363). Cambridge, MA: MIT Press.

    Abstract

    Reprint from: Bowerman, M. (1989). Learning a semantic system: What role do cognitive predispositions play? In M.L. Rice & R.L Schiefelbusch (Ed.), The teachability of language (pp. 133-169). Baltimore: Paul H. Brookes.
  • Bowerman, M. (1979). The acquisition of complex sentences. In M. Garman, & P. Fletcher (Eds.), Studies in language acquisition (pp. 285-305). Cambridge: Cambridge University Press.
  • Bowerman, M. (1973). Structural relationships in children's utterances: Semantic or syntactic? In T. Moore (Ed.), Cognitive development and the acquisition of language (pp. 197-213). New York: Academic Press.
  • Bowerman, M. (1980). The structure and origin of semantic categories in the language learning child. In M. Foster, & S. Brandes (Eds.), Symbol as sense (pp. 277-299). New York: Academic Press.
  • Brown, P. (1980). How and why are women more polite: Some evidence from a Mayan community. In S. McConnell-Ginet, R. Borker, & N. Furman (Eds.), Women and language in literature and society (pp. 111-136). New York: Praeger.
  • Brown, P., & Levinson, S. C. (1979). Social structure, groups and interaction. In H. Giles, & K. R. Scherer (Eds.), Social markers in speech (pp. 291-341). Cambridge University Press.
  • Brown, P., & Fraser, C. (1979). Speech as a marker of situation. In H. Giles, & K. Scherer (Eds.), Social markers in speech (pp. 33-62). Cambridge: Cambridge University Press.
  • Cheung, C.-Y., Kirby, S., & Raviv, L. (2024). The role of gender, social bias and personality traits in shaping linguistic accommodation: An experimental approach. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 80-82). Nijmegen: The Evolution of Language Conferences. doi:10.17617/2.3587960.
  • Coopmans, C. W., De Hoop, H., Kaushik, K., Hagoort, P., & Martin, A. E. (2021). Structure-(in)dependent interpretation of phrases in humans and LSTMs. In Proceedings of the Society for Computation in Linguistics (SCiL 2021) (pp. 459-463).

    Abstract

    In this study, we compared the performance of a long short-term memory (LSTM) neural network to the behavior of human participants on a language task that requires hierarchically structured knowledge. We show that humans interpret ambiguous noun phrases, such as second blue ball, in line with their hierarchical constituent structure. LSTMs, instead, only do
    so after unambiguous training, and they do not systematically generalize to novel items. Overall, the results of our simulations indicate that a model can behave hierarchically without relying on hierarchical constituent structure.
  • Cos, F., Bujok, R., & Bosker, H. R. (2024). Test-retest reliability of audiovisual lexical stress perception after >1.5 years. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 871-875). doi:10.21437/SpeechProsody.2024-176.

    Abstract

    In natural communication, we typically both see and hear our conversation partner. Speech comprehension thus requires the integration of auditory and visual information from the speech signal. This is for instance evidenced by the Manual McGurk effect, where the perception of lexical stress is biased towards the syllable that has a beat gesture aligned to it. However, there is considerable individual variation in how heavily gestural timing is weighed as a cue to stress. To assess within-individualconsistency, this study investigated the test-retest reliability of the Manual McGurk effect. We reran an earlier Manual McGurk experiment with the same participants, over 1.5 years later. At the group level, we successfully replicated the Manual McGurk effect with a similar effect size. However, a correlation of the by-participant effect sizes in the two identical experiments indicated that there was only a weak correlation between both tests, suggesting that the weighing of gestural information in the perception of lexical stress is stable at the group level, but less so in individuals. Findings are discussed in comparison to other measures of audiovisual integration in speech perception. Index Terms: Audiovisual integration, beat gestures, lexical stress, test-retest reliability
  • Cutler, A., & Jesse, A. (2021). Word stress in speech perception. In J. S. Pardo, L. C. Nygaard, & D. B. Pisoni (Eds.), The handbook of speech perception (2nd ed., pp. 239-265). Chichester: Wiley.
  • Cutler, A. (1979). Beyond parsing and lexical look-up. In R. J. Wales, & E. C. T. Walker (Eds.), New approaches to language mechanisms: a collection of psycholinguistic studies (pp. 133-149). Amsterdam: North-Holland.
  • Cutler, A. (1980). Errors of stress and intonation. In V. A. Fromkin (Ed.), Errors in linguistic performance: Slips of the tongue, ear, pen and hand (pp. 67-80). New York: Academic Press.
  • Cutler, A. (1994). How human speech recognition is affected by phonological diversity among languages. In R. Togneri (Ed.), Proceedings of the fifth Australian International Conference on Speech Science and Technology: Vol. 1 (pp. 285-288). Canberra: Australian Speech Science and Technology Association.

    Abstract

    Listeners process spoken language in ways which are adapted to the phonological structure of their native language. As a consequence, non-native speakers do not listen to a language in the same way as native speakers; moreover, listeners may use their native language listening procedures inappropriately with foreign input. With sufficient experience, however, it may be possible to inhibit this latter (counter-productive) behavior.
  • Cutler, A., & Norris, D. (1979). Monitoring sentence comprehension. In W. E. Cooper, & E. C. T. Walker (Eds.), Sentence processing: Psycholinguistic studies presented to Merrill Garrett (pp. 113-134). Hillsdale: Erlbaum.
  • Cutler, A. (1980). Productivity in word formation. In J. Kreiman, & A. E. Ojeda (Eds.), Papers from the Sixteenth Regional Meeting, Chicago Linguistic Society (pp. 45-51). Chicago, Ill.: CLS.
  • Cutler, A. (1980). Syllable omission errors and isochrony. In H. W. Dechet, & M. Raupach (Eds.), Temporal variables in speech: studies in honour of Frieda Goldman-Eisler (pp. 183-190). The Hague: Mouton.
  • Cutler, A., & Young, D. (1994). Rhythmic structure of word blends in English. In Proceedings of the Third International Conference on Spoken Language Processing (pp. 1407-1410). Kobe: Acoustical Society of Japan.

    Abstract

    Word blends combine fragments from two words, either in speech errors or when a new word is created. Previous work has demonstrated that in Japanese, such blends preserve moraic structure; in English they do not. A similar effect of moraic structure is observed in perceptual research on segmentation of continuous speech in Japanese; English listeners, by contrast, exploit stress units in segmentation, suggesting that a general rhythmic constraint may underlie both findings. The present study examined whether mis parallel would also hold for word blends. In spontaneous English polysyllabic blends, the source words were significantly more likely to be split before a strong than before a weak (unstressed) syllable, i.e. to be split at a stress unit boundary. In an experiment in which listeners were asked to identify the source words of blends, significantly more correct detections resulted when splits had been made before strong syllables. Word blending, like speech segmentation, appears to be constrained by language rhythm.
  • Cutler, A., & Isard, S. D. (1980). The production of prosody. In B. Butterworth (Ed.), Language production (pp. 245-269). London: Academic Press.
  • Cutler, A., McQueen, J. M., Baayen, R. H., & Drexler, H. (1994). Words within words in a real-speech corpus. In R. Togneri (Ed.), Proceedings of the 5th Australian International Conference on Speech Science and Technology: Vol. 1 (pp. 362-367). Canberra: Australian Speech Science and Technology Association.

    Abstract

    In a 50,000-word corpus of spoken British English the occurrence of words embedded within other words is reported. Within-word embedding in this real speech sample is common, and analogous to the extent of embedding observed in the vocabulary. Imposition of a syllable boundary matching constraint reduces but by no means eliminates spurious embedding. Embedded words are most likely to overlap with the beginning of matrix words, and thus may pose serious problems for speech recognisers.
  • Dang, A., Raviv, L., & Galke, L. (2024). Testing the linguistic niche hypothesis in large with a multilingual Wug test. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 91-93). Nijmegen: The Evolution of Language Conferences.
  • D'Avis, F.-J., & Gretsch, P. (1994). Variations on "Variation": On the Acquisition of Complementizers in German. In R. Tracy, & E. Lattey (Eds.), How Tolerant is Universal Grammar? (pp. 59-109). Tübingen, Germany: Max-Niemeyer-Verlag.
  • Defina, R., Dingemanse, M., & Van Putten, S. (2024). Linguistic fieldwork as team science. In E. Aboh (Ed.), Predication in African Languages (pp. 20-42). Amsterdam: John Benjamins. doi:10.1075/slcs.235.01def.

    Abstract


    Linguistic fieldwork is increasingly moving forward from the traditional model of lone fieldworker with a notebook to collaborative projects with key roles for native speakers and other experts and involving the use of different kinds of stimulus-based elicitation methods as well as extensive video documentation. Several cohorts of colleagues and students have been influenced by this inclusive and interdisciplinary view of linguistic fieldwork. We describe the challenges and benefits of doing multi-methods collaborative fieldwork. As linguistics inevitably moves into the direction of multiple methods, interdisciplinarity and team science, now is the time to reflect critically on how best to contribute to a cumulative science of language.
  • Dona, L., & Schouwstra, M. (2024). Balancing regularization and variation: The roles of priming and motivatedness. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 130-133). Nijmegen: The Evolution of Language Conferences.
  • Falk, J. J., Zhang, Y., Scheutz, M., & Yu, C. (2021). Parents adaptively use anaphora during parent-child social interaction. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 1472-1478). Vienna: Cognitive Science Society.

    Abstract

    Anaphora, a ubiquitous feature of natural language, poses a particular challenge to young children as they first learn language due to its referential ambiguity. In spite of this, parents and caregivers use anaphora frequently in child-directed speech, potentially presenting a risk to effective communication if children do not yet have the linguistic capabilities of resolving anaphora successfully. Through an eye-tracking study in a naturalistic free-play context, we examine the strategies that parents employ to calibrate their use of anaphora to their child's linguistic development level. We show that, in this way, parents are able to intuitively scaffold the complexity of their speech such that greater referential ambiguity does not hurt overall communication success.
  • Frost, R. L. A., & Casillas, M. (2021). Investigating statistical learning of nonadjacent dependencies: Running statistical learning tasks in non-WEIRD populations. In SAGE Research Methods Cases. doi:10.4135/9781529759181.

    Abstract

    Language acquisition is complex. However, one thing that has been suggested to help learning is the way that information is distributed throughout language; co-occurrences among particular items (e.g., syllables and words) have been shown to help learners discover the words that a language contains and figure out how those words are used. Humans’ ability to draw on this information—“statistical learning”—has been demonstrated across a broad range of studies. However, evidence from non-WEIRD (Western, Educated, Industrialized, Rich, and Democratic) societies is critically lacking, which limits theorizing on the universality of this skill. We extended work on statistical language learning to a new, non-WEIRD linguistic population: speakers of Yélî Dnye, who live on a remote island off mainland Papua New Guinea (Rossel Island). We performed a replication of an existing statistical learning study, training adults on an artificial language with statistically defined words, then examining what they had learnt using a two-alternative forced-choice test. Crucially, we implemented several key amendments to the original study to ensure the replication was suitable for remote field-site testing with speakers of Yélî Dnye. We made critical changes to the stimuli and materials (to test speakers of Yélî Dnye, rather than English), the instructions (we re-worked these significantly, and added practice tasks to optimize participants’ understanding), and the study format (shifting from a lab-based to a portable tablet-based setup). We discuss the requirement for acute sensitivity to linguistic, cultural, and environmental factors when adapting studies to test new populations.

  • Galke, L., Franke, B., Zielke, T., & Scherp, A. (2021). Lifelong learning of graph neural networks for open-world node classification. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN). Piscataway, NJ: IEEE. doi:10.1109/IJCNN52387.2021.9533412.

    Abstract

    Graph neural networks (GNNs) have emerged as the standard method for numerous tasks on graph-structured data such as node classification. However, real-world graphs are often evolving over time and even new classes may arise. We model these challenges as an instance of lifelong learning, in which a learner faces a sequence of tasks and may take over knowledge acquired in past tasks. Such knowledge may be stored explicitly as historic data or implicitly within model parameters. In this work, we systematically analyze the influence of implicit and explicit knowledge. Therefore, we present an incremental training method for lifelong learning on graphs and introduce a new measure based on k-neighborhood time differences to address variances in the historic data. We apply our training method to five representative GNN architectures and evaluate them on three new lifelong node classification datasets. Our results show that no more than 50% of the GNN's receptive field is necessary to retain at least 95% accuracy compared to training over the complete history of the graph data. Furthermore, our experiments confirm that implicit knowledge becomes more important when fewer explicit knowledge is available.
  • Galke, L., Seidlmayer, E., Lüdemann, G., Langnickel, L., Melnychuk, T., Förstner, K. U., Tochtermann, K., & Schultz, C. (2021). COVID-19++: A citation-aware Covid-19 dataset for the analysis of research dynamics. In Y. Chen, H. Ludwig, Y. Tu, U. Fayyad, X. Zhu, X. Hu, S. Byna, X. Liu, J. Zhang, S. Pan, V. Papalexakis, J. Wang, A. Cuzzocrea, & C. Ordonez (Eds.), Proceedings of the 2021 IEEE International Conference on Big Data (pp. 4350-4355). Piscataway, NJ: IEEE.

    Abstract

    COVID-19 research datasets are crucial for analyzing research dynamics. Most collections of COVID-19 research items do not to include cited works and do not have annotations
    from a controlled vocabulary. Starting with ZB MED KE data on COVID-19, which comprises CORD-19, we assemble a new dataset that includes cited work and MeSH annotations for all records. Furthermore, we conduct experiments on the analysis of research dynamics, in which we investigate predicting links in a co-annotation graph created on the basis of the new dataset. Surprisingly, we find that simple heuristic methods are better at
    predicting future links than more sophisticated approaches such as graph neural networks.
  • Galke, L., Ram, Y., & Raviv, L. (2024). Learning pressures and inductive biases in emergent communication: Parallels between humans and deep neural networks. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 197-201). Nijmegen: The Evolution of Language Conferences.
  • Ghaleb, E., Rasenberg, M., Pouw, W., Toni, I., Holler, J., Özyürek, A., & Fernandez, R. (2024). Analysing cross-speaker convergence through the lens of automatically detected shared linguistic constructions. In L. K. Samuelson, S. L. Frank, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 1717-1723).

    Abstract

    Conversation requires a substantial amount of coordination between dialogue participants, from managing turn taking to negotiating mutual understanding. Part of this coordination effort surfaces as the reuse of linguistic behaviour across speakers, a process often referred to as alignment. While the presence of linguistic alignment is well documented in the literature, several questions remain open, including the extent to which patterns of reuse across speakers have an impact on the emergence of labelling conventions for novel referents. In this study, we put forward a methodology for automatically detecting shared lemmatised constructions---expressions with a common lexical core used by both speakers within a dialogue---and apply it to a referential communication corpus where participants aim to identify novel objects for which no established labels exist. Our analyses uncover the usage patterns of shared constructions in interaction and reveal that features such as their frequency and the amount of different constructions used for a referent are associated with the degree of object labelling convergence the participants exhibit after social interaction. More generally, the present study shows that automatically detected shared constructions offer a useful level of analysis to investigate the dynamics of reference negotiation in dialogue.

    Additional information

    link to eScholarship
  • Ghaleb, E., Burenko, I., Rasenberg, M., Pouw, W., Uhrig, P., Holler, J., Toni, I., Ozyurek, A., & Fernandez, R. (2024). Cospeech gesture detection through multi-phase sequence labeling. In Proceedings of IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024) (pp. 4007-4015).

    Abstract

    Gestures are integral components of face-to-face communication. They unfold over time, often following predictable movement phases of preparation, stroke, and re-
    traction. Yet, the prevalent approach to automatic gesture detection treats the problem as binary classification, classifying a segment as either containing a gesture or not, thus failing to capture its inherently sequential and contextual nature. To address this, we introduce a novel framework that reframes the task as a multi-phase sequence labeling problem rather than binary classification. Our model processes sequences of skeletal movements over time windows, uses Transformer encoders to learn contextual embeddings, and leverages Conditional Random Fields to perform sequence labeling. We evaluate our proposal on a large dataset of diverse co-speech gestures in task-oriented face-to-face dialogues. The results consistently demonstrate that our method significantly outperforms strong baseline models in detecting gesture strokes. Furthermore, applying Transformer encoders to learn contextual embeddings from movement sequences substantially improves gesture unit detection. These results highlight our framework’s capacity to capture the fine-grained dynamics of co-speech gesture phases, paving the way for more nuanced and accurate gesture detection and analysis.
  • Grosseck, O., Perlman, M., Ortega, G., & Raviv, L. (2024). The iconic affordances of gesture and vocalization in emerging languages in the lab. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 223-225). Nijmegen: The Evolution of Language Conferences.
  • Hagoort, P., & Brown, C. M. (1994). Brain responses to lexical ambiguity resolution and parsing. In C. Clifton Jr, L. Frazier, & K. Rayner (Eds.), Perspectives on sentence processing (pp. 45-81). Hilsdale NY: Lawrence Erlbaum Associates.
  • Harmon, Z., Barak, L., Shafto, P., Edwards, J., & Feldman, N. H. (2021). Making heads or tails of it: A competition–compensation account of morphological deficits in language impairment. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 1872-1878). Vienna: Cognitive Science Society.

    Abstract

    Children with developmental language disorder (DLD) regularly use the base form of verbs (e.g., dance) instead of inflected forms (e.g., danced). We propose an account of this behavior in which children with DLD have difficulty processing novel inflected verbs in their input. This leads the inflected form to face stronger competition from alternatives. Competition is resolved by the production of a more accessible alternative with high semantic overlap with the inflected form: in English, the bare form. We test our account computationally by training a nonparametric Bayesian model that infers the productivity of the inflectional suffix (-ed). We systematically vary the number of novel types of inflected verbs in the input to simulate the input as processed by children with and without DLD. Modeling results are consistent with our hypothesis, suggesting that children’s inconsistent use of inflectional morphemes could stem from inferences they make on the basis of impoverished data.
  • Hellwig, B., Defina, R., Kidd, E., Allen, S. E. M., Davidson, L., & Kelly, B. F. (2021). Child language documentation: The sketch acquisition project. In G. Haig, S. Schnell, & F. Seifart (Eds.), Doing corpus-based typology with spoken language data: State of the art (pp. 29-58). Honolulu, HI: University of Hawai'i Press.

    Abstract

    This paper reports on an on-going project designed to collect comparable corpus data on child language and child-directed language in under-researched languages. Despite a long history of cross-linguistic research, there is a severe empirical bias within language acquisition research: Data is available for less than 2% of the world's languages, heavily skewed towards the larger and better-described languages. As a result, theories of language development tend to be grounded in a non-representative sample, and we know little about the acquisition of typologically-diverse languages from different families, regions, or sociocultural contexts. It is very likely that the reasons are to be found in the forbidding methodological challenges of constructing child language corpora under fieldwork conditions with their strict requirements on participant selection, sampling intervals, and amounts of data. There is thus an urgent need for proposals that facilitate and encourage language acquisition research across a wide variety of languages. Adopting a language documentation perspective, we illustrate an approach that combines the construction of manageable corpora of natural interaction with and between children with a sketch description of the corpus data – resulting in a set of comparable corpora and comparable sketches that form the basis for cross-linguistic comparisons.
  • Hintz, F., Voeten, C. C., McQueen, J. M., & Scharenborg, O. (2021). The effects of onset and offset masking on the time course of non-native spoken-word recognition in noise. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 133-139). Vienna: Cognitive Science Society.

    Abstract

    Using the visual-word paradigm, the present study investigated the effects of word onset and offset masking on the time course of non-native spoken-word recognition in the presence of background noise. In two experiments, Dutch non-native listeners heard English target words, preceded by carrier sentences that were noise-free (Experiment 1) or contained intermittent noise (Experiment 2). Target words were either onset- or offset-masked or not masked at all. Results showed that onset masking delayed target word recognition more than offset masking did, suggesting that – similar to natives – non-native listeners strongly rely on word onset information during word recognition in noise.

    Additional information

    Link to Preprint on BioRxiv
  • Joshi, A., Mohanty, R., Kanakanti, M., Mangla, A., Choudhary, S., Barbate, M., & Modi, A. (2024). iSign: A benchmark for Indian Sign Language processing. In L.-W. Ku, A. Martins, & V. Srikumar (Eds.), Findings of the Association for Computational Linguistics ACL 2024 (pp. 10827-10844). Bangkok, Thailand: Association for Computational Linguistics.

    Abstract

    Indian Sign Language has limited resources for developing machine learning and data-driven approaches for automated language processing. Though text/audio-based language processing techniques have shown colossal research interest and tremendous improvements in the last few years, Sign Languages still need to catch up due to the need for more resources. To bridge this gap, in this work, we propose iSign: a benchmark for Indian Sign Language (ISL) Processing. We make three primary contributions to this work. First, we release one of the largest ISL-English datasets with more than video-sentence/phrase pairs. To the best of our knowledge, it is the largest sign language dataset available for ISL. Second, we propose multiple NLP-specific tasks (including SignVideo2Text, SignPose2Text, Text2Pose, Word Prediction, and Sign Semantics) and benchmark them with the baseline models for easier access to the research community. Third, we provide detailed insights into the proposed benchmarks with a few linguistic insights into the working of ISL. We streamline the evaluation of Sign Language processing, addressing the gaps in the NLP research community for Sign Languages. We release the dataset, tasks and models via the following website: https://exploration-lab.github.io/iSign/

    Additional information

    dataset, tasks, models
  • Josserand, M., Pellegrino, F., Grosseck, O., Dediu, D., & Raviv, L. (2024). Adapting to individual differences: An experimental study of variation in language evolution. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 286-289). Nijmegen: The Evolution of Language Conferences.
  • Karaca, F., Brouwer, S., Unsworth, S., & Huettig, F. (2021). Prediction in bilingual children: The missing piece of the puzzle. In E. Kaan, & T. Grüter (Eds.), Prediction in Second Language Processing and Learning (pp. 116-137). Amsterdam: Benjamins.

    Abstract

    A wealth of studies has shown that more proficient monolingual speakers are better at predicting upcoming information during language comprehension. Similarly, prediction skills of adult second language (L2) speakers in their L2 have also been argued to be modulated by their L2 proficiency. How exactly language proficiency and prediction are linked, however, is yet to be systematically investigated. One group of language users which has the potential to provide invaluable insights into this link is bilingual children. In this paper, we compare bilingual children’s prediction skills with those of monolingual children and adult L2 speakers, and show how investigating bilingual children’s prediction skills may contribute to our understanding of how predictive processing works.
  • Karadöller, D. Z., Sumer, B., Ünal, E., & Ozyurek, A. (2021). Spatial language use predicts spatial memory of children: Evidence from sign, speech, and speech-plus-gesture. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 672-678). Vienna: Cognitive Science Society.

    Abstract

    There is a strong relation between children’s exposure to
    spatial terms and their later memory accuracy. In the current
    study, we tested whether the production of spatial terms by
    children themselves predicts memory accuracy and whether
    and how language modality of these encodings modulates
    memory accuracy differently. Hearing child speakers of
    Turkish and deaf child signers of Turkish Sign Language
    described pictures of objects in various spatial relations to each
    other and later tested for their memory accuracy of these
    pictures in a surprise memory task. We found that having
    described the spatial relation between the objects predicted
    better memory accuracy. However, the modality of these
    descriptions in sign, speech, or speech-plus-gesture did not
    reveal differences in memory accuracy. We discuss the
    implications of these findings for the relation between spatial
    language, memory, and the modality of encoding.
  • Kempen, G. (1979). A study of syntactic bookkeeping during sentence production. In H. Ueckert, & D. Rhenius (Eds.), Komplexe menschliche Informationsverarbeitung (pp. 361-368). Bern: Hans Huber.

    Abstract

    It is an important feature of the human sentence production system that semantic and syntactic processes may overlap in time and do not proceed strictly serially. That is, the process of building the syntactic form of an utterance does not always wait until the complete semantic content for that utterance has been decided upon. On the contrary, speakers will often start pronouncing the first words of a sentence while still working on further details of its semantic content. An important advantage is memory economy. Semantic and syntactic fragments do not have to occupy working memory until complete semantic and syntactic structures for an utterance have been computed. Instead, each semantic and syntactic fragment is processed as soon as possible and is kept in working memory for a minimum period of time. This raises the question of how the sentence production system can maintain syntactic coherence across syntactic fragments. Presumably there are processes of "syntactic bookkeeping" which (1) store in working memory those syntactic properties of a fragmentary sentence which are needed to eliminate ungrammatical continuations, and (2) check whether a prospective continuation is indeed compatible with the sentence constructed so far. In reaction time experiments where subjects described, under time pressure, simple static pictures of an action performed by an actor, the second aspect of syntactic bookkeeping could be demonstrated. This evidence is used for modelling bookkeeping processes as part of a computational sentence generator which aims at simulating the syntactic operations people carry out during spontaneous speech.
  • Kempen, G. (1994). Innovative language checking software for Dutch. In J. Van Gent, & E. Peeters (Eds.), Proceedings of the 2e Dag van het Document (pp. 99-100). Delft: TNO Technisch Physische Dienst.
  • Kempen, G. (1994). The unification space: A hybrid model of human syntactic processing [Abstract]. In Cuny 1994 - The 7th Annual CUNY Conference on Human Sentence Processing. March 17-19, 1994. CUNY Graduate Center, New York.
  • Kempen, G., & Dijkstra, A. (1994). Toward an integrated system for grammar, writing and spelling instruction. In L. Appelo, & F. De Jong (Eds.), Computer-Assisted Language Learning: Proceedings of the Seventh Twente Workshop on Language Technology (pp. 41-46). Enschede: University of Twente.
  • Klein, W. (2021). Das „Heidelberger Forschungsprojekt Pidgin-Deutsch “und die Folgen. In B. Ahrenholz, & M. Rost-Roth (Eds.), Ein Blick zurück nach vorn: Frühe deutsche Forschung zu Zweitspracherwerb, Migration, Mehrsprachigkeit und zweitsprachbezogener Sprachdidaktik sowie ihre Bedeutung heute (pp. 50-95). Berlin: De Gruyter.
  • Klein, W. (1973). Eine Analyse der Kerne in Schillers "Räuber". In S. Marcus (Ed.), Mathematische Poetik (pp. 326-333). Frankfurt am Main: Athenäum.
  • Klein, W. (1979). Die Geschichte eines Tores. In R. Baum, F. J. Hausmann, & I. Monreal-Wickert (Eds.), Sprache in Unterricht und Forschung: Schwerpunkt Romanistik (pp. 175-194). Tübingen: Narr.
  • Klein, W. (1973). Dialekt und Einheitssprache im Fremdsprachenunterricht. In Beiträge zu den Sommerkursen des Goethe-Instituts München (pp. 53-60).
  • Klein, W. (1967). Einführende Bibliographie zu "Mathematik und Dichtung". In H. Kreuzer, & R. Gunzenhäuser (Eds.), Mathematik und Dichtung (pp. 347-359). München: Nymphenburger.
  • Klein, W. (1994). Für eine rein zeitliche Deutung von Tempus und Aspekt. In R. Baum (Ed.), Lingua et Traditio: Festschrift für Hans Helmut Christmann zum 65. Geburtstag (pp. 409-422). Tübingen: Narr.
  • Klein, W. (1994). Keine Känguruhs zur Linken: Über die Variabilität von Raumvorstellungen und ihren Ausdruck in der Sprache. In H.-J. Kornadt, J. Grabowski, & R. Mangold-Allwinn (Eds.), Sprache und Kognition (pp. 163-182). Heidelberg, Berlin, Oxford: Spektrum.
  • Klein, W. (1994). Learning how to express temporality in a second language. In A. G. Ramat, & M. Vedovelli (Eds.), Società di linguistica Italiana, SLI 34: Italiano - lingua seconda/lingua straniera: Atti del XXVI Congresso (pp. 227-248). Roma: Bulzoni.
  • Klein, W. (1980). Verbal planning in route directions. In H. Dechert, & M. Raupach (Eds.), Temporal variables in speech (pp. 159-168). Den Haag: Mouton.
  • Koutamanis, E., Kootstra, G. J., Dijkstra, T., & Unsworth., S. (2021). Lexical priming as evidence for language-nonselective access in the simultaneous bilingual child's lexicon. In D. Dionne, & L.-A. Vidal Covas (Eds.), BUCLD 45: Proceedings of the 45th annual Boston University Conference on Language Development (pp. 413-430). Sommerville, MA: Cascadilla Press.
  • Kupisch, T., Pereira Soares, S. M., Puig-Mayenco, E., & Rothman, J. (2021). Multilingualism and Chomsky's Generative Grammar. In N. Allott (Ed.), A companion to Chomsky (pp. 232-242). doi:10.1002/9781119598732.ch15.

    Abstract

    Like Einstein's general theory of relativity is concerned with explaining the basics of an observable experience – i.e., gravity – most people take for granted that Chomsky's theory of generative grammar (GG) is concerned with the basic nature of language. This chapter highlights a mere subset of central constructs in GG, showing how they have featured prominently and thus shaped formal linguistic studies in multilingualism. Because multilingualism includes a wide range of nonmonolingual populations, the constructs are divided across child bilingualism and adult third language for greater coverage. In the case of the former, the chapter examines how poverty of the stimulus has been investigated. Using the nascent field of L3/Ln acquisition as the backdrop, it discusses how the GG constructs of I-language versus E-language sit at the core of debates regarding the very notion of what linguistic transfer and mental representations should be taken to be.
  • Lammertink, I., De Heer Kloots, M., Bazioni, M., & Raviv, L. (2024). Learnability effects in children: Are more structured languages easier to learn? In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 320-323). Nijmegen: The Evolution of Language Conferences.
  • Levelt, W. J. M. (1994). Psycholinguistics. In A. M. Colman (Ed.), Companion Encyclopedia of Psychology: Vol. 1 (pp. 319-337). London: Routledge.

    Abstract

    Linguistic skills are primarily tuned to the proper conduct of conversation. The innate ability to converse has provided species with a capacity to share moods, attitudes, and information of almost any kind, to assemble knowledge and skills, to plan coordinated action, to educate its offspring, in short, to create and transmit culture. In conversation the interlocutors are involved in negotiating meaning. Speaking is most complex cognitive-motor skill. It involves the conception of an intention, the selection of information whose expression will make that intention recognizable, the selection of appropriate words, the construction of a syntactic framework, the retrieval of the words’ sound forms, and the computation of an articulatory plan for each word and for the utterance as a whole. The question where communicative intentions come from is a psychodynamic question rather than a psycholinguistic one. Speaking is a form of social action, and it is in the context of action that intentions, goals, and subgoals develop.
  • Levelt, W. J. M. (1962). Motion breaking and the perception of causality. In A. Michotte (Ed.), Causalité, permanence et réalité phénoménales: Etudes de psychologie expérimentale (pp. 244-258). Louvain: Publications Universitaires.
  • Levelt, W. J. M., & Plomp, R. (1962). Musical consonance and critical bandwidth. In Proceedings of the 4th International Congress Acoustics (pp. 55-55).
  • Levelt, W. J. M., & Kempen, G. (1979). Language. In J. A. Michon, E. G. J. Eijkman, & L. F. W. De Klerk (Eds.), Handbook of psychonomics (Vol. 2) (pp. 347-407). Amsterdam: North Holland.
  • Levelt, W. J. M. (1994). On the skill of speaking: How do we access words? In Proceedings ICSLP 94 (pp. 2253-2258). Yokohama: The Acoustical Society of Japan.
  • Levelt, W. J. M. (1980). On-line processing constraints on the properties of signed and spoken language. In U. Bellugi, & M. Studdert-Kennedy (Eds.), Signed and spoken language: Biological constraints on linguistic form (pp. 141-160). Weinheim: Verlag Chemie.

    Abstract

    It is argued that the dominantly successive nature of language is largely mode-independent and holds equally for sign and for spoken language. A preliminary distinction is made between what is simultaneous or successive in the signal, and what is in the process; these need not coincide, and it is the successiveness of the process that is at stake. It is then discussed extensively for the word/sign level, and in a more preliminary fashion for the clause and discourse level that online processes are parallel in that they can simultaneously draw on various sources of knowledge (syntactic, semantic, pragmatic), but successive in that they can work at the interpretation of only one unit at a time. This seems to hold for both sign and spoken language. In the final section, conjectures are made about possible evolutionary explanations for these properties of language processing.
  • Levelt, W. J. M. (1994). Onder woorden brengen: Beschouwingen over het spreekproces. In Haarlemse voordrachten: voordrachten gehouden in de Hollandsche Maatschappij der Wetenschappen te Haarlem. Haarlem: Hollandsche maatschappij der wetenschappen.
  • Levelt, W. J. M., & Plomp, K. (1968). The appreciation of musical intervals. In J. M. M. Aler (Ed.), Proceedings of the fifth International Congress of Aesthetics, Amsterdam 1964 (pp. 901-904). The Hague: Mouton.
  • Levelt, W. J. M. (1979). The origins of language and language awareness. In M. Von Cranach, K. Foppa, W. Lepenies, & D. Ploog (Eds.), Human ethology (pp. 739-745). Cambridge: Cambridge University Press.
  • Levelt, W. J. M. (1994). The skill of speaking. In P. Bertelson, P. Eelen, & G. d'Ydewalle (Eds.), International perspectives on psychological science: Vol. 1. Leading themes (pp. 89-103). Hove: Erlbaum.
  • Levelt, W. J. M. (1994). What can a theory of normal speaking contribute to AAC? In ISAAC '94 Conference Book and Proceedings. Hoensbroek: IRV.
  • Levelt, W. J. M. (1980). Toegepaste aspecten van het taal-psychologisch onderzoek: Enkele inleidende overwegingen. In J. Matter (Ed.), Toegepaste aspekten van de taalpsychologie (pp. 3-11). Amsterdam: VU Boekhandel.
  • Levinson, S. C. (1994). Deixis. In R. E. Asher (Ed.), Encyclopedia of language and linguistics (pp. 853-857). Oxford: Pergamon Press.
  • Levinson, S. C. (1979). Pragmatics and social deixis: Reclaiming the notion of conventional implicature. In C. Chiarello (Ed.), Proceedings of the Fifth Annual Meeting of the Berkeley Linguistics Society (pp. 206-223).
  • Levinson, S. C., & Senft, G. (1994). Wie lösen Sprecher von Sprachen mit absoluten und relativen Systemen des räumlichen Verweisens nicht-sprachliche räumliche Aufgaben? In Jahrbuch der Max-Planck-Gesellschaft 1994 (pp. 295-299). München: Generalverwaltung der Max-Planck-Gesellschaft München.
  • Levinson, S. C. (2024). Culture as cognitive technology: An evolutionary perspective. In G. Bennardo, V. C. De Munck, & S. Chrisomalis (Eds.), Cognition in and out of the mind: Advances in cultural model theory (pp. 241-265). London: Palgrave Macmillan.

    Abstract

    Cognitive anthropology is in need of a theory that extends beyond cultural model theory and explains both how culture has transformed human cognition and the curious ontology of culture itself, for, as Durkheim insisted, culture cannot be reduced to psychology. This chapter promotes a framework that deals with both the evolutionary question and the ontological problem. It is argued that at least a central part of culture should be conceived of in terms of cognitive technology. Beginning with obvious examples of cognitive artifacts, like those used in measurement, way-finding, time-reckoning and numerical calculation, the chapter goes on to consider extensions to our communication systems, emotion-modulating systems and the cognitive division of labor. Cognitive artifacts form ‘coupled systems’ that amplify individual psychology, lying partly outside the head, and are honed by cultural evolution. They make clear how culture gave human cognition an evolutionary edge.
  • Levshina, N. (2021). Conditional inference trees and random forests. In M. Paquot, & T. Gries (Eds.), Practical Handbook of Corpus Linguistics (pp. 611-643). New York: Springer.
  • Liesenfeld, A., & Dingemanse, M. (2024). Rethinking open source generative AI: open-washing and the EU AI Act. In The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’24) (pp. 1774-1784). ACM.

    Abstract

    The past year has seen a steep rise in generative AI systems that claim to be open. But how open are they really? The question of what counts as open source in generative AI is poised to take on particular importance in light of the upcoming EU AI Act that regulates open source systems differently, creating an urgent need for practical openness assessment. Here we use an evidence-based framework that distinguishes 14 dimensions of openness, from training datasets to scientific and technical documentation and from licensing to access methods. Surveying over 45 generative AI systems (both text and text-to-image), we find that while the term open source is widely used, many models are `open weight' at best and many providers seek to evade scientific, legal and regulatory scrutiny by withholding information on training and fine-tuning data. We argue that openness in generative AI is necessarily composite (consisting of multiple elements) and gradient (coming in degrees), and point out the risk of relying on single features like access or licensing to declare models open or not. Evidence-based openness assessment can help foster a generative AI landscape in which models can be effectively regulated, model providers can be held accountable, scientists can scrutinise generative AI, and end users can make informed decisions.
  • Long, M., & Rubio-Fernandez, P. (2024). Beyond typicality: Lexical category affects the use and processing of color words. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 4925-4930).

    Abstract

    Speakers and listeners show an informativity bias in the use and interpretation of color modifiers. For example, speakers use color more often when referring to objects that vary in color than to objects with a prototypical color. Likewise, listeners look away from objects with prototypical colors upon hearing that color mentioned. Here we test whether speakers and listeners account for another factor related to informativity: the strength of the association between lexical categories and color. Our results demonstrate that speakers and listeners' choices are indeed influenced by this factor; as such, it should be integrated into current pragmatic theories of informativity and computational models of color reference.

    Additional information

    link to eScholarship
  • Lupyan, G., & Raviv, L. (2024). A cautionary note on sociodemographic predictors of linguistic complexity: Different measures and different analyses lead to different conclusions. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 345-348). Nijmegen: The Evolution of Language Conferences.
  • Mak, M., & Willems, R. M. (2021). Mental simulation during literary reading. In D. Kuiken, & A. M. Jacobs (Eds.), Handbook of empirical literary studies (pp. 63-84). Berlin: De Gruyter.

    Abstract

    Readers experience a number of sensations during reading. They do
    not – or do not only – process words and sentences in a detached, abstract
    manner. Instead they “perceive” what they read about. They see descriptions of
    scenery, feel what characters feel, and hear the sounds in a story. These sensa-
    tions tend to be grouped under the umbrella terms “mental simulation” and
    “mental imagery.” This chapter provides an overview of empirical research on
    the role of mental simulation during literary reading. Our chapter also discusses
    what mental simulation is and how it relates to mental imagery. Moreover, it
    explores how mental simulation plays a role in leading models of literary read-
    ing and investigates under what circumstances mental simulation occurs dur-
    ing literature reading. Finally, the effect of mental simulation on the literary
    reader’s experience is discussed, and suggestions and unresolved issues in this
    field are formulated.
  • Mamus, E., Speed, L. J., Ozyurek, A., & Majid, A. (2021). Sensory modality of input influences encoding of motion events in speech but not co-speech gestures. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 376-382). Vienna: Cognitive Science Society.

    Abstract

    Visual and auditory channels have different affordances and
    this is mirrored in what information is available for linguistic
    encoding. The visual channel has high spatial acuity, whereas
    the auditory channel has better temporal acuity. These
    differences may lead to different conceptualizations of events
    and affect multimodal language production. Previous studies of
    motion events typically present visual input to elicit speech and
    gesture. The present study compared events presented as audio-
    only, visual-only, or multimodal (visual+audio) input and
    assessed speech and co-speech gesture for path and manner of
    motion in Turkish. Speakers with audio-only input mentioned
    path more and manner less in verbal descriptions, compared to
    speakers who had visual input. There was no difference in the
    type or frequency of gestures across conditions, and gestures
    were dominated by path-only gestures. This suggests that input
    modality influences speakers’ encoding of path and manner of
    motion events in speech, but not in co-speech gestures.
  • Matteo, M., & Bosker, H. R. (2024). How to test gesture-speech integration in ten minutes. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 737-741). doi:10.21437/SpeechProsody.2024-149.

    Abstract

    Human conversations are inherently multimodal, including auditory speech, visual articulatory cues, and hand gestures. Recent studies demonstrated that the timing of a simple up-and-down hand movement, known as a beat gesture, can affect speech perception. A beat gesture falling on the first syllable of a disyllabic word induces a bias to perceive a strong-weak stress pattern (i.e., “CONtent”), while a beat gesture falling on the second syllable combined with the same acoustics biases towards a weak-strong stress pattern (“conTENT”). This effect, termed the “manual McGurk effect”, has been studied in both in-lab and online studies, employing standard experimental sessions lasting approximately forty minutes. The present work tests whether the manual McGurk effect can be observed in an online short version (“mini-test”) of the original paradigm, lasting only ten minutes. Additionally, we employ two different response modalities, namely a two-alternative forced choice and a visual analog scale. A significant manual McGurk effect was observed with both response modalities. Overall, the present study demonstrates the feasibility of employing a ten-minute manual McGurk mini-test to obtain a measure of gesture-speech integration. As such, it may lend itself for inclusion in large-scale test batteries that aim to quantify individual variation in language processing.
  • Merkx, D., & Frank, S. L. (2021). Human sentence processing: Recurrence or attention? In E. Chersoni, N. Hollenstein, C. Jacobs, Y. Oseki, L. Prévot, & E. Santus (Eds.), Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2021) (pp. 12-22). Stroudsburg, PA, USA: Association for Computational Linguistics (ACL). doi:10.18653/v1/2021.cmcl-1.2.

    Abstract

    Recurrent neural networks (RNNs) have long been an architecture of interest for computational models of human sentence processing. The recently introduced Transformer architecture outperforms RNNs on many natural language processing tasks but little is known about its ability to model human language processing. We compare Transformer- and RNN-based language models’ ability to account for measures of human reading effort. Our analysis shows Transformers to outperform RNNs in explaining self-paced reading times and neural activity during reading English sentences, challenging the widely held idea that human sentence processing involves recurrent and immediate processing and provides evidence for cue-based retrieval.
  • Merkx, D., Frank, S. L., & Ernestus, M. (2021). Semantic sentence similarity: Size does not always matter. In Proceedings of Interspeech 2021 (pp. 4393-4397). doi:10.21437/Interspeech.2021-1464.

    Abstract

    This study addresses the question whether visually grounded speech recognition (VGS) models learn to capture sentence semantics without access to any prior linguistic knowledge. We produce synthetic and natural spoken versions of a well known semantic textual similarity database and show that our VGS model produces embeddings that correlate well with human semantic similarity judgements. Our results show that a model trained on a small image-caption database outperforms two models trained on much larger databases, indicating that database size is not all that matters. We also investigate the importance of having multiple captions per image and find that this is indeed helpful even if the total number of images is lower, suggesting that paraphrasing is a valuable learning signal. While the general trend in the field is to create ever larger datasets to train models on, our findings indicate other characteristics of the database can just as important.
  • Mishra, C., Nandanwar, A., & Mishra, S. (2024). HRI in Indian education: Challenges opportunities. In H. Admoni, D. Szafir, W. Johal, & A. Sandygulova (Eds.), Designing an introductory HRI course (workshop at HRI 2024). ArXiv. doi:10.48550/arXiv.2403.12223.

    Abstract

    With the recent advancements in the field of robotics and the increased focus on having general-purpose robots widely available to the general public, it has become increasingly necessary to pursue research into Human-robot interaction (HRI). While there have been a lot of works discussing frameworks for teaching HRI in educational institutions with a few institutions already offering courses to students, a consensus on the course content still eludes the field. In this work, we highlight a few challenges and opportunities while designing an HRI course from an Indian perspective. These topics warrant further deliberations as they have a direct impact on the design of HRI courses and wider implications for the entire field.
  • Motiekaitytė, K., Grosseck, O., Wolf, L., Bosker, H. R., Peeters, D., Perlman, M., Ortega, G., & Raviv, L. (2024). Iconicity and compositionality in emerging vocal communication systems: a Virtual Reality approach. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 387-389). Nijmegen: The Evolution of Language Conferences.
  • Mudd, K., Lutzenberger, H., De Vos, C., & De Boer, B. (2021). Social structure and lexical uniformity: A case study of gender differences in the Kata Kolok community. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 2692-2698). Vienna: Cognitive Science Society.

    Abstract

    Language emergence is characterized by a high degree of lex-
    ical variation. It has been suggested that the speed at which
    lexical conventionalization occurs depends partially on social
    structure. In large communities, individuals receive input from
    many sources, creating a pressure for lexical convergence.
    In small, insular communities, individuals can remember id-
    iolects and share common ground with interlocuters, allow-
    ing these communities to retain a high degree of lexical vari-
    ation. We look at lexical variation in Kata Kolok, a sign lan-
    guage which emerged six generations ago in a Balinese vil-
    lage, where women tend to have more tightly-knit social net-
    works than men. We test if there are differing degrees of lexical
    uniformity between women and men by reanalyzing a picture
    description task in Kata Kolok. We find that women’s produc-
    tions exhibit less lexical uniformity than men’s. One possible
    explanation of this finding is that women’s more tightly-knit
    social networks allow for remembering idiolects, alleviating
    the pressure for lexical convergence, but social network data
    from the Kata Kolok community is needed to support this ex-
    planation.
  • Norris, D., McQueen, J. M., & Cutler, A. (1994). Competition and segmentation in spoken word recognition. In Proceedings of the Third International Conference on Spoken Language Processing: Vol. 1 (pp. 401-404). Yokohama: PACIFICO.

    Abstract

    This paper describes recent experimental evidence which shows that models of spoken word recognition must incorporate both inhibition between competing lexical candidates and a sensitivity to metrical cues to lexical segmentation. A new version of the Shortlist [1][2] model incorporating the Metrical Segmentation Strategy [3] provides a detailed simulation of the data.
  • Ozyurek, A. (1994). How children talk about a conversation. In K. Beals, J. Denton, R. Knippen, L. Melnar, H. Suzuki, & E. Zeinfeld (Eds.), Papers from the Thirtieth Regional Meeting of the Chicago Linguistic Society: Main Session (pp. 309-319). Chicago, Ill: Chicago Linguistic Society.
  • Ozyurek, A. (1994). How children talk about conversations: Development of roles and voices. In E. V. Clark (Ed.), Proceedings of the Twenty-Sixth Annual Child Language Research Forum (pp. 197-206). Stanford: CSLI Publications.
  • Peirolo, M., Meyer, A. S., & Frances, C. (2024). Investigating the causes of prosodic marking in self-repairs: An automatic process? In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 1080-1084). doi:10.21437/SpeechProsody.2024-218.

    Abstract

    Natural speech involves repair. These repairs are often highlighted through prosodic marking (Levelt & Cutler, 1983). Prosodic marking usually entails an increase in pitch, loudness, and/or duration that draws attention to the corrected word. While it is established that natural self-repairs typically elicit prosodic marking, the exact cause of this is unclear. This study investigates whether producing a prosodic marking emerges from an automatic correction process or has a communicative purpose. In the current study, we elicit corrections to test whether all self-corrections elicit prosodic marking. Participants carried out a picture-naming task in which they described two images presented on-screen. To prompt self-correction, the second image was altered in some cases, requiring participants to abandon their initial utterance and correct their description to match the new image. This manipulation was compared to a control condition in which only the orientation of the object would change, eliciting no self-correction while still presenting a visual change. We found that the replacement of the item did not elicit a prosodic marking, regardless of the type of change. Theoretical implications and research directions are discussed, in particular theories of prosodic planning.
  • Plate, L., Fisher, V. J., Nabibaks, F., & Feenstra, M. (2024). Feeling the traces of the Dutch colonial past: Dance as an affective methodology in Farida Nabibaks’s radiant shadow. In E. Van Bijnen, P. Brandon, K. Fatah-Black, I. Limon, W. Modest, & M. Schavemaker (Eds.), The future of the Dutch colonial past: From dialogues to new narratives (pp. 126-139). Amsterdam: Amsterdam University Press.
  • Pouw, W., Wit, J., Bögels, S., Rasenberg, M., Milivojevic, B., & Ozyurek, A. (2021). Semantically related gestures move alike: Towards a distributional semantics of gesture kinematics. In V. G. Duffy (Ed.), Digital human modeling and applications in health, safety, ergonomics and risk management. human body, motion and behavior:12th International Conference, DHM 2021, Held as Part of the 23rd HCI International Conference, HCII 2021 (pp. 269-287). Berlin: Springer. doi:10.1007/978-3-030-77817-0_20.
  • de Reus, K., Benítez-Burraco, A., Hersh, T. A., Groot, N., Lambert, M. L., Slocombe, K. E., Vernes, S. C., & Raviv, L. (2024). Self-domestication traits in vocal learning mammals. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 105-108). Nijmegen: The Evolution of Language Conferences.
  • Rohrer, P. L., Bujok, R., Van Maastricht, L., & Bosker, H. R. (2024). The timing of beat gestures affects lexical stress perception in Spanish. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings Speech Prosody 2024 (pp. 702-706). doi:10.21437/SpeechProsody.2024-142.

    Abstract

    It has been shown that when speakers produce hand gestures, addressees are attentive towards these gestures, using them to facilitate speech processing. Even relatively simple “beat” gestures are taken into account to help process aspects of speech such as prosodic prominence. In fact, recent evidence suggests that the timing of a beat gesture can influence spoken word recognition. Termed the manual McGurk Effect, Dutch participants, when presented with lexical stress minimal pair continua in Dutch, were biased to hear lexical stress on the syllable that coincided with a beat gesture. However, little is known about how this manual McGurk effect would surface in languages other than Dutch, with different acoustic cues to prominence, and variable gestures. Therefore, this study tests the effect in Spanish where lexical stress is arguably even more important, being a contrastive cue in the regular verb conjugation system. Results from 24 participants corroborate the effect in Spanish, namely that when given the same auditory stimulus, participants were biased to perceive lexical stress on the syllable that visually co-occurred with a beat gesture. These findings extend the manual McGurk effect to a different language, emphasizing the impact of gestures' timing on prosody perception and spoken word recognition.

Share this page