Publications

Displaying 301 - 400 of 571
  • Lai, C. S. L., Fisher, S. E., Hurst, J. A., Levy, E. R., Hodgson, S., Fox, M., Jeremiah, S., Povey, S., Jamison, D. C., Green, E. D., Vargha-Khadem, F., & Monaco, A. P. (2000). The SPCH1 region on human 7q31: Genomic characterization of the critical interval and localization of translocations associated with speech and language disorder. American Journal of Human Genetics, 67(2), 357-368. doi:10.1086/303011.

    Abstract

    The KE family is a large three-generation pedigree in which half the members are affected with a severe speech and language disorder that is transmitted as an autosomal dominant monogenic trait. In previously published work, we localized the gene responsible (SPCH1) to a 5.6-cM region of 7q31 between D7S2459 and D7S643. In the present study, we have employed bioinformatic analyses to assemble a detailed BAC-/PAC-based sequence map of this interval, containing 152 sequence tagged sites (STSs), 20 known genes, and >7.75 Mb of completed genomic sequence. We screened the affected chromosome 7 from the KE family with 120 of these STSs (average spacing <100 kb), but we did not detect any evidence of a microdeletion. Novel polymorphic markers were generated from the sequence and were used to further localize critical recombination breakpoints in the KE family. This allowed refinement of the SPCH1 interval to a region between new markers 013A and 330B, containing ∼6.1 Mb of completed sequence. In addition, we have studied two unrelated patients with a similar speech and language disorder, who have de novo translocations involving 7q31. Fluorescence in situ hybridization analyses with BACs/PACs from the sequence map localized the t(5;7)(q22;q31.2) breakpoint in the first patient (CS) to a single clone within the newly refined SPCH1 interval. This clone contains the CAGH44 gene, which encodes a brain-expressed protein containing a large polyglutamine stretch. However, we found that the t(2;7)(p23;q31.3) breakpoint in the second patient (BRD) resides within a BAC clone mapping >3.7 Mb distal to this, outside the current SPCH1 critical interval. Finally, we investigated the CAGH44 gene in affected individuals of the KE family, but we found no mutations in the currently known coding sequence. These studies represent further steps toward the isolation of the first gene to be implicated in the development of speech and language.
  • Lai, J., Chan, A., & Kidd, E. (2023). Relative clause comprehension in Cantonese-speaking children with and without developmental language disorder. PLoS One, 18: e0288021. doi:10.1371/journal.pone.0288021.

    Abstract

    Developmental Language Disorder (DLD), present in 2 out of every 30 children, affects primarily oral language abilities and development in the absence of associated biomedical conditions. We report the first experimental study that examines relative clause (RC) comprehension accuracy and processing (via looking preference) in Cantonese-speaking children with and without DLD, testing the predictions from competing domain-specific versus domain-general theoretical accounts. We compared children with DLD (N = 22) with their age-matched typically-developing (TD) children (AM-TD, N = 23) aged 6;6–9;7 and language-matched (and younger) TD children (YTD, N = 21) aged 4;7–7;6, using a referent selection task. Within-subject factors were: RC type (subject-RCs (SRCs) versus object-RCs (ORCs); relativizer (classifier (CL) versus relative marker ge3 RCs). Accuracy measures and looking preference to the target were analyzed using generalized linear mixed effects models. Results indicated Cantonese children with DLD scored significantly lower than their AM-TD peers in accuracy and processed RCs significantly slower than AM-TDs, but did not differ from the YTDs on either measure. Overall, while the results revealed evidence of a SRC advantage in the accuracy data, there was no indication of additional difficulty associated with ORCs in the eye-tracking data. All children showed a processing advantage for the frequent CL relativizer over the less frequent ge3 relativizer. These findings pose challenges to domain-specific representational deficit accounts of DLD, which primarily explain the disorder as a syntactic deficit, and are better explained by domain-general accounts that explain acquisition and processing as emergent properties of multiple converging linguistic and non-linguistic processes.

    Additional information

    S1 appendix
  • Lansner, A., Sandberg, A., Petersson, K. M., & Ingvar, M. (2000). On forgetful attractor network memories. In H. Malmgren, M. Borga, & L. Niklasson (Eds.), Artificial neural networks in medicine and biology: Proceedings of the ANNIMAB-1 Conference, Göteborg, Sweden, 13-16 May 2000 (pp. 54-62). Heidelberg: Springer Verlag.

    Abstract

    A recurrently connected attractor neural network with a Hebbian learning rule is currently our best ANN analogy for a piece cortex. Functionally biological memory operates on a spectrum of time scales with regard to induction and retention, and it is modulated in complex ways by sub-cortical neuromodulatory systems. Moreover, biological memory networks are commonly believed to be highly distributed and engage many co-operating cortical areas. Here we focus on the temporal aspects of induction and retention of memory in a connectionist type attractor memory model of a piece of cortex. A continuous time, forgetful Bayesian-Hebbian learning rule is described and compared to the characteristics of LTP and LTD seen experimentally. More generally, an attractor network implementing this learning rule can operate as a long-term, intermediate-term, or short-term memory. Modulation of the print-now signal of the learning rule replicates some experimental memory phenomena, like e.g. the von Restorff effect.
  • Laparle, S. (2023). Moving past the lexical affiliate with a frame-based analysis of gesture meaning. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527218.

    Abstract

    Interpreting the meaning of co-speech gesture often involves
    identifying a gesture’s ‘lexical affiliate’, the word or phrase to
    which it most closely relates (Schegloff 1984). Though there is
    work within gesture studies that resists this simplex mapping of
    meaning from speech to gesture (e.g. de Ruiter 2000; Kendon
    2014; Parrill 2008), including an evolving body of literature on
    recurrent gesture and gesture families (e.g. Fricke et al. 2014; Müller 2017), it is still the lexical affiliate model that is most ap-
    parent in formal linguistic models of multimodal meaning(e.g.
    Alahverdzhieva et al. 2017; Lascarides and Stone 2009; Puste-
    jovsky and Krishnaswamy 2021; Schlenker 2020). In this work,
    I argue that the lexical affiliate should be carefully reconsidered
    in the further development of such models.
    In place of the lexical affiliate, I suggest a further shift
    toward a frame-based, action schematic approach to gestural
    meaning in line with that proposed in, for example, Parrill and
    Sweetser (2004) and Müller (2017). To demonstrate the utility
    of this approach I present three types of compositional gesture
    sequences which I call spatial contrast, spatial embedding, and
    cooperative abstract deixis. All three rely on gestural context,
    rather than gesture-speech alignment, to convey interactive (i.e.
    pragmatic) meaning. The centrality of gestural context to ges-
    ture meaning in these examples demonstrates the necessity of
    developing a model of gestural meaning independent of its in-
    tegration with speech.
  • Lee, C., Jessop, A., Bidgood, A., Peter, M. S., Pine, J. M., Rowland, C. F., & Durrant, S. (2023). How executive functioning, sentence processing, and vocabulary are related at 3 years of age. Journal of Experimental Child Psychology, 233: 105693. doi:10.1016/j.jecp.2023.105693.

    Abstract

    There is a wealth of evidence demonstrating that executive function (EF) abilities are positively associated with language development during the preschool years, such that children with good executive functions also have larger vocabularies. However, why this is the case remains to be discovered. In this study, we focused on the hypothesis that sentence processing abilities mediate the association between EF skills and receptive vocabulary knowledge, in that the speed of language acquisition is at least partially dependent on a child’s processing ability, which is itself dependent on executive control. We tested this hypothesis in longitudinal data from a cohort of 3- and 4-year-old children at three age points (37, 43, and 49 months). We found evidence, consistent with previous research, for a significant association between three EF skills (cognitive flexibility, working memory [as measured by the Backward Digit Span], and inhibition) and receptive vocabulary knowledge across this age range. However, only one of the tested sentence processing abilities (the ability to maintain multiple possible referents in mind) significantly mediated this relationship and only for one of the tested EFs (inhibition). The results suggest that children who are better able to inhibit incorrect responses are also better able to maintain multiple possible referents in mind while a sentence unfolds, a sophisticated sentence processing ability that may facilitate vocabulary learning from complex input.

    Additional information

    table S1 code and data
  • Lehecka, T. (2023). Normative ratings for 111 Swedish nouns and corresponding picture stimuli. Nordic Journal of Linguistics, 46(1), 20-45. doi:10.1017/S0332586521000123.

    Abstract

    Normative ratings are a means to control for the effects of confounding variables in psycholinguistic experiments. This paper introduces a new dataset of normative ratings for Swedish encompassing 111 concrete nouns and the corresponding picture stimuli in the MultiPic database (Duñabeitia et al. 2017). The norms for name agreement, category typicality, age of acquisition and subjective frequency were collected using online surveys among native speakers of the Finland-Swedish variety of Swedish. The paper discusses the inter-correlations between these variables and compares them against available ratings for other languages. In doing so, the paper argues that ratings for age of acquisition and subjective frequency collected for other languages may be applied to psycholinguistic studies on Finland-Swedish, at least with respect to concrete and highly imageable nouns. In contrast, norms for name agreement should be collected from speakers of the same language variety as represented by the subjects in the actual experiments.
  • Lei, A., Willems, R. M., & Eekhof, L. S. (2023). Emotions, fast and slow: Processing of emotion words is affected by individual differences in need for affect and narrative absorption. Cognition and Emotion, 37(5), 997-1005. doi:10.1080/02699931.2023.2216445.

    Abstract

    Emotional words have consistently been shown to be processed differently than neutral words. However, few studies have examined individual variability in emotion word processing with longer, ecologically valid stimuli (beyond isolated words, sentences, or paragraphs). In the current study, we re-analysed eye-tracking data collected during story reading to reveal how individual differences in need for affect and narrative absorption impact the speed of emotion word reading. Word emotionality was indexed by affective-aesthetic potentials (AAP) calculated by a sentiment analysis tool. We found that individuals with higher levels of need for affect and narrative absorption read positive words more slowly. On the other hand, these individual differences did not influence the reading time of more negative words, suggesting that high need for affect and narrative absorption are characterised by a positivity bias only. In general, unlike most previous studies using more isolated emotion word stimuli, we observed a quadratic (U-shaped) effect of word emotionality on reading speed, such that both positive and negative words were processed more slowly than neutral words. Taken together, this study emphasises the importance of taking into account individual differences and task context when studying emotion word processing.
  • Lemaitre, H., Le Guen, Y., Tilot, A. K., Stein, J. L., Philippe, C., Mangin, J.-F., Fisher, S. E., & Frouin, V. (2023). Genetic variations within human gained enhancer elements affect human brain sulcal morphology. NeuroImage, 265: 119773. doi:10.1016/j.neuroimage.2022.119773.

    Abstract

    The expansion of the cerebral cortex is one of the most distinctive changes in the evolution of the human brain. Cortical expansion and related increases in cortical folding may have contributed to emergence of our capacities for high-order cognitive abilities. Molecular analysis of humans, archaic hominins, and non-human primates has allowed identification of chromosomal regions showing evolutionary changes at different points of our phylogenetic history. In this study, we assessed the contributions of genomic annotations spanning 30 million years to human sulcal morphology measured via MRI in more than 18,000 participants from the UK Biobank. We found that variation within brain-expressed human gained enhancers, regulatory genetic elements that emerged since our last common ancestor with Old World monkeys, explained more trait heritability than expected for the left and right calloso-marginal posterior fissures and the right central sulcus. Intriguingly, these are sulci that have been previously linked to the evolution of locomotion in primates and later on bipedalism in our hominin ancestors.

    Additional information

    tables
  • Levelt, W. J. M. (2000). Uit talloos veel miljoenen. Natuur & Techniek, 68(11), 90.
  • Levelt, W. J. M., & Ruijssenaars, A. (1995). Levensbericht Johan Joseph Dumont. In Jaarboek Koninklijke Nederlandse Akademie van Wetenschappen (pp. 31-36).
  • Levelt, W. J. M. (1995). Chapters of psychology: An interview with Wilhelm Wundt. In R. L. Solso, & D. W. Massaro (Eds.), The science of mind: 2001 and beyond (pp. 184-202). Oxford University Press.
  • Levelt, W. J. M. (2000). Dyslexie. Natuur & Techniek, 68(4), 64.
  • Levelt, W. J. M. (1982). Cognitive styles in the use of spatial direction terms. In R. Jarvella, & W. Klein (Eds.), Speech, place, and action: Studies in deixis and related topics (pp. 251-268). Chichester: Wiley.
  • Levelt, W. J. M. (2000). Met twee woorden spreken [Simon Dik Lezing 2000]. Amsterdam: Vossiuspers AUP.
  • Levelt, W. J. M. (1961). Michotte's theorie van de causaliteitswaarneming en de waarneming van remmingen. Hypothese: orgaan van de Psychologische Faculteit der Leidse Studenten, 5(4), 1-21.
  • Levelt, W. J. M. (1986). Herdenking van Joseph Maria Franciscus Jaspars (16 maart 1934 - 31 juli 1985). In Jaarboek 1986 Koninklijke Nederlandse Akademie van Wetenschappen (pp. 187-189). Amsterdam: North Holland.
  • Levelt, W. J. M. (1982). Het lineariseringsprobleem van de spreker. Tijdschrift voor Taal- en Tekstwetenschap (TTT), 2(1), 1-15.
  • Levelt, W. J. M. (1995). Hoezo 'neuro'? Hoezo 'linguïstisch'? Intermediair, 31(46), 32-37.
  • Levelt, W. J. M. (1982). Linearization in describing spatial networks. In S. Peters, & E. Saarinen (Eds.), Processes, beliefs, and questions (pp. 199-220). Dordrecht - Holland: D. Reidel.

    Abstract

    The topic of this paper is the way in which speakers order information in discourse. I will refer to this issue with the term "linearization", and will begin with two types of general remarks. The first one concerns the scope and relevance of the problem with reference to some existing literature. The second set of general remarks will be about the place of linearization in a theory of the speaker. The following, and main part of this paper, will be a summary report of research of linearization in a limited, but well-defined domain of discourse, namely the description of spatial networks.
  • Levelt, W. J. M. (2000). Links en rechts: Waarom hebben we zo vaak problemen met die woorden? Natuur & Techniek, 68(7/8), 90.
  • Levelt, W. J. M. (2000). Introduction Section VII: Language. In M. S. Gazzaniga (Ed.), The new cognitive neurosciences; 2nd ed. (pp. 843-844). Cambridge: MIT Press.
  • Levelt, W. J. M. (1980). On-line processing constraints on the properties of signed and spoken language. In U. Bellugi, & M. Studdert-Kennedy (Eds.), Signed and spoken language: Biological constraints on linguistic form (pp. 141-160). Weinheim: Verlag Chemie.

    Abstract

    It is argued that the dominantly successive nature of language is largely mode-independent and holds equally for sign and for spoken language. A preliminary distinction is made between what is simultaneous or successive in the signal, and what is in the process; these need not coincide, and it is the successiveness of the process that is at stake. It is then discussed extensively for the word/sign level, and in a more preliminary fashion for the clause and discourse level that online processes are parallel in that they can simultaneously draw on various sources of knowledge (syntactic, semantic, pragmatic), but successive in that they can work at the interpretation of only one unit at a time. This seems to hold for both sign and spoken language. In the final section, conjectures are made about possible evolutionary explanations for these properties of language processing.
  • Levelt, W. J. M. (1995). Psycholinguistics. In C. C. French, & A. M. Colman (Eds.), Cognitive psychology (reprint, pp. 39- 57). London: Longman.
  • Levelt, W. J. M. (2000). Psychology of language. In K. Pawlik, & M. R. Rosenzweig (Eds.), International handbook of psychology (pp. 151-167). London: SAGE publications.
  • Levelt, W. J. M. (1995). The ability to speak: From intentions to spoken words. European Review, 3(1), 13-23. doi:10.1017/S1062798700001290.

    Abstract

    In recent decades, psychologists have become increasingly interested in our ability to speak. This paper sketches the present theoretical perspective on this most complex skill of homo sapiens. The generation of fluent speech is based on the interaction of various processing components. These mechanisms are highly specialized, dedicated to performing specific subroutines, such as retrieving appropriate words, generating morpho-syntactic structure, computing the phonological target shape of syllables, words, phrases and whole utterances, and creating and executing articulatory programmes. As in any complex skill, there is a self-monitoring mechanism that checks the output. These component processes are targets of increasingly sophisticated experimental research, of which this paper presents a few salient examples.
  • Levelt, C. C., Schiller, N. O., & Levelt, W. J. M. (2000). The acquisition of syllable types. Language Acquisition, 8(3), 237-263. doi:10.1207/S15327817LA0803_2.

    Abstract

    In this article, we present an account of developmental data regarding the acquisition of syllable types. The data come from a longitudinal corpus of phonetically transcribed speech of 12 children acquiring Dutch as their first language. A developmental order of acquisition of syllable types was deduced by aligning the syllabified data on a Guttman scale. This order could be analyzed as following from an initial ranking and subsequent rerankings in the grammar of the structural constraints ONSET, NO-CODA, *COMPLEX-O, and *COMPLEX-C; some local conjunctions of these constraints; and a faithfulness constraint FAITH. The syllable type frequencies in the speech surrounding the language learner are also considered. An interesting correlation is found between the frequencies and the order of development of the different syllable types.
  • Levelt, W. J. M. (2000). The brain does not serve linguistic theory so easily [Commentary to target article by Grodzinksy]. Behavioral and Brain Sciences, 23(1), 40-41.
  • Levelt, W. J. M., & Flores d'Arcais, G. B. (1975). Some psychologists' reactions to the Symposium of Dynamic Aspects of Speech Perception. In A. Cohen, & S. Nooteboom (Eds.), Structure and process in speech perception (pp. 345-351). Berlin: Springer.
  • Levelt, W. J. M. (2000). Speech production. In A. E. Kazdin (Ed.), Encyclopedia of psychology (pp. 432-433). Oxford University Press.
  • Levelt, W. J. M., & Kelter, S. (1982). Surface form and memory in question answering. Cognitive Psychology, 14, 78-106. doi:10.1016/0010-0285(82)90005-6.

    Abstract

    Speakers tend to repeat materials from previous talk. This tendency is experimentally established and manipulated in various question-answering situations. It is shown that a question's surface form can affect the format of the answer given, even if this form has little semantic or conversational consequence, as in the pair Q: (At) what time do you close. A: “(At)five o'clock.” Answerers tend to match the utterance to the prepositional (nonprepositional) form of the question. This “correspondence effect” may diminish or disappear when, following the question, additional verbal material is presented to the answerer. The experiments show that neither the articulatory buffer nor long-term memory is normally involved in this retention of recent speech. Retaining recent speech in working memory may fulfill a variety of functions for speaker and listener, among them the correct production and interpretation of surface anaphora. Reusing recent materials may, moreover, be more economical than regenerating speech anew from a semantic base, and thus contribute to fluency. But the realization of this strategy requires a production system in which linguistic formulation can take place relatively independent of, and parallel to, conceptual planning.
  • Levelt, W. J. M. (1975). Systems, skills and language learning. In A. Van Essen, & J. Menting (Eds.), The context of foreign language learning (pp. 83-99). Assen: Van Gorcum.
  • Levelt, W. J. M. (1982). Science policy: Three recent idols, and a goddess. IPO Annual Progress Report, 17, 32-35.
  • Levelt, W. J. M., & Kempen, G. (1975). Semantic and syntactic aspects of remembering sentences: A review of some recent continental research. In A. Kennedy, & W. Wilkes (Eds.), Studies in long term memory (pp. 201-216). New York: Wiley.
  • Levelt, W. J. M., & Indefrey, P. (2000). The speaking mind/brain: Where do spoken words come from? In A. Marantz, Y. Miyashita, & W. O'Neil (Eds.), Image, language, brain: Papers from the First Mind Articulation Project Symposium (pp. 77-94). Cambridge, Mass.: MIT Press.
  • Levelt, W. J. M. (1975). What became of LAD? [Essay]. Lisse: Peter de Ridder Press.

    Abstract

    PdR Press publications in cognition ; 1
  • Levelt, W. J. M. (1980). Toegepaste aspecten van het taal-psychologisch onderzoek: Enkele inleidende overwegingen. In J. Matter (Ed.), Toegepaste aspekten van de taalpsychologie (pp. 3-11). Amsterdam: VU Boekhandel.
  • Levelt, W. J. M., & Meyer, A. S. (2000). Word for word: Multiple lexical access in speech production. European Journal of Cognitive Psychology, 12(4), 433-452. doi:10.1080/095414400750050178.

    Abstract

    It is quite normal for us to produce one or two million word tokens every year. Speaking is a dear occupation and producing words is at the core of it. Still, producing even a single word is a highly complex affair. Recently, Levelt, Roelofs, and Meyer (1999) reviewed their theory of lexical access in speech production, which dissects the word-producing mechanism as a staged application of various dedicated operations. The present paper begins by presenting a bird eye's view of this mechanism. We then square the complexity by asking how speakers control multiple access in generating simple utterances such as a table and a chair. In particular, we address two issues. The first one concerns dependency: Do temporally contiguous access procedures interact in any way, or do they run in modular fashion? The second issue concerns temporal alignment: How much temporal overlap of processing does the system tolerate in accessing multiple content words, such as table and chair? Results from picture-word interference and eye tracking experiments provide evidence for restricted cases of dependency as well as for constraints on the temporal alignment of access procedures.
  • Levelt, W. J. M. (1982). Zelfcorrecties in het spreekproces. KNAW: Mededelingen van de afdeling letterkunde, nieuwe reeks, 45(8), 215-228.
  • Levelt, W. J. M. (1986). Zur sprachlichen Abbildung des Raumes: Deiktische und intrinsische Perspektive. In H. Bosshardt (Ed.), Perspektiven auf Sprache. Interdisziplinäre Beiträge zum Gedenken an Hans Hörmann (pp. 187-211). Berlin: De Gruyter.
  • Levinson, S. C. (1995). 'Logical' Connectives in Natural Language: A First Questionnaire. In D. Wilkins (Ed.), Extensions of space and beyond: manual for field elicitation for the 1995 field season (pp. 61-69). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.3513476.

    Abstract

    It has been hypothesised that human reasoning has a non-linguistic foundation, but is nevertheless influenced by the formal means available in a language. For example, Western logic is transparently related to European sentential connectives (e.g., and, if … then, or, not), some of which cannot be unambiguously expressed in other languages. The questionnaire explores reasoning tools and practices through investigating translation equivalents of English sentential connectives and collecting examples of “reasoned arguments”.
  • Levinson, S. C. (1982). Caste rank and verbal interaction in Western Tamilnadu. In D. B. McGilvray (Ed.), Caste ideology and interaction (pp. 98-203). Cambridge University Press.
  • Levinson, S. C. (2000). Language as nature and language as art. In J. Mittelstrass, & W. Singer (Eds.), Proceedings of the Symposium on ‘Changing concepts of nature and the turn of the Millennium (pp. 257-287). Vatican City: Pontificae Academiae Scientiarium Scripta Varia.
  • Levinson, S. C. (2000). H.P. Grice on location on Rossel Island. In S. S. Chang, L. Liaw, & J. Ruppenhofer (Eds.), Proceedings of the 25th Annual Meeting of the Berkeley Linguistic Society (pp. 210-224). Berkeley: Berkeley Linguistic Society.
  • Levinson, S. C. (1995). Interactional biases in human thinking. In E. N. Goody (Ed.), Social intelligence and interaction (pp. 221-260). Cambridge: Cambridge University Press.
  • Levinson, S. C. (2000). Presumptive meanings: The theory of generalized conversational implicature. Cambridge: MIT press.
  • Levinson, S. C. (1982). Speech act theory: The state of the art. In V. Kinsella (Ed.), Surveys 2. Eight state-of-the-art articles on key areas in language teaching. Cambridge University Press.
  • Levinson, S. C. (1980). Speech act theory: The state of the art. Language teaching and linguistics: Abstracts, 5-24.

    Abstract

    Survey article
  • Levinson, S. C. (1995). Three levels of meaning. In F. Palmer (Ed.), Grammar and meaning: Essays in honour of Sir John Lyons (pp. 90-115). Cambridge University Press.
  • Levinson, S. C. (2000). Yélî Dnye and the theory of basic color terms. Journal of Linguistic Anthropology, 10( 1), 3-55. doi:10.1525/jlin.2000.10.1.3.

    Abstract

    The theory of basic color terms was a crucial factor in the demise of linguistic relativity. The theory is now once again under scrutiny and fundamental revision. This article details a case study that undermines one of the central claims of the classical theory, namely that languages universally treat color as a unitary domain, to be exhaustively named. Taken together with other cases, the study suggests that a number of languages have only an incipient color terminology, raising doubts about the linguistic universality of such terminology.
  • Levinson, S. C. (2023). On cognitive artifacts. In R. Feldhay (Ed.), The evolution of knowledge: A scientific meeting in honor of Jürgen Renn (pp. 59-78). Berlin: Max Planck Institute for the History of Science.

    Abstract

    Wearing the hat of a cognitive anthropologist rather than an historian, I will try to amplify the ideas of Renn’s cited above. I argue that a particular subclass of material objects, namely “cognitive artifacts,” involves a close coupling of mind and artifact that acts like a brain prosthesis. Simple cognitive artifacts are external objects that act as aids to internal
    computation, and not all cultures have extended inventories of these. Cognitive artifacts in this sense (e.g., calculating or measuring devices) have clearly played a central role in the history of science. But the notion can be widened to take in less material externalizations of cognition, like writing and language itself. A critical question here is how and why this close coupling of internal computation and external device actually works, a rather neglected question to which I’ll suggest some answers.

    Additional information

    link to book
  • Levinson, S. C. (2023). Gesture, spatial cognition and the evolution of language. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 378(1875): 20210481. doi:10.1098/rstb.2021.0481.

    Abstract

    Human communication displays a striking contrast between the diversity of languages and the universality of the principles underlying their use in conversation. Despite the importance of this interactional base, it is not obvious that it heavily imprints the structure of languages. However, a deep-time perspective suggests that early hominin communication was gestural, in line with all the other Hominidae. This gestural phase of early language development seems to have left its traces in the way in which spatial concepts, implemented in the hippocampus, provide organizing principles at the heart of grammar.
  • Levshina, N. (2023). Communicative efficiency: Language structure and use. Cambridge: Cambridge University Press.

    Abstract

    All living beings try to save effort, and humans are no exception. This groundbreaking book shows how we save time and energy during communication by unconsciously making efficient choices in grammar, lexicon and phonology. It presents a new theory of 'communicative efficiency', the idea that language is designed to be as efficient as possible, as a system of communication. The new framework accounts for the diverse manifestations of communicative efficiency across a typologically broad range of languages, using various corpus-based and statistical approaches to explain speakers' bias towards efficiency. The author's unique interdisciplinary expertise allows her to provide rich evidence from a broad range of language sciences. She integrates diverse insights from over a hundred years of research into this comprehensible new theory, which she presents step-by-step in clear and accessible language. It is essential reading for language scientists, cognitive scientists and anyone interested in language use and communication.
  • Levshina, N., Namboodiripad, S., Allassonnière-Tang, M., Kramer, M., Talamo, L., Verkerk, A., Wilmoth, S., Garrido Rodriguez, G., Gupton, T. M., Kidd, E., Liu, Z., Naccarato, C., Nordlinger, R., Panova, A., & Stoynova, N. (2023). Why we need a gradient approach to word order. Linguistics, 61(4), 825-883. doi:10.1515/ling-2021-0098.

    Abstract

    This article argues for a gradient approach to word order, which treats word order preferences, both within and across languages, as a continuous variable. Word order variability should be regarded as a basic assumption, rather than as something exceptional. Although this approach follows naturally from the emergentist usage-based view of language, we argue that it can be beneficial for all frameworks and linguistic domains, including language acquisition, processing, typology, language contact, language evolution and change, and formal approaches. Gradient approaches have been very fruitful in some domains, such as language processing, but their potential is not fully realized yet. This may be due to practical reasons. We discuss the most pressing methodological challenges in corpus-based and experimental research of word order and propose some practical solutions.
  • Levshina, N. (2023). Testing communicative and learning biases in a causal model of language evolution:A study of cues to Subject and Object. In M. Degano, T. Roberts, G. Sbardolini, & M. Schouwstra (Eds.), The Proceedings of the 23rd Amsterdam Colloquium (pp. 383-387). Amsterdam: University of Amsterdam.
  • Levshina, N. (2023). Word classes in corpus linguistics. In E. Van Lier (Ed.), The Oxford handbook of word classes (pp. 833-850). Oxford: Oxford University Press. doi:10.1093/oxfordhb/9780198852889.013.34.

    Abstract

    Word classes play a central role in corpus linguistics under the name of parts of speech (POS). Many popular corpora are provided with POS tags. This chapter gives examples of popular tagsets and discusses the methods of automatic tagging. It also considers bottom-up approaches to POS induction, which are particularly important for the ‘poverty of stimulus’ debate in language acquisition research. The choice of optimal POS tagging involves many difficult decisions, which are related to the level of granularity, redundancy at different levels of corpus annotation, cross-linguistic applicability, language-specific descriptive adequacy, and dealing with fuzzy boundaries between POS. The chapter also discusses the problem of flexible word classes and demonstrates how corpus data with POS tags and syntactic dependencies can be used to quantify the level of flexibility in a language.
  • Lewis, A. G., Schoffelen, J.-M., Bastiaansen, M., & Schriefers, H. (2023). Is beta in agreement with the relatives? Using relative clause sentences to investigate MEG beta power dynamics during sentence comprehension. Psychophysiology, 60(10): e14332. doi:10.1111/psyp.14332.

    Abstract

    There remains some debate about whether beta power effects observed during sentence comprehension reflect ongoing syntactic unification operations (beta-syntax hypothesis), or instead reflect maintenance or updating of the sentence-level representation (beta-maintenance hypothesis). In this study, we used magnetoencephalography to investigate beta power neural dynamics while participants read relative clause sentences that were initially ambiguous between a subject- or an object-relative reading. An additional condition included a grammatical violation at the disambiguation point in the relative clause sentences. The beta-maintenance hypothesis predicts a decrease in beta power at the disambiguation point for unexpected (and less preferred) object-relative clause sentences and grammatical violations, as both signal a need to update the sentence-level representation. While the beta-syntax hypothesis also predicts a beta power decrease for grammatical violations due to a disruption of syntactic unification operations, it instead predicts an increase in beta power for the object-relative clause condition because syntactic unification at the point of disambiguation becomes more demanding. We observed decreased beta power for both the agreement violation and object-relative clause conditions in typical left hemisphere language regions, which provides compelling support for the beta-maintenance hypothesis. Mid-frontal theta power effects were also present for grammatical violations and object-relative clause sentences, suggesting that violations and unexpected sentence interpretations are registered as conflicts by the brain's domain-general error detection system.

    Additional information

    data
  • Liesenfeld, A., Lopez, A., & Dingemanse, M. (2023). Opening up ChatGPT: Tracking Openness, Transparency, and Accountability in Instruction-Tuned Text Generators. In CUI '23: Proceedings of the 5th International Conference on Conversational User Interfaces. doi:10.1145/3571884.3604316.

    Abstract

    Large language models that exhibit instruction-following behaviour represent one of the biggest recent upheavals in conversational interfaces, a trend in large part fuelled by the release of OpenAI's ChatGPT, a proprietary large language model for text generation fine-tuned through reinforcement learning from human feedback (LLM+RLHF). We review the risks of relying on proprietary software and survey the first crop of open-source projects of comparable architecture and functionality. The main contribution of this paper is to show that openness is differentiated, and to offer scientific documentation of degrees of openness in this fast-moving field. We evaluate projects in terms of openness of code, training data, model weights, RLHF data, licensing, scientific documentation, and access methods. We find that while there is a fast-growing list of projects billing themselves as 'open source', many inherit undocumented data of dubious legality, few share the all-important instruction-tuning (a key site where human labour is involved), and careful scientific documentation is exceedingly rare. Degrees of openness are relevant to fairness and accountability at all points, from data collection and curation to model architecture, and from training and fine-tuning to release and deployment.
  • Liesenfeld, A., Lopez, A., & Dingemanse, M. (2023). The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems. In Proceedings of the 24rd Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDial 2023). doi:10.18653/v1/2023.sigdial-1.45.

    Abstract

    Speech recognition systems are a key intermediary in voice-driven human-computer interaction. Although speech recognition works well for pristine monologic audio, real-life use cases in open-ended interactive settings still present many challenges. We argue that timing is mission-critical for dialogue systems, and evaluate 5 major commercial ASR systems for their conversational and multilingual support. We find that word error rates for natural conversational data in 6 languages remain abysmal, and that overlap remains a key challenge (study 1). This impacts especially the recognition of conversational words (study 2), and in turn has dire consequences for downstream intent recognition (study 3). Our findings help to evaluate the current state of conversational ASR, contribute towards multidimensional error analysis and evaluation, and identify phenomena that need most attention on the way to build robust interactive speech technologies.
  • Lingwood, J., Lampropoulou, S., De Bezena, C., Billington, J., & Rowland, C. F. (2023). Children’s engagement and caregivers’ use of language-boosting strategies during shared book reading: A mixed methods approach. Journal of Child Language, 50(6), 1436-1458. doi:10.1017/S0305000922000290.

    Abstract

    For shared book reading to be effective for language development, the adult and child need to be highly engaged. The current paper adopted a mixed-methods approach to investigate caregiver’s language-boosting behaviours and children’s engagement during shared book reading. The results revealed there were more instances of joint attention and caregiver’s use of prompts during moments of higher engagement. However, instances of most language-boosting behaviours were similar across episodes of higher and lower engagement. Qualitative analysis assessing the link between children’s engagement and caregiver’s use of speech acts, revealed that speech acts do seem to contribute to high engagement, in combination with other aspects of the interaction.
  • Liszkowski, U. (2000). A belief about theory of mind: The relation between children's inhibitory control and their common sense psychological knowledge. Master Thesis, University of Essex.
  • Lumaca, M., Bonetti, L., Brattico, E., Baggio, G., Ravignani, A., & Vuust, P. (2023). High-fidelity transmission of auditory symbolic material is associated with reduced right–left neuroanatomical asymmetry between primary auditory regions. Cerebral Cortex, 33(11), 6902-6919. doi:10.1093/cercor/bhad009.

    Abstract

    The intergenerational stability of auditory symbolic systems, such as music, is thought to rely on brain processes that allow the faithful transmission of complex sounds. Little is known about the functional and structural aspects of the human brain which support this ability, with a few studies pointing to the bilateral organization of auditory networks as a putative neural substrate. Here, we further tested this hypothesis by examining the role of left–right neuroanatomical asymmetries between auditory cortices. We collected neuroanatomical images from a large sample of participants (nonmusicians) and analyzed them with Freesurfer’s surface-based morphometry method. Weeks after scanning, the same individuals participated in a laboratory experiment that simulated music transmission: the signaling games. We found that high accuracy in the intergenerational transmission of an artificial tone system was associated with reduced rightward asymmetry of cortical thickness in Heschl’s sulcus. Our study suggests that the high-fidelity copying of melodic material may rely on the extent to which computational neuronal resources are distributed across hemispheres. Our data further support the role of interhemispheric brain organization in the cultural transmission and evolution of auditory symbolic systems.
  • Mak, M., Faber, M., & Willems, R. M. (2023). Different kinds of simulation during literary reading: Insights from a combined fMRI and eye-tracking study. Cortex, 162, 115-135. doi:10.1016/j.cortex.2023.01.014.

    Abstract

    Mental simulation is an important aspect of narrative reading. In a previous study, we found that gaze durations are differentially impacted by different kinds of mental simulation. Motor simulation, perceptual simulation, and mentalizing as elicited by literary short stories influenced eye movements in distinguishable ways (Mak & Willems, 2019). In the current study, we investigated the existence of a common neural locus for these different kinds of simulation. We additionally investigated whether individual differences during reading, as indexed by the eye movements, are reflected in domain-specific activations in the brain. We found a variety of brain areas activated by simulation-eliciting content, both modality-specific brain areas and a general simulation area. Individual variation in percent signal change in activated areas was related to measures of story appreciation as well as personal characteristics (i.e., transportability, perspective taking). Taken together, these findings suggest that mental simulation is supported by both domain-specific processes grounded in previous experiences, and by the neural mechanisms that underlie higher-order language processing (e.g., situation model building, event indexing, integration).

    Additional information

    figures localizer tasks appendix C1
  • Mamus, E., Speed, L. J., Rissman, L., Majid, A., & Özyürek, A. (2023). Lack of visual experience affects multimodal language production: Evidence from congenitally blind and sighted people. Cognitive Science, 47(1): e13228. doi:10.1111/cogs.13228.

    Abstract

    The human experience is shaped by information from different perceptual channels, but it is still debated whether and how differential experience influences language use. To address this, we compared congenitally blind, blindfolded, and sighted people's descriptions of the same motion events experienced auditorily by all participants (i.e., via sound alone) and conveyed in speech and gesture. Comparison of blind and sighted participants to blindfolded participants helped us disentangle the effects of a lifetime experience of being blind versus the task-specific effects of experiencing a motion event by sound alone. Compared to sighted people, blind people's speech focused more on path and less on manner of motion, and encoded paths in a more segmented fashion using more landmarks and path verbs. Gestures followed the speech, such that blind people pointed to landmarks more and depicted manner less than sighted people. This suggests that visual experience affects how people express spatial events in the multimodal language and that blindness may enhance sensitivity to paths of motion due to changes in event construal. These findings have implications for the claims that language processes are deeply rooted in our sensory experiences.
  • Mamus, E., Speed, L., Özyürek, A., & Majid, A. (2023). The effect of input sensory modality on the multimodal encoding of motion events. Language, Cognition and Neuroscience, 38(5), 711-723. doi:10.1080/23273798.2022.2141282.

    Abstract

    Each sensory modality has different affordances: vision has higher spatial acuity than audition, whereas audition has better temporal acuity. This may have consequences for the encoding of events and its subsequent multimodal language production—an issue that has received relatively little attention to date. In this study, we compared motion events presented as audio-only, visual-only, or multimodal (visual + audio) input and measured speech and co-speech gesture depicting path and manner of motion in Turkish. Input modality affected speech production. Speakers with audio-only input produced more path descriptions and fewer manner descriptions in speech compared to speakers who received visual input. In contrast, the type and frequency of gestures did not change across conditions. Path-only gestures dominated throughout. Our results suggest that while speech is more susceptible to auditory vs. visual input in encoding aspects of motion events, gesture is less sensitive to such differences.

    Additional information

    Supplemental material
  • Manhardt, F., Brouwer, S., Van Wijk, E., & Özyürek, A. (2023). Word order preference in sign influences speech in hearing bimodal bilinguals but not vice versa: Evidence from behavior and eye-gaze. Bilingualism: Language and Cognition, 26(1), 48-61. doi:10.1017/S1366728922000311.

    Abstract

    We investigated cross-modal influences between speech and sign in hearing bimodal bilinguals, proficient in a spoken and a sign language, and its consequences on visual attention during message preparation using eye-tracking. We focused on spatial expressions in which sign languages, unlike spoken languages, have a modality-driven preference to mention grounds (big objects) prior to figures (smaller objects). We compared hearing bimodal bilinguals’ spatial expressions and visual attention in Dutch and Dutch Sign Language (N = 18) to those of their hearing non-signing (N = 20) and deaf signing peers (N = 18). In speech, hearing bimodal bilinguals expressed more ground-first descriptions and fixated grounds more than hearing non-signers, showing influence from sign. In sign, they used as many ground-first descriptions as deaf signers and fixated grounds equally often, demonstrating no influence from speech. Cross-linguistic influence of word order preference and visual attention in hearing bimodal bilinguals appears to be one-directional modulated by modality-driven differences.
  • Marslen-Wilsen, W., & Tyler, L. K. (Eds.). (1980). Max-Planck-Institute for Psycholinguistics: Annual Report Nr.1 1980. Nijmegen: MPI for Psycholinguistics.
  • Maskalenka, K., Alagöz, G., Krueger, F., Wright, J., Rostovskaya, M., Nakhuda, A., Bendall, A., Krueger, C., Walker, S., Scally, A., & Rugg-Gunn, P. J. (2023). NANOGP1, a tandem duplicate of NANOG, exhibits partial functional conservation in human naïve pluripotent stem cells. Development, 150(2): dev201155. doi:10.1242/dev.201155.

    Abstract

    Gene duplication events can drive evolution by providing genetic material for new gene functions, and they create opportunities for diverse developmental strategies to emerge between species. To study the contribution of duplicated genes to human early development, we examined the evolution and function of NANOGP1, a tandem duplicate of the transcription factor NANOG. We found that NANOGP1 and NANOG have overlapping but distinct expression profiles, with high NANOGP1 expression restricted to early epiblast cells and naïve-state pluripotent stem cells. Sequence analysis and epitope-tagging revealed that NANOGP1 is protein coding with an intact homeobox domain. The duplication that created NANOGP1 occurred earlier in primate evolution than previously thought and has been retained only in great apes, whereas Old World monkeys have disabled the gene in different ways, including homeodomain point mutations. NANOGP1 is a strong inducer of naïve pluripotency; however, unlike NANOG, it is not required to maintain the undifferentiated status of human naïve pluripotent cells. By retaining expression, sequence and partial functional conservation with its ancestral copy, NANOGP1 exemplifies how gene duplication and subfunctionalisation can contribute to transcription factor activity in human pluripotency and development.
  • Mazzini, S., Holler, J., & Drijvers, L. (2023). Studying naturalistic human communication using dual-EEG and audio-visual recordings. STAR Protocols, 4(3): 102370. doi:10.1016/j.xpro.2023.102370.

    Abstract

    We present a protocol to study naturalistic human communication using dual-EEG and audio-visual recordings. We describe preparatory steps for data collection including setup preparation, experiment design, and piloting. We then describe the data collection process in detail which consists of participant recruitment, experiment room preparation, and data collection. We also outline the kinds of research questions that can be addressed with the current protocol, including several analysis possibilities, from conversational to advanced time-frequency analyses.
    For complete details on the use and execution of this protocol, please refer to Drijvers and Holler (2022).
  • McConnell, K. (2023). Individual Differences in Holistic and Compositional Language Processing. Journal of Cognition, 6. doi:10.5334/joc.283.

    Abstract

    Individual differences in cognitive abilities are ubiquitous across the spectrum of proficient language users. Although speakers differ with regard to their memory capacity, ability for inhibiting distraction, and ability to shift between different processing levels, comprehension is generally successful. However, this does not mean it is identical across individuals; listeners and readers may rely on different processing strategies to exploit distributional information in the service of efficient understanding. In the following psycholinguistic reading experiment, we investigate potential sources of individual differences in the processing of co-occurring words. Participants read modifier-noun bigrams like absolute silence in a self-paced reading task. Backward transition probability (BTP) between the two lexemes was used to quantify the prominence of the bigram as a whole in comparison to the frequency of its parts. Of five individual difference measures (processing speed, verbal working memory, cognitive inhibition, global-local scope shifting, and personality), two proved to be significantly associated with the effect of BTP on reading times. Participants who could inhibit a distracting global environment in order to more efficiently retrieve a single part and those that preferred the local level in the shifting task showed greater effects of the co-occurrence probability of the parts. We conclude that some participants are more likely to retrieve bigrams via their parts and their co-occurrence statistics whereas others more readily retrieve the two words together as a single chunked unit.
  • McLean, B., Dunn, M., & Dingemanse, M. (2023). Two measures are better than one: Combining iconicity ratings and guessing experiments for a more nuanced picture of iconicity in the lexicon. Language and Cognition, 15(4), 719-739. doi:10.1017/langcog.2023.9.

    Abstract

    Iconicity in language is receiving increased attention from many fields, but our understanding of iconicity is only as good as the measures we use to quantify it. We collected iconicity measures for 304 Japanese words from English-speaking participants, using rating and guessing tasks. The words included ideophones (structurally marked depictive words) along with regular lexical items from similar semantic domains (e.g., fuwafuwa ‘fluffy’, jawarakai ‘soft’). The two measures correlated, speaking to their validity. However, ideophones received consistently higher iconicity ratings than other items, even when guessed at the same accuracies, suggesting the rating task is more sensitive to cues like structural markedness that frame words as iconic. These cues did not always guide participants to the meanings of ideophones in the guessing task, but they did make them more confident in their guesses, even when they were wrong. Consistently poor guessing results reflect the role different experiences play in shaping construals of iconicity. Using multiple measures in tandem allows us to explore the interplay between iconicity and these external factors. To facilitate this, we introduce a reproducible workflow for creating rating and guessing tasks from standardised wordlists, while also making improvements to the robustness, sensitivity and discriminability of previous approaches.
  • McQueen, J. M., Cutler, A., Briscoe, T., & Norris, D. (1995). Models of continuous speech recognition and the contents of the vocabulary. Language and Cognitive Processes, 10, 309-331. doi:10.1080/01690969508407098.

    Abstract

    Several models of spoken word recognition postulate that recognition is achieved via a process of competition between lexical hypotheses. Competition not only provides a mechanism for isolated word recognition, it also assists in continuous speech recognition, since it offers a means of segmenting continuous input into individual words. We present statistics on the pattern of occurrence of words embedded in the polysyllabic words of the English vocabulary, showing that an overwhelming majority (84%) of polysyllables have shorter words embedded within them. Positional analyses show that these embeddings are most common at the onsets of the longer word. Although both phonological and syntactic constraints could rule out some embedded words, they do not remove the problem. Lexical competition provides a means of dealing with lexical embedding. It is also supported by a growing body of experimental evidence. We present results which indicate that competition operates both between word candidates that begin at the same point in the input and candidates that begin at different points (McQueen, Norris, & Cutler, 1994, Noms, McQueen, & Cutler, in press). We conclude that lexical competition is an essential component in models of continuous speech recognition.
  • McQueen, J. M., Cutler, A., & Norris, D. (2000). Positive and negative influences of the lexicon on phonemic decision-making. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the Sixth International Conference on Spoken Language Processing: Vol. 3 (pp. 778-781). Beijing: China Military Friendship Publish.

    Abstract

    Lexical knowledge influences how human listeners make decisions about speech sounds. Positive lexical effects (faster responses to target sounds in words than in nonwords) are robust across several laboratory tasks, while negative effects (slower responses to targets in more word-like nonwords than in less word-like nonwords) have been found in phonetic decision tasks but not phoneme monitoring tasks. The present experiments tested whether negative lexical effects are therefore a task-specific consequence of the forced choice required in phonetic decision. We compared phoneme monitoring and phonetic decision performance using the same Dutch materials in each task. In both experiments there were positive lexical effects, but no negative lexical effects. We observe that in all studies showing negative lexical effects, the materials were made by cross-splicing, which meant that they contained perceptual evidence supporting the lexically-consistent phonemes. Lexical knowledge seems to influence phonemic decision-making only when there is evidence for the lexically-consistent phoneme in the speech signal.
  • McQueen, J. M., Cutler, A., & Norris, D. (2000). Why Merge really is autonomous and parsimonious. In A. Cutler, J. M. McQueen, & R. Zondervan (Eds.), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 47-50). Nijmegen: Max-Planck-Institute for Psycholinguistics.

    Abstract

    We briefly describe the Merge model of phonemic decision-making, and, in the light of general arguments about the possible role of feedback in spoken-word recognition, defend Merge's feedforward structure. Merge not only accounts adequately for the data, without invoking feedback connections, but does so in a parsimonious manner.
  • McQueen, J. M., Jesse, A., & Mitterer, H. (2023). Lexically mediated compensation for coarticulation still as elusive as a white christmash. Cognitive Science: a multidisciplinary journal, 47(9): e13342. doi:10.1111/cogs.13342.

    Abstract

    Luthra, Peraza-Santiago, Beeson, Saltzman, Crinnion, and Magnuson (2021) present data from the lexically mediated compensation for coarticulation paradigm that they claim provides conclusive evidence in favor of top-down processing in speech perception. We argue here that this evidence does not support that conclusion. The findings are open to alternative explanations, and we give data in support of one of them (that there is an acoustic confound in the materials). Lexically mediated compensation for coarticulation thus remains elusive, while prior data from the paradigm instead challenge the idea that there is top-down processing in online speech recognition.

    Additional information

    supplementary materials
  • Meyer, A. S., & Levelt, W. J. M. (2000). Merging speech perception and production [Comment on Norris, McQueen and Cutler]. Behavioral and Brain Sciences, 23(3), 339-340. doi:10.1017/S0140525X00373241.

    Abstract

    A comparison of Merge, a model of comprehension, and WEAVER, a model of production, raises five issues: (1) merging models of comprehension and production necessarily creates feedback; (2) neither model is a comprehensive account of word processing; (3) the models are incomplete in different ways; (4) the models differ in their handling of competition; (5) as opposed to WEAVER, Merge is a model of metalinguistic behavior.
  • Meyer, A. S., & Van der Meulen, F. (2000). Phonological priming effects on speech onset latencies and viewing times in object naming. Psychonomic Bulletin & Review, 7, 314-319.
  • Meyer, A. S. (2023). Timing in conversation. Journal of Cognition, 6(1), 1-17. doi:10.5334/joc.268.

    Abstract

    Turn-taking in everyday conversation is fast, with median latencies in corpora of conversational speech often reported to be under 300 ms. This seems like magic, given that experimental research on speech planning has shown that speakers need much more time to plan and produce even the shortest of utterances. This paper reviews how language scientists have combined linguistic analyses of conversations and experimental work to understand the skill of swift turn-taking and proposes a tentative solution to the riddle of fast turn-taking.
  • Mickan, A., McQueen, J. M., Brehm, L., & Lemhöfer, K. (2023). Individual differences in foreign language attrition: A 6-month longitudinal investigation after a study abroad. Language, Cognition and Neuroscience, 38(1), 11-39. doi:10.1080/23273798.2022.2074479.

    Abstract

    While recent laboratory studies suggest that the use of competing languages is a driving force in foreign language (FL) attrition (i.e. forgetting), research on “real” attriters has failed to demonstrate
    such a relationship. We addressed this issue in a large-scale longitudinal study, following German students throughout a study abroad in Spain and their first six months back in Germany. Monthly,
    percentage-based frequency of use measures enabled a fine-grained description of language use.
    L3 Spanish forgetting rates were indeed predicted by the quantity and quality of Spanish use, and
    correlated negatively with L1 German and positively with L2 English letter fluency. Attrition rates
    were furthermore influenced by prior Spanish proficiency, but not by motivation to maintain
    Spanish or non-verbal long-term memory capacity. Overall, this study highlights the importance
    of language use for FL retention and sheds light on the complex interplay between language
    use and other determinants of attrition.
  • Mishra, C., Offrede, T., Fuchs, S., Mooshammer, C., & Skantze, G. (2023). Does a robot’s gaze aversion affect human gaze aversion? Frontiers in Robotics and AI, 10: 1127626. doi:10.3389/frobt.2023.1127626.

    Abstract

    Gaze cues serve an important role in facilitating human conversations and are generally considered to be one of the most important non-verbal cues. Gaze cues are used to manage turn-taking, coordinate joint attention, regulate intimacy, and signal cognitive effort. In particular, it is well established that gaze aversion is used in conversations to avoid prolonged periods of mutual gaze. Given the numerous functions of gaze cues, there has been extensive work on modelling these cues in social robots. Researchers have also tried to identify the impact of robot gaze on human participants. However, the influence of robot gaze behavior on human gaze behavior has been less explored. We conducted a within-subjects user study (N = 33) to verify if a robot’s gaze aversion influenced human gaze aversion behavior. Our results show that participants tend to avert their gaze more when the robot keeps staring at them as compared to when the robot exhibits well-timed gaze aversions. We interpret our findings in terms of intimacy regulation: humans try to compensate for the robot’s lack of gaze aversion.
  • Mishra, C., Verdonschot, R. G., Hagoort, P., & Skantze, G. (2023). Real-time emotion generation in human-robot dialogue using large language models. Frontiers in Robotics and AI, 10: 1271610. doi:10.3389/frobt.2023.1271610.

    Abstract

    Affective behaviors enable social robots to not only establish better connections with humans but also serve as a tool for the robots to express their internal states. It has been well established that emotions are important to signal understanding in Human-Robot Interaction (HRI). This work aims to harness the power of Large Language Models (LLM) and proposes an approach to control the affective behavior of robots. By interpreting emotion appraisal as an Emotion Recognition in Conversation (ERC) tasks, we used GPT-3.5 to predict the emotion of a robot’s turn in real-time, using the dialogue history of the ongoing conversation. The robot signaled the predicted emotion using facial expressions. The model was evaluated in a within-subjects user study (N = 47) where the model-driven emotion generation was compared against conditions where the robot did not display any emotions and where it displayed incongruent emotions. The participants interacted with the robot by playing a card sorting game that was specifically designed to evoke emotions. The results indicated that the emotions were reliably generated by the LLM and the participants were able to perceive the robot’s emotions. It was found that the robot expressing congruent model-driven facial emotion expressions were perceived to be significantly more human-like, emotionally appropriate, and elicit a more positive impression. Participants also scored significantly better in the card sorting game when the robot displayed congruent facial expressions. From a technical perspective, the study shows that LLMs can be used to control the affective behavior of robots reliably in real-time. Additionally, our results could be used in devising novel human-robot interactions, making robots more effective in roles where emotional interaction is important, such as therapy, companionship, or customer service.
  • Monaghan, P., Donnelly, S., Alcock, K., Bidgood, A., Cain, K., Durrant, S., Frost, R. L. A., Jago, L. S., Peter, M. S., Pine, J. M., Turnbull, H., & Rowland, C. F. (2023). Learning to generalise but not segment an artificial language at 17 months predicts children’s language skills 3 years later. Cognitive Psychology, 147: 101607. doi:10.1016/j.cogpsych.2023.101607.

    Abstract

    We investigated whether learning an artificial language at 17 months was predictive of children’s natural language vocabulary and grammar skills at 54 months. Children at 17 months listened to an artificial language containing non-adjacent dependencies, and were then tested on their learning to segment and to generalise the structure of the language. At 54 months, children were then tested on a range of standardised natural language tasks that assessed receptive and expressive vocabulary and grammar. A structural equation model demonstrated that learning the artificial language generalisation at 17 months predicted language abilities – a composite of vocabulary and grammar skills – at 54 months, whereas artificial language segmentation at 17 months did not predict language abilities at this age. Artificial language learning tasks – especially those that probe grammar learning – provide a valuable tool for uncovering the mechanisms driving children’s early language development.

    Additional information

    supplementary data
  • Mooijman, S., Schoonen, R., Ruiter, M. B., & Roelofs, A. (2023). Voluntary and cued language switching in late bilingual speakers. Bilingualism: Language and Cognition. Advance online publication. doi:10.1017/S1366728923000755.

    Abstract

    Previous research examining the factors that determine language choice and voluntary switching mainly involved early bilinguals. Here, using picture naming, we investigated language choice and switching in late Dutch–English bilinguals. We found that naming was overall slower in cued than in voluntary switching, but switch costs occurred in both types of switching. The magnitude of switch costs differed depending on the task and language, and was moderated by L2 proficiency. Self-rated rather than objectively assessed proficiency predicted voluntary switching and ease of lexical access was associated with language choice. Between-language and within-language switch costs were not correlated. These results highlight self-rated proficiency as a reliable predictor of voluntary switching, with language modulating switch costs. As in early bilinguals, ease of lexical access was related to word-level language choice of late bilinguals.
  • Morison, L., Meffert, E., Stampfer, M., Steiner-Wilke, I., Vollmer, B., Schulze, K., Briggs, T., Braden, R., Vogel, A. P., Thompson-Lake, D., Patel, C., Blair, E., Goel, H., Turner, S., Moog, U., Riess, A., Liegeois, F., Koolen, D. A., Amor, D. J., Kleefstra, T. and 3 moreMorison, L., Meffert, E., Stampfer, M., Steiner-Wilke, I., Vollmer, B., Schulze, K., Briggs, T., Braden, R., Vogel, A. P., Thompson-Lake, D., Patel, C., Blair, E., Goel, H., Turner, S., Moog, U., Riess, A., Liegeois, F., Koolen, D. A., Amor, D. J., Kleefstra, T., Fisher, S. E., Zweier, C., & Morgan, A. T. (2023). In-depth characterisation of a cohort of individuals with missense and loss-of-function variants disrupting FOXP2. Journal of Medical Genetics, 60(6), 597-607. doi:10.1136/jmg-2022-108734.

    Abstract

    Background
    Heterozygous disruptions of FOXP2 were the first identified molecular cause for severe speech disorder; childhood apraxia of speech (CAS), yet few cases have been reported, limiting knowledge of the condition.

    Methods
    Here we phenotyped 29 individuals from 18 families with pathogenic FOXP2-only variants (13 loss-of-function, 5 missense variants; 14 males; aged 2 years to 62 years). Health and development (cognitive, motor, social domains) was examined, including speech and language outcomes with the first cross-linguistic analysis of English and German.

    Results
    Speech disorders were prevalent (24/26, 92%) and CAS was most common (23/26, 89%), with similar speech presentations across English and German. Speech was still impaired in adulthood and some speech sounds (e.g. ‘th’, ‘r’, ‘ch’, ‘j’) were never acquired. Language impairments (22/26, 85%) ranged from mild to severe. Comorbidities included feeding difficulties in infancy (10/27, 37%), fine (14/27, 52%) and gross (14/27, 52%) motor impairment, anxiety (6/28, 21%), depression (7/28, 25%), and sleep disturbance (11/15, 44%). Physical features were common (23/28, 82%) but with no consistent pattern. Cognition ranged from average to mildly impaired, and was incongruent with language ability; for example, seven participants with severe language disorder had average non-verbal cognition.

    Conclusions
    Although we identify increased prevalence of conditions like anxiety, depression and sleep disturbance, we confirm that the consequences of FOXP2 dysfunction remain relatively specific to speech disorder, as compared to other recently identified monogenic conditions associated with CAS. Thus, our findings reinforce that FOXP2 provides a valuable entrypoint for examining the neurobiological bases of speech disorder.
  • Muhinyi, A., & Rowland, C. F. (2023). Contributions of abstract extratextual talk and interactive style to preschoolers’ vocabulary development. Journal of Child Language, 50(1), 198-213. doi:10.1017/S0305000921000696.

    Abstract

    Caregiver abstract talk during shared reading predicts preschool-age children’s vocabulary development. However, previous research has focused on level of abstraction with less consideration of the style of extratextual talk. Here, we investigated the relation between these two dimensions of extratextual talk, and their contributions to variance in children’s vocabulary skills. Caregiver level of abstraction was associated with an interactive reading style. Controlling for socioeconomic status and child age, high interactivity predicted children’s concurrent vocabulary skills whereas abstraction did not. Controlling for earlier vocabulary skills, neither dimension of the extratextual talk predicted later vocabulary. Theoretical and practical relevance are discussed.
  • Nabrotzky, J., Ambrazaitis, G., Zellers, M., & House, D. (2023). Temporal alignment of manual gestures’ phase transitions with lexical and post-lexical accentual F0 peaks in spontaneous Swedish interaction. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527194.

    Abstract

    Many studies investigating the temporal alignment of co-speech
    gestures to acoustic units in the speech signal find a close
    coupling of the gestural landmarks and pitch accents or the
    stressed syllable of pitch-accented words. In English, a pitch
    accent is anchored in the lexically stressed syllable. Hence, it is
    unclear whether it is the lexical phonological dimension of
    stress, or the phrase-level prominence that determines the
    details of speech-gesture synchronization. This paper explores
    the relation between gestural phase transitions and accentual F0
    peaks in Stockholm Swedish, which exhibits a lexical pitch
    accent distinction. When produced with phrase-level
    prominence, there are three different configurations of
    lexicality of F0 peaks and the status of the syllable it is aligned
    with. Through analyzing the alignment of the different F0 peaks
    with gestural onsets in spontaneous dyadic conversations, we
    aim to contribute to our understanding of the role of lexical
    prosodic phonology in the co-production of speech and gesture.
    The results, though limited by a small dataset, still suggest
    differences between the three types of peaks concerning which
    types of gesture phase onsets they tend to align with, and how
    well these landmarks align with each other, although these
    differences did not reach significance.
  • Norris, D., McQueen, J. M., & Cutler, A. (1995). Competition and segmentation in spoken word recognition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1209-1228.

    Abstract

    Spoken utterances contain few reliable cues to word boundaries, but listeners nonetheless experience little difficulty identifying words in continuous speech. The authors present data and simulations that suggest that this ability is best accounted for by a model of spoken-word recognition combining competition between alternative lexical candidates and sensitivity to prosodic structure. In a word-spotting experiment, stress pattern effects emerged most clearly when there were many competing lexical candidates for part of the input. Thus, competition between simultaneously active word candidates can modulate the size of prosodic effects, which suggests that spoken-word recognition must be sensitive both to prosodic structure and to the effects of competition. A version of the Shortlist model ( D. G. Norris, 1994b) incorporating the Metrical Segmentation Strategy ( A. Cutler & D. Norris, 1988) accurately simulates the results using a lexicon of more than 25,000 words.
  • Norris, D., McQueen, J. M., & Cutler, A. (2000). Feedback on feedback on feedback: It’s feedforward. (Response to commentators). Behavioral and Brain Sciences, 23, 352-370.

    Abstract

    The central thesis of the target article was that feedback is never necessary in spoken word recognition. The commentaries present no new data and no new theoretical arguments which lead us to revise this position. In this response we begin by clarifying some terminological issues which have lead to a number of significant misunderstandings. We provide some new arguments to support our case that the feedforward model Merge is indeed more parsimonious than the interactive alternatives, and that it provides a more convincing account of the data than alternative models. Finally, we extend the arguments to deal with new issues raised by the commentators such as infant speech perception and neural architecture.
  • Norris, D., McQueen, J. M., & Cutler, A. (2000). Merging information in speech recognition: Feedback is never necessary. Behavioral and Brain Sciences, 23, 299-325.

    Abstract

    Top-down feedback does not benefit speech recognition; on the contrary, it can hinder it. No experimental data imply that feedback loops are required for speech recognition. Feedback is accordingly unnecessary and spoken word recognition is modular. To defend this thesis, we analyse lexical involvement in phonemic decision making. TRACE (McClelland & Elman 1986), a model with feedback from the lexicon to prelexical processes, is unable to account for all the available data on phonemic decision making. The modular Race model (Cutler & Norris 1979) is likewise challenged by some recent results, however. We therefore present a new modular model of phonemic decision making, the Merge model. In Merge, information flows from prelexical processes to the lexicon without feedback. Because phonemic decisions are based on the merging of prelexical and lexical information, Merge correctly predicts lexical involvement in phonemic decisions in both words and nonwords. Computer simulations show how Merge is able to account for the data through a process of competition between lexical hypotheses. We discuss the issue of feedback in other areas of language processing and conclude that modular models are particularly well suited to the problems and constraints of speech recognition.
  • Norris, D., Cutler, A., McQueen, J. M., Butterfield, S., & Kearns, R. K. (2000). Language-universal constraints on the segmentation of English. In A. Cutler, J. M. McQueen, & R. Zondervan (Eds.), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 43-46). Nijmegen: Max-Planck-Institute for Psycholinguistics.

    Abstract

    Two word-spotting experiments are reported that examine whether the Possible-Word Constraint (PWC) [1] is a language-specific or language-universal strategy for the segmentation of continuous speech. The PWC disfavours parses which leave an impossible residue between the end of a candidate word and a known boundary. The experiments examined cases where the residue was either a CV syllable with a lax vowel, or a CVC syllable with a schwa. Although neither syllable context is a possible word in English, word-spotting in both contexts was easier than with a context consisting of a single consonant. The PWC appears to be language-universal rather than language-specific.
  • Norris, D., Cutler, A., & McQueen, J. M. (2000). The optimal architecture for simulating spoken-word recognition. In C. Davis, T. Van Gelder, & R. Wales (Eds.), Cognitive Science in Australia, 2000: Proceedings of the Fifth Biennial Conference of the Australasian Cognitive Science Society. Adelaide: Causal Productions.

    Abstract

    Simulations explored the inability of the TRACE model of spoken-word recognition to model the effects on human listening of subcategorical mismatch in word forms. The source of TRACE's failure lay not in interactive connectivity, not in the presence of inter-word competition, and not in the use of phonemic representations, but in the need for continuously optimised interpretation of the input. When an analogue of TRACE was allowed to cycle to asymptote on every slice of input, an acceptable simulation of the subcategorical mismatch data was achieved. Even then, however, the simulation was not as close as that produced by the Merge model, which has inter-word competition, phonemic representations and continuous optimisation (but no interactive connectivity).
  • Nota, N., Trujillo, J. P., & Holler, J. (2023). Specific facial signals associate with categories of social actions conveyed through questions. PLoS One, 18(7): e0288104. doi:10.1371/journal.pone.0288104.

    Abstract

    The early recognition of fundamental social actions, like questions, is crucial for understanding the speaker’s intended message and planning a timely response in conversation. Questions themselves may express more than one social action category (e.g., an information request “What time is it?”, an invitation “Will you come to my party?” or a criticism “Are you crazy?”). Although human language use occurs predominantly in a multimodal context, prior research on social actions has mainly focused on the verbal modality. This study breaks new ground by investigating how conversational facial signals may map onto the expression of different types of social actions conveyed through questions. The distribution, timing, and temporal organization of facial signals across social actions was analysed in a rich corpus of naturalistic, dyadic face-to-face Dutch conversations. These social actions were: Information Requests, Understanding Checks, Self-Directed questions, Stance or Sentiment questions, Other-Initiated Repairs, Active Participation questions, questions for Structuring, Initiating or Maintaining Conversation, and Plans and Actions questions. This is the first study to reveal differences in distribution and timing of facial signals across different types of social actions. The findings raise the possibility that facial signals may facilitate social action recognition during language processing in multimodal face-to-face interaction.

    Additional information

    supporting information
  • Nota, N., Trujillo, J. P., Jacobs, V., & Holler, J. (2023). Facilitating question identification through natural intensity eyebrow movements in virtual avatars. Scientific Reports, 13: 21295. doi:10.1038/s41598-023-48586-4.

    Abstract

    In conversation, recognizing social actions (similar to ‘speech acts’) early is important to quickly understand the speaker’s intended message and to provide a fast response. Fast turns are typical for fundamental social actions like questions, since a long gap can indicate a dispreferred response. In multimodal face-to-face interaction, visual signals may contribute to this fast dynamic. The face is an important source of visual signalling, and previous research found that prevalent facial signals such as eyebrow movements facilitate the rapid recognition of questions. We aimed to investigate whether early eyebrow movements with natural movement intensities facilitate question identification, and whether specific intensities are more helpful in detecting questions. Participants were instructed to view videos of avatars where the presence of eyebrow movements (eyebrow frown or raise vs. no eyebrow movement) was manipulated, and to indicate whether the utterance in the video was a question or statement. Results showed higher accuracies for questions with eyebrow frowns, and faster response times for questions with eyebrow frowns and eyebrow raises. No additional effect was observed for the specific movement intensity. This suggests that eyebrow movements that are representative of naturalistic multimodal behaviour facilitate question recognition.
  • Nota, N., Trujillo, J. P., & Holler, J. (2023). Conversational eyebrow frowns facilitate question identification: An online study using virtual avatars. Cognitive Science, 47(12): e13392. doi:10.1111/cogs.13392.

    Abstract

    Conversation is a time-pressured environment. Recognizing a social action (the ‘‘speech act,’’ such as a question requesting information) early is crucial in conversation to quickly understand the intended message and plan a timely response. Fast turns between interlocutors are especially relevant for responses to questions since a long gap may be meaningful by itself. Human language is multimodal, involving speech as well as visual signals from the body, including the face. But little is known about how conversational facial signals contribute to the communication of social actions. Some of the most prominent facial signals in conversation are eyebrow movements. Previous studies found links between eyebrow movements and questions, suggesting that these facial signals could contribute to the rapid recognition of questions. Therefore, we aimed to investigate whether early eyebrow movements (eyebrow frown or raise vs. no eyebrow movement) facilitate question identification. Participants were instructed to view videos of avatars where the presence of eyebrow movements accompanying questions was manipulated. Their task was to indicate whether the utterance was a question or a statement as accurately and quickly as possible. Data were collected using the online testing platform Gorilla. Results showed higher accuracies and faster response times for questions with eyebrow frowns, suggesting a facilitative role of eyebrow frowns for question identification. This means that facial signals can critically contribute to the communication of social actions in conversation by signaling social action-specific visual information and providing visual cues to speakers’ intentions.

    Additional information

    link to preprint
  • Nota, N. (2023). Talking faces: The contribution of conversational facial signals to language use and processing. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Nozais, V., Forkel, S. J., Petit, L., Talozzi, L., Corbetta, M., Thiebaut de Schotten, M., & Joliot, M. (2023). Atlasing white matter and grey matter joint contributions to resting-state networks in the human brain. Communications Biology, 6: 726. doi:10.1038/s42003-023-05107-3.

    Abstract

    Over the past two decades, the study of resting-state functional magnetic resonance imaging has revealed that functional connectivity within and between networks is linked to cognitive states and pathologies. However, the white matter connections supporting this connectivity remain only partially described. We developed a method to jointly map the white and grey matter contributing to each resting-state network (RSN). Using the Human Connectome Project, we generated an atlas of 30 RSNs. The method also highlighted the overlap between networks, which revealed that most of the brain’s white matter (89%) is shared between multiple RSNs, with 16% shared by at least 7 RSNs. These overlaps, especially the existence of regions shared by numerous networks, suggest that white matter lesions in these areas might strongly impact the communication within networks. We provide an atlas and an open-source software to explore the joint contribution of white and grey matter to RSNs and facilitate the study of the impact of white matter damage to these networks. In a first application of the software with clinical data, we were able to link stroke patients and impacted RSNs, showing that their symptoms aligned well with the estimated functions of the networks.
  • Numssen, O., van der Burght, C. L., & Hartwigsen, G. (2023). Revisiting the focality of non-invasive brain stimulation - implications for studies of human cognition. Neuroscience and Biobehavioral Reviews, 149: 105154. doi:10.1016/j.neubiorev.2023.105154.

    Abstract

    Non-invasive brain stimulation techniques are popular tools to investigate brain function in health and disease. Although transcranial magnetic stimulation (TMS) is widely used in cognitive neuroscience research to probe causal structure-function relationships, studies often yield inconclusive results. To improve the effectiveness of TMS studies, we argue that the cognitive neuroscience community needs to revise the stimulation focality principle – the spatial resolution with which TMS can differentially stimulate cortical regions. In the motor domain, TMS can differentiate between cortical muscle representations of adjacent fingers. However, this high degree of spatial specificity cannot be obtained in all cortical regions due to the influences of cortical folding patterns on the TMS-induced electric field. The region-dependent focality of TMS should be assessed a priori to estimate the experimental feasibility. Post-hoc simulations allow modeling of the relationship between cortical stimulation exposure and behavioral modulation by integrating data across stimulation sites or subjects.

    Files private

    Request files
  • Offrede, T., Mishra, C., Skantze, G., Fuchs, S., & Mooshammer, C. (2023). Do Humans Converge Phonetically When Talking to a Robot? In R. Skarnitzl, & J. Volin (Eds.), Proceedings of the 20th International Congress of Phonetic Sciences (pp. 3507-3511). Prague: GUARANT International.

    Abstract

    Phonetic convergence—i.e., adapting one’s speech
    towards that of an interlocutor—has been shown
    to occur in human-human conversations as well as
    human-machine interactions. Here, we investigate
    the hypothesis that human-to-robot convergence is
    influenced by the human’s perception of the robot
    and by the conversation’s topic. We conducted a
    within-subjects experiment in which 33 participants
    interacted with two robots differing in their eye gaze
    behavior—one looked constantly at the participant;
    the other produced gaze aversions, similarly to a
    human’s behavior. Additionally, the robot asked
    questions with increasing intimacy levels.
    We observed that the speakers tended to converge
    on F0 to the robots. However, this convergence
    to the robots was not modulated by how the
    speakers perceived them or by the topic’s intimacy.
    Interestingly, speakers produced lower F0 means
    when talking about more intimate topics. We
    discuss these findings in terms of current theories of
    conversational convergence.
  • Oliveira‑Stahl, G., Farboud, S., Sterling, M. L., Heckman, J. J., Van Raalte, B., Lenferink, D., Van der Stam, A., Smeets, C. J. L. M., Fisher, S. E., & Englitz, B. (2023). High-precision spatial analysis of mouse courtship vocalization behavior reveals sex and strain differences. Scientific Reports, 13: 5219. doi:10.1038/s41598-023-31554-3.

    Abstract

    Mice display a wide repertoire of vocalizations that varies with sex, strain, and context. Especially during social interaction, including sexually motivated dyadic interaction, mice emit sequences of ultrasonic vocalizations (USVs) of high complexity. As animals of both sexes vocalize, a reliable attribution of USVs to their emitter is essential. The state-of-the-art in sound localization for USVs in 2D allows spatial localization at a resolution of multiple centimeters. However, animals interact at closer ranges, e.g. snout-to-snout. Hence, improved algorithms are required to reliably assign USVs. We present a novel algorithm, SLIM (Sound Localization via Intersecting Manifolds), that achieves a 2–3-fold improvement in accuracy (13.1–14.3 mm) using only 4 microphones and extends to many microphones and localization in 3D. This accuracy allows reliable assignment of 84.3% of all USVs in our dataset. We apply SLIM to courtship interactions between adult C57Bl/6J wildtype mice and those carrying a heterozygous Foxp2 variant (R552H). The improved spatial accuracy reveals that vocalization behavior is dependent on the spatial relation between the interacting mice. Female mice vocalized more in close snout-to-snout interaction while male mice vocalized more when the male snout was in close proximity to the female's ano-genital region. Further, we find that the acoustic properties of the ultrasonic vocalizations (duration, Wiener Entropy, and sound level) are dependent on the spatial relation between the interacting mice as well as on the genotype. In conclusion, the improved attribution of vocalizations to their emitters provides a foundation for better understanding social vocal behaviors.

    Additional information

    supplementary movies and figures
  • Otake, T., & Cutler, A. (2000). A set of Japanese word cohorts rated for relative familiarity. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the Sixth International Conference on Spoken Language Processing: Vol. 3 (pp. 766-769). Beijing: China Military Friendship Publish.

    Abstract

    A database is presented of relative familiarity ratings for 24 sets of Japanese words, each set comprising words overlapping in the initial portions. These ratings are useful for the generation of material sets for research in the recognition of spoken words.
  • Otake, T., Davis, S. M., & Cutler, A. (1995). Listeners’ representations of within-word structure: A cross-linguistic and cross-dialectal investigation. In J. Pardo (Ed.), Proceedings of EUROSPEECH 95: Vol. 3 (pp. 1703-1706). Madrid: European Speech Communication Association.

    Abstract

    Japanese, British English and American English listeners were presented with spoken words in their native language, and asked to mark on a written transcript of each word the first natural division point in the word. The results showed clear and strong patterns of consensus, indicating that listeners have available to them conscious representations of within-word structure. Orthography did not play a strongly deciding role in the results. The patterns of response were at variance with results from on-line studies of speech segmentation, suggesting that the present task taps not those representations used in on-line listening, but levels of representation which may involve much richer knowledge of word-internal structure.

Share this page