Publications

Displaying 201 - 300 of 378
  • Lai, V. T., Hagoort, P., & Casasanto, D. (2011). Affective and non-affective meaning in words and pictures. In L. Carlson, C. Holscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Meeting of the Cognitive Science Society (pp. 390-395). Austin, TX: Cognitive Science Society.
  • Lammertink, I., De Heer Kloots, M., Bazioni, M., & Raviv, L. (2024). Learnability effects in children: Are more structured languages easier to learn? In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 320-323). Nijmegen: The Evolution of Language Conferences.
  • Lenkiewicz, P., Pereira, M., Freire, M., & Fernandes, J. (2008). Accelerating 3D medical image segmentation with high performance computing. In Proceedings of the IEEE International Workshops on Image Processing Theory, Tools and Applications - IPT (pp. 1-8).

    Abstract

    Digital processing of medical images has helped physicians and patients during past years by allowing examination and diagnosis on a very precise level. Nowadays possibly the biggest deal of support it can offer for modern healthcare is the use of high performance computing architectures to treat the huge amounts of data that can be collected by modern acquisition devices. This paper presents a parallel processing implementation of an image segmentation algorithm that operates on a computer cluster equipped with 10 processing units. Thanks to well-organized distribution of the workload we manage to significantly shorten the execution time of the developed algorithm and reach a performance gain very close to linear.
  • Lenkiewicz, P., Wittenburg, P., Schreer, O., Masneri, S., Schneider, D., & Tschöpel, S. (2011). Application of audio and video processing methods for language research. In Proceedings of the conference Supporting Digital Humanities 2011 [SDH 2011], Copenhagen, Denmark, November 17-18, 2011.

    Abstract

    Annotations of media recordings are the grounds for linguistic research. Since creating those annotations is a very laborious task, reaching 100 times longer than the length of the annotated media, innovative audio and video processing algorithms are needed, in order to improve the efficiency and quality of annotation process. The AVATecH project, started by the Max-Planck Institute for Psycholinguistics (MPI) and the Fraunhofer institutes HHI and IAIS, aims at significantly speeding up the process of creating annotations of audio-visual data for humanities research. In order for this to be achieved a range of state-of-the-art audio and video pattern recognition algorithms have been developed and integrated into widely used ELAN annotation tool. To address the problem of heterogeneous annotation tasks and recordings we provide modular components extended by adaptation and feedback mechanisms to achieve competitive annotation quality within significantly less annotation time.
  • Lenkiewicz, P., Wittenburg, P., Gebre, B. G., Lenkiewicz, A., Schreer, O., & Masneri, S. (2011). Application of video processing methods for linguistic research. In Z. Vetulani (Ed.), Human language technologies as a challenge for computer science and linguistics. Proceedings of the 5th Language and Technology Conference (LTC 2011), November 25-27, 2011, Poznań, Poland (pp. 561-564).

    Abstract

    Evolution and changes of all modern languages is a well-known fact. However, recently it is reaching dynamics never seen before, which results in loss of the vast amount of information encoded in every language. In order to preserve such heritage, properly annotated recordings of world languages are necessary. Since creating those annotations is a very laborious task, reaching times 100 longer than the length of the annotated media, innovative video processing algorithms are needed, in order to improve the efficiency and quality of annotation process.
  • Lenkiewicz, P., Pereira, M., Freire, M., & Fernandes, J. (2011). Extended whole mesh deformation model: Full 3D processing. In Proceedings of the 2011 IEEE International Conference on Image Processing (pp. 1633-1636).

    Abstract

    Processing medical data has always been an interesting field that has shown the need for effective image segmentation methods. Modern medical image segmentation solutions are focused on 3D image volumes, which originate at advanced acquisition devices. Operating on such data in a 3D envi- ronment is essential in order to take the full advantage of the available information. In this paper we present an extended version of our 3D image segmentation and reconstruction model that belongs to the family of Deformable Models and is capable of processing large image volumes in competitive times and in fully 3D environment, offering a big level of automation of the process and a high precision of results. It is also capable of handling topology changes and offers a very good scalability on multi-processing unit architectures. We present a description of the model and show its capabilities in the field of medical image processing.
  • Levelt, W. J. M. (1996). Preface. In W. J. M. Levelt (Ed.), Advanced psycholinguistics: A bressanone perspective for Giovanni B. Flores d'Arcais (pp. VII-IX). Nijmegen: Max Planck Institute for Psycholinguistics.
  • Levelt, W. J. M. (1996). Foreword. In T. Dijkstra, & K. De Smedt (Eds.), Computational psycholinguistics (pp. ix-xi). London: Taylor & Francis.
  • Levelt, W. J. M. (1962). Motion breaking and the perception of causality. In A. Michotte (Ed.), Causalité, permanence et réalité phénoménales: Etudes de psychologie expérimentale (pp. 244-258). Louvain: Publications Universitaires.
  • Levelt, W. J. M., & Plomp, R. (1962). Musical consonance and critical bandwidth. In Proceedings of the 4th International Congress Acoustics (pp. 55-55).
  • Levelt, W. J. M., & Kempen, G. (1979). Language. In J. A. Michon, E. G. J. Eijkman, & L. F. W. De Klerk (Eds.), Handbook of psychonomics (Vol. 2) (pp. 347-407). Amsterdam: North Holland.
  • Levelt, W. J. M. (1996). Linguistic intuitions and beyond. In W. J. M. Levelt (Ed.), Advanced psycholinguistics: A Bressanone retrospective for Giovanni B. Floris d'Arcais (pp. 31-35). Nijmegen: Max Planck Institute for Psycholinguistics.
  • Levelt, W. J. M. (1996). Perspective taking and ellipsis in spatial descriptions. In P. Bloom, M. A. Peterson, L. Nadel, & M. F. Garrett (Eds.), Language and space (pp. 77-107). Cambridge, MA: MIT Press.
  • Levelt, W. J. M. (1979). The origins of language and language awareness. In M. Von Cranach, K. Foppa, W. Lepenies, & D. Ploog (Eds.), Human ethology (pp. 739-745). Cambridge: Cambridge University Press.
  • Levelt, W. J. M. (1966). The perceptual conflict in binocular rivalry. In M. A. Bouman (Ed.), Studies in perception: Dedicated to M.A. Bouman (pp. 47-60). Soesterberg: Institute for Perception RVO-TNO.
  • Levelt, W. J. M. (2008). What has become of formal grammars in linguistics and psycholinguistics? [Postscript]. In Formal Grammars in linguistics and psycholinguistics (pp. 1-17). Amsterdam: John Benjamins.
  • Levinson, S. C. (2011). Deixis [Reprint]. In D. Archer, & P. Grundy (Eds.), The pragmatics reader (pp. 163-185). London: Routledge.

    Abstract

    Reproduced with permission of Blackwell Publishing from: Levinson, S. C. (2004) 'Deixis'. In: Horn, L.R. and Ward, G. (Eds.) The Handbook of Pragmatics. Oxford: Blackwell Publishing, pp. 100-121
  • Levinson, S. C. (2011). Foreword. In D. M. Mark, A. G. Turk, N. Burenhult, & D. Stea (Eds.), Landscape in language: Transdisciplinary perspectives (pp. ix-x). Amsterdam: John Benjamins.
  • Levinson, S. C. (1996). Frames of reference and Molyneux's question: Cross-linguistic evidence. In P. Bloom, M. Peterson, L. Nadel, & M. Garrett (Eds.), Language and space (pp. 109-169). Cambridge, MA: MIT press.
  • Levinson, S. C. (1996). Introduction to part II. In J. J. Gumperz, & S. C. Levinson (Eds.), Rethinking linguistic relativity (pp. 133-144). Cambridge: Cambridge University Press.
  • Levinson, S. C. (2011). Presumptive meanings [Reprint]. In D. Archer, & P. Grundy (Eds.), The pragmatics reader (pp. 86-98). London: Routledge.

    Abstract

    Reprinted with permission of The MIT Press from Levinson (2000) Presumptive meanings: The theory of generalized conversational implicature, pp. 112-118, 116-167, 170-173, 177-180. MIT Press
  • Levinson, S. C. (2011). Reciprocals in Yélî Dnye, the Papuan language of Rossel Island. In N. Evans, A. Gaby, S. C. Levinson, & A. Majid (Eds.), Reciprocals and semantic typology (pp. 177-194). Amsterdam: Benjamins.

    Abstract

    Yélî Dnye has two discernable dedicated constructions for reciprocal marking. The first and main construction uses a dedicated reciprocal pronoun numo, somewhat like English each other. We can recognise two subconstructions. First, the ‘numo-construction’, where the reciprocal pronoun is a patient of the verb, and where the invariant pronoun numo is obligatorily incorporated, triggering intransitivisation (e.g. A-NPs become absolutive). This subconstruction has complexities, for example in the punctual aspect only, the verb is inflected like a transitive, but with enclitics mismatching actual person/number. In the second variant or subconstruction, the ‘noko-construction’, the same reciprocal pronoun (sometimes case-marked as noko) occurs but now in oblique positions with either transitive or intransitive verbs. The reciprocal element here has some peculiar binding properties. Finally, the second independent construction is a dedicated periphrastic (or woni…woni) construction, glossing ‘the one did X to the other, and the other did X to the one’. It is one of the rare cross-serial dependencies that show that natural languages cannot be modelled by context-free phrase-structure grammars. Finally, the usage of these two distinct constructions is discussed.
  • Levinson, S. C. (1996). Relativity in spatial conception and description. In J. J. Gumperz, & S. C. Levinson (Eds.), Rethinking linguistic relativity (pp. 177-202). Cambridge University Press.
  • Levinson, S. C. (1979). Pragmatics and social deixis: Reclaiming the notion of conventional implicature. In C. Chiarello (Ed.), Proceedings of the Fifth Annual Meeting of the Berkeley Linguistics Society (pp. 206-223).
  • Levinson, S. C., & Majid, A. (2008). Preface and priorities. In A. Majid (Ed.), Field manual volume 11 (pp. iii-iv). Nijmegen: Max Planck Institute for Psycholinguistics.
  • Levinson, S. C. (2011). Three levels of meaning: Essays in honor of Sir John Lyons [Reprint]. In A. Kasher (Ed.), Pragmatics II. London: Routledge.

    Abstract

    Reprint from Stephen C. Levinson, ‘Three Levels of Meaning’, in Frank Palmer (ed.), Grammar and Meaning: Essays in Honor of Sir John Lyons (Cambridge University Press, 1995), pp. 90–115
  • Levinson, S. C., Bohnemeyer, J., & Enfield, N. J. (2008). Time and space questionnaire. In A. Majid (Ed.), Field Manual Volume 11 (pp. 42-49). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.492955.

    Abstract

    This entry contains: 1. An invitation to think about to what extent the grammar of space and time share lexical and morphosyntactic resources − the suggestions here are only prompts, since it would take a long questionnaire to fully explore this; 2. A suggestion about how to collect gestural data that might show us to what extent the spatial and temporal domains, have a psychological continuity. This is really the goal − but you need to do the linguistic work first or in addition. The goal of this task is to explore the extent to which time is conceptualised on a spatial basis.
  • Levinson, S. C., & Senft, G. (1996). Zur Semantik der Verben INTRARE und EXIRE in verschieden Sprachen. In Jahrbuch der Max-Planck-Gesellschaft 1996 (pp. 340-344). München: Generalverwaltung der Max-Planck-Gesellschaft München.
  • Levinson, S. C. (2011). Universals in pragmatics. In P. C. Hogan (Ed.), The Cambridge encyclopedia of the language sciences (pp. 654-657). New York: Cambridge University Press.

    Abstract

    Changing Prospects for Universals in Pragmatics
    The term PRAGMATICS has come to denote the study of general principles of language use. It is usually understood to contrast with SEMANTICS, the study of encoded meaning, and also, by some authors, to contrast with SOCIOLINGUISTICS
    and the ethnography of speaking, which are more concerned with local sociocultural practices. Given that pragmaticists come from disciplines as varied as philosophy, sociology,
    linguistics, communication studies, psychology, and anthropology, it is not surprising that definitions of pragmatics vary. Nevertheless, most authors agree on a list of topics
    that come under the rubric, including DEIXIS, PRESUPPOSITION,
    implicature (see CONVERSATIONAL IMPLICATURE), SPEECH-ACTS, and conversational organization (see CONVERSATIONAL ANALYSIS). Here, we can use this extensional definition as a starting point (Levinson 1988; Huang 2007).
  • Levinson, S. C. (2024). Culture as cognitive technology: An evolutionary perspective. In G. Bennardo, V. C. De Munck, & S. Chrisomalis (Eds.), Cognition in and out of the mind: Advances in cultural model theory (pp. 241-265). London: Palgrave Macmillan.

    Abstract

    Cognitive anthropology is in need of a theory that extends beyond cultural model theory and explains both how culture has transformed human cognition and the curious ontology of culture itself, for, as Durkheim insisted, culture cannot be reduced to psychology. This chapter promotes a framework that deals with both the evolutionary question and the ontological problem. It is argued that at least a central part of culture should be conceived of in terms of cognitive technology. Beginning with obvious examples of cognitive artifacts, like those used in measurement, way-finding, time-reckoning and numerical calculation, the chapter goes on to consider extensions to our communication systems, emotion-modulating systems and the cognitive division of labor. Cognitive artifacts form ‘coupled systems’ that amplify individual psychology, lying partly outside the head, and are honed by cultural evolution. They make clear how culture gave human cognition an evolutionary edge.
  • Liesenfeld, A., & Dingemanse, M. (2024). Rethinking open source generative AI: open-washing and the EU AI Act. In The 2024 ACM Conference on Fairness, Accountability, and Transparency (FAccT ’24) (pp. 1774-1784). ACM.

    Abstract

    The past year has seen a steep rise in generative AI systems that claim to be open. But how open are they really? The question of what counts as open source in generative AI is poised to take on particular importance in light of the upcoming EU AI Act that regulates open source systems differently, creating an urgent need for practical openness assessment. Here we use an evidence-based framework that distinguishes 14 dimensions of openness, from training datasets to scientific and technical documentation and from licensing to access methods. Surveying over 45 generative AI systems (both text and text-to-image), we find that while the term open source is widely used, many models are `open weight' at best and many providers seek to evade scientific, legal and regulatory scrutiny by withholding information on training and fine-tuning data. We argue that openness in generative AI is necessarily composite (consisting of multiple elements) and gradient (coming in degrees), and point out the risk of relying on single features like access or licensing to declare models open or not. Evidence-based openness assessment can help foster a generative AI landscape in which models can be effectively regulated, model providers can be held accountable, scientists can scrutinise generative AI, and end users can make informed decisions.
  • Long, M., & Rubio-Fernandez, P. (2024). Beyond typicality: Lexical category affects the use and processing of color words. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 4925-4930).

    Abstract

    Speakers and listeners show an informativity bias in the use and interpretation of color modifiers. For example, speakers use color more often when referring to objects that vary in color than to objects with a prototypical color. Likewise, listeners look away from objects with prototypical colors upon hearing that color mentioned. Here we test whether speakers and listeners account for another factor related to informativity: the strength of the association between lexical categories and color. Our results demonstrate that speakers and listeners' choices are indeed influenced by this factor; as such, it should be integrated into current pragmatic theories of informativity and computational models of color reference.

    Additional information

    link to eScholarship
  • Lucas, C., Griffiths, T., Xu, F., & Fawcett, C. (2008). A rational model of preference learning and choice prediction by children. In D. Koller, Y. Bengio, D. Schuurmans, L. Bottou, & A. Culotta (Eds.), Advances in Neural Information Processing Systems.

    Abstract

    Young children demonstrate the ability to make inferences about the preferences of other agents based on their choices. However, there exists no overarching account of what children are doing when they learn about preferences or how they use that knowledge. We use a rational model of preference learning, drawing on ideas from economics and computer science, to explain the behavior of children in several recent experiments. Specifically, we show how a simple econometric model can be extended to capture two- to four-year-olds’ use of statistical information in inferring preferences, and their generalization of these preferences.
  • Lupyan, G., & Raviv, L. (2024). A cautionary note on sociodemographic predictors of linguistic complexity: Different measures and different analyses lead to different conclusions. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 345-348). Nijmegen: The Evolution of Language Conferences.
  • Magyari, L., & De Ruiter, J. P. (2008). Timing in conversation: The anticipation of turn endings. In J. Ginzburg, P. Healey, & Y. Sato (Eds.), Proceedings of the 12th Workshop on the Semantics and Pragmatics Dialogue (pp. 139-146). London: King's college.

    Abstract

    We examined how communicators can switch between speaker and listener role with such accurate timing. During conversations, the majority of role transitions happens with a gap or overlap of only a few hundred milliseconds. This suggests that listeners can predict when the turn of the current speaker is going to end. Our hypothesis is that listeners know when a turn ends because they know how it ends. Anticipating the last words of a turn can help the next speaker in predicting when the turn will end, and also in anticipating the content of the turn, so that an appropriate response can be prepared in advance. We used the stimuli material of an earlier experiment (De Ruiter, Mitterer & Enfield, 2006), in which subjects were listening to turns from natural conversations and had to press a button exactly when the turn they were listening to ended. In the present experiment, we investigated if the subjects can complete those turns when only an initial fragment of the turn is presented to them. We found that the subjects made better predictions about the last words of those turns that had more accurate responses in the earlier button press experiment.
  • Magyari, L. (2008). A mentális lexikon modelljei és a magyar nyelv (Models of mental lexicon and the Hungarian language). In J. Gervain, & C. Pléh (Eds.), A láthatatlan nyelv (Invisible Language). Budapest: Gondolat Kiadó.
  • Majid, A., van Leeuwen, T., & Dingemanse, M. (2008). Synaesthesia: A cross-cultural pilot. In A. Majid (Ed.), Field manual volume 11 (pp. 37-41). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.492960.

    Abstract

    This Field Manual entry has been superceded by the 2009 version:
    https://doi.org/10.17617/2.883570

    Files private

    Request files
  • Majid, A. (2008). Focal colours. In A. Majid (Ed.), Field Manual Volume 11 (pp. 8-10). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.492958.

    Abstract

    In this task we aim to find what the best exemplars or “focal colours” of each basic colour term is in our field languages. This is an important part of the evidence we need in order to understand the colour data collected using 'The Language of Vision I: Colour'. This task consists of an experiment where participants pick out the best exemplar for the colour terms in their language. The goal is to establish language specific focal colours.
  • Majid, A., Evans, N., Gaby, A., & Levinson, S. C. (2011). The semantics of reciprocal constructions across languages: An extensional approach. In N. Evans, A. Gaby, S. C. Levinson, & A. Majid (Eds.), Reciprocals and semantic typology (pp. 29-60). Amsterdam: Benjamins.

    Abstract

    How similar are reciprocal constructions in the semantic parameters they encode? We investigate this question by using an extensional approach, which examines similarity of meaning by examining how constructions are applied over a set of 64 videoclips depicting reciprocal events (Evans et al. 2004). We apply statistical modelling to descriptions from speakers of 20 languages elicited using the videoclips. We show that there are substantial differences in meaning between constructions of different languages.

    Files private

    Request files
  • Majid, A., & Levinson, S. C. (2011). The language of perception across cultures [Abstract]. Abstracts of the XXth Congress of European Chemoreception Research Organization, ECRO-2010. Publ. in Chemical Senses, 36(1), E7-E8.

    Abstract

    How are the senses structured by the languages we speak, the cultures we inhabit? To what extent is the encoding of perceptual experiences in languages a matter of how the mind/brain is ―wired-up‖ and to what extent is it a question of local cultural preoccupation? The ―Language of Perception‖ project tests the hypothesis that some perceptual domains may be more ―ineffable‖ – i.e. difficult or impossible to put into words – than others. While cognitive scientists have assumed that proximate senses (olfaction, taste, touch) are more ineffable than distal senses (vision, hearing), anthropologists have illustrated the exquisite variation and elaboration the senses achieve in different cultural milieus. The project is designed to test whether the proximate senses are universally ineffable – suggesting an architectural constraint on cognition – or whether they are just accidentally so in Indo-European languages, so expanding the role of cultural interests and preoccupations. To address this question, a standardized set of stimuli of color patches, geometric shapes, simple sounds, tactile textures, smells and tastes have been used to elicit descriptions from speakers of more than twenty languages—including three sign languages. The languages are typologically, genetically and geographically diverse, representing a wide-range of cultures. The communities sampled vary in subsistence modes (hunter-gatherer to industrial), ecological zones (rainforest jungle to desert), dwelling types (rural and urban), and various other parameters. We examine how codable the different sensory modalities are by comparing how consistent speakers are in how they describe the materials in each modality. Our current analyses suggest that taste may, in fact, be the most codable sensorial domain across languages. Moreover, we have identified exquisite elaboration in the olfactory domains in some cultural settings, contrary to some contemporary predictions within the cognitive sciences. These results suggest that differential codability may be at least partly the result of cultural preoccupation. This shows that the senses are not just physiological phenomena but are constructed through linguistic, cultural and social practices.
  • Malt, B. C., Ameel, E., Gennari, S., Imai, M., Saji, N., & Majid, A. (2011). Do words reveal concepts? In L. Carlson, C. Hölscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society (pp. 519-524). Austin, TX: Cognitive Science Society.

    Abstract

    To study concepts, cognitive scientists must first identify some. The prevailing assumption is that they are revealed by words such as triangle, table, and robin. But languages vary dramatically in how they carve up the world by name. Either ordinary concepts must be heavily language-dependent or names cannot be a direct route to concepts. We asked English, Dutch, Spanish, and Japanese speakers to name videos of human locomotion and judge their similarities. We investigated what name inventories and scaling solutions on name similarity and on physical similarity for the groups individually and together suggest about the underlying concepts. Aggregated naming and similarity solutions converged on results distinct from the answers suggested by the word inventories and scaling solutions of any single language. Words such as triangle, table, and robin can help identify the conceptual space of a domain, but they do not directly reveal units of knowledge usefully considered 'concepts'.
  • Marcus, G., & Fisher, S. E. (2011). Genes and language. In P. Hogan (Ed.), The Cambridge encyclopedia of the language sciences (pp. 341-344). New York: Cambridge University Press.
  • Mark, D. M., Turk, A., Burenhult, N., & Stea, D. (2011). Landscape in language: An introduction. In D. M. Mark, A. G. Turk, N. Burenhult, & D. Stea (Eds.), Landscape in language: Transdisciplinary perspectives (pp. 1-24). Amsterdam: John Benjamins.
  • de Marneffe, M.-C., Tomlinson, J. J., Tice, M., & Sumner, M. (2011). The interaction of lexical frequency and phonetic variation in the perception of accented speech. In L. Carlson, C. Hölscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society (pp. 3575-3580). Austin, TX: Cognitive Science Society.

    Abstract

    How listeners understand spoken words despite massive variation in the speech signal is a central issue for linguistic theory. A recent focus on lexical frequency and specificity has proved fruitful in accounting for this phenomenon. Speech perception, though, is a multi-faceted process and likely incorporates a number of mechanisms to map a variable signal to meaning. We examine a well-established language use factor — lexical frequency — and how this factor is integrated with phonetic variability during the perception of accented speech. We show that an integrated perspective highlights a low-level perceptual mechanism that accounts for the perception of accented speech absent native contrasts, while shedding light on the use of interactive language factors in the perception of spoken words.
  • Matteo, M., & Bosker, H. R. (2024). How to test gesture-speech integration in ten minutes. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 737-741). doi:10.21437/SpeechProsody.2024-149.

    Abstract

    Human conversations are inherently multimodal, including auditory speech, visual articulatory cues, and hand gestures. Recent studies demonstrated that the timing of a simple up-and-down hand movement, known as a beat gesture, can affect speech perception. A beat gesture falling on the first syllable of a disyllabic word induces a bias to perceive a strong-weak stress pattern (i.e., “CONtent”), while a beat gesture falling on the second syllable combined with the same acoustics biases towards a weak-strong stress pattern (“conTENT”). This effect, termed the “manual McGurk effect”, has been studied in both in-lab and online studies, employing standard experimental sessions lasting approximately forty minutes. The present work tests whether the manual McGurk effect can be observed in an online short version (“mini-test”) of the original paradigm, lasting only ten minutes. Additionally, we employ two different response modalities, namely a two-alternative forced choice and a visual analog scale. A significant manual McGurk effect was observed with both response modalities. Overall, the present study demonstrates the feasibility of employing a ten-minute manual McGurk mini-test to obtain a measure of gesture-speech integration. As such, it may lend itself for inclusion in large-scale test batteries that aim to quantify individual variation in language processing.
  • Mishra, C., Nandanwar, A., & Mishra, S. (2024). HRI in Indian education: Challenges opportunities. In H. Admoni, D. Szafir, W. Johal, & A. Sandygulova (Eds.), Designing an introductory HRI course (workshop at HRI 2024). ArXiv. doi:10.48550/arXiv.2403.12223.

    Abstract

    With the recent advancements in the field of robotics and the increased focus on having general-purpose robots widely available to the general public, it has become increasingly necessary to pursue research into Human-robot interaction (HRI). While there have been a lot of works discussing frameworks for teaching HRI in educational institutions with a few institutions already offering courses to students, a consensus on the course content still eludes the field. In this work, we highlight a few challenges and opportunities while designing an HRI course from an Indian perspective. These topics warrant further deliberations as they have a direct impact on the design of HRI courses and wider implications for the entire field.
  • Mitterer, H. (2008). How are words reduced in spontaneous speech? In A. Botonis (Ed.), Proceedings of ISCA Tutorial and Research Workshop On Experimental Linguistics (pp. 165-168). Athens: University of Athens.

    Abstract

    Words are reduced in spontaneous speech. If reductions are constrained by functional (i.e., perception and production) constraints, they should not be arbitrary. This hypothesis was tested by examing the pronunciations of high- to mid-frequency words in a Dutch and a German spontaneous speech corpus. In logistic-regression models the "reduction likelihood" of a phoneme was predicted by fixed-effect predictors such as position within the word, word length, word frequency, and stress, as well as random effects such as phoneme identity and word. The models for Dutch and German show many communalities. This is in line with the assumption that similar functional constraints influence reductions in both languages.
  • Mitterer, H. (2011). Social accountability influences phonetic alignment. Journal of the Acoustical Society of America. Program abstracts of the 162nd Meeting of the Acoustical Society of America, 130(4), 2442.

    Abstract

    Speakers tend to take over the articulatory habits of their interlocutors [e.g., Pardo, JASA (2006)]. This phonetic alignment could be the consequence of either a social mechanism or a direct and automatic link between speech perception and production. The latter assumes that social variables should have little influence on phonetic alignment. To test that participants were engaged in a "cloze task" (i.e., Stimulus: "In fantasy movies, silver bullets are used to kill ..." Response: "werewolves") with either one or four interlocutors. Given findings with the Asch-conformity paradigm in social psychology, multiple consistent speakers should exert a stronger force on the participant to align. To control the speech style of the interlocutors, their questions and answers were pre-recorded in either a formal or a casual speech style. The stimuli's speech style was then manipulated between participants and was consistent throughout the experiment for a given participant. Surprisingly, participants aligned less with the speech style if there were multiple interlocutors. This may reflect a "diffusion of responsibility:" Participants may find it more important to align when they interact with only one person than with a larger group.
  • Motiekaitytė, K., Grosseck, O., Wolf, L., Bosker, H. R., Peeters, D., Perlman, M., Ortega, G., & Raviv, L. (2024). Iconicity and compositionality in emerging vocal communication systems: a Virtual Reality approach. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 387-389). Nijmegen: The Evolution of Language Conferences.
  • Norcliffe, E., Enfield, N. J., Majid, A., & Levinson, S. C. (2011). The grammar of perception. In K. Kendrick, & A. Majid (Eds.), Field manual volume 14 (pp. 1-10). Nijmegen: Max Planck Institute for Psycholinguistics.
  • Nordhoff, S., & Hammarström, H. (2011). Glottolog/Langdoc: Defining dialects, languages, and language families as collections of resources. Proceedings of the First International Workshop on Linked Science 2011 (LISC2011), Bonn, Germany, October 24, 2011.

    Abstract

    This paper describes the Glottolog/Langdoc project, an at- tempt to provide near-total bibliographical coverage of descriptive re- sources to the world's languages. Every reference is treated as a resource, as is every \languoid"[1]. References are linked to the languoids which they describe, and languoids are linked to the references described by them. Family relations between languoids are modeled in SKOS, as are relations across dierent classications of the same languages. This setup allows the representation of languoids as collections of references, render- ing the question of the denition of entities like `Scots', `West-Germanic' or `Indo-European' more empirical.
  • Ozturk, O., & Papafragou, A. (2008). Acquisition of evidentiality and source monitoring. In H. Chan, H. Jacob, & E. Kapia (Eds.), Proceedings from the 32nd Annual Boston University Conference on Language Development [BUCLD 32] (pp. 368-377). Somerville, Mass.: Cascadilla Press.
  • Ozyurek, A., & Perniss, P. M. (2011). Event representations in signed languages. In J. Bohnemeyer, & E. Pederson (Eds.), Event representations in language and cognition (pp. 84-107). New York: Cambridge University Press.
  • Pederson, E., & Wilkins, D. (1996). A cross-linguistic questionnaire on 'demonstratives'. In S. C. Levinson (Ed.), Manual for the 1996 Field Season (pp. 1-11). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.3003259.

    Abstract

    Demonstrative terms (e.g., this and that) are key items in understanding how a language constructs and interprets spatial relationships. This in-depth questionnaire explores how demonstratives (and similar spatial deixis forms) function in the research language, covering such topics as their morphology and syntax, semantic dimensions, and co-occurring gesture practices. Questionnaire responses should ideally be based on natural, situated discourse as well as elicitation with consultants.
  • Pederson, E., & Senft, G. (1996). Route descriptions: interactive games with Eric's maze task. In S. C. Levinson (Ed.), Manual for the 1996 Field Season (pp. 15-17). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.3003287.

    Abstract

    What are the preferred ways to describe spatial relationships in different linguistic and cultural groups, and how does this interact with non-linguistic spatial awareness? This game was devised as an interactive supplement to several items that collect information on the encoding and understanding of spatial relationships, especially as relevant to “route descriptions”. This is a director-matcher task, where one consultant has access to stimulus materials that shows a “target” situation, and directs another consultant (who cannot see the target) to recreate this arrangement.
  • Peirolo, M., Meyer, A. S., & Frances, C. (2024). Investigating the causes of prosodic marking in self-repairs: An automatic process? In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 1080-1084). doi:10.21437/SpeechProsody.2024-218.

    Abstract

    Natural speech involves repair. These repairs are often highlighted through prosodic marking (Levelt & Cutler, 1983). Prosodic marking usually entails an increase in pitch, loudness, and/or duration that draws attention to the corrected word. While it is established that natural self-repairs typically elicit prosodic marking, the exact cause of this is unclear. This study investigates whether producing a prosodic marking emerges from an automatic correction process or has a communicative purpose. In the current study, we elicit corrections to test whether all self-corrections elicit prosodic marking. Participants carried out a picture-naming task in which they described two images presented on-screen. To prompt self-correction, the second image was altered in some cases, requiring participants to abandon their initial utterance and correct their description to match the new image. This manipulation was compared to a control condition in which only the orientation of the object would change, eliciting no self-correction while still presenting a visual change. We found that the replacement of the item did not elicit a prosodic marking, regardless of the type of change. Theoretical implications and research directions are discussed, in particular theories of prosodic planning.
  • Perniss, P. M., & Ozyurek, A. (2008). Representations of action, motion and location in sign space: A comparison of German (DGS) and Turkish (TID) sign language narratives. In J. Quer (Ed.), Signs of the time: Selected papers from TISLR 8 (pp. 353-376). Seedorf: Signum Press.
  • Perniss, P. M., Zwitserlood, I., & Ozyurek, A. (2011). Does space structure spatial language? Linguistic encoding of space in sign languages. In L. Carlson, C. Holscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Meeting of the Cognitive Science Society (pp. 1595-1600). Austin, TX: Cognitive Science Society.
  • Perniss, P. M., & Zeshan, U. (2008). Possessive and existential constructions in Kata Kolok (Bali). In Possessive and existential constructions in sign languages. Nijmegen: Ishara Press.
  • Perniss, P. M., & Zeshan, U. (2008). Possessive and existential constructions: Introduction and overview. In Possessive and existential constructions in sign languages (pp. 1-31). Nijmegen: Ishara Press.
  • Petersson, K. M. (2008). On cognition, structured sequence processing, and adaptive dynamical systems. American Institute of Physics Conference Proceedings, 1060(1), 195-200.

    Abstract

    Cognitive neuroscience approaches the brain as a cognitive system: a system that functionally is conceptualized in terms of information processing. We outline some aspects of this concept and consider a physical system to be an information processing device when a subclass of its physical states can be viewed as representational/cognitive and transitions between these can be conceptualized as a process operating on these states by implementing operations on the corresponding representational structures. We identify a generic and fundamental problem in cognition: sequentially organized structured processing. Structured sequence processing provides the brain, in an essential sense, with its processing logic. In an approach addressing this problem, we illustrate how to integrate levels of analysis within a framework of adaptive dynamical systems. We note that the dynamical system framework lends itself to a description of asynchronous event-driven devices, which is likely to be important in cognition because the brain appears to be an asynchronous processing system. We use the human language faculty and natural language processing as a concrete example through out.
  • Petersson, K. M., Forkstam, C., Inácio, F., Bramão, I., Araújo, S., Souza, A. C., Silva, S., & Castro, S. L. (2011). Artificial language learning. In A. Trevisan, & V. Wannmacher Pereira (Eds.), Alfabeltização e cognição (pp. 71-90). Porto Alegre, Brasil: Edipucrs.

    Abstract

    Neste artigo fazemos uma revisão breve de investigações actuais com técnicas comportamentais e de neuroimagem funcional sobre a aprendizagem de uma linguagem artificial em crianças e adultos. Na secção final, discutimos uma possível associação entre dislexia e aprendizagem implícita. Resultados recentes sugerem que a presença de um défice ao nível da aprendizagem implícita pode contribuir para as dificuldades de leitura e escrita observadas em indivíduos disléxicos.
  • Phillips, W., & Majid, A. (2011). Emotional sound symbolism. In K. Kendrick, & A. Majid (Eds.), Field manual volume 14 (pp. 16-18). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.1005615.
  • Plate, L., Fisher, V. J., Nabibaks, F., & Feenstra, M. (2024). Feeling the traces of the Dutch colonial past: Dance as an affective methodology in Farida Nabibaks’s radiant shadow. In E. Van Bijnen, P. Brandon, K. Fatah-Black, I. Limon, W. Modest, & M. Schavemaker (Eds.), The future of the Dutch colonial past: From dialogues to new narratives (pp. 126-139). Amsterdam: Amsterdam University Press.
  • Plomp, R., & Levelt, W. J. M. (1966). Perception of tonal consonance. In M. A. Bouman (Ed.), Studies in Perception - dedicated to M.A. Bouman (pp. 105-118). Soesterberg: Institute for Perception RVO-TNO.
  • Poellmann, K., McQueen, J. M., & Mitterer, H. (2011). The time course of perceptual learning. In W.-S. Lee, & E. Zee (Eds.), Proceedings of the 17th International Congress of Phonetic Sciences 2011 [ICPhS XVII] (pp. 1618-1621). Hong Kong: Department of Chinese, Translation and Linguistics, City University of Hong Kong.

    Abstract

    Two groups of participants were trained to perceive an ambiguous sound [s/f] as either /s/ or /f/ based on lexical bias: One group heard the ambiguous fricative in /s/-final words, the other in /f/-final words. This kind of exposure leads to a recalibration of the /s/-/f/ contrast [e.g., 4]. In order to investigate when and how this recalibration emerges, test trials were interspersed among training and filler trials. The learning effect needed at least 10 clear training items to arise. Its emergence seemed to occur in a rather step-wise fashion. Learning did not improve much after it first appeared. It is likely, however, that the early test trials attracted participants' attention and therefore may have interfered with the learning process.
  • Rapold, C. J. (2011). Semantics of Khoekhoe reciprocal constructions. In N. Evans, A. Gaby, S. C. Levinson, & A. Majid (Eds.), Reciprocals and semantic typology (pp. 61-74). Amsterdam: Benjamins.

    Abstract

    This paper identifies four reciprocal construction types in Khoekhoe (Central Khoisan). After a brief description of the morphosyntax of each construction, semantic factors governing their choice are explored. Besides lexical semantics, the number of participants, timing of symmetric subevents, and symmetric conceptualisation are shown to account for the distribution of the four partially competing reciprocal constructions.
  • Razafindrazaka, H., & Brucato, N. (2008). Esclavage et diaspora Africaine. In É. Crubézy, J. Braga, & G. Larrouy (Eds.), Anthropobiologie: Évolution humaine (pp. 326-328). Issy-les-Moulineaux: Elsevier Masson.
  • Razafindrazaka, H., Brucato, N., & Mazières, S. (2008). Les Noirs marrons. In É. Crubézy, J. Braga, & G. Larrouy (Eds.), Anthropobiologie: Évolution humaine (pp. 319-320). Issy-les-Moulineaux: Elsevier Masson.
  • Regier, T., Khetarpal, N., & Majid, A. (2011). Inferring conceptual structure from cross-language data. In L. Carlson, C. Hölscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society (pp. 1488). Austin, TX: Cognitive Science Society.
  • Reinisch, E., Jesse, A., & McQueen, J. M. (2008). The strength of stress-related lexical competition depends on the presence of first-syllable stress. In Proceedings of Interspeech 2008 (pp. 1954-1954).

    Abstract

    Dutch listeners' looks to printed words were tracked while they listened to instructions to click with their mouse on one of them. When presented with targets from word pairs where the first two syllables were segmentally identical but differed in stress location, listeners used stress information to recognize the target before segmental information disambiguated the words. Furthermore, the amount of lexical competition was influenced by the presence or absence of word-initial stress.
  • Reinisch, E., & Weber, A. (2011). Adapting to lexical stress in a foreign accent. In W.-S. Lee, & E. Zee (Eds.), Proceedings of the 17th International Congress of Phonetic Sciences 2011 [ICPhS XVII] (pp. 1678-1681). Hong Kong: Department of Chinese, Translation and Linguistics, City University of Hong Kong.

    Abstract

    An exposure-test paradigm was used to examine whether Dutch listeners can adapt their perception to non-canonical marking of lexical stress in Hungarian-accented Dutch. During exposure, one group of listeners heard only words with correct initial stress, while another group also heard examples of unstressed initial syllables that were marked by high pitch, a possible stress cue in Dutch. Subsequently, listeners’ eye movements to target-competitor pairs with segmental overlap but different stress patterns were tracked while hearing Hungarian-accented Dutch. Listeners who had heard non-canonically produced words previously distinguished target-competitor pairs faster than listeners who had only been exposed to canonical forms before. This suggests that listeners can adapt quickly to speaker-specific realizations of non-canonical lexical stress.
  • Reinisch, E., Jesse, A., & McQueen, J. M. (2008). Lexical stress information modulates the time-course of spoken-word recognition. In Proceedings of Acoustics' 08 (pp. 3183-3188).

    Abstract

    Segmental as well as suprasegmental information is used by Dutch listeners to recognize words. The time-course of the effect of suprasegmental stress information on spoken-word recognition was investigated in a previous study, in which we tracked Dutch listeners' looks to arrays of four printed words as they listened to spoken sentences. Each target was displayed along with a competitor that did not differ segmentally in its first two syllables but differed in stress placement (e.g., 'CENtimeter' and 'sentiMENT'). The listeners' eye-movements showed that stress information is used to recognize the target before distinct segmental information is available. Here, we examine the role of durational information in this effect. Two experiments showed that initial-syllable duration, as a cue to lexical stress, is not interpreted dependent on the speaking rate of the preceding carrier sentence. This still held when other stress cues like pitch and amplitude were removed. Rather, the speaking rate of the preceding carrier affected the speed of word recognition globally, even though the rate of the target itself was not altered. Stress information modulated lexical competition, but did so independently of the rate of the preceding carrier, even if duration was the only stress cue present.
  • Reinisch, E., Weber, A., & Mitterer, H. (2011). Listeners retune phoneme boundaries across languages [Abstract]. Journal of the Acoustical Society of America. Program abstracts of the 162nd Meeting of the Acoustical Society of America, 130(4), 2572-2572.

    Abstract

    Listeners can flexibly retune category boundaries of their native language to adapt to non-canonically produced phonemes. This only occurs, however, if the pronunciation peculiarities can be attributed to stable and not transient speaker-specific characteristics. Listening to someone speaking a second language, listeners could attribute non-canonical pronunciations either to the speaker or to the fact that she is modifying her categories in the second language. We investigated whether, following exposure to Dutch-accented English, Dutch listeners show effects of category retuning during test where they hear the same speaker speaking her native language, Dutch. Exposure was a lexical-decision task where either word-final [f] or [s] was replaced by an ambiguous sound. At test listeners categorized minimal word pairs ending in sounds along an [f]-[s] continuum. Following exposure to English words, Dutch listeners showed boundary shifts of a similar magnitude as following exposure to the same phoneme variants in their native language. This suggests that production patterns in a second language are deemed a stable characteristic. A second experiment suggests that category retuning also occurs when listeners are exposed to and tested with a native speaker of their second language. Listeners thus retune phoneme boundaries across languages.
  • Reis, A., Faísca, L., & Petersson, K. M. (2011). Literacia: Modelo para o estudo dos efeitos de uma aprendizagem específica na cognição e nas suas bases cerebrais. In A. Trevisan, J. J. Mouriño Mosquera, & V. Wannmacher Pereira (Eds.), Alfabeltização e cognição (pp. 23-36). Porto Alegro, Brasil: Edipucrs.

    Abstract

    A aquisição de competências de leitura e de escrita pode ser vista como um processo formal de transmissão cultural, onde interagem factores neurobiológicos e culturais. O treino sistemático exigido pela aprendizagem da leitura e da escrita poderá produzir mudanças quantitativas e qualitativas tanto a nível cognitivo como ao nível da organização do cérebro. Estudar sujeitos iletrados e letrados representa, assim, uma oportunidade para investigar efeitos de uma aprendizagem específica no desenvolvimento cognitivo e suas bases cerebrais. Neste trabalho, revemos um conjunto de investigações comportamentais e com métodos de imagem cerebral que indicam que a literacia tem um impacto nas nossas funções cognitivas e na organização cerebral. Mais especificamente, discutiremos diferenças entre letrados e iletrados para domínios cognitivos verbais e não-verbais, sugestivas de que a arquitectura cognitiva é formatada, em parte, pela aprendizagem da leitura e da escrita. Os dados de neuroimagem funcionais e estruturais são também indicadores que a aquisição de uma ortografia alfabética interfere nos processos de organização e lateralização das funções cognitivas.
  • de Reus, K., Benítez-Burraco, A., Hersh, T. A., Groot, N., Lambert, M. L., Slocombe, K. E., Vernes, S. C., & Raviv, L. (2024). Self-domestication traits in vocal learning mammals. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 105-108). Nijmegen: The Evolution of Language Conferences.
  • Roberts, L. (2008). Processing temporal constraints and some implications for the investigation of second language sentence processing and acquisition. Commentary on Baggio. In P. Indefrey, & M. Gullberg (Eds.), Time to speak: Cognitive and neural prerequisites for time in language (pp. 57-61). Oxford: Blackwell.
  • Robinson, S. (2011). Reciprocals in Rotokas. In N. Evans, A. Gaby, S. C. Levinson, & A. Majid (Eds.), Reciprocals and semantic typology (pp. 195-211). Amsterdam: Benjamins.

    Abstract

    This paper describes the syntax and semantics of reciprocity in the Central dialect of Rotokas, a non-Austronesian (Papuan) language spoken in Bougainville, Papua New Guinea. In Central Rotokas, there are three main reciprocal construction types, which differ formally according to where the reflexive/reciprocal marker (ora-) occurs in the clause: on the verb, on a pronominal argument or adjunct, or on a body part noun. The choice of construction type is determined by two considerations: the valency of the verb (i.e., whether it has one or two core arguments) and whether the reciprocal action is performed on a body part. The construction types are compatible with a wide range of the logical subtypes of reciprocity (strong, melee, chaining, etc.).
  • Robotham, L., Trinkler, I., & Sauter, D. (2008). The power of positives: Evidence for an overall emotional recognition deficit in Huntington's disease [Abstract]. Journal of Neurology, Neurosurgery & Psychiatry, 79, A12.

    Abstract

    The recognition of emotions of disgust, anger and fear have been shown to be significantly impaired in Huntington’s disease (eg,Sprengelmeyer et al, 1997, 2006; Gray et al, 1997; Milders et al, 2003,Montagne et al, 2006; Johnson et al, 2007; De Gelder et al, 2008). The relative impairment of these emotions might have implied a recognition impairment specific to negative emotions. Could the asymmetric recognition deficits be due not to the complexity of the emotion but rather reflect the complexity of the task? In the current study, 15 Huntington’s patients and 16 control subjects were presented with negative and positive non-speech emotional vocalisations that were to be identified as anger, fear, sadness, disgust, achievement, pleasure and amusement in a forced-choice paradigm. This experiment more accurately matched the negative emotions with positive emotions in a homogeneous modality. The resulting dually impaired ability of Huntington’s patients to identify negative and positive non-speech emotional vocalisations correctly provides evidence for an overall emotional recognition deficit in the disease. These results indicate that previous findings of a specificity in emotional recognition deficits might instead be due to the limitations of the visual modality. Previous experiments may have found an effect of emotional specificy due to the presence of a single positive emotion, happiness, in the midst of multiple negative emotions. In contrast with the previous literature, the study presented here points to a global deficit in the recognition of emotional sounds.
  • Rohrer, P. L., Bujok, R., Van Maastricht, L., & Bosker, H. R. (2024). The timing of beat gestures affects lexical stress perception in Spanish. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings Speech Prosody 2024 (pp. 702-706). doi:10.21437/SpeechProsody.2024-142.

    Abstract

    It has been shown that when speakers produce hand gestures, addressees are attentive towards these gestures, using them to facilitate speech processing. Even relatively simple “beat” gestures are taken into account to help process aspects of speech such as prosodic prominence. In fact, recent evidence suggests that the timing of a beat gesture can influence spoken word recognition. Termed the manual McGurk Effect, Dutch participants, when presented with lexical stress minimal pair continua in Dutch, were biased to hear lexical stress on the syllable that coincided with a beat gesture. However, little is known about how this manual McGurk effect would surface in languages other than Dutch, with different acoustic cues to prominence, and variable gestures. Therefore, this study tests the effect in Spanish where lexical stress is arguably even more important, being a contrastive cue in the regular verb conjugation system. Results from 24 participants corroborate the effect in Spanish, namely that when given the same auditory stimulus, participants were biased to perceive lexical stress on the syllable that visually co-occurred with a beat gesture. These findings extend the manual McGurk effect to a different language, emphasizing the impact of gestures' timing on prosody perception and spoken word recognition.
  • Rohrer, P. L., Hong, Y., & Bosker, H. R. (2024). Gestures time to vowel onset and change the acoustics of the word in Mandarin. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 866-870). doi:10.21437/SpeechProsody.2024-175.

    Abstract

    Recent research on multimodal language production has revealed that prominence in speech and gesture go hand-in-hand. Specifically, peaks in gesture (i.e., the apex) seem to closely coordinate with peaks in fundamental frequency (F0). The nature of this relationship may also be bi-directional, as it has also been shown that the production of gesture directly affects speech acoustics. However, most studies on the topic have largely focused on stress-based languages, where fundamental frequency has a prominence-lending function. Less work has been carried out on lexical tone languages such as Mandarin, where F0 is lexically distinctive. In this study, four native Mandarin speakers were asked to produce single monosyllabic CV words, taken from minimal lexical tone triplets (e.g., /pi1/, /pi2/, /pi3/), either with or without a beat gesture. Our analyses of the timing of the gestures showed that the gesture apex most stably occurred near vowel onset, with consonantal duration being the strongest predictor of apex placement. Acoustic analyses revealed that words produced with gesture showed raised F0 contours, greater intensity, and shorter durations. These findings further our understanding of gesture-speech alignment in typologically diverse languages, and add to the discussion about multimodal prominence.
  • Ronderos, C. R., Zhang, Y., & Rubio-Fernandez, P. (2024). Weighted parameters in demonstrative use: The case of Spanish teens and adults. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 3279-3286).
  • Rubio-Fernandez, P., Long, M., Shukla, V., Bhatia, V., Mahapatra, A., Ralekar, C., Ben-Ami, S., & Sinha, P. (2024). Multimodal communication in newly sighted children: An investigation of the relation between visual experience and pragmatic development. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 2560-2567).

    Abstract

    We investigated the relationship between visual experience and pragmatic development by testing the socio-communicative skills of a unique population: the Prakash children of India, who received treatment for congenital cataracts after years of visual deprivation. Using two different referential communication tasks, our study investigated Prakash' children ability to produce sufficiently informative referential expressions (e.g., ‘the green pear' or ‘the small plate') and pay attention to their interlocutor's face during the task (Experiment 1), as well as their ability to recognize a speaker's referential intent through non-verbal cues such as head turning and pointing (Experiment 2). Our results show that Prakash children have strong pragmatic skills, but do not look at their interlocutor's face as often as neurotypical children do. However, longitudinal analyses revealed an increase in face fixations, suggesting that over time, Prakash children come to utilize their improved visual skills for efficient referential communication.

    Additional information

    link to eScholarship
  • De Ruiter, L. E. (2008). How useful are polynomials for analyzing intonation? In Proceedings of Interspeech 2008 (pp. 785-789).

    Abstract

    This paper presents the first application of polynomial modeling as a means for validating phonological pitch accent labels to German data. It is compared to traditional phonetic analysis (measuring minima, maxima, alignment). The traditional method fares better in classification, but results are comparable in statistical accent pair testing. Robustness tests show that pitch correction is necessary in both cases. The approaches are discussed in terms of their practicability, applicability to other domains of research and interpretability of their results.
  • Sadakata, M., & McQueen, J. M. (2011). The role of variability in non-native perceptual learning of a Japanese geminate-singleton fricative contrast. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy (pp. 873-876).

    Abstract

    The current study reports the enhancing effect of a high variability training procedure in the learning of a Japanese geminate-singleton fricative contrast. Dutch natives took part in a five-day training procedure in which they identified geminate and singleton variants of the Japanese fricative /s/. They heard either many repetitions of a limited set of words recorded by a single speaker (simple training) or fewer repetitions of a more variable set of words recorded by multiple speakers (variable training). Pre-post identification evaluations and a transfer test indicated clear benefits of the variable training.
  • Sander, J., Çetinçelik, M., Zhang, Y., Rowland, C. F., & Harmon, Z. (2024). Why does joint attention predict vocabulary acquisition? The answer depends on what coding scheme you use. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 1607-1613).

    Abstract

    Despite decades of study, we still know less than we would like about the association between joint attention (JA) and language acquisition. This is partly because of disagreements on how to operationalise JA. In this study, we examine the impact of applying two different, influential JA operationalisation schemes to the same dataset of child-caregiver interactions, to determine which yields a better fit to children's later vocabulary size. Two coding schemes— one defining JA in terms of gaze overlap and one in terms of social aspects of shared attention—were applied to video-recordings of dyadic naturalistic toy-play interactions (N=45). We found that JA was predictive of later production vocabulary when operationalised as shared focus (study 1), but also that its operationalisation as shared social awareness increased its predictive power (study 2). Our results emphasise the critical role of methodological choices in understanding how and why JA is associated with vocabulary size.
  • Sauermann, A., Höhle, B., Chen, A., & Järvikivi, J. (2011). Intonational marking of focus in different word orders in German children. In M. B. Washburn, K. McKinney-Bock, E. Varis, & A. Sawyer (Eds.), Proceedings of the 28th West Coast Conference on Formal Linguistics (pp. 313-322). Somerville, MA: Cascadilla Proceedings Project.

    Abstract

    The use of word order and intonation to mark focus in child speech has received some attention. However, past work usually examined each device separately or only compared the realizations of focused vs. non-focused constituents. This paper investigates the interaction between word order and intonation in the marking of different focus types in 4- to 5-year old German-speaking children and an adult control group. An answer-reconstruction task was used to elicit syntactic (word order) and intonational focus marking of subject and objects (locus of focus) in three focus types (broad, narrow, and contrastive focus). The results indicate that both children and adults used intonation to distinguish broad from contrastive focus but they differed in the marking of narrow focus. Further, both groups preferred intonation to word order as device for focus marking. But children showed an early sensitivity for the impact of focus type and focus location on word order variation and on phonetic means to mark focus.
  • Sauter, D., Eisner, F., Rosen, S., & Scott, S. K. (2008). The role of source and filter cues in emotion recognition in speech [Abstract]. Journal of the Acoustical Society of America, 123, 3739-3740.

    Abstract

    In the context of the source-filter theory of speech, it is well established that intelligibility is heavily reliant on information carried by the filter, that is, spectral cues (e.g., Faulkner et al., 2001; Shannon et al., 1995). However, the extraction of other types of information in the speech signal, such as emotion and identity, is less well understood. In this study we investigated the extent to which emotion recognition in speech depends on filterdependent cues, using a forced-choice emotion identification task at ten levels of noise-vocoding ranging between one and 32 channels. In addition, participants performed a speech intelligibility task with the same stimuli. Our results indicate that compared to speech intelligibility, emotion recognition relies less on spectral information and more on cues typically signaled by source variations, such as voice pitch, voice quality, and intensity. We suggest that, while the reliance on spectral dynamics is likely a unique aspect of human speech, greater phylogenetic continuity across species may be found in the communication of affect in vocalizations.
  • Sauter, D. (2008). The time-course of emotional voice processing [Abstract]. Neurocase, 14, 455-455.

    Abstract

    Research using event-related brain potentials (ERPs) has demonstrated an early differential effect in fronto-central regions when processing emotional, as compared to affectively neutral facial stimuli (e.g., Eimer & Holmes, 2002). In this talk, data demonstrating a similar effect in the auditory domain will be presented. ERPs were recorded in a one-back task where participants had to identify immediate repetitions of emotion category, such as a fearful sound followed by another fearful sound. The stimulus set consisted of non-verbal emotional vocalisations communicating positive and negative sounds, as well as neutral baseline conditions. Similarly to the facial domain, fear sounds as compared to acoustically controlled neutral sounds, elicited a frontally distributed positivity with an onset latency of about 150 ms after stimulus onset. These data suggest the existence of a rapid multi-modal frontocentral mechanism discriminating emotional from non-emotional human signals.
  • Scharenborg, O., & Cooke, M. P. (2008). Comparing human and machine recognition performance on a VCV corpus. In ISCA Tutorial and Research Workshop (ITRW) on "Speech Analysis and Processing for Knowledge Discovery".

    Abstract

    Listeners outperform ASR systems in every speech recognition task. However, what is not clear is where this human advantage originates. This paper investigates the role of acoustic feature representations. We test four (MFCCs, PLPs, Mel Filterbanks, Rate Maps) acoustic representations, with and without ‘pitch’ information, using the same backend. The results are compared with listener results at the level of articulatory feature classification. While no acoustic feature representation reached the levels of human performance, both MFCCs and Rate maps achieved good scores, with Rate maps nearing human performance on the classification of voicing. Comparing the results on the most difficult articulatory features to classify showed similarities between the humans and the SVMs: e.g., ‘dental’ was by far the least well identified by both groups. Overall, adding pitch information seemed to hamper classification performance.
  • Scharenborg, O. (2008). Modelling fine-phonetic detail in a computational model of word recognition. In INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association (pp. 1473-1476). ISCA Archive.

    Abstract

    There is now considerable evidence that fine-grained acoustic-phonetic detail in the speech signal helps listeners to segment a speech signal into syllables and words. In this paper, we compare two computational models of word recognition on their ability to capture and use this finephonetic detail during speech recognition. One model, SpeM, is phoneme-based, whereas the other, newly developed Fine- Tracker, is based on articulatory features. Simulations dealt with modelling the ability of listeners to distinguish short words (e.g., ‘ham’) from the longer words in which they are embedded (e.g., ‘hamster’). The simulations with Fine- Tracker showed that it was, like human listeners, able to distinguish between short words from the longer words in which they are embedded. This suggests that it is possible to extract this fine-phonetic detail from the speech signal and use it during word recognition.
  • Scharenborg, O., Mitterer, H., & McQueen, J. M. (2011). Perceptual learning of liquids. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy (pp. 149-152).

    Abstract

    Previous research on lexically-guided perceptual learning has focussed on contrasts that differ primarily in local cues, such as plosive and fricative contrasts. The present research had two aims: to investigate whether perceptual learning occurs for a contrast with non-local cues, the /l/-/r/ contrast, and to establish whether STRAIGHT can be used to create ambiguous sounds on an /l/-/r/ continuum. Listening experiments showed lexically-guided learning about the /l/-/r/ contrast. Listeners can thus tune in to unusual speech sounds characterised by non-local cues. Moreover, STRAIGHT can be used to create stimuli for perceptual learning experiments, opening up new research possibilities. Index Terms: perceptual learning, morphing, liquids, human word recognition, STRAIGHT.
  • Schmidt, T., Duncan, S., Ehmer, O., Hoyt, J., Kipp, M., Loehr, D., Magnusson, M., Rose, T., & Sloetjes, H. (2008). An exchange format for multimodal annotations. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    This paper presents the results of a joint effort of a group of multimodality researchers and tool developers to improve the interoperability between several tools used for the annotation of multimodality. We propose a multimodal annotation exchange format, based on the annotation graph formalism, which is supported by import and export routines in the respective tools
  • Schmiedtova, B., & Flecken, M. (2008). The role of aspectual distinctions in event encoding: Implications for second language acquisition. In S. Müller-de Knop, & T. Mortelmans (Eds.), Pedagogical grammar (pp. 357-384). Berlin: Mouton de Gruyter.
  • Schuppler, B., Ernestus, M., Scharenborg, O., & Boves, L. (2008). Preparing a corpus of Dutch spontaneous dialogues for automatic phonetic analysis. In INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association (pp. 1638-1641). ISCA Archive.

    Abstract

    This paper presents the steps needed to make a corpus of Dutch spontaneous dialogues accessible for automatic phonetic research aimed at increasing our understanding of reduction phenomena and the role of fine phonetic detail. Since the corpus was not created with automatic processing in mind, it needed to be reshaped. The first part of this paper describes the actions needed for this reshaping in some detail. The second part reports the results of a preliminary analysis of the reduction phenomena in the corpus. For this purpose a phonemic transcription of the corpus was created by means of a forced alignment, first with a lexicon of canonical pronunciations and then with multiple pronunciation variants per word. In this study pronunciation variants were generated by applying a large set of phonetic processes that have been implicated in reduction to the canonical pronunciations of the words. This relatively straightforward procedure allows us to produce plausible pronunciation variants and to verify and extend the results of previous reduction studies reported in the literature.
  • Sekine, K. (2011). The development of spatial perspective in the description of large-scale environments. In G. Stam, & M. Ishino (Eds.), Integrating Gestures: The interdisciplinary nature of gesture (pp. 175-186). Amsterdam: John Benjamins Publishing Company.

    Abstract

    This research investigated developmental changes in children’s representations of large-scale environments as reflected in spontaneous gestures and speech produced during route descriptions Four-, five-, and six-year-olds (N = 122) described the route from their nursery school to their own homes. Analysis of the children’s gestures showed that some 5- and 6-year-olds produced gestures that represented survey mapping, and they were categorized as a survey group. Children who did not produce such gestures were categorized as a route group. A comparison of the two groups revealed no significant differences in speech indices, with the exception that the survey group showed significantly fewer right/left terms. As for gesture, the survey group produced more gestures than the route group. These results imply that an initial form of survey-map representation is acquired beginning at late preschool age.
  • Senft, G. (2008). The teaching of Tokunupei. In J. Kommers, & E. Venbrux (Eds.), Cultural styles of knowledge transmission: Essays in honour of Ad Borsboom (pp. 139-144). Amsterdam: Aksant.

    Abstract

    The paper describes how the documentation of a popular song of the adolescents of Tauwema in 1982 lead to the collection of the myth of Imdeduya and Yolina, one of the most important myths of the Trobriand Islands. When I returned to my fieldsite in 1989 Tokunupei, one of my best consultants in Tauwema, remembered my interest in the myth and provided me with further information on this topic. Tokunupei's teachings open up an important access to Trobriand eschatology.
  • Senft, G. (2008). Zur Bedeutung der Sprache für die Feldforschung. In B. Beer (Ed.), Methoden und Techniken der Feldforschung (pp. 103-118). Berlin: Reimer.
  • Senft, G. (2008). Event conceptualization and event report in serial verb constructions in Kilivila: Towards a new approach to research and old phenomenon. In G. Senft (Ed.), Serial verb constructions in Austronesian and Papuan languages (pp. 203-230). Canberra: Pacific Linguistics Publishers.
  • Senft, G. (2008). Introduction. In G. Senft (Ed.), Serial verb constructions in Austronesian and Papuan languages (pp. 1-15). Canberra: Pacific Linguistics Publishers.

Share this page