Publications

Displaying 101 - 200 of 272
  • Hashemzadeh, M., Kaufeld, G., White, M., Martin, A. E., & Fyshe, A. (2020). From language to language-ish: How brain-like is an LSTM representation of nonsensical language stimuli? In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 645-655). Association for Computational Linguistics.

    Abstract

    The representations generated by many mod-
    els of language (word embeddings, recurrent
    neural networks and transformers) correlate
    to brain activity recorded while people read.
    However, these decoding results are usually
    based on the brain’s reaction to syntactically
    and semantically sound language stimuli. In
    this study, we asked: how does an LSTM (long
    short term memory) language model, trained
    (by and large) on semantically and syntac-
    tically intact language, represent a language
    sample with degraded semantic or syntactic
    information? Does the LSTM representation
    still resemble the brain’s reaction? We found
    that, even for some kinds of nonsensical lan-
    guage, there is a statistically significant rela-
    tionship between the brain’s activity and the
    representations of an LSTM. This indicates
    that, at least in some instances, LSTMs and the
    human brain handle nonsensical data similarly.
  • De Heer Kloots, M., Carlson, D., Garcia, M., Kotz, S., Lowry, A., Poli-Nardi, L., de Reus, K., Rubio-García, A., Sroka, M., Varola, M., & Ravignani, A. (2020). Rhythmic perception, production and interactivity in harbour and grey seals. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 59-62). Nijmegen: The Evolution of Language Conferences.
  • Hoeksema, N., Villanueva, S., Mengede, J., Salazar-Casals, A., Rubio-García, A., Curcic-Blake, B., Vernes, S. C., & Ravignani, A. (2020). Neuroanatomy of the grey seal brain: Bringing pinnipeds into the neurobiological study of vocal learning. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 162-164). Nijmegen: The Evolution of Language Conferences.
  • Hoeksema, N., Wiesmann, M., Kiliaan, A., Hagoort, P., & Vernes, S. C. (2020). Bats and the comparative neurobiology of vocal learning. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 165-167). Nijmegen: The Evolution of Language Conferences.
  • Holler, J., Tutton, M., & Wilkin, K. (2011). Co-speech gestures in the process of meaning coordination. In Proceedings of the 2nd GESPIN - Gesture & Speech in Interaction Conference, Bielefeld, 5-7 Sep 2011.

    Abstract

    This study uses a classical referential communication task to
    investigate the role of co-speech gestures in the process of
    coordination. The study manipulates both the common ground between the interlocutors, as well as the visibility of the gestures they use. The findings show that co-speech gestures are an integral part of the referential utterances speakers
    produced with regard to both initial references as well as repeated references, and that the availability of gestures appears to impact on interlocutors’ referential oordination. The results are discussed with regard to past research on
    common ground as well as theories of gesture production.
  • Isbilen, E. S., McCauley, S. M., Kidd, E., & Christiansen, M. H. (2017). Testing statistical learning implicitly: A novel chunk-based measure of statistical learning. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 564-569). Austin, TX: Cognitive Science Society.

    Abstract

    Attempts to connect individual differences in statistical learning with broader aspects of cognition have received considerable attention, but have yielded mixed results. A possible explanation is that statistical learning is typically tested using the two-alternative forced choice (2AFC) task. As a meta-cognitive task relying on explicit familiarity judgments, 2AFC may not accurately capture implicitly formed statistical computations. In this paper, we adapt the classic serial-recall memory paradigm to implicitly test statistical learning in a statistically-induced chunking recall (SICR) task. We hypothesized that artificial language exposure would lead subjects to chunk recurring statistical patterns, facilitating recall of words from the input. Experiment 1 demonstrates that SICR offers more fine-grained insights into individual differences in statistical learning than 2AFC. Experiment 2 shows that SICR has higher test-retest reliability than that reported for 2AFC. Thus, SICR offers a more sensitive measure of individual differences, suggesting that basic chunking abilities may explain statistical learning.
  • Janse, E., Sennema, A., & Slis, A. (2000). Fast speech timing in Dutch: The durational correlates of lexical stress and pitch accent. In Proceedings of the VIth International Conference on Spoken Language Processing, Vol. III (pp. 251-254).

    Abstract

    n this study we investigated the durational correlates of lexical stress and pitch accent at normal and fast speech rate in Dutch. Previous literature on English shows that durations of lexically unstressed vowels are reduced more than stressed vowels when speakers increase their speech rate. We found that the same holds for Dutch, irrespective of whether the unstressed vowel is schwa or a "full" vowel. In the same line, we expected that vowels in words without a pitch accent would be shortened relatively more than vowels in words with a pitch accent. This was not the case: if anything, the accented vowels were shortened relatively more than the unaccented vowels. We conclude that duration is an important cue for lexical stress, but not for pitch accent.
  • Janse, E. (2000). Intelligibility of time-compressed speech: Three ways of time-compression. In Proceedings of the VIth International Conference on Spoken Language Processing, vol. III (pp. 786-789).

    Abstract

    Studies on fast speech have shown that word-level timing of fast speech differs from that of normal rate speech in that unstressed syllables are shortened more than stressed syllables as speech rate increases. An earlier experiment showed that the intelligibility of time-compressed speech could not be improved by making its temporal organisation closer to natural fast speech. To test the hypothesis that segmental intelligibility is more important than prosodic timing in listening to timecompressed speech, the intelligibility of bisyllabic words was tested in three time-compression conditions: either stressed and unstressed syllable were compressed to the same degree, or the stressed syllable was compressed more than the unstressed syllable, or the reverse. As was found before, imitating wordlevel timing of fast speech did not improve intelligibility over linear compression. However, the results did not confirm the hypothesis either: there was no difference in intelligibility between the three compression conditions. We conclude that segmental intelligibility plays an important role, but further research is necessary to decide between the contributions of prosody and segmental intelligibility to the word-level intelligibility of time-compressed speech.
  • Janse, E. (2009). Hearing and cognitive measures predict elderly listeners' difficulty ignoring competing speech. In M. Boone (Ed.), Proceedings of the International Conference on Acoustics (pp. 1532-1535).
  • Janzen, G., & Weststeijn, C. (2004). Neural representation of object location and route direction: An fMRI study. NeuroImage, 22(Supplement 1), e634-e635.
  • Janzen, G., & Van Turennout, M. (2004). Neuronale Markierung navigationsrelevanter Objekte im räumlichen Gedächtnis: Ein fMRT Experiment. In D. Kerzel (Ed.), Beiträge zur 46. Tagung experimentell arbeitender Psychologen (pp. 125-125). Lengerich: Pabst Science Publishers.
  • Jasmin, K., & Casasanto, D. (2011). The QWERTY effect: How stereo-typing shapes the mental lexicon. In L. Carlson, C. Holscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society. Austin, TX: Cognitive Science Society.
  • Jesse, A., & Mitterer, H. (2011). Pointing gestures do not influence the perception of lexical stress. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy (pp. 2445-2448).

    Abstract

    We investigated whether seeing a pointing gesture influences the perceived lexical stress. A pitch contour continuum between the Dutch words “CAnon” (‘canon’) and “kaNON” (‘cannon’) was presented along with a pointing gesture during the first or the second syllable. Pointing gestures following natural recordings but not Gaussian functions influenced stress perception (Experiment 1 and 2), especially when auditory context preceded (Experiment 2). This was not replicated in Experiment 3. Natural pointing gestures failed to affect the categorization of a pitch peak timing continuum (Experiment 4). There is thus no convincing evidence that seeing a pointing gesture influences lexical stress perception.
  • Jesse, A., & Janse, E. (2009). Visual speech information aids elderly adults in stream segregation. In B.-J. Theobald, & R. Harvey (Eds.), Proceedings of the International Conference on Auditory-Visual Speech Processing 2009 (pp. 22-27). Norwich, UK: School of Computing Sciences, University of East Anglia.

    Abstract

    Listening to a speaker while hearing another speaker talks is a challenging task for elderly listeners. We show that elderly listeners over the age of 65 with various degrees of age-related hearing loss benefit in this situation from also seeing the speaker they intend to listen to. In a phoneme monitoring task, listeners monitored the speech of a target speaker for either the phoneme /p/ or /k/ while simultaneously hearing a competing speaker. Critically, on some trials, the target speaker was also visible. Elderly listeners benefited in their response times and accuracy levels from seeing the target speaker when monitoring for the less visible /k/, but more so when monitoring for the highly visible /p/. Visual speech therefore aids elderly listeners not only by providing segmental information about the target phoneme, but also by providing more global information that allows for better performance in this adverse listening situation.
  • Johns, T. G., Perera, R. M., Vitali, A. A., Vernes, S. C., & Scott, A. (2004). Phosphorylation of a glioma-specific mutation of the EGFR [Abstract]. Neuro-Oncology, 6, 317.

    Abstract

    Mutations of the epidermal growth factor receptor (EGFR) gene are found at a relatively high frequency in glioma, with the most common being the de2-7 EGFR (or EGFRvIII). This mutation arises from an in-frame deletion of exons 2-7, which removes 267 amino acids from the extracellular domain of the receptor. Despite being unable to bind ligand, the de2-7 EGFR is constitutively active at a low level. Transfection of human glioma cells with the de2-7 EGFR has little effect in vitro, but when grown as tumor xenografts this mutated receptor imparts a dramatic growth advantage. We mapped the phosphorylation pattern of de2-7 EGFR, both in vivo and in vitro, using a panel of antibodies specific for different phosphorylated tyrosine residues. Phosphorylation of de2-7 EGFR was detected constitutively at all tyrosine sites surveyed in vitro and in vivo, including tyrosine 845, a known target in the wild-type EGFR for src kinase. There was a substantial upregulation of phosphorylation at every yrosine residue of the de2-7 EGFR when cells were grown in vivo compared to the receptor isolated from cells cultured in vitro. Upregulation of phosphorylation at tyrosine 845 could be stimulated in vitro by the addition of specific components of the ECM via an integrindependent mechanism. These observations may partially explain why the growth enhancement mediated by de2-7 EGFR is largely restricted to the in vivo environment
  • Johnson, E. K., Jusczyk, P. W., Cutler, A., & Norris, D. (2000). The development of word recognition: The use of the possible-word constraint by 12-month-olds. In L. Gleitman, & A. Joshi (Eds.), Proceedings of CogSci 2000 (pp. 1034). London: Erlbaum.
  • Kapatsinski, V., & Harmon, Z. (2017). A Hebbian account of entrenchment and (over)-extension in language learning. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Meeting of the Cognitive Science Society (CogSci 2017) (pp. 2366-2371). Austin, TX: Cognitive Science Society.

    Abstract

    In production, frequently used words are preferentially extended to new, though related meanings. In comprehension, frequent exposure to a word instead makes the learner confident that all of the word’s legitimate uses have been experienced, resulting in an entrenched form-meaning mapping between the word and its experienced meaning(s). This results in a perception-production dissociation, where the forms speakers are most likely to map onto a novel meaning are precisely the forms that they believe can never be used that way. At first glance, this result challenges the idea of bidirectional form-meaning mappings, assumed by all current approaches to linguistic theory. In this paper, we show that bidirectional form-meaning mappings are not in fact challenged by this production-perception dissociation. We show that the production-perception dissociation is expected even if learners of the lexicon acquire simple symmetrical form-meaning associations through simple Hebbian learning.
  • Karadöller, D. Z., Sumer, B., & Ozyurek, A. (2017). Effects of delayed language exposure on spatial language acquisition by signing children and adults. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 2372-2376). Austin, TX: Cognitive Science Society.

    Abstract

    Deaf children born to hearing parents are exposed to language input quite late, which has long-lasting effects on language production. Previous studies with deaf individuals mostly focused on linguistic expressions of motion events, which have several event components. We do not know if similar effects emerge in simple events such as descriptions of spatial configurations of objects. Moreover, previous data mainly come from late adult signers. There is not much known about language development of late signing children soon after learning sign language. We compared simple event descriptions of late signers of Turkish Sign Language (adults, children) to age-matched native signers. Our results indicate that while late signers in both age groups are native-like in frequency of expressing a relational encoding, they lag behind native signers in using morphologically complex linguistic forms compared to other simple forms. Late signing children perform similar to adults and thus showed no development over time.
  • Kember, H., Grohe, A.-.-K., Zahner, K., Braun, B., Weber, A., & Cutler, A. (2017). Similar prosodic structure perceived differently in German and English. In Proceedings of Interspeech 2017 (pp. 1388-1392). doi:10.21437/Interspeech.2017-544.

    Abstract

    English and German have similar prosody, but their speakers realize some pitch falls (not rises) in subtly different ways. We here test for asymmetry in perception. An ABX discrimination task requiring F0 slope or duration judgements on isolated vowels revealed no cross-language difference in duration or F0 fall discrimination, but discrimination of rises (realized similarly in each language) was less accurate for English than for German listeners. This unexpected finding may reflect greater sensitivity to rising patterns by German listeners, or reduced sensitivity by English listeners as a result of extensive exposure to phrase-final rises (“uptalk”) in their language
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses? A corpus study revealing unexpected rigidity. In S. Kepser, & M. Reis (Eds.), Pre-Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: Niemeyer.
  • Kempen, G. (2004). Interactive visualization of syntactic structure assembly for grammar-intensive first- and second-language instruction. In R. Delmonte, P. Delcloque, & S. Tonelli (Eds.), Proceedings of InSTIL/ICALL2004 Symposium on NLP and speech technologies in advanced language learning systems (pp. 183-186). Venice: University of Venice.
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses?: A corpus study revealing unexpected rigidity. In Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: University of Tübingen.
  • Kempen, G. (2004). Human grammatical coding: Shared structure formation resources for grammatical encoding and decoding. In Cuny 2004 - The 17th Annual CUNY Conference on Human Sentence Processing. March 25-27, 2004. University of Maryland (pp. 66).
  • Kempen, G., & Hoenkamp, E. (1982). Incremental sentence generation: Implications for the structure of a syntactic processor. In J. Horecký (Ed.), COLING 82. Proceedings of the Ninth International Conference on Computational Linguistics, Prague, July 5-10, 1982 (pp. 151-156). Amsterdam: North-Holland.

    Abstract

    Human speakers often produce sentences incrementally. They can start speaking having in mind only a fragmentary idea of what they want to say, and while saying this they refine the contents underlying subsequent parts of the utterance. This capability imposes a number of constraints on the design of a syntactic processor. This paper explores these constraints and evaluates some recent computational sentence generators from the perspective of incremental production.
  • Khetarpal, N., Majid, A., & Regier, T. (2009). Spatial terms reflect near-optimal spatial categories. In N. Taatgen, & H. Van Rijn (Eds.), Proceedings of the Thirty-First Annual Conference of the Cognitive Science Society (pp. 2396-2401). Austin, TX: Cognitive Science Society.

    Abstract

    Spatial terms in the world’s languages appear to reflect both universal conceptual tendencies and linguistic convention. A similarly mixed picture in the case of color naming has been accounted for in terms of near-optimal partitions of color space. Here, we demonstrate that this account generalizes to spatial terms. We show that the spatial terms of 9 diverse languages near-optimally partition a similarity space of spatial meanings, just as color terms near-optimally partition color space. This account accommodates both universal tendencies and cross-language differences in spatial category extension, and identifies general structuring principles that appear to operate across different semantic domains.
  • Khoe, Y. H., Tsoukala, C., Kootstra, G. J., & Frank, S. L. (2020). Modeling cross-language structural priming in sentence production. In T. C. Stewart (Ed.), Proceedings of the 18th Annual Meeting of the International Conference on Cognitive Modeling (pp. 131-137). University Park, PA, USA: The Penn State Applied Cognitive Science Lab.

    Abstract

    A central question in the psycholinguistic study of multilingualism is how syntax is shared across languages. We implement a model to investigate whether error-based implicit learning can provide an account of cross-language structural priming. The model is based on the Dual-path model of
    sentence-production (Chang, 2002). We implement our model using the Bilingual version of Dual-path (Tsoukala, Frank, & Broersma, 2017). We answer two main questions: (1) Can structural priming of active and passive constructions occur between English and Spanish in a bilingual version of the Dual-
    path model? (2) Does cross-language priming differ quantitatively from within-language priming in this model? Our results show that cross-language priming does occur in the model. This finding adds to the viability of implicit learning as an account of structural priming in general and cross-language
    structural priming specifically. Furthermore, we find that the within-language priming effect is somewhat stronger than the cross-language effect. In the context of mixed results from
    behavioral studies, we interpret the latter finding as an indication that the difference between cross-language and within-
    language priming is small and difficult to detect statistically.
  • Klein, W. (Ed.). (2004). Philologie auf neuen Wegen [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 136.
  • Klein, W. (Ed.). (2004). Universitas [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik (LiLi), 134.
  • Klein, W. (2000). Changing concepts of the nature-nurture debate. In R. Hide, J. Mittelstrass, & W. Singer (Eds.), Changing concepts of nature at the turn of the millenium: Proceedings plenary session of the Pontifical academy of sciences, 26-29 October 1998 (pp. 289-299). Vatican City: Pontificia Academia Scientiarum.
  • Klein, W. (Ed.). (1989). Kindersprache [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (73).
  • Klein, W. (Ed.). (2000). Sprache des Rechts [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (118).
  • Klein, W., & Meibauer, J. (Eds.). (2011). Spracherwerb und Kinderliteratur [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 162.
  • Klein, W. (Ed.). (1985). Schriftlichkeit [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (59).
  • Klein, W., & Dimroth, C. (Eds.). (2009). Worauf kann sich der Sprachunterricht stützen? [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 153.
  • Klein, W. (Ed.). (1982). Zweitspracherwerb [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (45).
  • Koenig, A., Ringersma, J., & Trilsbeek, P. (2009). The Language Archiving Technology domain. In Z. Vetulani (Ed.), Human Language Technologies as a Challenge for Computer Science and Linguistics (pp. 295-299).

    Abstract

    The Max Planck Institute for Psycholinguistics (MPI) manages an archive of linguistic research data with a current size of almost 20 Terabytes. Apart from in-house researchers other projects also store their data in the archive, most notably the Documentation of Endangered Languages (DoBeS) projects. The archive is available online and can be accessed by anybody with Internet access. To be able to manage this large amount of data the MPI's technical group has developed a software suite called Language Archiving Technology (LAT) that on the one hand helps researchers and archive managers to manage the data and on the other hand helps users in enriching their primary data with additional layers. All the MPI software is Java-based and developed according to open source principles (GNU, 2007). All three major operating systems (Windows, Linux, MacOS) are supported and the software works similarly on all of them. As the archive is online, many of the tools, especially the ones for accessing the data, are browser based. Some of these browser-based tools make use of Adobe Flex to create nice-looking GUIs. The LAT suite is a complete set of management and enrichment tools, and given the interaction between the tools the result is a complete LAT software domain. Over the last 10 years, this domain has proven its functionality and use, and is being deployed to servers in other institutions. This deployment is an important step in getting the archived resources back to the members of the speech communities whose languages are documented. In the paper we give an overview of the tools of the LAT suite and we describe their functionality and role in the integrated process of archiving, management and enrichment of linguistic data.
  • Lai, V. T., Hagoort, P., & Casasanto, D. (2011). Affective and non-affective meaning in words and pictures. In L. Carlson, C. Holscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Meeting of the Cognitive Science Society (pp. 390-395). Austin, TX: Cognitive Science Society.
  • Lansner, A., Sandberg, A., Petersson, K. M., & Ingvar, M. (2000). On forgetful attractor network memories. In H. Malmgren, M. Borga, & L. Niklasson (Eds.), Artificial neural networks in medicine and biology: Proceedings of the ANNIMAB-1 Conference, Göteborg, Sweden, 13-16 May 2000 (pp. 54-62). Heidelberg: Springer Verlag.

    Abstract

    A recurrently connected attractor neural network with a Hebbian learning rule is currently our best ANN analogy for a piece cortex. Functionally biological memory operates on a spectrum of time scales with regard to induction and retention, and it is modulated in complex ways by sub-cortical neuromodulatory systems. Moreover, biological memory networks are commonly believed to be highly distributed and engage many co-operating cortical areas. Here we focus on the temporal aspects of induction and retention of memory in a connectionist type attractor memory model of a piece of cortex. A continuous time, forgetful Bayesian-Hebbian learning rule is described and compared to the characteristics of LTP and LTD seen experimentally. More generally, an attractor network implementing this learning rule can operate as a long-term, intermediate-term, or short-term memory. Modulation of the print-now signal of the learning rule replicates some experimental memory phenomena, like e.g. the von Restorff effect.
  • Lattenkamp, E. Z., Linnenschmidt, M., Mardus, E., Vernes, S. C., Wiegrebe, L., & Schutte, M. (2020). Impact of auditory feedback on bat vocal development. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 249-251). Nijmegen: The Evolution of Language Conferences.
  • Lausberg, H., & Sloetjes, H. (2009). NGCS/ELAN - Coding movement behaviour in psychotherapy [Meeting abstract]. PPmP - Psychotherapie · Psychosomatik · Medizinische Psychologie, 59: A113, 103.

    Abstract

    Individual and interactive movement behaviour (non-verbal behaviour / communication) specifically reflects implicit processes in psychotherapy [1,4,11]. However, thus far, the registration of movement behaviour has been a methodological challenge. We will present a coding system combined with an annotation tool for the analysis of movement behaviour during psychotherapy interviews [9]. The NGCS coding system enables to classify body movements based on their kinetic features alone [5,7]. The theoretical assumption behind the NGCS is that its main kinetic and functional movement categories are differentially associated with specific psychological functions and thus, have different neurobiological correlates [5-8]. ELAN is a multimodal annotation tool for digital video media [2,3,12]. The NGCS / ELAN template enables to link any movie to the same coding system and to have different raters independently work on the same file. The potential of movement behaviour analysis as an objective tool for psychotherapy research and for supervision in the psychosomatic practice is discussed by giving examples of the NGCS/ELAN analyses of psychotherapy sessions. While the quality of kinetic turn-taking and the therapistrsquor;s (implicit) adoption of the patientrsquor;s movements may predict therapy outcome, changes in the patientrsquor;s movement behaviour pattern may indicate changes in cognitive concepts and emotional states and thus, may help to identify therapeutically relevant processes [10].
  • Lee, R., Chambers, C. G., Huettig, F., & Ganea, P. A. (2017). Children’s semantic and world knowledge overrides fictional information during anticipatory linguistic processing. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Meeting of the Cognitive Science Society (CogSci 2017) (pp. 730-735). Austin, TX: Cognitive Science Society.

    Abstract

    Using real-time eye-movement measures, we asked how a fantastical discourse context competes with stored representations of semantic and world knowledge to influence children's and adults' moment-by-moment interpretation of a story. Seven-year- olds were less effective at bypassing stored semantic and world knowledge during real-time interpretation than adults. Nevertheless, an effect of discourse context on comprehension was still apparent.
  • Lei, L., Raviv, L., & Alday, P. M. (2020). Using spatial visualizations and real-world social networks to understand language evolution and change. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 252-254). Nijmegen: The Evolution of Language Conferences.
  • Lenkiewicz, P., Pereira, M., Freire, M. M., & Fernandes, J. (2009). A new 3D image segmentation method for parallel architectures. In Proceedings of the 2009 IEEE International Conference on Multimedia and Expo [ICME 2009] June 28 – July 3, 2009, New York (pp. 1813-1816).

    Abstract

    This paper presents a novel model for 3D image segmentation and reconstruction. It has been designed with the aim to be implemented over a computer cluster or a multi-core platform. The required features include a nearly absolute independence between the processes participating in the segmentation task and providing amount of work as equal as possible for all the participants. As a result, it is avoid many drawbacks often encountered when performing a parallelization of an algorithm that was constructed to operate in a sequential manner. Furthermore, the proposed algorithm based on the new segmentation model is efficient and shows a very good, nearly linear performance growth along with the growing number of processing units.
  • Lenkiewicz, P., Wittenburg, P., Schreer, O., Masneri, S., Schneider, D., & Tschöpel, S. (2011). Application of audio and video processing methods for language research. In Proceedings of the conference Supporting Digital Humanities 2011 [SDH 2011], Copenhagen, Denmark, November 17-18, 2011.

    Abstract

    Annotations of media recordings are the grounds for linguistic research. Since creating those annotations is a very laborious task, reaching 100 times longer than the length of the annotated media, innovative audio and video processing algorithms are needed, in order to improve the efficiency and quality of annotation process. The AVATecH project, started by the Max-Planck Institute for Psycholinguistics (MPI) and the Fraunhofer institutes HHI and IAIS, aims at significantly speeding up the process of creating annotations of audio-visual data for humanities research. In order for this to be achieved a range of state-of-the-art audio and video pattern recognition algorithms have been developed and integrated into widely used ELAN annotation tool. To address the problem of heterogeneous annotation tasks and recordings we provide modular components extended by adaptation and feedback mechanisms to achieve competitive annotation quality within significantly less annotation time.
  • Lenkiewicz, P., Wittenburg, P., Gebre, B. G., Lenkiewicz, A., Schreer, O., & Masneri, S. (2011). Application of video processing methods for linguistic research. In Z. Vetulani (Ed.), Human language technologies as a challenge for computer science and linguistics. Proceedings of the 5th Language and Technology Conference (LTC 2011), November 25-27, 2011, Poznań, Poland (pp. 561-564).

    Abstract

    Evolution and changes of all modern languages is a well-known fact. However, recently it is reaching dynamics never seen before, which results in loss of the vast amount of information encoded in every language. In order to preserve such heritage, properly annotated recordings of world languages are necessary. Since creating those annotations is a very laborious task, reaching times 100 longer than the length of the annotated media, innovative video processing algorithms are needed, in order to improve the efficiency and quality of annotation process.
  • Lenkiewicz, P., Pereira, M., Freire, M., & Fernandes, J. (2011). Extended whole mesh deformation model: Full 3D processing. In Proceedings of the 2011 IEEE International Conference on Image Processing (pp. 1633-1636).

    Abstract

    Processing medical data has always been an interesting field that has shown the need for effective image segmentation methods. Modern medical image segmentation solutions are focused on 3D image volumes, which originate at advanced acquisition devices. Operating on such data in a 3D envi- ronment is essential in order to take the full advantage of the available information. In this paper we present an extended version of our 3D image segmentation and reconstruction model that belongs to the family of Deformable Models and is capable of processing large image volumes in competitive times and in fully 3D environment, offering a big level of automation of the process and a high precision of results. It is also capable of handling topology changes and offers a very good scalability on multi-processing unit architectures. We present a description of the model and show its capabilities in the field of medical image processing.
  • Lenkiewicz, P., Pereira, M., Freire, M., & Fernandes, J. (2009). The dynamic topology changes model for unsupervised image segmentation. In Proceedings of the 11th IEEE International Workshop on Multimedia Signal Processing (MMSP'09) (pp. 1-5).

    Abstract

    Deformable models are a popular family of image segmentation techniques, which has been gaining significant focus in the last two decades, serving both for real-world applications as well as the base for research work. One of the features that the deformable models offer and that is considered a much desired one, is the ability to change their topology during the segmentation process. Using this characteristic it is possible to perform segmentation of objects with discontinuities in their bodies or to detect an undefined number of objects in the scene. In this paper we present our model for handling the topology changes in image segmentation methods based on the Active Volumes solution. The said model is capable of performing the changes in the structure of objects while the segmentation progresses, what makes it efficient and suitable for implementations over powerful execution environment, like multi-core architectures or computer clusters.
  • Lenkiewicz, P., Pereira, M., Freire, M., & Fernandes, J. (2009). The whole mesh Deformation Model for 2D and 3D image segmentation. In Proceedings of the 2009 IEEE International Conference on Image Processing (ICIP 2009) (pp. 4045-4048).

    Abstract

    In this paper we present a novel approach for image segmentation using Active Nets and Active Volumes. Those solutions are based on the Deformable Models, with slight difference in the method for describing the shapes of interests - instead of using a contour or a surface they represented the segmented objects with a mesh structure, which allows to describe not only the surface of the objects but also to model their interiors. This is obtained by dividing the nodes of the mesh in two categories, namely internal and external ones, which will be responsible for two different tasks. In our new approach we propose to negate this separation and use only one type of nodes. Using that assumption we manage to significantly shorten the time of segmentation while maintaining its quality.
  • Levelt, W. J. M., & Plomp, R. (1962). Musical consonance and critical bandwidth. In Proceedings of the 4th International Congress Acoustics (pp. 55-55).
  • Levinson, S. C. (2000). Language as nature and language as art. In J. Mittelstrass, & W. Singer (Eds.), Proceedings of the Symposium on ‘Changing concepts of nature and the turn of the Millennium (pp. 257-287). Vatican City: Pontificae Academiae Scientiarium Scripta Varia.
  • Levinson, S. C. (2000). H.P. Grice on location on Rossel Island. In S. S. Chang, L. Liaw, & J. Ruppenhofer (Eds.), Proceedings of the 25th Annual Meeting of the Berkeley Linguistic Society (pp. 210-224). Berkeley: Berkeley Linguistic Society.
  • Levshina, N. (2020). How tight is your language? A semantic typology based on Mutual Information. In K. Evang, L. Kallmeyer, R. Ehren, S. Petitjean, E. Seyffarth, & D. Seddah (Eds.), Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories (pp. 70-78). Düsseldorf, Germany: Association for Computational Linguistics. doi:10.18653/v1/2020.tlt-1.7.

    Abstract

    Languages differ in the degree of semantic flexibility of their syntactic roles. For example, Eng-
    lish and Indonesian are considered more flexible with regard to the semantics of subjects,
    whereas German and Japanese are less flexible. In Hawkins’ classification, more flexible lan-
    guages are said to have a loose fit, and less flexible ones are those that have a tight fit. This
    classification has been based on manual inspection of example sentences. The present paper
    proposes a new, quantitative approach to deriving the measures of looseness and tightness from
    corpora. We use corpora of online news from the Leipzig Corpora Collection in thirty typolog-
    ically and genealogically diverse languages and parse them syntactically with the help of the
    Universal Dependencies annotation software. Next, we compute Mutual Information scores for
    each language using the matrices of lexical lemmas and four syntactic dependencies (intransi-
    tive subjects, transitive subject, objects and obliques). The new approach allows us not only to
    reproduce the results of previous investigations, but also to extend the typology to new lan-
    guages. We also demonstrate that verb-final languages tend to have a tighter relationship be-
    tween lexemes and syntactic roles, which helps language users to recognize thematic roles early
    during comprehension.

    Additional information

    full text via ACL website
  • Little, H., Perlman, M., & Eryilmaz, K. (2017). Repeated interactions can lead to more iconic signals. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 760-765). Austin, TX: Cognitive Science Society.

    Abstract

    Previous research has shown that repeated interactions can cause iconicity in signals to reduce. However, data from several recent studies has shown the opposite trend: an increase in iconicity as the result of repeated interactions. Here, we discuss whether signals may become less or more iconic as a result of the modality used to produce them. We review several recent experimental results before presenting new data from multi-modal signals, where visual input creates audio feedback. Our results show that the growth in iconicity present in the audio information may come at a cost to iconicity in the visual information. Our results have implications for how we think about and measure iconicity in artificial signalling experiments. Further, we discuss how iconicity in real world speech may stem from auditory, kinetic or visual information, but iconicity in these different modalities may conflict.
  • Little, H. (Ed.). (2017). Special Issue on the Emergence of Sound Systems [Special Issue]. The Journal of Language Evolution, 2(1).
  • MacDonald, K., Räsänen, O., Casillas, M., & Warlaumont, A. S. (2020). Measuring prosodic predictability in children’s home language environments. In S. Denison, M. Mack, Y. Xu, & B. C. Armstrong (Eds.), Proceedings of the 42nd Annual Virtual Meeting of the Cognitive Science Society (CogSci 2020) (pp. 695-701). Montreal, QB: Cognitive Science Society.

    Abstract

    Children learn language from the speech in their home environment. Recent work shows that more infant-directed speech
    (IDS) leads to stronger lexical development. But what makes IDS a particularly useful learning signal? Here, we expand on an attention-based account first proposed by Räsänen et al. (2018): that prosodic modifications make IDS less predictable, and thus more interesting. First, we reproduce the critical finding from Räsänen et al.: that lab-recorded IDS pitch is less predictable compared to adult-directed speech (ADS). Next, we show that this result generalizes to the home language environment, finding that IDS in daylong recordings is also less predictable than ADS but that this pattern is much less robust than for IDS recorded in the lab. These results link experimental work on attention and prosodic modifications of IDS to real-world language-learning environments, highlighting some challenges of scaling up analyses of IDS to larger datasets that better capture children’s actual input.
  • Yu, J., Mailhammer, R., & Cutler, A. (2020). Vocabulary structure affects word recognition: Evidence from German listeners. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (Eds.), Proceedings of Speech Prosody 2020 (pp. 474-478). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-97.

    Abstract

    Lexical stress is realised similarly in English, German, and
    Dutch. On a suprasegmental level, stressed syllables tend to be
    longer and more acoustically salient than unstressed syllables;
    segmentally, vowels in unstressed syllables are often reduced.
    The frequency of unreduced unstressed syllables (where only
    the suprasegmental cues indicate lack of stress) however,
    differs across the languages. The present studies test whether
    listener behaviour is affected by these vocabulary differences,
    by investigating German listeners’ use of suprasegmental cues
    to lexical stress in German and English word recognition. In a
    forced-choice identification task, German listeners correctly
    assigned single-syllable fragments (e.g., Kon-) to one of two
    words differing in stress (KONto, konZEPT). Thus, German
    listeners can exploit suprasegmental information for
    identifying words. German listeners also performed above
    chance in a similar task in English (with, e.g., DIver, diVERT),
    i.e., their sensitivity to these cues also transferred to a nonnative
    language. An English listener group, in contrast, failed
    in the English fragment task. These findings mirror vocabulary
    patterns: German has more words with unreduced unstressed
    syllables than English does.
  • Majid, A., Van Staden, M., & Enfield, N. J. (2004). The human body in cognition, brain, and typology. In K. Hovie (Ed.), Forum Handbook, 4th International Forum on Language, Brain, and Cognition - Cognition, Brain, and Typology: Toward a Synthesis (pp. 31-35). Sendai: Tohoku University.

    Abstract

    The human body is unique: it is both an object of perception and the source of human experience. Its universality makes it a perfect resource for asking questions about how cognition, brain and typology relate to one another. For example, we can ask how speakers of different languages segment and categorize the human body. A dominant view is that body parts are “given” by visual perceptual discontinuities, and that words are merely labels for these visually determined parts (e.g., Andersen, 1978; Brown, 1976; Lakoff, 1987). However, there are problems with this view. First it ignores other perceptual information, such as somatosensory and motoric representations. By looking at the neural representations of sesnsory representations, we can test how much of the categorization of the human body can be done through perception alone. Second, we can look at language typology to see how much universality and variation there is in body-part categories. A comparison of a range of typologically, genetically and areally diverse languages shows that the perceptual view has only limited applicability (Majid, Enfield & van Staden, in press). For example, using a “coloring-in” task, where speakers of seven different languages were given a line drawing of a human body and asked to color in various body parts, Majid & van Staden (in prep) show that languages vary substantially in body part segmentation. For example, Jahai (Mon-Khmer) makes a lexical distinction between upper arm, lower arm, and hand, but Lavukaleve (Papuan Isolate) has just one word to refer to arm, hand, and leg. This shows that body part categorization is not a straightforward mapping of words to visually determined perceptual parts.
  • Majid, A., Van Staden, M., Boster, J. S., & Bowerman, M. (2004). Event categorization: A cross-linguistic perspective. In K. Forbus, D. Gentner, & T. Tegier (Eds.), Proceedings of the 26th Annual Meeting of the Cognitive Science Society (pp. 885-890). Mahwah, NJ: Erlbaum.

    Abstract

    Many studies in cognitive science address how people categorize objects, but there has been comparatively little research on event categorization. This study investigated the categorization of events involving material destruction, such as “cutting” and “breaking”. Speakers of 28 typologically, genetically, and areally diverse languages described events shown in a set of video-clips. There was considerable cross-linguistic agreement in the dimensions along which the events were distinguished, but there was variation in the number of categories and the placement of their boundaries.
  • Majid, A., & Levinson, S. C. (Eds.). (2011). The senses in language and culture [Special Issue]. The Senses & Society, 6(1).
  • Majid, A., & Levinson, S. C. (2011). The language of perception across cultures [Abstract]. Abstracts of the XXth Congress of European Chemoreception Research Organization, ECRO-2010. Publ. in Chemical Senses, 36(1), E7-E8.

    Abstract

    How are the senses structured by the languages we speak, the cultures we inhabit? To what extent is the encoding of perceptual experiences in languages a matter of how the mind/brain is ―wired-up‖ and to what extent is it a question of local cultural preoccupation? The ―Language of Perception‖ project tests the hypothesis that some perceptual domains may be more ―ineffable‖ – i.e. difficult or impossible to put into words – than others. While cognitive scientists have assumed that proximate senses (olfaction, taste, touch) are more ineffable than distal senses (vision, hearing), anthropologists have illustrated the exquisite variation and elaboration the senses achieve in different cultural milieus. The project is designed to test whether the proximate senses are universally ineffable – suggesting an architectural constraint on cognition – or whether they are just accidentally so in Indo-European languages, so expanding the role of cultural interests and preoccupations. To address this question, a standardized set of stimuli of color patches, geometric shapes, simple sounds, tactile textures, smells and tastes have been used to elicit descriptions from speakers of more than twenty languages—including three sign languages. The languages are typologically, genetically and geographically diverse, representing a wide-range of cultures. The communities sampled vary in subsistence modes (hunter-gatherer to industrial), ecological zones (rainforest jungle to desert), dwelling types (rural and urban), and various other parameters. We examine how codable the different sensory modalities are by comparing how consistent speakers are in how they describe the materials in each modality. Our current analyses suggest that taste may, in fact, be the most codable sensorial domain across languages. Moreover, we have identified exquisite elaboration in the olfactory domains in some cultural settings, contrary to some contemporary predictions within the cognitive sciences. These results suggest that differential codability may be at least partly the result of cultural preoccupation. This shows that the senses are not just physiological phenomena but are constructed through linguistic, cultural and social practices.
  • Malt, B. C., Ameel, E., Gennari, S., Imai, M., Saji, N., & Majid, A. (2011). Do words reveal concepts? In L. Carlson, C. Hölscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society (pp. 519-524). Austin, TX: Cognitive Science Society.

    Abstract

    To study concepts, cognitive scientists must first identify some. The prevailing assumption is that they are revealed by words such as triangle, table, and robin. But languages vary dramatically in how they carve up the world by name. Either ordinary concepts must be heavily language-dependent or names cannot be a direct route to concepts. We asked English, Dutch, Spanish, and Japanese speakers to name videos of human locomotion and judge their similarities. We investigated what name inventories and scaling solutions on name similarity and on physical similarity for the groups individually and together suggest about the underlying concepts. Aggregated naming and similarity solutions converged on results distinct from the answers suggested by the word inventories and scaling solutions of any single language. Words such as triangle, table, and robin can help identify the conceptual space of a domain, but they do not directly reveal units of knowledge usefully considered 'concepts'.
  • de Marneffe, M.-C., Tomlinson, J. J., Tice, M., & Sumner, M. (2011). The interaction of lexical frequency and phonetic variation in the perception of accented speech. In L. Carlson, C. Hölscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society (pp. 3575-3580). Austin, TX: Cognitive Science Society.

    Abstract

    How listeners understand spoken words despite massive variation in the speech signal is a central issue for linguistic theory. A recent focus on lexical frequency and specificity has proved fruitful in accounting for this phenomenon. Speech perception, though, is a multi-faceted process and likely incorporates a number of mechanisms to map a variable signal to meaning. We examine a well-established language use factor — lexical frequency — and how this factor is integrated with phonetic variability during the perception of accented speech. We show that an integrated perspective highlights a low-level perceptual mechanism that accounts for the perception of accented speech absent native contrasts, while shedding light on the use of interactive language factors in the perception of spoken words.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). Whether long-term tracking of speech rate affects perception depends on who is talking. In Proceedings of Interspeech 2017 (pp. 586-590). doi:10.21437/Interspeech.2017-1517.

    Abstract

    Speech rate is known to modulate perception of temporally ambiguous speech sounds. For instance, a vowel may be perceived as short when the immediate speech context is slow, but as long when the context is fast. Yet, effects of long-term tracking of speech rate are largely unexplored. Two experiments tested whether long-term tracking of rate influences perception of the temporal Dutch vowel contrast /ɑ/-/a:/. In Experiment 1, one low-rate group listened to 'neutral' rate speech from talker A and to slow speech from talker B. Another high-rate group was exposed to the same neutral speech from A, but to fast speech from B. Between-group comparison of the 'neutral' trials revealed that the low-rate group reported a higher proportion of /a:/ in A's 'neutral' speech, indicating that A sounded faster when B was slow. Experiment 2 tested whether one's own speech rate also contributes to effects of long-term tracking of rate. Here, talker B's speech was replaced by playback of participants' own fast or slow speech. No evidence was found that one's own voice affected perception of talker A in larger speech contexts. These results carry implications for our understanding of the mechanisms involved in rate-dependent speech perception and of dialogue.
  • Matsuo, A. (2004). Young children's understanding of ongoing vs. completion in present and perfective participles. In J. v. Kampen, & S. Baauw (Eds.), Proceedings of GALA 2003 (pp. 305-316). Utrecht: Netherlands Graduate School of Linguistics (LOT).
  • McDonough, J., Lehnert-LeHouillier, H., & Bardhan, N. P. (2009). The perception of nasalized vowels in American English: An investigation of on-line use of vowel nasalization in lexical access. In Nasal 2009.

    Abstract

    The goal of the presented study was to investigate the use of coarticulatory vowel nasalization in lexical access by native speakers of American English. In particular, we compare the use of coart culatory place of articulation cues to that of coarticulatory vowel nasalization. Previous research on lexical access has shown that listeners use cues to the place of articulation of a postvocalic stop in the preceding vowel. However, vowel nasalization as cue to an upcoming nasal consonant has been argued to be a more complex phenomenon. In order to establish whether coarticulatory vowel nasalization aides in the process of lexical access in the same way as place of articulation cues do, we conducted two perception experiments: an off-line 2AFC discrimination task and an on-line eyetracking study using the visual world paradigm. The results of our study suggest that listeners are indeed able to use vowel nasalization in similar ways to place of articulation information, and that both types of cues aide in lexical access.
  • McQueen, J. M., Cutler, A., & Norris, D. (2000). Positive and negative influences of the lexicon on phonemic decision-making. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the Sixth International Conference on Spoken Language Processing: Vol. 3 (pp. 778-781). Beijing: China Military Friendship Publish.

    Abstract

    Lexical knowledge influences how human listeners make decisions about speech sounds. Positive lexical effects (faster responses to target sounds in words than in nonwords) are robust across several laboratory tasks, while negative effects (slower responses to targets in more word-like nonwords than in less word-like nonwords) have been found in phonetic decision tasks but not phoneme monitoring tasks. The present experiments tested whether negative lexical effects are therefore a task-specific consequence of the forced choice required in phonetic decision. We compared phoneme monitoring and phonetic decision performance using the same Dutch materials in each task. In both experiments there were positive lexical effects, but no negative lexical effects. We observe that in all studies showing negative lexical effects, the materials were made by cross-splicing, which meant that they contained perceptual evidence supporting the lexically-consistent phonemes. Lexical knowledge seems to influence phonemic decision-making only when there is evidence for the lexically-consistent phoneme in the speech signal.
  • McQueen, J. M., Cutler, A., & Norris, D. (2000). Why Merge really is autonomous and parsimonious. In A. Cutler, J. M. McQueen, & R. Zondervan (Eds.), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 47-50). Nijmegen: Max-Planck-Institute for Psycholinguistics.

    Abstract

    We briefly describe the Merge model of phonemic decision-making, and, in the light of general arguments about the possible role of feedback in spoken-word recognition, defend Merge's feedforward structure. Merge not only accounts adequately for the data, without invoking feedback connections, but does so in a parsimonious manner.
  • Mengede, J., Devanna, P., Hörpel, S. G., Firzla, U., & Vernes, S. C. (2020). Studying the genetic bases of vocal learning in bats. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 280-282). Nijmegen: The Evolution of Language Conferences.
  • Mitterer, H. (2011). Social accountability influences phonetic alignment. Journal of the Acoustical Society of America. Program abstracts of the 162nd Meeting of the Acoustical Society of America, 130(4), 2442.

    Abstract

    Speakers tend to take over the articulatory habits of their interlocutors [e.g., Pardo, JASA (2006)]. This phonetic alignment could be the consequence of either a social mechanism or a direct and automatic link between speech perception and production. The latter assumes that social variables should have little influence on phonetic alignment. To test that participants were engaged in a "cloze task" (i.e., Stimulus: "In fantasy movies, silver bullets are used to kill ..." Response: "werewolves") with either one or four interlocutors. Given findings with the Asch-conformity paradigm in social psychology, multiple consistent speakers should exert a stronger force on the participant to align. To control the speech style of the interlocutors, their questions and answers were pre-recorded in either a formal or a casual speech style. The stimuli's speech style was then manipulated between participants and was consistent throughout the experiment for a given participant. Surprisingly, participants aligned less with the speech style if there were multiple interlocutors. This may reflect a "diffusion of responsibility:" Participants may find it more important to align when they interact with only one person than with a larger group.
  • Monaghan, P., Brand, J., Frost, R. L. A., & Taylor, G. (2017). Multiple variable cues in the environment promote accurate and robust word learning. In G. Gunzelman, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 817-822). Retrieved from https://mindmodeling.org/cogsci2017/papers/0164/index.html.

    Abstract

    Learning how words refer to aspects of the environment is a complex task, but one that is supported by numerous cues within the environment which constrain the possibilities for matching words to their intended referents. In this paper we tested the predictions of a computational model of multiple cue integration for word learning, that predicted variation in the presence of cues provides an optimal learning situation. In a cross-situational learning task with adult participants, we varied the reliability of presence of distributional, prosodic, and gestural cues. We found that the best learning occurred when cues were often present, but not always. The effect of variability increased the salience of individual cues for the learner, but resulted in robust learning that was not vulnerable to individual cues’ presence or absence. Thus, variability of multiple cues in the language-learning environment provided the optimal circumstances for word learning.
  • Mudd, K., Lutzenberger, H., De Vos, C., Fikkert, P., Crasborn, O., & De Boer, B. (2020). How does social structure shape language variation? A case study of the Kata Kolok lexicon. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 302-304). Nijmegen: The Evolution of Language Conferences.
  • Musgrave, S., & Cutfield, S. (2009). Language documentation and an Australian National Corpus. In M. Haugh, K. Burridge, J. Mulder, & P. Peters (Eds.), Selected proceedings of the 2008 HCSNet Workshop on Designing the Australian National Corpus: Mustering Languages (pp. 10-18). Somerville: Cascadilla Proceedings Project.

    Abstract

    Corpus linguistics and language documentation are usually considered separate subdisciplines within linguistics, having developed from different traditions and often operating on different scales, but the authors will suggest that there are commonalities to the two: both aim to represent language use in a community, and both are concerned with managing digital data. The authors propose that the development of the Australian National Corpus (AusNC) be guided by the experience of language documentation in the management of multimodal digital data and its annotation, and in ethical issues pertaining to making the data accessible. This would allow an AusNC that is distributed, multimodal, and multilingual, with holdings of text, audio, and video data distributed across multiple institutions; and including Indigenous, sign, and migrant community languages. An audit of language material held by Australian institutions and individuals is necessary to gauge the diversity and volume of possible content, and to inform common technical standards.
  • Nijland, L., & Janse, E. (Eds.). (2009). Auditory processing in speakers with acquired or developmental language disorders [Special Issue]. Clinical Linguistics and Phonetics, 23(3).
  • Nordhoff, S., & Hammarström, H. (2011). Glottolog/Langdoc: Defining dialects, languages, and language families as collections of resources. Proceedings of the First International Workshop on Linked Science 2011 (LISC2011), Bonn, Germany, October 24, 2011.

    Abstract

    This paper describes the Glottolog/Langdoc project, an at- tempt to provide near-total bibliographical coverage of descriptive re- sources to the world's languages. Every reference is treated as a resource, as is every \languoid"[1]. References are linked to the languoids which they describe, and languoids are linked to the references described by them. Family relations between languoids are modeled in SKOS, as are relations across dierent classications of the same languages. This setup allows the representation of languoids as collections of references, render- ing the question of the denition of entities like `Scots', `West-Germanic' or `Indo-European' more empirical.
  • Norris, D., Cutler, A., McQueen, J. M., Butterfield, S., & Kearns, R. K. (2000). Language-universal constraints on the segmentation of English. In A. Cutler, J. M. McQueen, & R. Zondervan (Eds.), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 43-46). Nijmegen: Max-Planck-Institute for Psycholinguistics.

    Abstract

    Two word-spotting experiments are reported that examine whether the Possible-Word Constraint (PWC) [1] is a language-specific or language-universal strategy for the segmentation of continuous speech. The PWC disfavours parses which leave an impossible residue between the end of a candidate word and a known boundary. The experiments examined cases where the residue was either a CV syllable with a lax vowel, or a CVC syllable with a schwa. Although neither syllable context is a possible word in English, word-spotting in both contexts was easier than with a context consisting of a single consonant. The PWC appears to be language-universal rather than language-specific.
  • Norris, D., Cutler, A., & McQueen, J. M. (2000). The optimal architecture for simulating spoken-word recognition. In C. Davis, T. Van Gelder, & R. Wales (Eds.), Cognitive Science in Australia, 2000: Proceedings of the Fifth Biennial Conference of the Australasian Cognitive Science Society. Adelaide: Causal Productions.

    Abstract

    Simulations explored the inability of the TRACE model of spoken-word recognition to model the effects on human listening of subcategorical mismatch in word forms. The source of TRACE's failure lay not in interactive connectivity, not in the presence of inter-word competition, and not in the use of phonemic representations, but in the need for continuously optimised interpretation of the input. When an analogue of TRACE was allowed to cycle to asymptote on every slice of input, an acceptable simulation of the subcategorical mismatch data was achieved. Even then, however, the simulation was not as close as that produced by the Merge model, which has inter-word competition, phonemic representations and continuous optimisation (but no interactive connectivity).
  • Ortega, G., Schiefner, A., & Ozyurek, A. (2017). Speakers’ gestures predict the meaning and perception of iconicity in signs. In G. Gunzelmann, A. Howe, & T. Tenbrink (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 889-894). Austin, TX: Cognitive Science Society.

    Abstract

    Sign languages stand out in that there is high prevalence of
    conventionalised linguistic forms that map directly to their
    referent (i.e., iconic). Hearing adults show low performance
    when asked to guess the meaning of iconic signs suggesting
    that their iconic features are largely inaccessible to them.
    However, it has not been investigated whether speakers’
    gestures, which also share the property of iconicity, may
    assist non-signers in guessing the meaning of signs. Results
    from a pantomime generation task (Study 1) show that
    speakers’ gestures exhibit a high degree of systematicity, and
    share different degrees of form overlap with signs (full,
    partial, and no overlap). Study 2 shows that signs with full
    and partial overlap are more accurately guessed and are
    assigned higher iconicity ratings than signs with no overlap.
    Deaf and hearing adults converge in their iconic depictions
    for some concepts due to the shared conceptual knowledge
    and manual-visual modality.
  • Otake, T., & Cutler, A. (2000). A set of Japanese word cohorts rated for relative familiarity. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the Sixth International Conference on Spoken Language Processing: Vol. 3 (pp. 766-769). Beijing: China Military Friendship Publish.

    Abstract

    A database is presented of relative familiarity ratings for 24 sets of Japanese words, each set comprising words overlapping in the initial portions. These ratings are useful for the generation of material sets for research in the recognition of spoken words.
  • Ozyurek, A. (2020). From hands to brains: How does human body talk, think and interact in face-to-face language use? In K. Truong, D. Heylen, & M. Czerwinski (Eds.), ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction (pp. 1-2). New York, NY, USA: Association for Computing Machinery. doi:10.1145/3382507.3419442.
  • Ozyurek, A., & Ozcaliskan, S. (2000). How do children learn to conflate manner and path in their speech and gestures? Differences in English and Turkish. In E. V. Clark (Ed.), The proceedings of the Thirtieth Child Language Research Forum (pp. 77-85). Stanford: CSLI Publications.
  • Pacheco, A., Araújo, S., Faísca, L., Petersson, K. M., & Reis, A. (2009). Profiling dislexic children: Phonology and visual naming skills. In Abstracts presented at the International Neuropsychological Society, Finnish Neuropsychological Society, Joint Mid-Year Meeting July 29-August 1, 2009. Helsinki, Finland & Tallinn, Estonia (pp. 40). Retrieved from http://www.neuropsykologia.fi/ins2009/INS_MY09_Abstract.pdf.
  • Paplu, S. H., Mishra, C., & Berns, K. (2020). Pseudo-randomization in automating robot behaviour during human-robot interaction. In 2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp. 1-6). Institute of Electrical and Electronics Engineers. doi:10.1109/ICDL-EpiRob48136.2020.9278115.

    Abstract

    Automating robot behavior in a specific situation is an active area of research. There are several approaches available in the literature of robotics to cater for the automatic behavior of a robot. However, when it comes to humanoids or human-robot interaction in general, the area has been less explored. In this paper, a pseudo-randomization approach has been introduced to automatize the gestures and facial expressions of an interactive humanoid robot called ROBIN based on its mental state. A significant number of gestures and facial expressions have been implemented to allow the robot more options to perform a relevant action or reaction based on visual stimuli. There is a display of noticeable differences in the behaviour of the robot for the same stimuli perceived from an interaction partner. This slight autonomous behavioural change in the robot clearly shows a notion of automation in behaviour. The results from experimental scenarios and human-centered evaluation of the system help validate the approach.

    Files private

    Request files
  • Perlman, M., Fusaroli, R., Fein, D., & Naigles, L. (2017). The use of iconic words in early child-parent interactions. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 913-918). Austin, TX: Cognitive Science Society.

    Abstract

    This paper examines the use of iconic words in early conversations between children and caregivers. The longitudinal data include a span of six observations of 35 children-parent dyads in the same semi-structured activity. Our findings show that children’s speech initially has a high proportion of iconic words, and over time, these words become diluted by an increase of arbitrary words. Parents’ speech is also initially high in iconic words, with a decrease in the proportion of iconic words over time – in this case driven by the use of fewer iconic words. The level and development of iconicity are related to individual differences in the children’s cognitive skills. Our findings fit with the hypothesis that iconicity facilitates early word learning and may play an important role in learning to produce new words.
  • Perniss, P. M., Zwitserlood, I., & Ozyurek, A. (2011). Does space structure spatial language? Linguistic encoding of space in sign languages. In L. Carlson, C. Holscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Meeting of the Cognitive Science Society (pp. 1595-1600). Austin, TX: Cognitive Science Society.
  • Poellmann, K., McQueen, J. M., & Mitterer, H. (2011). The time course of perceptual learning. In W.-S. Lee, & E. Zee (Eds.), Proceedings of the 17th International Congress of Phonetic Sciences 2011 [ICPhS XVII] (pp. 1618-1621). Hong Kong: Department of Chinese, Translation and Linguistics, City University of Hong Kong.

    Abstract

    Two groups of participants were trained to perceive an ambiguous sound [s/f] as either /s/ or /f/ based on lexical bias: One group heard the ambiguous fricative in /s/-final words, the other in /f/-final words. This kind of exposure leads to a recalibration of the /s/-/f/ contrast [e.g., 4]. In order to investigate when and how this recalibration emerges, test trials were interspersed among training and filler trials. The learning effect needed at least 10 clear training items to arise. Its emergence seemed to occur in a rather step-wise fashion. Learning did not improve much after it first appeared. It is likely, however, that the early test trials attracted participants' attention and therefore may have interfered with the learning process.
  • Popov, V., Ostarek, M., & Tenison, C. (2017). Inferential Pitfalls in Decoding Neural Representations. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 961-966). Austin, TX: Cognitive Science Society.

    Abstract

    A key challenge for cognitive neuroscience is to decipher the representational schemes of the brain. A recent class of decoding algorithms for fMRI data, stimulus-feature-based encoding models, is becoming increasingly popular for inferring the dimensions of neural representational spaces from stimulus-feature spaces. We argue that such inferences are not always valid, because decoding can occur even if the neural representational space and the stimulus-feature space use different representational schemes. This can happen when there is a systematic mapping between them. In a simulation, we successfully decoded the binary representation of numbers from their decimal features. Since binary and decimal number systems use different representations, we cannot conclude that the binary representation encodes decimal features. The same argument applies to the decoding of neural patterns from stimulus-feature spaces and we urge caution in inferring the nature of the neural code from such methods. We discuss ways to overcome these inferential limitations.
  • Pouw, W., Aslanidou, A., Kamermans, K. L., & Paas, F. (2017). Is ambiguity detection in haptic imagery possible? Evidence for Enactive imaginings. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 2925-2930). Austin, TX: Cognitive Science Society.

    Abstract

    A classic discussion about visual imagery is whether it affords reinterpretation, like discovering two interpretations in the duck/rabbit illustration. Recent findings converge on reinterpretation being possible in visual imagery, suggesting functional equivalence with pictorial representations. However, it is unclear whether such reinterpretations are necessarily a visual-pictorial achievement. To assess this, 68 participants were briefly presented 2-d ambiguous figures. One figure was presented visually, the other via manual touch alone. Afterwards participants mentally rotated the memorized figures as to discover a novel interpretation. A portion (20.6%) of the participants detected a novel interpretation in visual imagery, replicating previous research. Strikingly, 23.6% of participants were able to reinterpret figures they had only felt. That reinterpretation truly involved haptic processes was further supported, as some participants performed co-thought gestures on an imagined figure during retrieval. These results are promising for further development of an Enactivist approach to imagination.
  • Rasenberg, M., Dingemanse, M., & Ozyurek, A. (2020). Lexical and gestural alignment in interaction and the emergence of novel shared symbols. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 356-358). Nijmegen: The Evolution of Language Conferences.
  • Raviv, L., Meyer, A. S., & Lev-Ari, S. (2020). Network structure and the cultural evolution of linguistic structure: A group communication experiment. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 359-361). Nijmegen: The Evolution of Language Conferences.
  • Regier, T., Khetarpal, N., & Majid, A. (2011). Inferring conceptual structure from cross-language data. In L. Carlson, C. Hölscher, & T. Shipley (Eds.), Proceedings of the 33rd Annual Conference of the Cognitive Science Society (pp. 1488). Austin, TX: Cognitive Science Society.
  • Reinisch, E., & Weber, A. (2011). Adapting to lexical stress in a foreign accent. In W.-S. Lee, & E. Zee (Eds.), Proceedings of the 17th International Congress of Phonetic Sciences 2011 [ICPhS XVII] (pp. 1678-1681). Hong Kong: Department of Chinese, Translation and Linguistics, City University of Hong Kong.

    Abstract

    An exposure-test paradigm was used to examine whether Dutch listeners can adapt their perception to non-canonical marking of lexical stress in Hungarian-accented Dutch. During exposure, one group of listeners heard only words with correct initial stress, while another group also heard examples of unstressed initial syllables that were marked by high pitch, a possible stress cue in Dutch. Subsequently, listeners’ eye movements to target-competitor pairs with segmental overlap but different stress patterns were tracked while hearing Hungarian-accented Dutch. Listeners who had heard non-canonically produced words previously distinguished target-competitor pairs faster than listeners who had only been exposed to canonical forms before. This suggests that listeners can adapt quickly to speaker-specific realizations of non-canonical lexical stress.
  • Reinisch, E., Weber, A., & Mitterer, H. (2011). Listeners retune phoneme boundaries across languages [Abstract]. Journal of the Acoustical Society of America. Program abstracts of the 162nd Meeting of the Acoustical Society of America, 130(4), 2572-2572.

    Abstract

    Listeners can flexibly retune category boundaries of their native language to adapt to non-canonically produced phonemes. This only occurs, however, if the pronunciation peculiarities can be attributed to stable and not transient speaker-specific characteristics. Listening to someone speaking a second language, listeners could attribute non-canonical pronunciations either to the speaker or to the fact that she is modifying her categories in the second language. We investigated whether, following exposure to Dutch-accented English, Dutch listeners show effects of category retuning during test where they hear the same speaker speaking her native language, Dutch. Exposure was a lexical-decision task where either word-final [f] or [s] was replaced by an ambiguous sound. At test listeners categorized minimal word pairs ending in sounds along an [f]-[s] continuum. Following exposure to English words, Dutch listeners showed boundary shifts of a similar magnitude as following exposure to the same phoneme variants in their native language. This suggests that production patterns in a second language are deemed a stable characteristic. A second experiment suggests that category retuning also occurs when listeners are exposed to and tested with a native speaker of their second language. Listeners thus retune phoneme boundaries across languages.
  • de Reus, K., Carlson, D., Jadoul, Y., Lowry, A., Gross, S., Garcia, M., Salazar-Casals, A., Rubio-García, A., Haas, C. E., De Boer, B., & Ravignani, A. (2020). Relationships between vocal ontogeny and vocal tract anatomy in harbour seals (Phoca vitulina). In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 63-66). Nijmegen: The Evolution of Language Conferences.
  • Ringersma, J., Zinn, C., & Kemps-Snijders, M. (2009). LEXUS & ViCoS From lexical to conceptual spaces. In 1st International Conference on Language Documentation and Conservation (ICLDC).

    Abstract

    LEXUS and ViCoS: from lexicon to conceptual spaces LEXUS is a web-based lexicon tool and the knowledge space software ViCoS is an extension of LEXUS, allowing users to create relations between objects in and across lexica. LEXUS and ViCoS are part of the Language Archiving Technology software, developed at the MPI for Psycholinguistics to archive and enrich linguistic resources collected in the framework of language documentation projects. LEXUS is of primary interest for language documentation, offering the possibility to not just create a digital dictionary, but additionally it allows the creation of multi-media encyclopedic lexica. ViCoS provides an interface between the lexical space and the ontological space. Its approach permits users to model a world of concepts and their interrelations based on categorization patterns made by the speech community. We describe the LEXUS and ViCoS functionalities using three cases from DoBeS language documentation projects: (1) Marquesan The Marquesan lexicon was initially created in Toolbox and imported into LEXUS using the Toolbox import functionality. The lexicon is enriched with multi-media to illustrate the meaning of the words in its cultural environment. Members of the speech community consider words as keys to access and describe relevant parts of their life and traditions. Their understanding of words is best described by the various associations they evoke rather than in terms of any formal theory of meaning. Using ViCoS a knowledge space of related concepts is being created. (2) Kola-Sámi Two lexica are being created in LEXUS: RuSaDic lexicon is a Russian-Kildin wordlist in which the entries are of relative limited structure and content. SaRuDiC is a more complex structured lexicon with much richer content, including multi-media fragments and derivations. Using ViCoS we have created a connection between the two lexica, so that speakers who are familiair with Russian and wish to revitalize their Kildin can enter the lexicon through the RuSaDic and from there approach the informative SaRuDic. Similary we will create relations from the two lexica to external open databases, like e.g. Álgu. (3) Beaver A speaker database including kinship relations has been created and the database has been imported into LEXUS. In the LEXUS views the relations for individual speakers are being displayed. Using ViCoS the relational information from the database will be extracted to form a kisnhip relation space with specific relation types, like e.g 'mother-of'. The whole set of relations from the database can be displayed in one ViCoS relation window, and zoom functionality is available.
  • De Ruiter, J. P. (2004). On the primacy of language in multimodal communication. In Workshop Proceedings on Multimodal Corpora: Models of Human Behaviour for the Specification and Evaluation of Multimodal Input and Output Interfaces.(LREC2004) (pp. 38-41). Paris: ELRA - European Language Resources Association (CD-ROM).

    Abstract

    In this paper, I will argue that although the study of multimodal interaction offers exciting new prospects for Human Computer Interaction and human-human communication research, language is the primary form of communication, even in multimodal systems. I will support this claim with theoretical and empirical arguments, mainly drawn from human-human communication research, and will discuss the implications for multimodal communication research and Human-Computer Interaction.
  • Sadakata, M., & McQueen, J. M. (2011). The role of variability in non-native perceptual learning of a Japanese geminate-singleton fricative contrast. In Proceedings of the 12th Annual Conference of the International Speech Communication Association (Interspeech 2011), Florence, Italy (pp. 873-876).

    Abstract

    The current study reports the enhancing effect of a high variability training procedure in the learning of a Japanese geminate-singleton fricative contrast. Dutch natives took part in a five-day training procedure in which they identified geminate and singleton variants of the Japanese fricative /s/. They heard either many repetitions of a limited set of words recorded by a single speaker (simple training) or fewer repetitions of a more variable set of words recorded by multiple speakers (variable training). Pre-post identification evaluations and a transfer test indicated clear benefits of the variable training.
  • Sauermann, A., Höhle, B., Chen, A., & Järvikivi, J. (2011). Intonational marking of focus in different word orders in German children. In M. B. Washburn, K. McKinney-Bock, E. Varis, & A. Sawyer (Eds.), Proceedings of the 28th West Coast Conference on Formal Linguistics (pp. 313-322). Somerville, MA: Cascadilla Proceedings Project.

    Abstract

    The use of word order and intonation to mark focus in child speech has received some attention. However, past work usually examined each device separately or only compared the realizations of focused vs. non-focused constituents. This paper investigates the interaction between word order and intonation in the marking of different focus types in 4- to 5-year old German-speaking children and an adult control group. An answer-reconstruction task was used to elicit syntactic (word order) and intonational focus marking of subject and objects (locus of focus) in three focus types (broad, narrow, and contrastive focus). The results indicate that both children and adults used intonation to distinguish broad from contrastive focus but they differed in the marking of narrow focus. Further, both groups preferred intonation to word order as device for focus marking. But children showed an early sensitivity for the impact of focus type and focus location on word order variation and on phonetic means to mark focus.
  • Sauter, D., Scott, S., & Calder, A. (2004). Categorisation of vocally expressed positive emotion: A first step towards basic positive emotions? [Abstract]. Proceedings of the British Psychological Society, 12, 111.

    Abstract

    Most of the study of basic emotion expressions has focused on facial expressions and little work has been done to specifically investigate happiness, the only positive of the basic emotions (Ekman & Friesen, 1971). However, a theoretical suggestion has been made that happiness could be broken down into discrete positive emotions, which each fulfil the criteria of basic emotions, and that these would be expressed vocally (Ekman, 1992). To empirically test this hypothesis, 20 participants categorised 80 paralinguistic sounds using the labels achievement, amusement, contentment, pleasure and relief. The results suggest that achievement, amusement and relief are perceived as distinct categories, which subjects accurately identify. In contrast, the categories of contentment and pleasure were systematically confused with other responses, although performance was still well above chance levels. These findings are initial evidence that the positive emotions engage distinct vocal expressions and may be considered to be distinct emotion categories.
  • Sauter, D., Eisner, F., Ekman, P., & Scott, S. K. (2009). Universal vocal signals of emotion. In N. Taatgen, & H. Van Rijn (Eds.), Proceedings of the 31st Annual Meeting of the Cognitive Science Society (CogSci 2009) (pp. 2251-2255). Cognitive Science Society.

    Abstract

    Emotional signals allow for the sharing of important information with conspecifics, for example to warn them of danger. Humans use a range of different cues to communicate to others how they feel, including facial, vocal, and gestural signals. Although much is known about facial expressions of emotion, less research has focused on affect in the voice. We compare British listeners to individuals from remote Namibian villages who have had no exposure to Western culture, and examine recognition of non-verbal emotional vocalizations, such as screams and laughs. We show that a number of emotions can be universally recognized from non-verbal vocal signals. In addition we demonstrate the specificity of this pattern, with a set of additional emotions only recognized within, but not across these cultural groups. Our findings indicate that a small set of primarily negative emotions have evolved signals across several modalities, while most positive emotions are communicated with culture-specific signals.
  • Scharenborg, O., Bouwman, G., & Boves, L. (2000). Connected digit recognition with class specific word models. In Proceedings of the COST249 Workshop on Voice Operated Telecom Services workshop (pp. 71-74).

    Abstract

    This work focuses on efficient use of the training material by selecting the optimal set of model topologies. We do this by training multiple word models of each word class, based on a subclassification according to a priori knowledge of the training material. We will examine classification criteria with respect to duration of the word, gender of the speaker, position of the word in the utterance, pauses in the vicinity of the word, and combinations of these. Comparative experiments were carried out on a corpus consisting of Dutch spoken connected digit strings and isolated digits, which are recorded in a wide variety of acoustic conditions. The results show, that classification based on gender of the speaker, position of the digit in the string, pauses in the vicinity of the training tokens, and models based on a combination of these criteria perform significantly better than the set with single models per digit.

Share this page