Publications

Displaying 101 - 200 of 310
  • Gaby, A., & Faller, M. (2003). Reciprocity questionnaire. In N. J. Enfield (Ed.), Field research manual 2003, part I: Multimodal interaction, space, event representation (pp. 77-80). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.877641.

    Abstract

    This project is part of a collaborative project with the research group “Reciprocals across languages” led by Nick Evans. One goal of this project is to develop a typology of reciprocals. This questionnaire is designed to help field workers get an overview over the type of markers used in the expression of reciprocity in the language studied.
  • Galke, L., Vagliano, I., & Scherp, A. (2019). Can graph neural networks go „online“? An analysis of pretraining and inference. In Proceedings of the Representation Learning on Graphs and Manifolds: ICLR2019 Workshop.

    Abstract

    Large-scale graph data in real-world applications is often not static but dynamic,
    i. e., new nodes and edges appear over time. Current graph convolution approaches
    are promising, especially, when all the graph’s nodes and edges are available dur-
    ing training. When unseen nodes and edges are inserted after training, it is not
    yet evaluated whether up-training or re-training from scratch is preferable. We
    construct an experimental setup, in which we insert previously unseen nodes and
    edges after training and conduct a limited amount of inference epochs. In this
    setup, we compare adapting pretrained graph neural networks against retraining
    from scratch. Our results show that pretrained models yield high accuracy scores
    on the unseen nodes and that pretraining is preferable over retraining from scratch.
    Our experiments represent a first step to evaluate and develop truly online variants
    of graph neural networks.
  • Galke, L., Melnychuk, T., Seidlmayer, E., Trog, S., Foerstner, K., Schultz, C., & Tochtermann, K. (2019). Inductive learning of concept representations from library-scale bibliographic corpora. In K. David, K. Geihs, M. Lange, & G. Stumme (Eds.), Informatik 2019: 50 Jahre Gesellschaft für Informatik - Informatik für Gesellschaft (pp. 219-232). Bonn: Gesellschaft für Informatik e.V. doi:10.18420/inf2019_26.
  • Galke, L., Mai, F., Schelten, A., Brunch, D., & Scherp, A. (2017). Using titles vs. full-text as source for automated semantic document annotation. In O. Corcho, K. Janowicz, G. Rizz, I. Tiddi, & D. Garijo (Eds.), Proceedings of the 9th International Conference on Knowledge Capture (K-CAP 2017). New York: ACM.

    Abstract

    We conduct the first systematic comparison of automated semantic
    annotation based on either the full-text or only on the title metadata
    of documents. Apart from the prominent text classification baselines
    kNN and SVM, we also compare recent techniques of Learning
    to Rank and neural networks and revisit the traditional methods
    logistic regression, Rocchio, and Naive Bayes. Across three of our
    four datasets, the performance of the classifications using only titles
    reaches over 90% of the quality compared to the performance when
    using the full-text.
  • Galke, L., Saleh, A., & Scherp, A. (2017). Word embeddings for practical information retrieval. In M. Eibl, & M. Gaedke (Eds.), INFORMATIK 2017 (pp. 2155-2167). Bonn: Gesellschaft für Informatik. doi:10.18420/in2017_215.

    Abstract

    We assess the suitability of word embeddings for practical information retrieval scenarios. Thus, we assume that users issue ad-hoc short queries where we return the first twenty retrieved documents after applying a boolean matching operation between the query and the documents. We compare the performance of several techniques that leverage word embeddings in the retrieval models to compute the similarity between the query and the documents, namely word centroid similarity, paragraph vectors, Word Mover’s distance, as well as our novel inverse document frequency (IDF) re-weighted word centroid similarity. We evaluate the performance using the ranking metrics mean average precision, mean reciprocal rank, and normalized discounted cumulative gain. Additionally, we inspect the retrieval models’ sensitivity to document length by using either only the title or the full-text of the documents for the retrieval task. We conclude that word centroid similarity is the best competitor to state-of-the-art retrieval models. It can be further improved by re-weighting the word frequencies with IDF before aggregating the respective word vectors of the embedding. The proposed cosine similarity of IDF re-weighted word vectors is competitive to the TF-IDF baseline and even outperforms it in case of the news domain with a relative percentage of 15%.
  • Goldrick, M., Brehm, L., Pyeong Whan, C., & Smolensky, P. (2019). Transient blend states and discrete agreement-driven errors in sentence production. In G. J. Snover, M. Nelson, B. O'Connor, & J. Pater (Eds.), Proceedings of the Society for Computation in Linguistics (SCiL 2019) (pp. 375-376). doi:10.7275/n0b2-5305.
  • Goudbeek, M., Smits, R., Cutler, A., & Swingley, D. (2017). Auditory and phonetic category formation. In H. Cohen, & C. Lefebvre (Eds.), Handbook of categorization in cognitive science (2nd revised ed.) (pp. 687-708). Amsterdam: Elsevier.
  • Gretsch, P. (2003). Omission impossible?: Topic and Focus in Focal Ellipsis. In K. Schwabe, & S. Winkler (Eds.), The Interfaces: Deriving and interpreting omitted structures (pp. 341-365). Amsterdam: John Benjamins.
  • Gullberg, M. (2003). Eye movements and gestures in human face-to-face interaction. In J. Hyönä, R. Radach, & H. Deubel (Eds.), The mind's eyes: Cognitive and applied aspects of eye movements (pp. 685-703). Oxford: Elsevier.

    Abstract

    Gestures are visuospatial events, meaning carriers, and social interactional phenomena. As such they constitute a particularly favourable area for investigating visual attention in a complex everyday situation under conditions of competitive processing. This chapter discusses visual attention to spontaneous gestures in human face-to-face interaction as explored with eye-tracking. Some basic fixation patterns are described, live and video-based settings are compared, and preliminary results on the relationship between fixations and information processing are outlined.
  • Gullberg, M., & Kita, S. (2003). Das Beachten von Gesten: Eine Studie zu Blickverhalten und Integration gestisch ausgedrückter Informationen. In Max-Planck-Gesellschaft (Ed.), Jahrbuch der Max Planck Gesellschaft 2003 (pp. 949-953). Göttingen: Vandenhoeck & Ruprecht.
  • Gullberg, M. (2003). Gestures, referents, and anaphoric linkage in learner varieties. In C. Dimroth, & M. Starren (Eds.), Information structure, linguistic structure and the dynamics of language acquisition. (pp. 311-328). Amsterdam: Benjamins.

    Abstract

    This paper discusses how the gestural modality can contribute to our understanding of anaphoric linkage in learner varieties, focusing on gestural anaphoric linkage marking the introduction, maintenance, and shift of reference in story retellings by learners of French and Swedish. The comparison of gestural anaphoric linkage in native and non-native varieties reveals what appears to be a particular learner variety of gestural cohesion, which closely reflects the characteristics of anaphoric linkage in learners' speech. Specifically, particular forms co-occur with anaphoric gestures depending on the information organisation in discourse. The typical nominal over-marking of maintained referents or topic elements in speech is mirrored by gestural (over-)marking of the same items. The paper discusses two ways in which this finding may further the understanding of anaphoric over-explicitness of learner varieties. An addressee-based communicative perspective on anaphoric linkage highlights how over-marking in gesture and speech may be related to issues of hyper-clarity and ambiguity. An alternative speaker-based perspective is also explored in which anaphoric over-marking is seen as related to L2 speech planning.
  • Hagoort, P. (2017). It is the facts, stupid. In J. Brockman, F. Van der Wa, & H. Corver (Eds.), Wetenschappelijke parels: het belangrijkste wetenschappelijke nieuws volgens 193 'briljante geesten'. Amsterdam: Maven Press.
  • Hagoort, P. (2003). De verloving tussen neurowetenschap en psychologie. In K. Hilberdink (Ed.), Interdisciplinariteit in de geesteswetenschappen (pp. 73-81). Amsterdam: KNAW.
  • Hagoort, P. (2003). Die einzigartige, grösstenteils aber unbewusste Fähigkeit der Menschen zu sprachlicher Kommunikation. In G. Kaiser (Ed.), Jahrbuch 2002-2003 / Wissenschaftszentrum Nordrhein-Westfalen (pp. 33-46). Düsseldorf: Wissenschaftszentrum Nordrhein-Westfalen.
  • Hagoort, P. (2003). Functional brain imaging. In W. J. Frawley (Ed.), International encyclopedia of linguistics (pp. 142-145). New York: Oxford University Press.
  • Hagoort, P., & Beckmann, C. F. (2019). Key issues and future directions: The neural architecture for language. In P. Hagoort (Ed.), Human language: From genes and brains to behavior (pp. 527-532). Cambridge, MA: MIT Press.
  • Hagoort, P. (2019). Introduction. In P. Hagoort (Ed.), Human language: From genes and brains to behavior (pp. 1-6). Cambridge, MA: MIT Press.
  • Hagoort, P. (2017). The neural basis for primary and acquired language skills. In E. Segers, & P. Van den Broek (Eds.), Developmental Perspectives in Written Language and Literacy: In honor of Ludo Verhoeven (pp. 17-28). Amsterdam: Benjamins. doi:10.1075/z.206.02hag.

    Abstract

    Reading is a cultural invention that needs to recruit cortical infrastructure that was not designed for it (cultural recycling of cortical maps). In the case of reading both visual cortex and networks for speech processing are recruited. Here I discuss current views on the neurobiological underpinnings of spoken language that deviate in a number of ways from the classical Wernicke-Lichtheim-Geschwind model. More areas than Broca’s and Wernicke’s region are involved in language. Moreover, a division along the axis of language production and language comprehension does not seem to be warranted. Instead, for central aspects of language processing neural infrastructure is shared between production and comprehension. Arguments are presented in favor of a dynamic network view, in which the functionality of a region is co-determined by the network of regions in which it is embedded at particular moments in time. Finally, core regions of language processing need to interact with other networks (e.g. the attentional networks and the ToM network) to establish full functionality of language and communication. The consequences of this architecture for reading are discussed.
  • Hahn, L. E., Ten Buuren, M., De Nijs, M., Snijders, T. M., & Fikkert, P. (2019). Acquiring novel words in a second language through mutual play with child songs - The Noplica Energy Center. In L. Nijs, H. Van Regenmortel, & C. Arculus (Eds.), MERYC19 Counterpoints of the senses: Bodily experiences in musical learning (pp. 78-87). Ghent, Belgium: EuNet MERYC 2019.

    Abstract

    Child songs are a great source for linguistic learning. Here we explore whether children can acquire novel words in a second language by playing a game featuring child songs in a playhouse. We present data from three studies that serve as scientific proof for the functionality of one game of the playhouse: the Energy Center. For this game, three hand-bikes were mounted on a panel. When children start moving the hand-bikes, child songs start playing simultaneously. Once the children produce enough energy with the hand-bikes, the songs are additionally accompanied with the sounds of musical instruments. In our studies, children executed a picture-selection task to evaluate whether they acquired new vocabulary from the songs presented during the game. Two of our studies were run in the field, one at a Dutch and one at an Indian pre-school. The third study features data from a more controlled laboratory setting. Our results partly confirm that the Energy Center is a successful means to support vocabulary acquisition in a second language. More research with larger sample sizes and longer access to the Energy Center is needed to evaluate the overall functionality of the game. Based on informal observations at our test sites, however, we are certain that children do pick up linguistic content from the songs during play, as many of the children repeat words and phrases from songs they heard. We will pick up upon these promising observations during future studies
  • Hammarström, H. (2019). An inventory of Bantu languages. In M. Van de Velde, K. Bostoen, D. Nurse, & G. Philippson (Eds.), The Bantu languages (2nd). London: Routledge.

    Abstract

    This chapter aims to provide an updated list of all Bantu languages known at present and to provide individual pointers to further information on the inventory. The area division has some correlation with what are perceived genealogical relations between Bantu languages, but they are not defined as such and do not change whenever there is an update in our understanding of genealogical relations. Given the popularity of Guthrie codes in Bantu linguistics, our listing also features a complete mapping to Guthrie codes. The language inventory listed excludes sign languages used in the Bantu area, speech registers, pidgins, drummed/whistled languages and urban youth languages. Pointers to such languages in the Bantu area are included in the continent-wide overview in Hammarstrom. The most important alternative names, subvarieties and spelling variants are given for each language, though such lists are necessarily incomplete and reflect some degree of arbitrary selection.
  • Haun, D. B. M., & Waller, D. (2003). Alignment task. In N. J. Enfield (Ed.), Field research manual 2003, part I: Multimodal interaction, space, event representation (pp. 39-48). Nijmegen: Max Planck Institute for Psycholinguistics.
  • Haun, D. B. M. (2003). Path integration. In N. J. Enfield (Ed.), Field research manual 2003, part I: Multimodal interaction, space, event representation (pp. 33-38). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.877644.
  • Haun, D. B. M. (2003). Spatial updating. In N. J. Enfield (Ed.), Field research manual 2003, part I: Multimodal interaction, space, event representation (pp. 49-56). Nijmegen: Max Planck Institute for Psycholinguistics.
  • Hawkins, J. A., & Cutler, A. (1988). Psycholinguistic factors in morphological asymmetry. In J. A. Hawkins (Ed.), Explaining language universals (pp. 280-317). Oxford: Blackwell.
  • Heilbron, M., Ehinger, B., Hagoort, P., & De Lange, F. P. (2019). Tracking naturalistic linguistic predictions with deep neural language models. In Proceedings of the 2019 Conference on Cognitive Computational Neuroscience (pp. 424-427). doi:10.32470/CCN.2019.1096-0.

    Abstract

    Prediction in language has traditionally been studied using
    simple designs in which neural responses to expected
    and unexpected words are compared in a categorical
    fashion. However, these designs have been contested
    as being ‘prediction encouraging’, potentially exaggerating
    the importance of prediction in language understanding.
    A few recent studies have begun to address
    these worries by using model-based approaches to probe
    the effects of linguistic predictability in naturalistic stimuli
    (e.g. continuous narrative). However, these studies
    so far only looked at very local forms of prediction, using
    models that take no more than the prior two words into
    account when computing a word’s predictability. Here,
    we extend this approach using a state-of-the-art neural
    language model that can take roughly 500 times longer
    linguistic contexts into account. Predictability estimates
    fromthe neural network offer amuch better fit to EEG data
    from subjects listening to naturalistic narrative than simpler
    models, and reveal strong surprise responses akin to
    the P200 and N400. These results show that predictability
    effects in language are not a side-effect of simple designs,
    and demonstrate the practical use of recent advances
    in AI for the cognitive neuroscience of language.
  • Holler, J., & Bavelas, J. (2017). Multi-modal communication of common ground: A review of social functions. In R. B. Church, M. W. Alibali, & S. D. Kelly (Eds.), Why gesture? How the hands function in speaking, thinking and communicating (pp. 213-240). Amsterdam: Benjamins.

    Abstract

    Until recently, the literature on common ground depicted its influence as a purely verbal phenomenon. We review current research on how common ground influences gesture. With informative exceptions, most experiments found that speakers used fewer gestures as well as fewer words in common ground contexts; i.e., the gesture/word ratio did not change. Common ground often led to more poorly articulated gestures, which parallels its effect on words. These findings support the principle of recipient design as well as more specific social functions such as grounding, the given-new contract, and Grice’s maxims. However, conceptual pacts or linking old with new information may maintain the original form. All together, these findings implicate gesture-speech ensembles rather than isolated effects on gestures alone.
  • Isbilen, E. S., McCauley, S. M., Kidd, E., & Christiansen, M. H. (2017). Testing statistical learning implicitly: A novel chunk-based measure of statistical learning. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 564-569). Austin, TX: Cognitive Science Society.

    Abstract

    Attempts to connect individual differences in statistical learning with broader aspects of cognition have received considerable attention, but have yielded mixed results. A possible explanation is that statistical learning is typically tested using the two-alternative forced choice (2AFC) task. As a meta-cognitive task relying on explicit familiarity judgments, 2AFC may not accurately capture implicitly formed statistical computations. In this paper, we adapt the classic serial-recall memory paradigm to implicitly test statistical learning in a statistically-induced chunking recall (SICR) task. We hypothesized that artificial language exposure would lead subjects to chunk recurring statistical patterns, facilitating recall of words from the input. Experiment 1 demonstrates that SICR offers more fine-grained insights into individual differences in statistical learning than 2AFC. Experiment 2 shows that SICR has higher test-retest reliability than that reported for 2AFC. Thus, SICR offers a more sensitive measure of individual differences, suggesting that basic chunking abilities may explain statistical learning.
  • Janse, E. (2003). Word perception in natural-fast and artificially time-compressed speech. In M. SolÉ, D. Recasens, & J. Romero (Eds.), Proceedings of the 15th International Congress of the Phonetic Sciences (pp. 3001-3004).
  • Johnson, E. K. (2003). Speaker intent influences infants' segmentation of potentially ambiguous utterances. In Proceedings of the 15th International Congress of Phonetic Sciences (PCPhS 2003) (pp. 1995-1998). Adelaide: Causal Productions.
  • De Jong, N. H., Schreuder, R., & Baayen, R. H. (2003). Morphological resonance in the mental lexicon. In R. Baayen, & R. Schreuder (Eds.), Morphological structure in language processing (pp. 65-88). Berlin: Mouton de Gruyter.
  • Joo, H., Jang, J., Kim, S., Cho, T., & Cutler, A. (2019). Prosodic structural effects on coarticulatory vowel nasalization in Australian English in comparison to American English. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 835-839). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    This study investigates effects of prosodic factors (prominence, boundary) on coarticulatory Vnasalization in Australian English (AusE) in CVN and NVC in comparison to those in American English
    (AmE). As in AmE, prominence was found to
    lengthen N, but to reduce V-nasalization, enhancing N’s nasality and V’s orality, respectively (paradigmatic contrast enhancement). But the prominence effect in CVN was more robust than that in AmE. Again similar to findings in AmE, boundary
    induced a reduction of N-duration and V-nasalization phrase-initially (syntagmatic contrast enhancement), and increased the nasality of both C and V phrasefinally.
    But AusE showed some differences in terms
    of the magnitude of V nasalization and N duration. The results suggest that the linguistic contrast enhancements underlie prosodic-structure modulation of coarticulatory V-nasalization in
    comparable ways across dialects, while the fine phonetic detail indicates that the phonetics-prosody interplay is internalized in the individual dialect’s phonetic grammar.
  • Jordens, P. (2003). Constraints on the shape of second language learner varieties. In G. Rickheit, T. Herrmann, & W. Deutsch (Eds.), Psycholinguistik/Psycholinguistics: Ein internationales Handbuch. [An International Handbook] (pp. 819-833). Berlin: Mouton de Gruyter.
  • Kapatsinski, V., & Harmon, Z. (2017). A Hebbian account of entrenchment and (over)-extension in language learning. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Meeting of the Cognitive Science Society (CogSci 2017) (pp. 2366-2371). Austin, TX: Cognitive Science Society.

    Abstract

    In production, frequently used words are preferentially extended to new, though related meanings. In comprehension, frequent exposure to a word instead makes the learner confident that all of the word’s legitimate uses have been experienced, resulting in an entrenched form-meaning mapping between the word and its experienced meaning(s). This results in a perception-production dissociation, where the forms speakers are most likely to map onto a novel meaning are precisely the forms that they believe can never be used that way. At first glance, this result challenges the idea of bidirectional form-meaning mappings, assumed by all current approaches to linguistic theory. In this paper, we show that bidirectional form-meaning mappings are not in fact challenged by this production-perception dissociation. We show that the production-perception dissociation is expected even if learners of the lexicon acquire simple symmetrical form-meaning associations through simple Hebbian learning.
  • Karadöller, D. Z., Sumer, B., & Ozyurek, A. (2017). Effects of delayed language exposure on spatial language acquisition by signing children and adults. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 2372-2376). Austin, TX: Cognitive Science Society.

    Abstract

    Deaf children born to hearing parents are exposed to language input quite late, which has long-lasting effects on language production. Previous studies with deaf individuals mostly focused on linguistic expressions of motion events, which have several event components. We do not know if similar effects emerge in simple events such as descriptions of spatial configurations of objects. Moreover, previous data mainly come from late adult signers. There is not much known about language development of late signing children soon after learning sign language. We compared simple event descriptions of late signers of Turkish Sign Language (adults, children) to age-matched native signers. Our results indicate that while late signers in both age groups are native-like in frequency of expressing a relational encoding, they lag behind native signers in using morphologically complex linguistic forms compared to other simple forms. Late signing children perform similar to adults and thus showed no development over time.
  • Keating, P., Cho, T., Fougeron, C., & Hsu, C.-S. (2003). Domain-initial strengthening in four languages. In J. Local, R. Ogden, & R. Temple (Eds.), Laboratory phonology VI: Phonetic interpretation (pp. 145-163). Cambridge: Cambridge University Press.
  • Kember, H., Grohe, A.-.-K., Zahner, K., Braun, B., Weber, A., & Cutler, A. (2017). Similar prosodic structure perceived differently in German and English. In Proceedings of Interspeech 2017 (pp. 1388-1392). doi:10.21437/Interspeech.2017-544.

    Abstract

    English and German have similar prosody, but their speakers realize some pitch falls (not rises) in subtly different ways. We here test for asymmetry in perception. An ABX discrimination task requiring F0 slope or duration judgements on isolated vowels revealed no cross-language difference in duration or F0 fall discrimination, but discrimination of rises (realized similarly in each language) was less accurate for English than for German listeners. This unexpected finding may reflect greater sensitivity to rising patterns by German listeners, or reduced sensitivity by English listeners as a result of extensive exposure to phrase-final rises (“uptalk”) in their language
  • Kempen, G., & Harbusch, K. (2003). A corpus study into word order variation in German subordinate clauses: Animacy affects linearization independently of function assignment. In Proceedings of AMLaP 2003 (pp. 153-154). Glasgow: Glasgow University.
  • Kempen, G. (1988). De netwerker: Spin in het web of rat in een doolhof? In SURF in theorie en praktijk: Van personal tot supercomputer (pp. 59-61). Amsterdam: Elsevier Science Publishers.
  • Kempen, G., & Harbusch, K. (2003). Dutch and German verb clusters in performance grammar. In P. A. Seuren, & G. Kempen (Eds.), Verb constructions in German and Dutch (pp. 185-221). Amsterdam: Benjamins.
  • Kempen, G. (2003). Language generation. In W. Frawley (Ed.), International encyclopedia of linguistics (pp. 362-364). New York: Oxford University Press.
  • Kempen, G., & Harbusch, K. (2017). Frequential test of (S)OV as unmarked word order in Dutch and German clauses: A serendipitous corpus-linguistic experiment. In H. Reckman, L. L. S. Cheng, M. Hijzelendoorn, & R. Sybesma (Eds.), Crossroads semantics: Computation, experiment and grammar (pp. 107-123). Amsterdam: Benjamins.

    Abstract

    In a paper entitled “Against markedness (and what to replace it with)”, Haspelmath argues “that the term ‘markedness’ is superfluous”, and that frequency asymmetries often explain structural (un)markedness asymmetries (Haspelmath 2006). We investigate whether this argument applies to Object and Verb orders in main (VO, marked) and subordinate (OV, unmarked) clauses of spoken and written German and Dutch, using English (without VO/OV alternation) as control. Frequency counts from six treebanks (three languages, two output modalities) do not support Haspelmath’s proposal. However, they reveal an unexpected phenomenon, most prominently in spoken Dutch and German: a small set of extremely high-frequent finite verbs with unspecific meanings populates main clauses much more densely than subordinate clauses. We suggest these verbs accelerate the start-up of grammatical encoding, thus facilitating sentence-initial output fluency
  • Kempen, G. (1981). Taalpsychologie. In H. Duijker, & P. Vroon (Eds.), Codex Psychologicus (pp. 205-221). Amsterdam: Elsevier.
  • Kempen, G., & Harbusch, K. (2003). Word order scrambling as a consequence of incremental sentence production. In H. Härtl, & H. Tappe (Eds.), Mediating between concepts and grammar (pp. 141-164). Berlin: Mouton de Gruyter.
  • Kita, S. (2003). Pointing: A foundational building block in human communication. In S. Kita (Ed.), Pointing: Where language, culture, and cognition meet (pp. 1-8). Mahwah, NJ: Erlbaum.
  • Kita, S. (2003). Interplay of gaze, hand, torso orientation and language in pointing. In S. Kita (Ed.), Pointing: Where language, culture, and cognition meet (pp. 307-328). Mahwah, NJ: Erlbaum.
  • Kita, S., & Essegbey, J. (2003). Left-hand taboo on direction-indicating gestures in Ghana: When and why people still use left-hand gestures. In M. Rector, I. Poggi, & N. Trigo (Eds.), Gesture: Meaning and use (pp. 301-306). Oporto: Edições Universidade Fernando Pessoa, Fundação Fernado Pessoa.
  • Kita, S., & Enfield, N. J. (2003). Recording recommendations for video research. In N. J. Enfield (Ed.), Field research manual 2003, part I: Multimodal interaction, space, event representation (pp. 8-9). Nijmegen: Max Planck Institute for Psycholinguistics.
  • Klamer, M., Trilsbeek, P., Hoogervorst, T., & Haskett, C. (2017). Creating a Language Archive of Insular South East Asia and West New Guinea. In J. Odijk, & A. Van Hessen (Eds.), CLARIN in the Low Countries (pp. 113-121). London: Ubiquity Press. doi:10.5334/bbi.10.

    Abstract

    The geographical region of Insular South East Asia and New Guinea is well-known as an
    area of mega-biodiversity. Less well-known is the extreme linguistic diversity in this area:
    over a quarter of the world’s 6,000 languages are spoken here. As small minority languages,
    most of them will cease to be spoken in the coming few generations. The project described
    here ensures the preservation of unique records of languages and the cultures encapsulated
    by them in the region. The language resources were gathered by twenty linguists at,
    or in collaboration with, Dutch universities over the last 40 years, and were compiled and
    archived in collaboration with The Language Archive (TLA) at the Max Planck Institute in
    Nijmegen. The resulting archive constitutes a collection ofmultimediamaterials and written
    documents from 48 languages in Insular South East Asia and West New Guinea. At TLA,
    the data was archived according to state-of-the-art standards (TLA holds the Data Seal of
    Approval): the component metadata infrastructure CMDI was used; all metadata categories
    as well as relevant units of annotation were linked to the ISO data category registry ISOcat.
    This guaranteed proper integration of the language resources into the CLARIN framework.
    Through the archive, future speaker communities and researchers will be able to extensively
    search thematerials for answers to their own questions, even if they do not themselves know the language, and even if the language dies.
  • Klein, W., & Rath, R. (1981). Automatische Lemmatisierung deutscher Flexionsformen. In R. Herzog (Ed.), Computer in der Übersetzungswissenschaft (pp. 94-142). Framkfurt am Main, Bern: Verlag Peter Lang.
  • Klein, W. (1973). Eine Analyse der Kerne in Schillers "Räuber". In S. Marcus (Ed.), Mathematische Poetik (pp. 326-333). Frankfurt am Main: Athenäum.
  • Klein, W. (1981). Eine kommentierte Bibliographie zur Computerlinguistik. In R. Herzog (Ed.), Computer in der Übersetzungswissenschaft (pp. 95-142). Frankfurt am Main: Lang.
  • Klein, W., & Dimroth, C. (2003). Der ungesteuerte Zweitspracherwerb Erwachsener: Ein Überblick über den Forschungsstand. In U. Maas, & U. Mehlem (Eds.), Qualitätsanforderungen für die Sprachförderung im Rahmen der Integration von Zuwanderern (Heft 21) (pp. 127-161). Osnabrück: IMIS.
  • Klein, W. (1973). Dialekt und Einheitssprache im Fremdsprachenunterricht. In Beiträge zu den Sommerkursen des Goethe-Instituts München (pp. 53-60).
  • Klein, W. (1981). Knowing a language and knowing to communicate: A case study in foreign workers' communication. In A. Vermeer (Ed.), Language problems of minority groups (pp. 75-95). Tilburg: Tilburg University.
  • Klein, W. (1981). Logik der Argumentation. In Institut für deutsche Sprache (Ed.), Dialogforschung: Jahrbuch 1980 des Instituts für deutsche Sprache (pp. 226-264). Düsseldorf: Schwann.
  • Klein, W. (1981). Some rules of regular ellipsis in German. In W. Klein, & W. J. M. Levelt (Eds.), Crossing the boundaries in linguistics: Studies presented to Manfred Bierwisch (pp. 51-78). Dordrecht: Reidel.
  • Klein, W. (1988). The unity of a vernacular: Some remarks on "Berliner Stadtsprache". In N. Dittmar, & P. Schlobinski (Eds.), The sociolinguistics of urban vernaculars: Case studies and their evaluation (pp. 147-153). Berlin: de Gruyter.
  • Klein, W. (1988). Varietätengrammatik. In U. Ammon, N. Dittmar, & K. J. Mattheier (Eds.), Sociolinguistics: An international handbook of the science of language and society: Vol. 2 (pp. 997-1060). Berlin: de Gruyter.
  • Kuzla, C. (2003). Prosodically-conditioned variation in the realization of domain-final stops and voicing assimilation of domain-initial fricatives in German. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2829-2832). Adelaide: Causal Productions.
  • De Lange, F. P., Hagoort, P., & Toni, I. (2003). Differential fronto-parietal contributions to visual and motor imagery. NeuroImage, 19(2), e2094-e2095.

    Abstract

    Mental imagery is a cognitive process crucial to human reasoning. Numerous studies have characterized specific
    instances of this cognitive ability, as evoked by visual imagery (VI) or motor imagery (MI) tasks. However, it
    remains unclear which neural resources are shared between VI and MI, and which are exclusively related to MI.
    To address this issue, we have used fMRI to measure human brain activity during performance of VI and MI
    tasks. Crucially, we have modulated the imagery process by manipulating the degree of mental rotation necessary
    to solve the tasks. We focused our analysis on changes in neural signal as a function of the degree of mental
    rotation in each task.
  • Lee, R., Chambers, C. G., Huettig, F., & Ganea, P. A. (2017). Children’s semantic and world knowledge overrides fictional information during anticipatory linguistic processing. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Meeting of the Cognitive Science Society (CogSci 2017) (pp. 730-735). Austin, TX: Cognitive Science Society.

    Abstract

    Using real-time eye-movement measures, we asked how a fantastical discourse context competes with stored representations of semantic and world knowledge to influence children's and adults' moment-by-moment interpretation of a story. Seven-year- olds were less effective at bypassing stored semantic and world knowledge during real-time interpretation than adults. Nevertheless, an effect of discourse context on comprehension was still apparent.
  • Lev-Ari, S. (2019). The influence of social network properties on language processing and use. In M. S. Vitevitch (Ed.), Network Science in Cognitive Psychology (pp. 10-29). New York, NY: Routledge.

    Abstract

    Language is a social phenomenon. The author learns, processes, and uses it in social contexts. In other words, the social environment shapes the linguistic knowledge and use of the knowledge. To a degree, this is trivial. A child exposed to Japanese will become fluent in Japanese, whereas a child exposed to only Spanish will not understand Japanese but will master the sounds, vocabulary, and grammar of Spanish. Language is a structured system. Sounds and words do not occur randomly but are characterized by regularities. Learners are sensitive to these regularities and exploit them when learning language. People differ in the sizes of their social networks. Some people tend to interact with only a few people, whereas others might interact with a wide range of people. This is reflected in people’s holiday greeting habits: some people might send cards to only a few people, whereas other would send greeting cards to more than 350 people.
  • Levelt, W. J. M. (1988). Psycholinguistics: An overview. In W. Bright (Ed.), International encyclopedia of linguistics: Vol. 3 (pp. 290-294). Oxford: Oxford University press.
  • Levelt, C. C., Fikkert, P., & Schiller, N. O. (2003). Metrical priming in speech production. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2481-2485). Adelaide: Causal Productions.

    Abstract

    In this paper we report on four experiments in which we attempted to prime the stress position of Dutch bisyllabic target nouns. These nouns, picture names, had stress on either the first or the second syllable. Auditory prime words had either the same stress as the target or a different stress (e.g., WORtel – MOtor vs. koSTUUM – MOtor; capital letters indicate stressed syllables in prime – target pairs). Furthermore, half of the prime words were semantically related, the other half were unrelated. In none of the experiments a stress priming effect was found. This could mean that stress is not stored in the lexicon. An additional finding was that targets with initial stress had a faster response than targets with a final stress. We hypothesize that bisyllabic words with final stress take longer to be encoded because this stress pattern is irregular with respect to the lexical distribution of bisyllabic stress patterns, even though it can be regular in terms of the metrical stress rules of Dutch.
  • Levelt, W. J. M. (1962). Motion breaking and the perception of causality. In A. Michotte (Ed.), Causalité, permanence et réalité phénoménales: Etudes de psychologie expérimentale (pp. 244-258). Louvain: Publications Universitaires.
  • Levelt, W. J. M., & Plomp, R. (1962). Musical consonance and critical bandwidth. In Proceedings of the 4th International Congress Acoustics (pp. 55-55).
  • Levelt, W. J. M., & Maassen, B. (1981). Lexical search and order of mention in sentence production. In W. Klein, & W. J. M. Levelt (Eds.), Crossing the boundaries in linguistics (pp. 221-252). Dordrecht: Reidel.
  • Levelt, W. J. M., & Plomp, K. (1968). The appreciation of musical intervals. In J. M. M. Aler (Ed.), Proceedings of the fifth International Congress of Aesthetics, Amsterdam 1964 (pp. 901-904). The Hague: Mouton.
  • Levelt, W. J. M. (1966). The perceptual conflict in binocular rivalry. In M. A. Bouman (Ed.), Studies in perception: Dedicated to M.A. Bouman (pp. 47-60). Soesterberg: Institute for Perception RVO-TNO.
  • Levinson, S. C. (2003). Spatial language. In L. Nadel (Ed.), Encyclopedia of cognitive science (pp. 131-137). London: Nature Publishing Group.
  • Levinson, S. C. (1988). Conceptual problems in the study of regional and cultural style. In N. Dittmar, & P. Schlobinski (Eds.), The sociolinguistics of urban vernaculars: Case studies and their evaluation (pp. 161-190). Berlin: De Gruyter.
  • Levinson, S. C. (2003). Contextualizing 'contextualization cues'. In S. Eerdmans, C. Prevignano, & P. Thibault (Eds.), Language and interaction: Discussions with John J. Gumperz (pp. 31-39). Amsterdam: John Benjamins.
  • Levinson, S. C. (2003). Language and cognition. In W. Frawley (Ed.), International Encyclopedia of Linguistics (pp. 459-463). Oxford: Oxford University Press.
  • Levinson, S. C. (2003). Language and mind: Let's get the issues straight! In D. Gentner, & S. Goldin-Meadow (Eds.), Language in mind: Advances in the study of language and cognition (pp. 25-46). Cambridge, MA: MIT Press.
  • Levinson, S. C., & Toni, I. (2019). Key issues and future directions: Interactional foundations of language. In P. Hagoort (Ed.), Human language: From genes and brain to behavior (pp. 257-261). Cambridge, MA: MIT Press.
  • Levinson, S. C. (2017). Living with Manny's dangerous idea. In G. Raymond, G. H. Lerner, & J. Heritage (Eds.), Enabling human conduct: Studies of talk-in-interaction in honor of Emanuel A. Schegloff (pp. 327-349). Amsterdam: Benjamins.
  • Levinson, S. C. (2019). Interactional foundations of language: The interaction engine hypothesis. In P. Hagoort (Ed.), Human language: From genes and brain to behavior (pp. 189-200). Cambridge, MA: MIT Press.
  • Levinson, S. C. (2019). Natural forms of purposeful interaction among humans: What makes interaction effective? In K. A. Gluck, & J. E. Laird (Eds.), Interactive task learning: Humans, robots, and agents acquiring new tasks through natural interactions (pp. 111-126). Cambridge, MA: MIT Press.
  • Levinson, S. C. (1988). Putting linguistics on a proper footing: Explorations in Goffman's participation framework. In P. Drew, & A. Wootton (Eds.), Goffman: Exploring the interaction order (pp. 161-227). Oxford: Polity Press.
  • Levinson, S. C. (1981). The essential inadequacies of speech act models of dialogue. In H. Parret, M. Sbisà, & J. Verscheuren (Eds.), Possibilities and limitations of pragmatics: Proceedings of the Conference on Pragmatics, Urbino, July 8–14, 1979 (pp. 473-492). Amsterdam: John Benjamins.
  • Levinson, S. C. (2017). Speech acts. In Y. Huang (Ed.), Oxford handbook of pragmatics (pp. 199-216). Oxford: Oxford University Press. doi:10.1093/oxfordhb/9780199697960.013.22.

    Abstract

    The essential insight of speech act theory was that when we use language, we perform actions—in a more modern parlance, core language use in interaction is a form of joint action. Over the last thirty years, speech acts have been relatively neglected in linguistic pragmatics, although important work has been done especially in conversation analysis. Here we review the core issues—the identifying characteristics, the degree of universality, the problem of multiple functions, and the puzzle of speech act recognition. Special attention is drawn to the role of conversation structure, probabilistic linguistic cues, and plan or sequence inference in speech act recognition, and to the centrality of deep recursive structures in sequences of speech acts in conversation

    Files private

    Request files
  • Liszkowski, U., & Epps, P. (2003). Directing attention and pointing in infants: A cross-cultural approach. In N. J. Enfield (Ed.), Field research manual 2003, part I: Multimodal interaction, space, event representation (pp. 25-27). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.877649.

    Abstract

    Recent research suggests that 12-month-old infants in German cultural settings have the motive of sharing their attention to and interest in various events with a social interlocutor. To do so, these preverbal infants predominantly use the pointing gesture (in this case the extended arm with or without extended index finger) as a means to direct another person’s attention. This task systematically investigates different types of motives underlying infants’ pointing. The occurrence of a protodeclarative (as opposed to protoimperative) motive is of particular interest because it requires an understanding of the recipient’s psychological states, such as attention and interest, that can be directed and accessed.
  • Little, H., Perlman, M., & Eryilmaz, K. (2017). Repeated interactions can lead to more iconic signals. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 760-765). Austin, TX: Cognitive Science Society.

    Abstract

    Previous research has shown that repeated interactions can cause iconicity in signals to reduce. However, data from several recent studies has shown the opposite trend: an increase in iconicity as the result of repeated interactions. Here, we discuss whether signals may become less or more iconic as a result of the modality used to produce them. We review several recent experimental results before presenting new data from multi-modal signals, where visual input creates audio feedback. Our results show that the growth in iconicity present in the audio information may come at a cost to iconicity in the visual information. Our results have implications for how we think about and measure iconicity in artificial signalling experiments. Further, we discuss how iconicity in real world speech may stem from auditory, kinetic or visual information, but iconicity in these different modalities may conflict.
  • Liu, S., & Zhang, Y. (2019). Why some verbs are harder to learn than others – A micro-level analysis of everyday learning contexts for early verb learning. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2173-2178). Montreal, QB: Cognitive Science Society.

    Abstract

    Verb learning is important for young children. While most
    previous research has focused on linguistic and conceptual
    challenges in early verb learning (e.g. Gentner, 1982, 2006),
    the present paper examined early verb learning at the
    attentional level and quantified the input for early verb learning
    by measuring verb-action co-occurrence statistics in parent-
    child interaction from the learner’s perspective. To do so, we
    used head-mounted eye tracking to record fine-grained
    multimodal behaviors during parent-infant joint play, and
    analyzed parent speech, parent and infant action, and infant
    attention at the moments when parents produced verb labels.
    Our results show great variability across different action verbs,
    in terms of frequency of verb utterances, frequency of
    corresponding actions related to verb meanings, and infants’
    attention to verbs and actions, which provide new insights on
    why some verbs are harder to learn than others.
  • Mai, F., Galke, L., & Scherp, A. (2019). CBOW is not all you need: Combining CBOW with the compositional matrix space model. In Proceedings of the Seventh International Conference on Learning Representations (ICLR 2019). OpenReview.net.

    Abstract

    Continuous Bag of Words (CBOW) is a powerful text embedding method. Due to its strong capabilities to encode word content, CBOW embeddings perform well on a wide range of downstream tasks while being efficient to compute. However, CBOW is not capable of capturing the word order. The reason is that the computation of CBOW's word embeddings is commutative, i.e., embeddings of XYZ and ZYX are the same. In order to address this shortcoming, we propose a
    learning algorithm for the Continuous Matrix Space Model, which we call Continual Multiplication of Words (CMOW). Our algorithm is an adaptation of word2vec, so that it can be trained on large quantities of unlabeled text. We empirically show that CMOW better captures linguistic properties, but it is inferior to CBOW in memorizing word content. Motivated by these findings, we propose a hybrid model that combines the strengths of CBOW and CMOW. Our results show that the hybrid CBOW-CMOW-model retains CBOW's strong ability to memorize word content while at the same time substantially improving its ability to encode other linguistic information by 8%. As a result, the hybrid also performs better on 8 out of 11 supervised downstream tasks with an average improvement of 1.2%.
  • Majid, A., & Enfield, N. J. (2017). Body. In H. Burkhardt, J. Seibt, G. Imaguire, & S. Gerogiorgakis (Eds.), Handbook of mereology (pp. 100-103). Munich: Philosophia.
  • Majid, A., & Bödeker, K. (2003). Folk theories of objects in motion. In N. J. Enfield (Ed.), Field research manual 2003, part I: Multimodal interaction, space, event representation (pp. 72-76). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.877654.

    Abstract

    There are three main strands of research which have investigated people’s intuitive knowledge of objects in motion. (1) Knowledge of the trajectories of objects in motion; (2) knowledge of the causes of motion; and (3) the categorisation of motion as to whether it has been produced by something animate or inanimate. We provide a brief introduction to each of these areas. We then point to some linguistic and cultural differences which may have consequences for people’s knowledge of objects in motion. Finally, we describe two experimental tasks and an ethnographic task that will allow us to collect data in order to establish whether, indeed, there are interesting cross-linguistic/cross-cultural differences in lay theories of objects in motion.
  • Majid, A., Manko, P., & De Valk, J. (2017). Language of the senses. In S. Dekker (Ed.), Scientific breakthroughs in the classroom! (pp. 40-76). Nijmegen: Science Education Hub Radboud University.

    Abstract

    The project that we describe in this chapter has the theme ‘Language of the senses’. This theme is
    based on the research of Asifa Majid and her team regarding the influence of language and culture on
    sensory perception. The chapter consists of two sections. Section 2.1 describes how different sensory
    perceptions are spoken of in different languages. Teachers can use this section as substantive preparation
    before they launch this theme in the classroom. Section 2.2 describes how teachers can handle
    this theme in accordance with the seven phases of inquiry-based learning. Chapter 1, in which the
    general guideline of the seven phases is described, forms the basis for this. We therefore recommend
    the use of chapter 1 as the starting point for the execution of a project in the classroom. This chapter
    provides the thematic additions.

    Additional information

    Materials Language of the senses
  • Majid, A., Manko, P., & de Valk, J. (2017). Taal der Zintuigen. In S. Dekker, & J. Van Baren-Nawrocka (Eds.), Wetenschappelijke doorbraken de klas in! Molecuulbotsingen, Stress en Taal der Zintuigen (pp. 128-166). Nijmegen: Wetenschapsknooppunt Radboud Universiteit.

    Abstract

    Taal der zintuigen gaat over de invloed van taal en cultuur op zintuiglijke waarnemingen. Hoe omschrijf je wat je ziet, voelt, proeft of ruikt? In sommige culturen zijn er veel verschillende woorden voor kleur, in andere culturen juist weer heel weinig. Worden we geboren met deze verschillende kleurgroepen? En bepaalt hoe je ergens over praat ook wat je waarneemt?
  • Majid, A. (2019). Preface. In L. J. Speed, C. O'Meara, L. San Roque, & A. Majid (Eds.), Perception Metaphors (pp. vii-viii). Amsterdam: Benjamins.
  • Mamus, E., Rissman, L., Majid, A., & Ozyurek, A. (2019). Effects of blindfolding on verbal and gestural expression of path in auditory motion events. In A. K. Goel, C. M. Seifert, & C. C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2275-2281). Montreal, QB: Cognitive Science Society.

    Abstract

    Studies have claimed that blind people’s spatial representations are different from sighted people, and blind people display superior auditory processing. Due to the nature of auditory and haptic information, it has been proposed that blind people have spatial representations that are more sequential than sighted people. Even the temporary loss of sight—such as through blindfolding—can affect spatial representations, but not much research has been done on this topic. We compared blindfolded and sighted people’s linguistic spatial expressions and non-linguistic localization accuracy to test how blindfolding affects the representation of path in auditory motion events. We found that blindfolded people were as good as sighted people when localizing simple sounds, but they outperformed sighted people when localizing auditory motion events. Blindfolded people’s path related speech also included more sequential, and less holistic elements. Our results indicate that even temporary loss of sight influences spatial representations of auditory motion events
  • Marcoux, K., & Ernestus, M. (2019). Differences between native and non-native Lombard speech in terms of pitch range. In M. Ochmann, M. Vorländer, & J. Fels (Eds.), Proceedings of the ICA 2019 and EAA Euroregio. 23rd International Congress on Acoustics, integrating 4th EAA Euroregio 2019 (pp. 5713-5720). Berlin: Deutsche Gesellschaft für Akustik.

    Abstract

    Lombard speech, speech produced in noise, is acoustically different from speech produced in quiet (plain speech) in several ways, including having a higher and wider F0 range (pitch). Extensive research on native Lombard speech does not consider that non-natives experience a higher cognitive load while producing
    speech and that the native language may influence the non-native speech. We investigated pitch range in plain and Lombard speech in native and non-natives.
    Dutch and American-English speakers read contrastive question-answer pairs in quiet and in noise in English, while the Dutch also read Dutch sentence pairs. We found that Lombard speech is characterized by a wider pitch range than plain speech, for all speakers (native English, non-native English, and native Dutch).
    This shows that non-natives also widen their pitch range in Lombard speech. In sentences with early-focus, we see the same increase in pitch range when going from plain to Lombard speech in native and non-native English, but a smaller increase in native Dutch. In sentences with late-focus, we see the biggest increase for the native English, followed by non-native English and then native Dutch. Together these results indicate an effect of the native language on non-native Lombard speech.
  • Marcoux, K., & Ernestus, M. (2019). Pitch in native and non-native Lombard speech. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019) (pp. 2605-2609). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Lombard speech, speech produced in noise, is
    typically produced with a higher fundamental
    frequency (F0, pitch) compared to speech in quiet. This paper examined the potential differences in native and non-native Lombard speech by analyzing median pitch in sentences with early- or late-focus produced in quiet and noise. We found an increase in pitch in late-focus sentences in noise for Dutch speakers in both English and Dutch, and for American-English speakers in English. These results
    show that non-native speakers produce Lombard speech, despite their higher cognitive load. For the early-focus sentences, we found a difference between the Dutch and the American-English speakers. Whereas the Dutch showed an increased F0 in noise
    in English and Dutch, the American-English speakers did not in English. Together, these results suggest that some acoustic characteristics of Lombard speech, such as pitch, may be language-specific, potentially
    resulting in the native language influencing the non-native Lombard speech.
  • Maslowski, M., Meyer, A. S., & Bosker, H. R. (2017). Whether long-term tracking of speech rate affects perception depends on who is talking. In Proceedings of Interspeech 2017 (pp. 586-590). doi:10.21437/Interspeech.2017-1517.

    Abstract

    Speech rate is known to modulate perception of temporally ambiguous speech sounds. For instance, a vowel may be perceived as short when the immediate speech context is slow, but as long when the context is fast. Yet, effects of long-term tracking of speech rate are largely unexplored. Two experiments tested whether long-term tracking of rate influences perception of the temporal Dutch vowel contrast /ɑ/-/a:/. In Experiment 1, one low-rate group listened to 'neutral' rate speech from talker A and to slow speech from talker B. Another high-rate group was exposed to the same neutral speech from A, but to fast speech from B. Between-group comparison of the 'neutral' trials revealed that the low-rate group reported a higher proportion of /a:/ in A's 'neutral' speech, indicating that A sounded faster when B was slow. Experiment 2 tested whether one's own speech rate also contributes to effects of long-term tracking of rate. Here, talker B's speech was replaced by playback of participants' own fast or slow speech. No evidence was found that one's own voice affected perception of talker A in larger speech contexts. These results carry implications for our understanding of the mechanisms involved in rate-dependent speech perception and of dialogue.
  • McQueen, J. M., & Cho, T. (2003). The use of domain-initial strengthening in segmentation of continuous English speech. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2993-2996). Adelaide: Causal Productions.
  • McQueen, J. M., Dahan, D., & Cutler, A. (2003). Continuity and gradedness in speech processing. In N. O. Schiller, & A. S. Meyer (Eds.), Phonetics and phonology in language comprehension and production: Differences and similarities (pp. 39-78). Berlin: Mouton de Gruyter.
  • McQueen, J. M., & Meyer, A. S. (2019). Key issues and future directions: Towards a comprehensive cognitive architecture for language use. In P. Hagoort (Ed.), Human language: From genes and brain to behavior (pp. 85-96). Cambridge, MA: MIT Press.
  • Meeuwissen, M., Roelofs, A., & Levelt, W. J. M. (2003). Naming analog clocks conceptually facilitates naming digital clocks. In Proceedings of XIII Conference of the European Society of Cognitive Psychology (ESCOP 2003) (pp. 271-271).
  • Meira, S. (2003). 'Addressee effects' in demonstrative systems: The cases of Tiriyó and Brazilian Portugese. In F. Lenz (Ed.), Deictic conceptualization of space, time and person (pp. 3-12). Amsterdam/Philadelphia: John Benjamins.
  • Merkx, D., Frank, S., & Ernestus, M. (2019). Language learning using speech to image retrieval. In Proceedings of Interspeech 2019 (pp. 1841-1845). doi:10.21437/Interspeech.2019-3067.

    Abstract

    Humans learn language by interaction with their environment and listening to other humans. It should also be possible for computational models to learn language directly from speech but so far most approaches require text. We improve on existing neural network approaches to create visually grounded embeddings for spoken utterances. Using a combination of a multi-layer GRU, importance sampling, cyclic learning rates, ensembling and vectorial self-attention our results show a remarkable increase in image-caption retrieval performance over previous work. Furthermore, we investigate which layers in the model learn to recognise words in the input. We find that deeper network layers are better at encoding word presence, although the final layer has slightly lower performance. This shows that our visually grounded sentence encoder learns to recognise words from the input even though it is not explicitly trained for word recognition.

Share this page