Publications

Displaying 301 - 338 of 338
  • Ten Bosch, L., Mulder, K., & Boves, L. (2019). Phase synchronization between EEG signals as a function of differences between stimuli characteristics. In Proceedings of Interspeech 2019 (pp. 1213-1217). doi:10.21437/Interspeech.2019-2443.

    Abstract

    The neural processing of speech leads to specific patterns in the brain which can be measured as, e.g., EEG signals. When properly aligned with the speech input and averaged over many tokens, the Event Related Potential (ERP) signal is able to differentiate specific contrasts between speech signals. Well-known effects relate to the difference between expected and unexpected words, in particular in the N400, while effects in N100 and P200 are related to attention and acoustic onset effects. Most EEG studies deal with the amplitude of EEG signals over time, sidestepping the effect of phase and phase synchronization. This paper investigates the relation between phase in the EEG signals measured in an auditory lexical decision task by Dutch participants listening to full and reduced English word forms. We show that phase synchronization takes place across stimulus conditions, and that the so-called circular variance is narrowly related to the type of contrast between stimuli.
  • Ten Bosch, L., & Boves, L. (2018). Information encoding by deep neural networks: what can we learn? In Proceedings of Interspeech 2018 (pp. 1457-1461). doi:10.21437/Interspeech.2018-1896.

    Abstract

    The recent advent of deep learning techniques in speech tech-nology and in particular in automatic speech recognition hasyielded substantial performance improvements. This suggeststhat deep neural networks (DNNs) are able to capture structurein speech data that older methods for acoustic modeling, suchas Gaussian Mixture Models and shallow neural networks failto uncover. In image recognition it is possible to link repre-sentations on the first couple of layers in DNNs to structuralproperties of images, and to representations on early layers inthe visual cortex. This raises the question whether it is possi-ble to accomplish a similar feat with representations on DNNlayers when processing speech input. In this paper we presentthree different experiments in which we attempt to untanglehow DNNs encode speech signals, and to relate these repre-sentations to phonetic knowledge, with the aim to advance con-ventional phonetic concepts and to choose the topology of aDNNs more efficiently. Two experiments investigate represen-tations formed by auto-encoders. A third experiment investi-gates representations on convolutional layers that treat speechspectrograms as if they were images. The results lay the basisfor future experiments with recursive networks.
  • Ter Bekke, M., Ozyurek, A., & Ünal, E. (2019). Speaking but not gesturing predicts motion event memory within and across languages. In A. Goel, C. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2940-2946). Montreal, QB: Cognitive Science Society.

    Abstract

    In everyday life, people see, describe and remember motion events. We tested whether the type of motion event information (path or manner) encoded in speech and gesture predicts which information is remembered and if this varies across speakers of typologically different languages. We focus on intransitive motion events (e.g., a woman running to a tree) that are described differently in speech and co-speech gesture across languages, based on how these languages typologically encode manner and path information (Kita & Özyürek, 2003; Talmy, 1985). Speakers of Dutch (n = 19) and Turkish (n = 22) watched and described motion events. With a surprise (i.e. unexpected) recognition memory task, memory for manner and path components of these events was measured. Neither Dutch nor Turkish speakers’ memory for manner went above chance levels. However, we found a positive relation between path speech and path change detection: participants who described the path during encoding were more accurate at detecting changes to the path of an event during the memory task. In addition, the relation between path speech and path memory changed with native language: for Dutch speakers encoding path in speech was related to improved path memory, but for Turkish speakers no such relation existed. For both languages, co-speech gesture did not predict memory speakers. We discuss the implications of these findings for our understanding of the relations between speech, gesture, type of encoding in language and memory.
  • Thomassen, A. J., & Kempen, G. (1979). Memory. In J. A. Michon, E. Eijkman, & L. Klerk (Eds.), Handbook of psychonomics (pp. 75-137 ). Amsterdam: North-Holland Publishing Company.
  • Thomaz, A. L., Lieven, E., Cakmak, M., Chai, J. Y., Garrod, S., Gray, W. D., Levinson, S. C., Paiva, A., & Russwinkel, N. (2019). Interaction for task instruction and learning. In K. A. Gluck, & J. E. Laird (Eds.), Interactive task learning: Humans, robots, and agents acquiring new tasks through natural interactions (pp. 91-110). Cambridge, MA: MIT Press.
  • Thompson, B., & Lupyan, G. (2018). Automatic estimation of lexical concreteness in 77 languages. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 1122-1127). Austin, TX: Cognitive Science Society.

    Abstract

    We estimate lexical Concreteness for millions of words across 77 languages. Using a simple regression framework, we combine vector-based models of lexical semantics with experimental norms of Concreteness in English and Dutch. By applying techniques to align vector-based semantics across distinct languages, we compute and release Concreteness estimates at scale in numerous languages for which experimental norms are not currently available. This paper lays out the technique and its efficacy. Although this is a difficult dataset to evaluate immediately, Concreteness estimates computed from English correlate with Dutch experimental norms at $\rho$ = .75 in the vocabulary at large, increasing to $\rho$ = .8 among Nouns. Our predictions also recapitulate attested relationships with word frequency. The approach we describe can be readily applied to numerous lexical measures beyond Concreteness
  • Thompson, B., Roberts, S., & Lupyan, G. (2018). Quantifying semantic similarity across languages. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 2551-2556). Austin, TX: Cognitive Science Society.

    Abstract

    Do all languages convey semantic knowledge in the same way? If language simply mirrors the structure of the world, the answer should be a qualified “yes”. If, however, languages impose structure as much as reflecting it, then even ostensibly the “same” word in different languages may mean quite different things. We provide a first pass at a large-scale quantification of cross-linguistic semantic alignment of approximately 1000 meanings in 55 languages. We find that the translation equivalents in some domains (e.g., Time, Quantity, and Kinship) exhibit high alignment across languages while the structure of other domains (e.g., Politics, Food, Emotions, and Animals) exhibits substantial cross-linguistic variability. Our measure of semantic alignment correlates with known phylogenetic distances between languages: more phylogenetically distant languages have less semantic alignment. We also find semantic alignment to correlate with cultural distances between societies speaking the languages, suggesting a rich co-adaptation of language and culture even in domains of experience that appear most constrained by the natural world
  • Tourtouri, E. N., Delogu, F., & Crocker, M. W. (2018). Specificity and entropy reduction in situated referential processing. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 3356-3361). Austin: Cognitive Science Society.

    Abstract

    In situated communication, reference to an entity in the shared visual context can be established using eitheranexpression that conveys precise (minimally specified) or redundant (over-specified) information. There is, however, along-lasting debate in psycholinguistics concerningwhether the latter hinders referential processing. We present evidence from an eyetrackingexperiment recordingfixations as well asthe Index of Cognitive Activity –a novel measure of cognitive workload –supporting the view that over-specifications facilitate processing. We further present originalevidence that, above and beyond the effect of specificity,referring expressions thatuniformly reduce referential entropyalso benefitprocessing
  • Troncoso Ruiz, A., Ernestus, M., & Broersma, M. (2019). Learning to produce difficult L2 vowels: The effects of awareness-rasing, exposure and feedback. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019) (pp. 1094-1098). Canberra, Australia: Australasian Speech Science and Technology Association Inc.
  • Tsutsui, S., Wang, X., Weng, G., Zhang, Y., Crandall, D., & Yu, C. (2022). Action recognition based on cross-situational action-object statistics. In Proceedings of the 2022 IEEE International Conference on Development and Learning (ICDL 2022).

    Abstract

    Machine learning models of visual action recognition are typically trained and tested on data from specific situations where actions are associated with certain objects. It is an open question how action-object associations in the training set influence a model's ability to generalize beyond trained situations. We set out to identify properties of training data that lead to action recognition models with greater generalization ability. To do this, we take inspiration from a cognitive mechanism called cross-situational learning, which states that human learners extract the meaning of concepts by observing instances of the same concept across different situations. We perform controlled experiments with various types of action-object associations, and identify key properties of action-object co-occurrence in training data that lead to better classifiers. Given that these properties are missing in the datasets that are typically used to train action classifiers in the computer vision literature, our work provides useful insights on how we should best construct datasets for efficiently training for better generalization.
  • Udden, J., & Männel, C. (2018). Artificial grammar learning and its neurobiology in relation to language processing and development. In S.-A. Rueschemeyer, & M. G. Gaskell (Eds.), The Oxford Handbook of Psycholinguistics (2nd ed., pp. 755-783). Oxford: Oxford University Press.

    Abstract

    The artificial grammar learning (AGL) paradigm enables systematic investigation of the acquisition of linguistically relevant structures. It is a paradigm of interest for language processing research, interfacing with theoretical linguistics, and for comparative research on language acquisition and evolution. This chapter presents a key for understanding major variants of the paradigm. An unbiased summary of neuroimaging findings of AGL is presented, using meta-analytic methods, pointing to the crucial involvement of the bilateral frontal operculum and regions in the right lateral hemisphere. Against a background of robust posterior temporal cortex involvement in processing complex syntax, the evidence for involvement of the posterior temporal cortex in AGL is reviewed. Infant AGL studies testing for neural substrates are reviewed, covering the acquisition of adjacent and non-adjacent dependencies as well as algebraic rules. The language acquisition data suggest that comparisons of learnability of complex grammars performed with adults may now also be possible with children.
  • Ünal, E., & Papafragou, A. (2018). Evidentials, information sources and cognition. In A. Y. Aikhenvald (Ed.), The Oxford Handbook of Evidentiality (pp. 175-184). Oxford University Press.
  • Ünal, E., & Papafragou, A. (2018). The relation between language and mental state reasoning. In J. Proust, & M. Fortier (Eds.), Metacognitive diversity: An interdisciplinary approach (pp. 153-169). Oxford: Oxford University Press.
  • Vagliano, I., Galke, L., Mai, F., & Scherp, A. (2018). Using adversarial autoencoders for multi-modal automatic playlist continuation. In C.-W. Chen, P. Lamere, M. Schedl, & H. Zamani (Eds.), RecSys Challenge '18: Proceedings of the ACM Recommender Systems Challenge 2018 (pp. 5.1-5.6). New York: ACM. doi:10.1145/3267471.3267476.

    Abstract

    The task of automatic playlist continuation is generating a list of recommended tracks that can be added to an existing playlist. By suggesting appropriate tracks, i. e., songs to add to a playlist, a recommender system can increase the user engagement by making playlist creation easier, as well as extending listening beyond the end of current playlist. The ACM Recommender Systems Challenge 2018 focuses on such task. Spotify released a dataset of playlists, which includes a large number of playlists and associated track listings. Given a set of playlists from which a number of tracks have been withheld, the goal is predicting the missing tracks in those playlists. We participated in the challenge as the team Unconscious Bias and, in this paper, we present our approach. We extend adversarial autoencoders to the problem of automatic playlist continuation. We show how multiple input modalities, such as the playlist titles as well as track titles, artists and albums, can be incorporated in the playlist continuation task.
  • Van Leeuwen, T. M., & Dingemanse, M. (2022). Samenwerkende zintuigen. In S. Dekker, & H. Kause (Eds.), Wetenschappelijke doorbraken de klas in!: Geloven, Neustussenschot en Samenwerkende zintuigen (pp. 85-116). Nijmegen: Wetenschapsknooppunt Radboud Universiteit.

    Abstract

    Ook al hebben we het niet altijd door, onze zintuigen werken altijd samen. Als je iemand ziet praten, bijvoorbeeld, verwerken je hersenen automatisch tegelijkertijd het geluid van de woorden en de bewegingen van de lippen. Omdat onze zintuigen altijd samenwerken zijn onze hersenen erg gevoelig voor dingen die ‘samenhoren’ en goed bij elkaar passen. In dit hoofdstuk beschrijven we een project onderzoekend leren met als thema ‘Samenwerkende zintuigen’.
  • Van Dooren, A., Tulling, M., Cournane, A., & Hacquard, V. (2019). Discovering modal polysemy: Lexical aspect might help. In M. Brown, & B. Dailey (Eds.), BUCLD 43: Proceedings of the 43rd annual Boston University Conference on Language Development (pp. 203-216). Sommerville, MA: Cascadilla Press.
  • Van den Heuvel, H., Oostdijk, N., Rowland, C. F., & Trilsbeek, P. (2022). The CLARIN Knowledge Centre for Atypical Communication Expertise. In D. Fišer, & A. Witt (Eds.), CLARIN: The Infrastructure for Language Resources (pp. 373-388). Berlin, Boston: De Gruyter.

    Abstract

    In this chapter we introduce the CLARIN Knowledge Centre for Atypical Communication Expertise. The mission of ACE is to support researchers engaged in languages which pose particular challenges for analysis; for this, we use the umbrella term “atypical communication”. This includes language use by second-language learners, people with language disorders or those suffering from lan-guage disabilities, and languages that pose unique challenges for analysis, such as sign languages and languages spoken in a multilingual context. The chapter presents details about the collaborations and outreach of the centre, the services offered, and a number of showcases for its activities.
  • Van Ooijen, B., Cutler, A., & Norris, D. (1991). Detection times for vowels versus consonants. In Eurospeech 91: Vol. 3 (pp. 1451-1454). Genova: Istituto Internazionale delle Comunicazioni.

    Abstract

    This paper reports two experiments with vowels and consonants as phoneme detection targets in real words. In the first experiment, two relatively distinct vowels were compared with two confusible stop consonants. Response times to the vowels were longer than to the consonants. Response times correlated negatively with target phoneme length. In the second, two relatively distinct vowels were compared with their corresponding semivowels. This time, the vowels were detected faster than the semivowels. We conclude that response time differences between vowels and stop consonants in this task may reflect differences between phoneme categories in the variability of tokens, both in the acoustic realisation of targets and in the' representation of targets by subjects.
  • Van Berkum, J. J. A., & Nieuwland, M. S. (2019). A cognitive neuroscience perspective on language comprehension in context. In P. Hagoort (Ed.), Human language: From genes and brain to behavior (pp. 429-442). Cambridge, MA: MIT Press.
  • Van Valin Jr., R. D. (2000). Focus structure or abstract syntax? A role and reference grammar account of some ‘abstract’ syntactic phenomena. In Z. Estrada Fernández, & I. Barreras Aguilar (Eds.), Memorias del V Encuentro Internacional de Lingüística en el Noroeste: (2 v.) Estudios morfosintácticos (pp. 39-62). Hermosillo: Editorial Unison.
  • Vernes, S. C. (2019). Neuromolecular approaches to the study of language. In P. Hagoort (Ed.), Human language: From genes and brain to behavior (pp. 577-593). Cambridge, MA: MIT Press.
  • Vernes, S. C. (2018). Vocal learning in bats: From genes to behaviour. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 516-518). Toruń, Poland: NCU Press. doi:10.12775/3991-1.128.
  • Vessel, E. A., Ishizu, T., & Bignardi, G. (2022). Neural correlates of visual aesthetic appeal. In M. Skov, & M. Nadal (Eds.), The Routledge international handbook of neuroaesthetics (pp. 103-133). London: Routledge.
  • Von Holzen, K., & Bergmann, C. (2018). A Meta-Analysis of Infants’ Mispronunciation Sensitivity Development. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 1159-1164). Austin, TX: Cognitive Science Society.

    Abstract

    Before infants become mature speakers of their native language, they must acquire a robust word-recognition system which allows them to strike the balance between allowing some variation (mood, voice, accent) and recognizing variability that potentially changes meaning (e.g. cat vs hat). The current meta-analysis quantifies how the latter, termed mispronunciation sensitivity, changes over infants’ first three years, testing competing predictions of mainstream language acquisition theories. Our results show that infants were sensitive to mispronunciations, but accepted them as labels for target objects. Interestingly, and in contrast to predictions of mainstream theories, mispronunciation sensitivity was not modulated by infant age, suggesting that a sufficiently flexible understanding of native language phonology is in place at a young age.
  • Von Stutterheim, C., & Klein, W. (1989). Referential movement in descriptive and narrative discourse. In R. Dietrich, & C. F. Graumann (Eds.), Language processing in social context (pp. 39-76). Amsterdam: Elsevier.
  • Vosse, T., & Kempen, G. (1991). A hybrid model of human sentence processing: Parsing right-branching, center-embedded and cross-serial dependencies. In M. Tomita (Ed.), Proceedings of the Second International Workshop on Parsing Technologies.
  • Wagner, M. A., Broersma, M., McQueen, J. M., & Lemhöfer, K. (2019). Imitating speech in an unfamiliar language and an unfamiliar non-native accent in the native language. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1362-1366). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    This study concerns individual differences in speech imitation ability and the role that lexical representations play in imitation. We examined 1) whether imitation of sounds in an unfamiliar language (L0) is related to imitation of sounds in an unfamiliar
    non-native accent in the speaker’s native language (L1) and 2) whether it is easier or harder to imitate speech when you know the words to be imitated. Fifty-nine native Dutch speakers imitated words with target vowels in Basque (/a/ and /e/) and Greekaccented
    Dutch (/i/ and /u/). Spectral and durational
    analyses of the target vowels revealed no relationship between the success of L0 and L1 imitation and no difference in performance between tasks (i.e., L1
    imitation was neither aided nor blocked by lexical knowledge about the correct pronunciation). The results suggest instead that the relationship of the vowels to native phonological categories plays a bigger role in imitation
  • Weber, A. (2000). Phonotactic and acoustic cues for word segmentation in English. In Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP 2000) (pp. 782-785).

    Abstract

    This study investigates the influence of both phonotactic and acoustic cues on the segmentation of spoken English. Listeners detected embedded English words in nonsense sequences (word spotting). Words aligned with phonotactic boundaries were easier to detect than words without such alignment. Acoustic cues to boundaries could also have signaled word boundaries, especially when word onsets lacked phonotactic alignment. However, only one of several durational boundary cues showed a marginally significant correlation with response times (RTs). The results suggest that word segmentation in English is influenced primarily by phonotactic constraints and only secondarily by acoustic aspects of the speech signal.
  • Weber, A. (2000). The role of phonotactics in the segmentation of native and non-native continuous speech. In A. Cutler, J. M. McQueen, & R. Zondervan (Eds.), Proceedings of SWAP, Workshop on Spoken Word Access Processes. Nijmegen: MPI for Psycholinguistics.

    Abstract

    Previous research has shown that listeners make use of their knowledge of phonotactic constraints to segment speech into individual words. The present study investigates the influence of phonotactics when segmenting a non-native language. German and English listeners detected embedded English words in nonsense sequences. German listeners also had knowledge of English, but English listeners had no knowledge of German. Word onsets were either aligned with a syllable boundary or not, according to the phonotactics of the two languages. Words aligned with either German or English phonotactic boundaries were easier for German listeners to detect than words without such alignment. Responses of English listeners were influenced primarily by English phonotactic alignment. The results suggest that both native and non-native phonotactic constraints influence lexical segmentation of a non-native, but familiar, language.
  • Weissenborn, J. (1988). Von der demonstratio ad oculos zur Deixis am Phantasma. Die Entwicklung der lokalen Referenz bei Kindern. In Karl Bühler's Theory of Language. Proceedings of the Conference held at Kirchberg, August 26, 1984 and Essen, November 21–24, 1984 (pp. 257-276). Amsterdam: Benjamins.
  • Willems, R. M., & Cristia, A. (2018). Hemodynamic methods: fMRI and fNIRS. In A. M. B. De Groot, & P. Hagoort (Eds.), Research methods in psycholinguistics and the neurobiology of language: A practical guide (pp. 266-287). Hoboken: Wiley.
  • Willems, R. M., & Van Gerven, M. (2018). New fMRI methods for the study of language. In S.-A. Rueschemeyer, & M. G. Gaskell (Eds.), The Oxford Handbook of Psycholinguistics (2nd ed., pp. 975-991). Oxford: Oxford University Press.
  • Woensdregt, M., Jara-Ettinger, J., & Rubio-Fernandez, P. (2022). Language universals rely on social cognition: Computational models of the use of this and that to redirect the receiver’s attention. In J. Culbertson, A. Perfors, H. Rabagliati, & V. Ramenzoni (Eds.), Proceedings of the 44th Annual Conference of the Cognitive Science Society (CogSci 2022) (pp. 1382-1388). Toronto, Canada: Cognitive Science Society.

    Abstract

    Demonstratives—simple referential devices like this and that—are linguistic universals, but their meaning varies cross-linguistically. In languages like English and Italian, demonstratives are thought to encode the referent’s distance from the producer (e.g., that one means “the one far away from me”),
    while in others, like Portuguese and Spanish, they encode relative distance from both producer and receiver (e.g., aquel means “the one far away from both of us”). Here we propose that demonstratives are also sensitive to the receiver’s focus of attention, hence requiring a deeper form of social cognition
    than previously thought. We provide initial empirical and computational evidence for this idea, suggesting that producers use
    demonstratives to redirect the receiver’s attention towards the intended referent, rather than only to indicate its physical distance.
  • Wolf, M. C., Smith, A. C., Meyer, A. S., & Rowland, C. F. (2019). Modality effects in vocabulary acquisition. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 1212-1218). Montreal, QB: Cognitive Science Society.

    Abstract

    It is unknown whether modality affects the efficiency with which humans learn novel word forms and their meanings, with previous studies reporting both written and auditory advantages. The current study implements controls whose absence in previous work likely offers explanation for such contradictory findings. In two novel word learning experiments, participants were trained and tested on pseudoword - novel object pairs, with controls on: modality of test, modality of meaning, duration of exposure and transparency of word form. In both experiments word forms were presented in either their written or spoken form, each paired with a pictorial meaning (novel object). Following a 20-minute filler task, participants were tested on their ability to identify the picture-word form pairs on which they were trained. A between subjects design generated four participant groups per experiment 1) written training, written test; 2) written training, spoken test; 3) spoken training, written test; 4) spoken training, spoken test. In Experiment 1 the written stimulus was presented for a time period equal to the duration of the spoken form. Results showed that when the duration of exposure was equal, participants displayed a written training benefit. Given words can be read faster than the time taken for the spoken form to unfold, in Experiment 2 the written form was presented for 300 ms, sufficient time to read the word yet 65% shorter than the duration of the spoken form. No modality effect was observed under these conditions, when exposure to the word form was equivalent. These results demonstrate, at least for proficient readers, that when exposure to the word form is controlled across modalities the efficiency with which word form-meaning associations are learnt does not differ. Our results therefore suggest that, although we typically begin as aural-only word learners, we ultimately converge on developing learning mechanisms that learn equally efficiently from both written and spoken materials.
  • Zavala, R. (2000). Multiple classifier systems in Akatek (Mayan). In G. Senft (Ed.), Systems of nominal classification (pp. 114-146). Cambridge University Press.
  • Zhang, Y., & Yu, C. (2022). Examining real-time attention dynamics in parent-infant picture book reading. In J. Culbertson, A. Perfors, H. Rabagliati, & V. Ramenzoni (Eds.), Proceedings of the 44th Annual Conference of the Cognitive Science Society (CogSci 2022) (pp. 1367-1374). Toronto, Canada: Cognitive Science Society.

    Abstract

    Picture book reading is a common word-learning context from which parents repeatedly name objects to their child and it has been found to facilitate early word learning. To learn the correct word-object mappings in a book-reading context, infants need to be able to link what they see with what they hear. However, given multiple objects on every book page, it is not clear how infants direct their attention to objects named by parents. The aim of the current study is to examine how infants mechanistically discover the correct word-object mappings during book reading in real time. We used head-mounted eye-tracking during parent-infant picture book reading and measured the infant's moment-by-moment visual attention to the named referent. We also examined how gesture cues provided by both the child and the parent may influence infants' attention to the named target. We found that although parents provided many object labels during book reading, infants were not able to attend to the named objects easily. However, their abilities to follow and use gestures to direct the other social partner’s attention increase the chance of looking at the named target during parent naming.
  • Zhang, Y., Chen, C.-h., & Yu, C. (2019). Mechanisms of cross-situational learning: Behavioral and computational evidence. In Advances in Child Development and Behavior; vol. 56 (pp. 37-63).

    Abstract

    Word learning happens in everyday contexts with many words and many potential referents for those words in view at the same time. It is challenging for young learners to find the correct referent upon hearing an unknown word at the moment. This problem of referential uncertainty has been deemed as the crux of early word learning (Quine, 1960). Recent empirical and computational studies have found support for a statistical solution to the problem termed cross-situational learning. Cross-situational learning allows learners to acquire word meanings across multiple exposures, despite each individual exposure is referentially uncertain. Recent empirical research shows that infants, children and adults rely on cross-situational learning to learn new words (Smith & Yu, 2008; Suanda, Mugwanya, & Namy, 2014; Yu & Smith, 2007). However, researchers have found evidence supporting two very different theoretical accounts of learning mechanisms: Hypothesis Testing (Gleitman, Cassidy, Nappa, Papafragou, & Trueswell, 2005; Markman, 1992) and Associative Learning (Frank, Goodman, & Tenenbaum, 2009; Yu & Smith, 2007). Hypothesis Testing is generally characterized as a form of learning in which a coherent hypothesis regarding a specific word-object mapping is formed often in conceptually constrained ways. The hypothesis will then be either accepted or rejected with additional evidence. However, proponents of the Associative Learning framework often characterize learning as aggregating information over time through implicit associative mechanisms. A learner acquires the meaning of a word when the association between the word and the referent becomes relatively strong. In this chapter, we consider these two psychological theories in the context of cross-situational word-referent learning. By reviewing recent empirical and cognitive modeling studies, our goal is to deepen our understanding of the underlying word learning mechanisms by examining and comparing the two theoretical learning accounts.
  • Zuidema, W., & Fitz, H. (2019). Key issues and future directions: Models of human language and speech processing. In P. Hagoort (Ed.), Human language: From genes and brain to behavior (pp. 353-358). Cambridge, MA: MIT Press.

Share this page