Publications

Displaying 101 - 200 of 315
  • Gebre, B. G., Crasborn, O., Wittenburg, P., Drude, S., & Heskes, T. (2014). Unsupervised feature learning for visual sign language identification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Vol 2 (pp. 370-376). Redhook, NY: Curran Proceedings.

    Abstract

    Prior research on language identification focused primarily on text and speech. In this paper, we focus on the visual modality and present a method for identifying sign languages solely from short video samples. The method is trained on unlabelled video data (unsupervised feature learning) and using these features, it is trained to discriminate between six sign languages (supervised learning). We ran experiments on video samples involving 30 signers (running for a total of 6 hours). Using leave-one-signer-out cross-validation, our evaluation on short video samples shows an average best accuracy of 84%. Given that sign languages are under-resourced, unsupervised feature learning techniques are the right tools and our results indicate that this is realistic for sign language identification.
  • Gentzsch, W., Lecarpentier, D., & Wittenburg, P. (2014). Big data in science and the EUDAT project. In Proceeding of the 2014 Annual SRII Global Conference.
  • Gerwien, J., & Flecken, M. (2016). First things first? Top-down influences on event apprehension. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 2633-2638). Austin, TX: Cognitive Science Society.

    Abstract

    Not much is known about event apprehension, the earliest stage of information processing in elicited language production studies, using pictorial stimuli. A reason for our lack of knowledge on this process is that apprehension happens very rapidly (<350 ms after stimulus onset, Griffin & Bock 2000), making it difficult to measure the process directly. To broaden our understanding of apprehension, we analyzed landing positions and onset latencies of first fixations on visual stimuli (pictures of real-world events) given short stimulus presentation times, presupposing that the first fixation directly results from information processing during apprehension
  • Goldrick, M., Brehm, L., Pyeong Whan, C., & Smolensky, P. (2019). Transient blend states and discrete agreement-driven errors in sentence production. In G. J. Snover, M. Nelson, B. O'Connor, & J. Pater (Eds.), Proceedings of the Society for Computation in Linguistics (SCiL 2019) (pp. 375-376). doi:10.7275/n0b2-5305.
  • Goriot, C. (2019). Early-English education works no miracles: Cognitive and linguistic development in mainstream, early-English, and bilingual primary-school pupils in the Netherlands. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Guerra, E., Huettig, F., & Knoeferle, P. (2014). Assessing the time course of the influence of featural, distributional and spatial representations during reading. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2309-2314). Austin, TX: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2014/papers/402/.

    Abstract

    What does semantic similarity between two concepts mean? How could we measure it? The way in which semantic similarity is calculated might differ depending on the theoretical notion of semantic representation. In an eye-tracking reading experiment, we investigated whether two widely used semantic similarity measures (based on featural or distributional representations) have distinctive effects on sentence reading times. In other words, we explored whether these measures of semantic similarity differ qualitatively. In addition, we examined whether visually perceived spatial distance interacts with either or both of these measures. Our results showed that the effect of featural and distributional representations on reading times can differ both in direction and in its time course. Moreover, both featural and distributional information interacted with spatial distance, yet in different sentence regions and reading measures. We conclude that featural and distributional representations are distinct components of semantic representation.
  • Guerra, E., & Knoeferle, P. (2014). Spatial distance modulates reading times for sentences about social relations: evidence from eye tracking. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2315-2320). Austin, TX: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2014/papers/403/.

    Abstract

    Recent evidence from eye tracking during reading showed that non-referential spatial distance presented in a visual context can modulate semantic interpretation of similarity relations rapidly and incrementally. In two eye-tracking reading experiments we extended these findings in two important ways; first, we examined whether other semantic domains (social relations) could also be rapidly influenced by spatial distance during sentence comprehension. Second, we aimed to further specify how abstract language is co-indexed with spatial information by varying the syntactic structure of sentences between experiments. Spatial distance rapidly modulated reading times as a function of the social relation expressed by a sentence. Moreover, our findings suggest that abstract language can be co-indexed as soon as critical information becomes available for the reader.
  • Hahn, L. E., Ten Buuren, M., De Nijs, M., Snijders, T. M., & Fikkert, P. (2019). Acquiring novel words in a second language through mutual play with child songs - The Noplica Energy Center. In L. Nijs, H. Van Regenmortel, & C. Arculus (Eds.), MERYC19 Counterpoints of the senses: Bodily experiences in musical learning (pp. 78-87). Ghent, Belgium: EuNet MERYC 2019.

    Abstract

    Child songs are a great source for linguistic learning. Here we explore whether children can acquire novel words in a second language by playing a game featuring child songs in a playhouse. We present data from three studies that serve as scientific proof for the functionality of one game of the playhouse: the Energy Center. For this game, three hand-bikes were mounted on a panel. When children start moving the hand-bikes, child songs start playing simultaneously. Once the children produce enough energy with the hand-bikes, the songs are additionally accompanied with the sounds of musical instruments. In our studies, children executed a picture-selection task to evaluate whether they acquired new vocabulary from the songs presented during the game. Two of our studies were run in the field, one at a Dutch and one at an Indian pre-school. The third study features data from a more controlled laboratory setting. Our results partly confirm that the Energy Center is a successful means to support vocabulary acquisition in a second language. More research with larger sample sizes and longer access to the Energy Center is needed to evaluate the overall functionality of the game. Based on informal observations at our test sites, however, we are certain that children do pick up linguistic content from the songs during play, as many of the children repeat words and phrases from songs they heard. We will pick up upon these promising observations during future studies
  • Harmon, Z., & Kapatsinski, V. (2016). Fuse to be used: A weak cue’s guide to attracting attention. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016). Austin, TX: Cognitive Science Society (pp. 520-525). Austin, TX: Cognitive Science Society.

    Abstract

    Several studies examined cue competition in human learning by testing learners on a combination of conflicting cues rooting for different outcomes, with each cue perfectly predicting its outcome. A common result has been that learners faced with cue conflict choose the outcome associated with the rare cue (the Inverse Base Rate Effect, IBRE). Here, we investigate cue competition including IBRE with sentences containing cues to meanings in a visual world. We do not observe IBRE. Instead we find that position in the sentence strongly influences cue salience. Faced with conflict between an initial cue and a non-initial cue, learners choose the outcome associated with the initial cue, whether frequent or rare. However, a frequent configuration of non-initial cues that are not sufficiently salient on their own can overcome a competing salient initial cue rooting for a different meaning. This provides a possible explanation for certain recurring patterns in language change.
  • Harmon, Z., & Kapatsinski, V. (2016). Determinants of lengths of repetition disfluencies: Probabilistic syntactic constituency in speech production. In R. Burkholder, C. Cisneros, E. R. Coppess, J. Grove, E. A. Hanink, H. McMahan, C. Meyer, N. Pavlou, Ö. Sarıgül, A. R. Singerman, & A. Zhang (Eds.), Proceedings of the Fiftieth Annual Meeting of the Chicago Linguistic Society (pp. 237-248). Chicago: Chicago Linguistic Society.
  • Haveman, A. (1997). The open-/closed-class distinction in spoken-word recognition. PhD Thesis, Radboud University Nijmegen, Nijmegen. doi:10.17617/2.2057704.
  • Heilbron, M., Ehinger, B., Hagoort, P., & De Lange, F. P. (2019). Tracking naturalistic linguistic predictions with deep neural language models. In Proceedings of the 2019 Conference on Cognitive Computational Neuroscience (pp. 424-427). doi:10.32470/CCN.2019.1096-0.

    Abstract

    Prediction in language has traditionally been studied using
    simple designs in which neural responses to expected
    and unexpected words are compared in a categorical
    fashion. However, these designs have been contested
    as being ‘prediction encouraging’, potentially exaggerating
    the importance of prediction in language understanding.
    A few recent studies have begun to address
    these worries by using model-based approaches to probe
    the effects of linguistic predictability in naturalistic stimuli
    (e.g. continuous narrative). However, these studies
    so far only looked at very local forms of prediction, using
    models that take no more than the prior two words into
    account when computing a word’s predictability. Here,
    we extend this approach using a state-of-the-art neural
    language model that can take roughly 500 times longer
    linguistic contexts into account. Predictability estimates
    fromthe neural network offer amuch better fit to EEG data
    from subjects listening to naturalistic narrative than simpler
    models, and reveal strong surprise responses akin to
    the P200 and N400. These results show that predictability
    effects in language are not a side-effect of simple designs,
    and demonstrate the practical use of recent advances
    in AI for the cognitive neuroscience of language.
  • Hendricks, I., Lefever, E., Croijmans, I., Majid, A., & Van den Bosch, A. (2016). Very quaffable and great fun: Applying NLP to wine reviews. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Vol 2 (pp. 306-312). Stroudsburg, PA: Association for Computational Linguistics.

    Abstract

    We automatically predict properties of
    wines on the basis of smell and flavor de-
    scriptions from experts’ wine reviews. We
    show wine experts are capable of describ-
    ing their smell and flavor experiences in
    wine reviews in a sufficiently consistent
    manner, such that we can use their descrip-
    tions to predict properties of a wine based
    solely on language. The experimental re-
    sults show promising F-scores when using
    lexical and semantic information to predict
    the color, grape variety, country of origin,
    and price of a wine. This demonstrates,
    contrary to popular opinion, that wine ex-
    perts’ reviews really are informative.
  • Heyselaar, E., Hagoort, P., & Segaert, K. (2014). In dialogue with an avatar, syntax production is identical compared to dialogue with a human partner. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2351-2356). Austin, Tx: Cognitive Science Society.

    Abstract

    The use of virtual reality (VR) as a methodological tool is
    becoming increasingly popular in behavioural research due
    to its seemingly limitless possibilities. This new method has
    not been used frequently in the field of psycholinguistics,
    however, possibly due to the assumption that humancomputer
    interaction does not accurately reflect human-human
    interaction. In the current study we compare participants’
    language behaviour in a syntactic priming task with human
    versus avatar partners. Our study shows comparable priming
    effects between human and avatar partners (Human: 12.3%;
    Avatar: 12.6% for passive sentences) suggesting that VR is a
    valid platform for conducting language research and studying
    dialogue interactions.
  • Hill, C. (2018). Person reference and interaction in Umpila/Kuuku Ya'u narrative. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Hintz, F., & Scharenborg, O. (2016). Neighbourhood density influences word recognition in native and non-native speech recognition in noise. In H. Van den Heuvel, B. Cranen, & S. Mattys (Eds.), Proceedings of the Speech Processing in Realistic Environments (SPIRE) workshop (pp. 46-47). Groningen.
  • Hintz, F., & Scharenborg, O. (2016). The effect of background noise on the activation of phonological and semantic information during spoken-word recognition. In Proceedings of Interspeech 2016: The 17th Annual Conference of the International Speech Communication Association (pp. 2816-2820).

    Abstract

    During spoken-word recognition, listeners experience phonological competition between multiple word candidates, which increases, relative to optimal listening conditions, when speech is masked by noise. Moreover, listeners activate semantic word knowledge during the word’s unfolding. Here, we replicated the effect of background noise on phonological competition and investigated to which extent noise affects the activation of semantic information in phonological competitors. Participants’ eye movements were recorded when they listened to sentences containing a target word and looked at three types of displays. The displays either contained a picture of the target word, or a picture of a phonological onset competitor, or a picture of a word semantically related to the onset competitor, each along with three unrelated distractors. The analyses revealed that, in noise, fixations to the target and to the phonological onset competitor were delayed and smaller in magnitude compared to the clean listening condition, most likely reflecting enhanced phonological competition. No evidence for the activation of semantic information in the phonological competitors was observed in noise and, surprisingly, also not in the clear. We discuss the implications of the lack of an effect and differences between the present and earlier studies.
  • Hoffmann, C. W. G., Sadakata, M., Chen, A., Desain, P., & McQueen, J. M. (2014). Within-category variance and lexical tone discrimination in native and non-native speakers. In C. Gussenhoven, Y. Chen, & D. Dediu (Eds.), Proceedings of the 4th International Symposium on Tonal Aspects of Language (pp. 45-49). Nijmegen: Radboud University Nijmegen.

    Abstract

    In this paper, we show how acoustic variance within lexical tones in disyllabic Mandarin Chinese pseudowords affects discrimination abilities in both native and non-native speakers of Mandarin Chinese. Within-category acoustic variance did not hinder native speakers in discriminating between lexical tones, whereas it precludes Dutch native speakers from reaching native level performance. Furthermore, the influence of acoustic variance was not uniform but asymmetric, dependent on the presentation order of the lexical tones to be discriminated. An exploratory analysis using an active adaptive oddball paradigm was used to quantify the extent of the perceptual asymmetry. We discuss two possible mechanisms underlying this asymmetry and propose possible paradigms to investigate these mechanisms
  • Hömke, P. (2019). The face in face-to-face communication: Signals of understanding and non-understanding. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Hopman, E., Thompson, B., Austerweil, J., & Lupyan, G. (2018). Predictors of L2 word learning accuracy: A big data investigation. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 513-518). Austin, TX: Cognitive Science Society.

    Abstract

    What makes some words harder to learn than others in a second language? Although some robust factors have been identified based on small scale experimental studies, many relevant factors are difficult to study in such experiments due to the amount of data necessary to test them. Here, we investigate what factors affect the ease of learning of a word in a second language using a large data set of users learning English as a second language through the Duolingo mobile app. In a regression analysis, we test and confirm the well-studied effect of cognate status on word learning accuracy. Furthermore, we find significant effects for both cross-linguistic semantic alignment and English semantic density, two novel predictors derived from large scale distributional models of lexical semantics. Finally, we provide data on several other psycholinguistically plausible word level predictors. We conclude with a discussion of the limits, benefits and future research potential of using big data for investigating second language learning.
  • Huettig, F., Kolinsky, R., & Lachmann, T. (Eds.). (2018). The effects of literacy on cognition and brain functioning [Special Issue]. Language, Cognition and Neuroscience, 33(3).
  • Irivine, E., & Roberts, S. G. (2016). Deictic tools can limit the emergence of referential symbol systems. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/99.html.

    Abstract

    Previous experiments and models show that the pressure to communicate can lead to the emergence of symbols in specific tasks. The experiment presented here suggests that the ability to use deictic gestures can reduce the pressure for symbols to emerge in co-operative tasks. In the 'gesture-only' condition, pairs built a structure together in 'Minecraft', and could only communicate using a small range of gestures. In the 'gesture-plus' condition, pairs could also use sound to develop a symbol system if they wished. All pairs were taught a pointing convention. None of the pairs we tested developed a symbol system, and performance was no different across the two conditions. We therefore suggest that deictic gestures, and non-referential means of organising activity sequences, are often sufficient for communication. This suggests that the emergence of linguistic symbols in early hominids may have been late and patchy with symbols only emerging in contexts where they could significantly improve task success or efficiency. Given the communicative power of pointing however, these contexts may be fewer than usually supposed. An approach for identifying these situations is outlined.
  • Irizarri van Suchtelen, P. (2016). Spanish as a heritage language in the Netherlands. A cognitive linguistic exploration. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Isbilen, E., Frost, R. L. A., Monaghan, P., & Christiansen, M. (2018). Bridging artificial and natural language learning: Comparing processing- and reflection-based measures of learning. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 1856-1861). Austin, TX: Cognitive Science Society.

    Abstract

    A common assumption in the cognitive sciences is that artificial and natural language learning rely on shared mechanisms. However, attempts to bridge the two have yielded ambiguous results. We suggest that an empirical disconnect between the computations employed during learning and the methods employed at test may explain these mixed results. Further, we propose statistically-based chunking as a potential computational link between artificial and natural language learning. We compare the acquisition of non-adjacent dependencies to that of natural language structure using two types of tasks: reflection-based 2AFC measures, and processing-based recall measures, the latter being more computationally analogous to the processes used during language acquisition. Our results demonstrate that task-type significantly influences the correlations observed between artificial and natural language acquisition, with reflection-based and processing-based measures correlating within – but not across – task-type. These findings have fundamental implications for artificial-to-natural language comparisons, both methodologically and theoretically.
  • Janse, E., & Quené, H. (1999). On the suitability of the cross-modal semantic priming task. In Proceedings of the XIVth International Congress of Phonetic Sciences (pp. 1937-1940).
  • Janssen, D. (1999). Producing past and plural inflections. PhD Thesis, Radboud University Nijmegen, Nijmegen. doi:10.17617/2.2057667.
  • Janssen, R., Moisik, S. R., & Dediu, D. (2018). Agent model reveals the influence of vocal tract anatomy on speech during ontogeny and glossogeny. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 171-174). Toruń, Poland: NCU Press. doi:10.12775/3991-1.042.
  • Janssen, R. (2018). Let the agents do the talking: On the influence of vocal tract anatomy no speech during ontogeny. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Janssen, R., Winter, B., Dediu, D., Moisik, S. R., & Roberts, S. G. (2016). Nonlinear biases in articulation constrain the design space of language. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/86.html.

    Abstract

    In Iterated Learning (IL) experiments, a participant’s learned output serves as the next participant’s learning input (Kirby et al., 2014). IL can be used to model cultural transmission and has indicated that weak biases can be amplified through repeated cultural transmission (Kirby et al., 2007). So, for example, structural language properties can emerge over time because languages come to reflect the cognitive constraints in the individuals that learn and produce the language. Similarly, we propose that languages may also reflect certain anatomical biases. Do sound systems adapt to the affordances of the articulation space induced by the vocal tract?
    The human vocal tract has inherent nonlinearities which might derive from acoustics and aerodynamics (cf. quantal theory, see Stevens, 1989) or biomechanics (cf. Gick & Moisik, 2015). For instance, moving the tongue anteriorly along the hard palate to produce a fricative does not result in large changes in acoustics in most cases, but for a small range there is an abrupt change from a perceived palato-alveolar [ʃ] to alveolar [s] sound (Perkell, 2012). Nonlinearities such as these might bias all human speakers to converge on a very limited set of phonetic categories, and might even be a basis for combinatoriality or phonemic ‘universals’.
    While IL typically uses discrete symbols, Verhoef et al. (2014) have used slide whistles to produce a continuous signal. We conducted an IL experiment with human subjects who communicated using a digital slide whistle for which the degree of nonlinearity is controlled. A single parameter (α) changes the mapping from slide whistle position (the ‘articulator’) to the acoustics. With α=0, the position of the slide whistle maps Bark-linearly to the acoustics. As α approaches 1, the mapping gets more double-sigmoidal, creating three plateaus where large ranges of positions map to similar frequencies. In more abstract terms, α represents the strength of a nonlinear (anatomical) bias in the vocal tract.
    Six chains (138 participants) of dyads were tested, each chain with a different, fixed α. Participants had to communicate four meanings by producing a continuous signal using the slide-whistle in a ‘director-matcher’ game, alternating roles (cf. Garrod et al., 2007).
    Results show that for high αs, subjects quickly converged on the plateaus. This quick convergence is indicative of a strong bias, repelling subjects away from unstable regions already within-subject. Furthermore, high αs lead to the emergence of signals that oscillate between two (out of three) plateaus. Because the sigmoidal spaces are spatially constrained, participants increasingly used the sequential/temporal dimension. As a result of this, the average duration of signals with high α was ~100ms longer than with low α. These oscillations could be an expression of a basis for phonemic combinatoriality.
    We have shown that it is possible to manipulate the magnitude of an articulator-induced non-linear bias in a slide whistle IL framework. The results suggest that anatomical biases might indeed constrain the design space of language. In particular, the signaling systems in our study quickly converged (within-subject) on the use of stable regions. While these conclusions were drawn from experiments using slide whistles with a relatively strong bias, weaker biases could possibly be amplified over time by repeated cultural transmission, and likely lead to similar outcomes.
  • Janssen, R., Dediu, D., & Moisik, S. R. (2016). Simple agents are able to replicate speech sounds using 3d vocal tract model. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/97.html.

    Abstract

    Many factors have been proposed to explain why groups of people use different speech sounds in their language. These range from cultural, cognitive, environmental (e.g., Everett, et al., 2015) to anatomical (e.g., vocal tract (VT) morphology). How could such anatomical properties have led to the similarities and differences in speech sound distributions between human languages?

    It is known that hard palate profile variation can induce different articulatory strategies in speakers (e.g., Brunner et al., 2009). That is, different hard palate profiles might induce a kind of bias on speech sound production, easing some types of sounds while impeding others. With a population of speakers (with a proportion of individuals) that share certain anatomical properties, even subtle VT biases might become expressed at a population-level (through e.g., bias amplification, Kirby et al., 2007). However, before we look into population-level effects, we should first look at within-individual anatomical factors. For that, we have developed a computer-simulated analogue for a human speaker: an agent. Our agent is designed to replicate speech sounds using a production and cognition module in a computationally tractable manner.

    Previous agent models have often used more abstract (e.g., symbolic) signals. (e.g., Kirby et al., 2007). We have equipped our agent with a three-dimensional model of the VT (the production module, based on Birkholz, 2005) to which we made numerous adjustments. Specifically, we used a 4th-order Bezier curve that is able to capture hard palate variation on the mid-sagittal plane (XXX, 2015). Using an evolutionary algorithm, we were able to fit the model to human hard palate MRI tracings, yielding high accuracy fits and using as little as two parameters. Finally, we show that the samples map well-dispersed to the parameter-space, demonstrating that the model cannot generate unrealistic profiles. We can thus use this procedure to import palate measurements into our agent’s production module to investigate the effects on acoustics. We can also exaggerate/introduce novel biases.

    Our agent is able to control the VT model using the cognition module.

    Previous research has focused on detailed neurocomputation (e.g., Kröger et al., 2014) that highlights e.g., neurobiological principles or speech recognition performance. However, the brain is not the focus of our current study. Furthermore, present-day computing throughput likely does not allow for large-scale deployment of these architectures, as required by the population model we are developing. Thus, the question whether a very simple cognition module is able to replicate sounds in a computationally tractable manner, and even generalize over novel stimuli, is one worthy of attention in its own right.

    Our agent’s cognition module is based on running an evolutionary algorithm on a large population of feed-forward neural networks (NNs). As such, (anatomical) bias strength can be thought of as an attractor basin area within the parameter-space the agent has to explore. The NN we used consists of a triple-layered (fully-connected), directed graph. The input layer (three neurons) receives the formants frequencies of a target-sound. The output layer (12 neurons) projects to the articulators in the production module. A hidden layer (seven neurons) enables the network to deal with nonlinear dependencies. The Euclidean distance (first three formants) between target and replication is used as fitness measure. Results show that sound replication is indeed possible, with Euclidean distance quickly approaching a close-to-zero asymptote.

    Statistical analysis should reveal if the agent can also: a) Generalize: Can it replicate sounds not exposed to during learning? b) Replicate consistently: Do different, isolated agents always converge on the same sounds? c) Deal with consolidation: Can it still learn new sounds after an extended learning phase (‘infancy’) has been terminated? Finally, a comparison with more complex models will be used to demonstrate robustness.
  • Jeske, J., Kember, H., & Cutler, A. (2016). Native and non-native English speakers' use of prosody to predict sentence endings. In Proceedings of the 16th Australasian International Conference on Speech Science and Technology (SST2016).
  • St. John-Saaltink, E. (2016). When the past influences the present: Modulations of the sensory response by prior knowledge and task set. PhD Thesis, Radboud University, Nijmegen.
  • Jongman, S. R. (2016). Sustained attention in language production. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Joo, H., Jang, J., Kim, S., Cho, T., & Cutler, A. (2019). Prosodic structural effects on coarticulatory vowel nasalization in Australian English in comparison to American English. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 835-839). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    This study investigates effects of prosodic factors (prominence, boundary) on coarticulatory Vnasalization in Australian English (AusE) in CVN and NVC in comparison to those in American English
    (AmE). As in AmE, prominence was found to
    lengthen N, but to reduce V-nasalization, enhancing N’s nasality and V’s orality, respectively (paradigmatic contrast enhancement). But the prominence effect in CVN was more robust than that in AmE. Again similar to findings in AmE, boundary
    induced a reduction of N-duration and V-nasalization phrase-initially (syntagmatic contrast enhancement), and increased the nasality of both C and V phrasefinally.
    But AusE showed some differences in terms
    of the magnitude of V nasalization and N duration. The results suggest that the linguistic contrast enhancements underlie prosodic-structure modulation of coarticulatory V-nasalization in
    comparable ways across dialects, while the fine phonetic detail indicates that the phonetics-prosody interplay is internalized in the individual dialect’s phonetic grammar.
  • Jung, D., Klessa, K., Duray, Z., Oszkó, B., Sipos, M., Szeverényi, S., Várnai, Z., Trilsbeek, P., & Váradi, T. (2014). Languagesindanger.eu - Including multimedia language resources to disseminate knowledge and create educational material on less-resourced languages. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 530-535).

    Abstract

    The present paper describes the development of the languagesindanger.eu interactive website as an example of including multimedia language resources to disseminate knowledge and create educational material on less-resourced languages. The website is a product of INNET (Innovative networking in infrastructure for endangered languages), European FP7 project. Its main functions can be summarized as related to the three following areas: (1) raising students' awareness of language endangerment and arouse their interest in linguistic diversity, language maintenance and language documentation; (2) informing both students and teachers about these topics and show ways how they can enlarge their knowledge further with a special emphasis on information about language archives; (3) helping teachers include these topics into their classes. The website has been localized into five language versions with the intention to be accessible to both scientific and non-scientific communities such as (primarily) secondary school teachers and students, beginning university students of linguistics, journalists, the interested public, and also members of speech communities who speak minority languages
  • Kanero, J., Franko, I., Oranç, C., Uluşahin, O., Koskulu, S., Adigüzel, Z., Küntay, A. C., & Göksun, T. (2018). Who can benefit from robots? Effects of individual differences in robot-assisted language learning. In Proceedings of the 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp. 212-217). Piscataway, NJ, USA: IEEE.

    Abstract

    It has been suggested that some individuals may benefit more from social robots than do others. Using second
    language (L2) as an example, the present study examined how individual differences in attitudes toward robots and personality
    traits may be related to learning outcomes. Preliminary results with 24 Turkish-speaking adults suggest that negative attitudes
    toward robots, more specifically thoughts and anxiety about the negative social impact that robots may have on the society,
    predicted how well adults learned L2 words from a social robot. The possible implications of the findings as well as future directions are also discussed
  • Kember, H., Choi, J., & Cutler, A. (2016). Processing advantages for focused words in Korean. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 702-705).

    Abstract

    In Korean, focus is expressed in accentual phrasing. To ascertain whether words focused in this manner enjoy a processing advantage analogous to that conferred by focus as expressed in, e.g, English and Dutch, we devised sentences with target words in one of four conditions: prosodic focus, syntactic focus, prosodic + syntactic focus, and no focus as a control. 32 native speakers of Korean listened to blocks of 10 sentences, then were presented visually with words and asked whether or not they had heard them. Overall, words with focus were recognised significantly faster and more accurately than unfocused words. In addition, words with syntactic focus or syntactic + prosodic focus were recognised faster than words with prosodic focus alone. As for other languages, Korean focus confers processing advantage on the words carrying it. While prosodic focus does provide an advantage, however, syntactic focus appears to provide the greater beneficial effect for recognition memory
  • Kempen, G. (1997). De ontdubbelde taalgebruiker: Maken taalproductie en taalperceptie gebruik van één en dezelfde syntactische processor? [Abstract]. In 6e Winter Congres NvP. Programma and abstracts (pp. 31-32). Nederlandse Vereniging voor Psychonomie.
  • Kempen, G., Kooij, A., & Van Leeuwen, T. (1997). Do skilled readers exploit inflectional spelling cues that do not mirror pronunciation? An eye movement study of morpho-syntactic parsing in Dutch. In Abstracts of the Orthography Workshop "What spelling changes". Nijmegen: Max Planck Institute for Psycholinguistics.
  • Kempen, G., & Hoenkamp, E. (1982). Incremental sentence generation: Implications for the structure of a syntactic processor. In J. Horecký (Ed.), COLING 82. Proceedings of the Ninth International Conference on Computational Linguistics, Prague, July 5-10, 1982 (pp. 151-156). Amsterdam: North-Holland.

    Abstract

    Human speakers often produce sentences incrementally. They can start speaking having in mind only a fragmentary idea of what they want to say, and while saying this they refine the contents underlying subsequent parts of the utterance. This capability imposes a number of constraints on the design of a syntactic processor. This paper explores these constraints and evaluates some recent computational sentence generators from the perspective of incremental production.
  • Kirsch, J. (2018). Listening for the WHAT and the HOW: Older adults' processing of semantic and affective information in speech. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Klatter-Folmer, J., Van Hout, R., Van den Heuvel, H., Fikkert, P., Baker, A., De Jong, J., Wijnen, F., Sanders, E., & Trilsbeek, P. (2014). Vulnerability in acquisition, language impairments in Dutch: Creating a VALID data archive. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 357-364).

    Abstract

    The VALID Data Archive is an open multimedia data archive (under construction) with data from speakers suffering from language impairments. We report on a pilot project in the CLARIN-NL framework in which five data resources were curated. For all data sets concerned, written informed consent from the participants or their caretakers has been obtained. All materials were anonymized. The audio files were converted into wav (linear PCM) files and the transcriptions into CHAT or ELAN format. Research data that consisted of test, SPSS and Excel files were documented and converted into CSV files. All data sets obtained appropriate CMDI metadata files. A new CMDI metadata profile for this type of data resources was established and care was taken that ISOcat metadata categories were used to optimize interoperability. After curation all data are deposited at the Max Planck Institute for Psycholinguistics Nijmegen where persistent identifiers are linked to all resources. The content of the transcriptions in CHAT and plain text format can be searched with the TROVA search engine
  • Klein, W., & Musan, R. (Eds.). (1999). Das deutsche Perfekt [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (113).
  • Klein, W. (Ed.). (1997). Technologischer Wandel in den Philologien [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (106).
  • Klein, W. (Ed.). (1984). Textverständlichkeit - Textverstehen [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (55).
  • Klein, W. (Ed.). (1982). Zweitspracherwerb [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (45).
  • Koch, X. (2018). Age and hearing loss effects on speech processing. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Kok, P. (2014). On the role of expectation in visual perception: A top-down view of early visual cortex. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Kolipakam, V. (2018). A holistic approach to understanding pre-history. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Kösem, A. (2014). Cortical oscillations as temporal reference frames for perception. PhD Thesis, Université Pierre et Marie Curie-Paris VI, Paris.
  • Koster, M., & Cutler, A. (1997). Segmental and suprasegmental contributions to spoken-word recognition in Dutch. In Proceedings of EUROSPEECH 97 (pp. 2167-2170). Grenoble, France: ESCA.

    Abstract

    Words can be distinguished by segmental differences or by suprasegmental differences or both. Studies from English suggest that suprasegmentals play little role in human spoken-word recognition; English stress, however, is nearly always unambiguously coded in segmental structure (vowel quality); this relationship is less close in Dutch. The present study directly compared the effects of segmental and suprasegmental mispronunciation on word recognition in Dutch. There was a strong effect of suprasegmental mispronunciation, suggesting that Dutch listeners do exploit suprasegmental information in word recognition. Previous findings indicating the effects of mis-stressing for Dutch differ with stress position were replicated only when segmental change was involved, suggesting that this is an effect of segmental rather than suprasegmental processing.
  • Kouwenhoven, H. (2016). Situational variation in non-native communication: Studies into register variation, discourse management and pronunciation in Spanish English. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Kung, C. (2018). Speech comprehension in a tone language: The role of lexical tone, context, and intonation in Cantonese-Chinese. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Lam, K. J. Y. (2016). Understanding action-related language: Sensorimotor contributions to meaning. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Lartseva, A. (2016). Reading emotions: How people with Autism Spectrum Disorders process emotional language. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Lasser, I. (1997). Finiteness in adult and child German. PhD Thesis, Radboud University Nijmegen, Nijmegen. doi:10.17617/2.2057674.
  • Latrouite, A., & Van Valin Jr., R. D. (2014). Event existentials in Tagalog: A Role and Reference Grammar account. In W. Arka, & N. L. K. Mas Indrawati (Eds.), Argument realisations and related constructions in Austronesian languages: papers from 12-ICAL (pp. 161-174). Canberra: Pacific Linguistics.
  • Lattenkamp, E. Z., Vernes, S. C., & Wiegrebe, L. (2018). Mammalian models for the study of vocal learning: A new paradigm in bats. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 235-237). Toruń, Poland: NCU Press. doi:10.12775/3991-1.056.
  • Lauscher, A., Eckert, K., Galke, L., Scherp, A., Rizvi, S. T. R., Ahmed, S., Dengel, A., Zumstein, P., & Klein, A. (2018). Linked open citation database: Enabling libraries to contribute to an open and interconnected citation graph. In J. Chen, M. A. Gonçalves, J. M. Allen, E. A. Fox, M.-Y. Kan, & V. Petras (Eds.), JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries (pp. 109-118). New York: ACM. doi:10.1145/3197026.3197050.

    Abstract

    Citations play a crucial role in the scientific discourse, in information retrieval, and in bibliometrics. Many initiatives are currently promoting the idea of having free and open citation data. Creation of citation data, however, is not part of the cataloging workflow in libraries nowadays.
    In this paper, we present our project Linked Open Citation Database, in which we design distributed processes and a system infrastructure based on linked data technology. The goal is to show that efficiently cataloging citations in libraries using a semi-automatic approach is possible. We specifically describe the current state of the workflow and its implementation. We show that we could significantly improve the automatic reference extraction that is crucial for the subsequent data curation. We further give insights on the curation and linking process and provide evaluation results that not only direct the further development of the project, but also allow us to discuss its overall feasibility.
  • Lefever, E., Hendrickx, I., Croijmans, I., Van den Bosch, A., & Majid, A. (2018). Discovering the language of wine reviews: A text mining account. In N. Calzolari, K. Choukri, C. Cieri, T. Declerck, S. Goggi, K. Hasida, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis, & T. Tokunaga (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (pp. 3297-3302). Paris: LREC.

    Abstract

    It is widely held that smells and flavors are impossible to put into words. In this paper we test this claim by seeking predictive patterns in wine reviews, which ostensibly aim to provide guides to perceptual content. Wine reviews have previously been critiqued as random and meaningless. We collected an English corpus of wine reviews with their structured metadata, and applied machine learning techniques to automatically predict the wine's color, grape variety, and country of origin. To train the three supervised classifiers, three different information sources were incorporated: lexical bag-of-words features, domain-specific terminology features, and semantic word embedding features. In addition, using regression analysis we investigated basic review properties, i.e., review length, average word length, and their relationship to the scalar values of price and review score. Our results show that wine experts do share a common vocabulary to describe wines and they use this in a consistent way, which makes it possible to automatically predict wine characteristics based on the review text alone. This means that odors and flavors may be more expressible in language than typically acknowledged.
  • Lenkiewicz, P., Drude, S., Lenkiewicz, A., Gebre, B. G., Masneri, S., Schreer, O., Schwenninger, J., & Bardeli, R. (2014). Application of audio and video processing methods for language research and documentation: The AVATecH Project. In Z. Vetulani, & J. Mariani (Eds.), 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25-27, 2011, Revised Selected Papers (pp. 288-299). Berlin: Springer.

    Abstract

    Evolution and changes of all modern languages is a wellknown fact. However, recently it is reaching dynamics never seen before, which results in loss of the vast amount of information encoded in every language. In order to preserve such rich heritage, and to carry out linguistic research, properly annotated recordings of world languages are necessary. Since creating those annotations is a very laborious task, reaching times 100 longer than the length of the annotated media, innovative video processing algorithms are needed, in order to improve the efficiency and quality of annotation process. This is the scope of the AVATecH project presented in this article
  • Lenkiewicz, P., Shkaravska, O., Goosen, T., Windhouwer, M., Broeder, D., Roth, S., & Olsson, O. (2014). The DWAN framework: Application of a web annotation framework for the general humanities to the domain of language resources. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 3644-3649).
  • Lev-Ari, S., & Peperkamp, S. (2014). Do people converge to the linguistic patterns of non-reliable speakers? Perceptual learning from non-native speakers. In S. Fuchs, M. Grice, A. Hermes, L. Lancia, & D. Mücke (Eds.), Proceedings of the 10th International Seminar on Speech Production (ISSP) (pp. 261-264).

    Abstract

    People's language is shaped by the input from the environment. The environment, however, offers a range of linguistic inputs that differ in their reliability. We test whether listeners accordingly weigh input from sources that differ in reliability differently. Using a perceptual learning paradigm, we show that listeners adjust their representations according to linguistic input provided by native but not by non-native speakers. This is despite the fact that listeners are able to learn the characteristics of the speech of both speakers. These results provide evidence for a disassociation between adaptation to the characteristic of specific speakers and adjustment of linguistic representations in general based on these learned characteristics. This study also has implications for theories of language change. In particular, it cast doubts on the hypothesis that a large proportion of non-native speakers in a community can bring about linguistic changes
  • Levelt, W. J. M., & Plomp, R. (1962). Musical consonance and critical bandwidth. In Proceedings of the 4th International Congress Acoustics (pp. 55-55).
  • Levelt, W. J. M. (1984). Spontaneous self-repairs in speech: Processes and representations. In M. P. R. Van den Broecke, & A. Cohen (Eds.), Proceedings of the 10th International Congress of Phonetic Sciences (pp. 105-117). Dordrecht: Foris.
  • Lew, A. A., Hall-Lew, L., & Fairs, A. (2014). Language and Tourism in Sabah, Malaysia and Edinburgh, Scotland. In B. O'Rourke, N. Bermingham, & S. Brennan (Eds.), Opening New Lines of Communication in Applied Linguistics: Proceedings of the 46th Annual Meeting of the British Association for Applied Linguistics (pp. 253-259). London, UK: Scitsiugnil Press.
  • Little, H., Eryılmaz, K., & De Boer, B. (2016). Emergence of signal structure: Effects of duration constraints. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/25.html.

    Abstract

    Recent work has investigated the emergence of structure in speech using experiments which use artificial continuous signals. Some experiments have had no limit on the duration which signals can have (e.g. Verhoef et al., 2014), and others have had time limitations (e.g. Verhoef et al., 2015). However, the effect of time constraints on the structure in signals has never been experimentally investigated.
  • Little, H., & de Boer, B. (2016). Did the pressure for discrimination trigger the emergence of combinatorial structure? In Proceedings of the 2nd Conference of the International Association for Cognitive Semiotics (pp. 109-110).
  • Little, H., Eryılmaz, K., & De Boer, B. (2016). Differing signal-meaning dimensionalities facilitates the emergence of structure. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/25.html.

    Abstract

    Structure of language is not only caused by cognitive processes, but also by physical aspects of the signalling modality. We test the assumptions surrounding the role which the physical aspects of the signal space will have on the emergence of structure in speech. Here, we use a signal creation task to test whether a signal space and a meaning space having similar dimensionalities will generate an iconic system with signal-meaning mapping and whether, when the topologies differ, the emergence of non-iconic structure is facilitated. In our experiments, signals are created using infrared sensors which use hand position to create audio signals. We find that people take advantage of signal-meaning mappings where possible. Further, we use trajectory probabilities and measures of variance to show that when there is a dimensionality mismatch, more structural strategies are used.
  • Little, H. (2016). Nahran Bhannamz: Language Evolution in an Online Zombie Apocalypse Game. In Createvolang: creativity and innovation in language evolution.
  • Little, H., & Silvey, C. (2014). Interpreting emerging structures: The interdependence of combinatoriality and compositionality. In Proceedings of the First Conference of the International Association for Cognitive Semiotics (IACS 2014) (pp. 113-114).
  • Little, H., & Eryilmaz, K. (2014). The effect of physical articulation constraints on the emergence of combinatorial structure. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 11-17).
  • Little, H., & De Boer, B. (2014). The effect of size of articulation space on the emergence of combinatorial structure. In E. Cartmill A., S. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th international conference (EvoLangX) (pp. 479-481). Singapore: World Scientific.
  • Liu, Z., Chen, A., & Van de Velde, H. (2014). Prosodic focus marking in Bai. In N. Campbell, D. Gibbon, & D. Hirst (Eds.), Proceedings of Speech Prosody 2014 (pp. 628-631).

    Abstract

    This study investigates prosodic marking of focus in Bai, a Sino-Tibetan language spoken in the Southwest of China, by adopting a semi-spontaneous experimental approach. Our data show that Bai speakers increase the duration of the focused constituent and reduce the duration of the post-focus constituent to encode focus. However, duration is not used in Bai to distinguish focus types differing in size and contrastivity. Further, pitch plays no role in signaling focus and differentiating focus types. The results thus suggest that Bai uses prosody to mark focus, but to a lesser extent, compared to Mandarin Chinese, with which Bai has been in close contact for centuries, and Cantonese, to which Bai is similar in the tonal system, although Bai is similar to Cantonese in its reliance on duration in prosodic focus marking.
  • Liu, S., & Zhang, Y. (2019). Why some verbs are harder to learn than others – A micro-level analysis of everyday learning contexts for early verb learning. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2173-2178). Montreal, QB: Cognitive Science Society.

    Abstract

    Verb learning is important for young children. While most
    previous research has focused on linguistic and conceptual
    challenges in early verb learning (e.g. Gentner, 1982, 2006),
    the present paper examined early verb learning at the
    attentional level and quantified the input for early verb learning
    by measuring verb-action co-occurrence statistics in parent-
    child interaction from the learner’s perspective. To do so, we
    used head-mounted eye tracking to record fine-grained
    multimodal behaviors during parent-infant joint play, and
    analyzed parent speech, parent and infant action, and infant
    attention at the moments when parents produced verb labels.
    Our results show great variability across different action verbs,
    in terms of frequency of verb utterances, frequency of
    corresponding actions related to verb meanings, and infants’
    attention to verbs and actions, which provide new insights on
    why some verbs are harder to learn than others.
  • Lockwood, G., Hagoort, P., & Dingemanse, M. (2016). Synthesized Size-Sound Sound Symbolism. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 1823-1828). Austin, TX: Cognitive Science Society.

    Abstract

    Studies of sound symbolism have shown that people can associate sound and meaning in consistent ways when presented with maximally contrastive stimulus pairs of nonwords such as bouba/kiki (rounded/sharp) or mil/mal (small/big). Recent work has shown the effect extends to antonymic words from natural languages and has proposed a role for shared cross-modal correspondences in biasing form-to-meaning associations. An important open question is how the associations work, and particularly what the role is of sound-symbolic matches versus mismatches. We report on a learning task designed to distinguish between three existing theories by using a spectrum of sound-symbolically matching, mismatching, and neutral (neither matching nor mismatching) stimuli. Synthesized stimuli allow us to control for prosody, and the inclusion of a neutral condition allows a direct test of competing accounts. We find evidence for a sound-symbolic match boost, but not for a mismatch difficulty compared to the neutral condition.
  • Long, M. (2018). The lifelong interplay between language and cognition: From language learning to perspective-taking, new insights into the ageing mind. PhD Thesis, University of Edinburgh, Edinburgh.
  • Lopopolo, A., Frank, S. L., Van den Bosch, A., Nijhof, A., & Willems, R. M. (2018). The Narrative Brain Dataset (NBD), an fMRI dataset for the study of natural language processing in the brain. In B. Devereux, E. Shutova, & C.-R. Huang (Eds.), Proceedings of LREC 2018 Workshop "Linguistic and Neuro-Cognitive Resources (LiNCR) (pp. 8-11). Paris: LREC.

    Abstract

    We present the Narrative Brain Dataset, an fMRI dataset that was collected during spoken presentation of short excerpts of three
    stories in Dutch. Together with the brain imaging data, the dataset contains the written versions of the stimulation texts. The texts are
    accompanied with stochastic (perplexity and entropy) and semantic computational linguistic measures. The richness and unconstrained
    nature of the data allows the study of language processing in the brain in a more naturalistic setting than is common for fMRI studies.
    We hope that by making NBD available we serve the double purpose of providing useful neural data to researchers interested in natural
    language processing in the brain and to further stimulate data sharing in the field of neuroscience of language.
  • Lupyan, G., Wendorf, A., Berscia, L. M., & Paul, J. (2018). Core knowledge or language-augmented cognition? The case of geometric reasoning. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 252-254). Toruń, Poland: NCU Press. doi:10.12775/3991-1.062.
  • Macuch Silva, V., & Roberts, S. G. (2016). Language adapts to signal disruption in interaction. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/20.html.

    Abstract

    Linguistic traits are often seen as reflecting cognitive biases and constraints (e.g. Christiansen & Chater, 2008). However, language must also adapt to properties of the channel through which communication between individuals occurs. Perhaps the most basic aspect of any communication channel is noise. Communicative signals can be blocked, degraded or distorted by other sources in the environment. This poses a fundamental problem for communication. On average, channel disruption accompanies problems in conversation every 3 minutes (27% of cases of other-initiated repair, Dingemanse et al., 2015). Linguistic signals must adapt to this harsh environment. While modern language structures are robust to noise (e.g. Piantadosi et al., 2011), we investigate how noise might have shaped the early emergence of structure in language. The obvious adaptation to noise is redundancy. Signals which are maximally different from competitors are harder to render ambiguous by noise. Redundancy can be increased by adding differentiating segments to each signal (increasing the diversity of segments). However, this makes each signal more complex and harder to learn. Under this strategy, holistic languages may emerge. Another strategy is reduplication - repeating parts of the signal so that noise is less likely to disrupt all of the crucial information. This strategy does not increase the difficulty of learning the language - there is only one extra rule which applies to all signals. Therefore, under pressures for learnability, expressivity and redundancy, reduplicated signals are expected to emerge. However, reduplication is not a pervasive feature of words (though it does occur in limited domains like plurals or iconic meanings). We suggest that this is due to the pressure for redundancy being lifted by conversational infrastructure for repair. Receivers can request that senders repeat signals only after a problem occurs. That is, robustness is achieved by repeating the signal across conversational turns (when needed) instead of within single utterances. As a proof of concept, we ran two iterated learning chains with pairs of individuals in generations learning and using an artificial language (e.g. Kirby et al., 2015). The meaning space was a structured collection of unfamiliar images (3 shapes x 2 textures x 2 outline types). The initial language for each chain was the same written, unstructured, fully expressive language. Signals produced in each generation formed the training language for the next generation. Within each generation, pairs played an interactive communication game. The director was given a target meaning to describe, and typed a word for the matcher, who guessed the target meaning from a set. With a 50% probability, a contiguous section of 3-5 characters in the typed word was replaced by ‘noise’ characters (#). In one chain, the matcher could initiate repair by requesting that the director type and send another signal. Parallel generations across chains were matched for the number of signals sent (if repair was initiated for a meaning, then it was presented twice in the parallel generation where repair was not possible) and noise (a signal for a given meaning which was affected by noise in one generation was affected by the same amount of noise in the parallel generation). For the final set of signals produced in each generation we measured the signal redundancy (the zip compressibility of the signals), the character diversity (entropy of the characters of the signals) and systematic structure (z-score of the correlation between signal edit distance and meaning hamming distance). In the condition without repair, redundancy increased with each generation (r=0.97, p=0.01), and the character diversity decreased (r=-0.99,p=0.001) which is consistent with reduplication, as shown below (part of the initial and the final language): Linear regressions revealed that generations with repair had higher overall systematic structure (main effect of condition, t = 2.5, p < 0.05), increasing character diversity (interaction between condition and generation, t = 3.9, p = 0.01) and redundancy increased at a slower rate (interaction between condition and generation, t = -2.5, p < 0.05). That is, the ability to repair counteracts the pressure from noise, and facilitates the emergence of compositional structure. Therefore, just as systems to repair damage to DNA replication are vital for the evolution of biological species (O’Brien, 2006), conversational repair may regulate replication of linguistic forms in the cultural evolution of language. Future studies should further investigate how evolving linguistic structure is shaped by interaction pressures, drawing on experimental methods and naturalistic studies of emerging languages, both spoken (e.g Botha, 2006; Roberge, 2008) and signed (e.g Senghas, Kita, & Ozyurek, 2004; Sandler et al., 2005).
  • Mai, F., Galke, L., & Scherp, A. (2019). CBOW is not all you need: Combining CBOW with the compositional matrix space model. In Proceedings of the Seventh International Conference on Learning Representations (ICLR 2019). OpenReview.net.

    Abstract

    Continuous Bag of Words (CBOW) is a powerful text embedding method. Due to its strong capabilities to encode word content, CBOW embeddings perform well on a wide range of downstream tasks while being efficient to compute. However, CBOW is not capable of capturing the word order. The reason is that the computation of CBOW's word embeddings is commutative, i.e., embeddings of XYZ and ZYX are the same. In order to address this shortcoming, we propose a
    learning algorithm for the Continuous Matrix Space Model, which we call Continual Multiplication of Words (CMOW). Our algorithm is an adaptation of word2vec, so that it can be trained on large quantities of unlabeled text. We empirically show that CMOW better captures linguistic properties, but it is inferior to CBOW in memorizing word content. Motivated by these findings, we propose a hybrid model that combines the strengths of CBOW and CMOW. Our results show that the hybrid CBOW-CMOW-model retains CBOW's strong ability to memorize word content while at the same time substantially improving its ability to encode other linguistic information by 8%. As a result, the hybrid also performs better on 8 out of 11 supervised downstream tasks with an average improvement of 1.2%.
  • Mai, F., Galke, L., & Scherp, A. (2018). Using deep learning for title-based semantic subject indexing to reach competitive performance to full-text. In J. Chen, M. A. Gonçalves, J. M. Allen, E. A. Fox, M.-Y. Kan, & V. Petras (Eds.), JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries (pp. 169-178). New York: ACM.

    Abstract

    For (semi-)automated subject indexing systems in digital libraries, it is often more practical to use metadata such as the title of a publication instead of the full-text or the abstract. Therefore, it is desirable to have good text mining and text classification algorithms that operate well already on the title of a publication. So far, the classification performance on titles is not competitive with the performance on the full-texts if the same number of training samples is used for training. However, it is much easier to obtain title data in large quantities and to use it for training than full-text data. In this paper, we investigate the question how models obtained from training on increasing amounts of title training data compare to models from training on a constant number of full-texts. We evaluate this question on a large-scale dataset from the medical domain (PubMed) and from economics (EconBiz). In these datasets, the titles and annotations of millions of publications are available, and they outnumber the available full-texts by a factor of 20 and 15, respectively. To exploit these large amounts of data to their full potential, we develop three strong deep learning classifiers and evaluate their performance on the two datasets. The results are promising. On the EconBiz dataset, all three classifiers outperform their full-text counterparts by a large margin. The best title-based classifier outperforms the best full-text method by 9.4%. On the PubMed dataset, the best title-based method almost reaches the performance of the best full-text classifier, with a difference of only 2.9%.
  • Mainz, N. (2018). Vocabulary knowledge and learning: Individual differences in adult native speakers. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Mamus, E., Rissman, L., Majid, A., & Ozyurek, A. (2019). Effects of blindfolding on verbal and gestural expression of path in auditory motion events. In A. K. Goel, C. M. Seifert, & C. C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2275-2281). Montreal, QB: Cognitive Science Society.

    Abstract

    Studies have claimed that blind people’s spatial representations are different from sighted people, and blind people display superior auditory processing. Due to the nature of auditory and haptic information, it has been proposed that blind people have spatial representations that are more sequential than sighted people. Even the temporary loss of sight—such as through blindfolding—can affect spatial representations, but not much research has been done on this topic. We compared blindfolded and sighted people’s linguistic spatial expressions and non-linguistic localization accuracy to test how blindfolding affects the representation of path in auditory motion events. We found that blindfolded people were as good as sighted people when localizing simple sounds, but they outperformed sighted people when localizing auditory motion events. Blindfolded people’s path related speech also included more sequential, and less holistic elements. Our results indicate that even temporary loss of sight influences spatial representations of auditory motion events
  • Marcoux, K., & Ernestus, M. (2019). Differences between native and non-native Lombard speech in terms of pitch range. In M. Ochmann, M. Vorländer, & J. Fels (Eds.), Proceedings of the ICA 2019 and EAA Euroregio. 23rd International Congress on Acoustics, integrating 4th EAA Euroregio 2019 (pp. 5713-5720). Berlin: Deutsche Gesellschaft für Akustik.

    Abstract

    Lombard speech, speech produced in noise, is acoustically different from speech produced in quiet (plain speech) in several ways, including having a higher and wider F0 range (pitch). Extensive research on native Lombard speech does not consider that non-natives experience a higher cognitive load while producing
    speech and that the native language may influence the non-native speech. We investigated pitch range in plain and Lombard speech in native and non-natives.
    Dutch and American-English speakers read contrastive question-answer pairs in quiet and in noise in English, while the Dutch also read Dutch sentence pairs. We found that Lombard speech is characterized by a wider pitch range than plain speech, for all speakers (native English, non-native English, and native Dutch).
    This shows that non-natives also widen their pitch range in Lombard speech. In sentences with early-focus, we see the same increase in pitch range when going from plain to Lombard speech in native and non-native English, but a smaller increase in native Dutch. In sentences with late-focus, we see the biggest increase for the native English, followed by non-native English and then native Dutch. Together these results indicate an effect of the native language on non-native Lombard speech.
  • Marcoux, K., & Ernestus, M. (2019). Pitch in native and non-native Lombard speech. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019) (pp. 2605-2609). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Lombard speech, speech produced in noise, is
    typically produced with a higher fundamental
    frequency (F0, pitch) compared to speech in quiet. This paper examined the potential differences in native and non-native Lombard speech by analyzing median pitch in sentences with early- or late-focus produced in quiet and noise. We found an increase in pitch in late-focus sentences in noise for Dutch speakers in both English and Dutch, and for American-English speakers in English. These results
    show that non-native speakers produce Lombard speech, despite their higher cognitive load. For the early-focus sentences, we found a difference between the Dutch and the American-English speakers. Whereas the Dutch showed an increased F0 in noise
    in English and Dutch, the American-English speakers did not in English. Together, these results suggest that some acoustic characteristics of Lombard speech, such as pitch, may be language-specific, potentially
    resulting in the native language influencing the non-native Lombard speech.
  • Margetts, A. (1999). Valence and transitivity in Saliba: An Oceanic language of Papua New Guinea. PhD Thesis, Radboud University Nijmegen, Nijmegen. doi:10.17617/2.2057646.
  • Maslowski, M. (2019). Fast speech can sound slow: Effects of contextual speech rate on word recognition. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Matic, D., & Nikolaeva, I. (2014). Focus feature percolation: Evidence from Tundra Nenets and Tundra Yukaghir. In S. Müller (Ed.), Proceedings of the 21st International Conference on Head-Driven Phrase Structure Grammar (HPSG 2014) (pp. 299-317). Stanford, CA: CSLI Publications.

    Abstract

    Two Siberian languages, Tundra Nenets and Tundra Yukaghir, do not obey strong island constraints in questioning: any sub-constituent of a relative or adverbial clause can be questioned. We argue that this has to do with how focusing works in these languages. The focused sub-constituent remains in situ, but there is abundant morphosyntactic evidence that the focus feature is passed up to the head of the clause. The result is the formation of a complex focus structure in which both the head and non head daughter are overtly marked as focus, and they are interpreted as a pairwise list such that the focus background is applicable to this list, but not to other alternative lists
  • Merkx, D., Frank, S., & Ernestus, M. (2019). Language learning using speech to image retrieval. In Proceedings of Interspeech 2019 (pp. 1841-1845). doi:10.21437/Interspeech.2019-3067.

    Abstract

    Humans learn language by interaction with their environment and listening to other humans. It should also be possible for computational models to learn language directly from speech but so far most approaches require text. We improve on existing neural network approaches to create visually grounded embeddings for spoken utterances. Using a combination of a multi-layer GRU, importance sampling, cyclic learning rates, ensembling and vectorial self-attention our results show a remarkable increase in image-caption retrieval performance over previous work. Furthermore, we investigate which layers in the model learn to recognise words in the input. We find that deeper network layers are better at encoding word presence, although the final layer has slightly lower performance. This shows that our visually grounded sentence encoder learns to recognise words from the input even though it is not explicitly trained for word recognition.
  • Merkx, D., & Scharenborg, O. (2018). Articulatory feature classification using convolutional neural networks. In Proceedings of Interspeech 2018 (pp. 2142-2146). doi:10.21437/Interspeech.2018-2275.

    Abstract

    The ultimate goal of our research is to improve an existing speech-based computational model of human speech recognition on the task of simulating the role of fine-grained phonetic information in human speech processing. As part of this work we are investigating articulatory feature classifiers that are able to create reliable and accurate transcriptions of the articulatory behaviour encoded in the acoustic speech signal. Articulatory feature (AF) modelling of speech has received a considerable amount of attention in automatic speech recognition research. Different approaches have been used to build AF classifiers, most notably multi-layer perceptrons. Recently, deep neural networks have been applied to the task of AF classification. This paper aims to improve AF classification by investigating two different approaches: 1) investigating the usefulness of a deep Convolutional neural network (CNN) for AF classification; 2) integrating the Mel filtering operation into the CNN architecture. The results showed a remarkable improvement in classification accuracy of the CNNs over state-of-the-art AF classification results for Dutch, most notably in the minority classes. Integrating the Mel filtering operation into the CNN architecture did not further improve classification performance.
  • Meyer, A. S., & Huettig, F. (Eds.). (2016). Speaking and Listening: Relationships Between Language Production and Comprehension [Special Issue]. Journal of Memory and Language, 89.
  • Micklos, A. (2016). Interaction for facilitating conventionalization: Negotiating the silent gesture communication of noun-verb pairs. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/143.html.

    Abstract

    This study demonstrates how interaction – specifically negotiation and repair – facilitates the emergence, evolution, and conventionalization of a silent gesture communication system. In a modified iterated learning paradigm, partners communicated noun-verb meanings using only silent gesture. The need to disambiguate similar noun-verb pairs drove these "new" language users to develop a morphology that allowed for quicker processing, easier transmission, and improved accuracy. The specific morphological system that emerged came about through a process of negotiation within the dyad, namely by means of repair. By applying a discourse analytic approach to the use of repair in an experimental methodology for language evolution, we are able to determine not only if interaction facilitates the emergence and learnability of a new communication system, but also how interaction affects such a system
  • Micklos, A., Macuch Silva, V., & Fay, N. (2018). The prevalence of repair in studies of language evolution. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 316-318). Toruń, Poland: NCU Press. doi:10.12775/3991-1.075.
  • Micklos, A. (2014). The nature of language in interaction. In E. Cartmill, S. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference.
  • Mizera, P., Pollak, P., Kolman, A., & Ernestus, M. (2014). Impact of irregular pronunciation on phonetic segmentation of Nijmegen corpus of Casual Czech. In P. Sojka, A. Horák, I. Kopecek, & K. Pala (Eds.), Text, Speech and Dialogue: 17th International Conference, TSD 2014, Brno, Czech Republic, September 8-12, 2014. Proceedings (pp. 499-506). Heidelberg: Springer.

    Abstract

    This paper describes the pilot study of phonetic segmentation applied to Nijmegen Corpus of Casual Czech (NCCCz). This corpus contains informal speech of strong spontaneous nature which influences the character of produced speech at various levels. This work is the part of wider research related to the analysis of pronunciation reduction in such informal speech. We present the analysis of the accuracy of phonetic segmentation when canonical or reduced pronunciation is used. The achieved accuracy of realized phonetic segmentation provides information about general accuracy of proper acoustic modelling which is supposed to be applied in spontaneous speech recognition. As a byproduct of presented spontaneous speech segmentation, this paper also describes the created lexicon with canonical pronunciations of words in NCCCz, a tool supporting pronunciation check of lexicon items, and finally also a minidatabase of selected utterances from NCCCz manually labelled on phonetic level suitable for evaluation purposes
  • Moisik, S. R., Zhi Yun, D. P., & Dediu, D. (2019). Active adjustment of the cervical spine during pitch production compensates for shape: The ArtiVarK study. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 864-868). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    The anterior lordosis of the cervical spine is thought
    to contribute to pitch (fo) production by influencing
    cricoid rotation as a function of larynx height. This
    study examines the matter of inter-individual
    variation in cervical spine shape and whether this has
    an influence on how fo is produced along increasing
    or decreasing scales, using the ArtiVarK dataset,
    which contains real-time MRI pitch production data.
    We find that the cervical spine actively participates in
    fo production, but the amount of displacement
    depends on individual shape. In general, anterior
    spine motion (tending toward cervical lordosis)
    occurs for low fo, while posterior movement (tending
    towards cervical kyphosis) occurs for high fo.
  • Mulder, K., Ten Bosch, L., & Boves, L. (2016). Comparing different methods for analyzing ERP signals. In Proceedings of Interspeech 2016: The 17th Annual Conference of the International Speech Communication Association (pp. 1373-1377). doi:10.21437/Interspeech.2016-967.
  • Mulder, K., Ten Bosch, L., & Boves, L. (2018). Analyzing EEG Signals in Auditory Speech Comprehension Using Temporal Response Functions and Generalized Additive Models. In Proceedings of Interspeech 2018 (pp. 1452-1456). doi:10.21437/Interspeech.2018-1676.

    Abstract

    Analyzing EEG signals recorded while participants are listening to continuous speech with the purpose of testing linguistic hypotheses is complicated by the fact that the signals simultaneously reflect exogenous acoustic excitation and endogenous linguistic processing. This makes it difficult to trace subtle differences that occur in mid-sentence position. We apply an analysis based on multivariate temporal response functions to uncover subtle mid-sentence effects. This approach is based on a per-stimulus estimate of the response of the neural system to speech input. Analyzing EEG signals predicted on the basis of the response functions might then bring to light conditionspecific differences in the filtered signals. We validate this approach by means of an analysis of EEG signals recorded with isolated word stimuli. Then, we apply the validated method to the analysis of the responses to the same words in the middle of meaningful sentences.
  • Nijveld, A., Ten Bosch, L., & Ernestus, M. (2019). ERP signal analysis with temporal resolution using a time window bank. In Proceedings of Interspeech 2019 (pp. 1208-1212). doi:10.21437/Interspeech.2019-2729.

    Abstract

    In order to study the cognitive processes underlying speech comprehension, neuro-physiological measures (e.g., EEG and MEG), or behavioural measures (e.g., reaction times and response accuracy) can be applied. Compared to behavioural measures, EEG signals can provide a more fine-grained and complementary view of the processes that take place during the unfolding of an auditory stimulus.

    EEG signals are often analysed after having chosen specific time windows, which are usually based on the temporal structure of ERP components expected to be sensitive to the experimental manipulation. However, as the timing of ERP components may vary between experiments, trials, and participants, such a-priori defined analysis time windows may significantly hamper the exploratory power of the analysis of components of interest. In this paper, we explore a wide-window analysis method applied to EEG signals collected in an auditory repetition priming experiment.

    This approach is based on a bank of temporal filters arranged along the time axis in combination with linear mixed effects modelling. Crucially, it permits a temporal decomposition of effects in a single comprehensive statistical model which captures the entire EEG trace.

Share this page