Publications

Displaying 101 - 151 of 151
  • Melinger, A., Schulte im Walde, S., & Weber, A. (2006). Characterizing response types and revealing noun ambiguity in German association norms. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics. Trento: Association for Computational Linguistics.

    Abstract

    This paper presents an analysis of semantic association norms for German nouns. In contrast to prior studies, we not only collected associations elicited by written representations of target objects but also by their pictorial representations. In a first analysis, we identified systematic differences in the type and distribution of associate responses for the two presentation forms. In a second analysis, we applied a soft cluster analysis to the collected target-response pairs. We subsequently used the clustering to predict noun ambiguity and to discriminate senses in our target nouns.
  • Meyer, A. S., & Wheeldon, L. (Eds.). (2006). Language production across the life span [Special Issue]. Language and Cognitive Processes, 21(1-3).
  • Mishra, C., Nandanwar, A., & Mishra, S. (2024). HRI in Indian education: Challenges opportunities. In H. Admoni, D. Szafir, W. Johal, & A. Sandygulova (Eds.), Designing an introductory HRI course (workshop at HRI 2024). ArXiv. doi:10.48550/arXiv.2403.12223.

    Abstract

    With the recent advancements in the field of robotics and the increased focus on having general-purpose robots widely available to the general public, it has become increasingly necessary to pursue research into Human-robot interaction (HRI). While there have been a lot of works discussing frameworks for teaching HRI in educational institutions with a few institutions already offering courses to students, a consensus on the course content still eludes the field. In this work, we highlight a few challenges and opportunities while designing an HRI course from an Indian perspective. These topics warrant further deliberations as they have a direct impact on the design of HRI courses and wider implications for the entire field.
  • Mitterer, H. (2005). Short- and medium-term plasticity for speaker adaptation seem to be independent. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 83-86).
  • Motiekaitytė, K., Grosseck, O., Wolf, L., Bosker, H. R., Peeters, D., Perlman, M., Ortega, G., & Raviv, L. (2024). Iconicity and compositionality in emerging vocal communication systems: a Virtual Reality approach. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 387-389). Nijmegen: The Evolution of Language Conferences.
  • Norris, D., McQueen, J. M., & Cutler, A. (1994). Competition and segmentation in spoken word recognition. In Proceedings of the Third International Conference on Spoken Language Processing: Vol. 1 (pp. 401-404). Yokohama: PACIFICO.

    Abstract

    This paper describes recent experimental evidence which shows that models of spoken word recognition must incorporate both inhibition between competing lexical candidates and a sensitivity to metrical cues to lexical segmentation. A new version of the Shortlist [1][2] model incorporating the Metrical Segmentation Strategy [3] provides a detailed simulation of the data.
  • Offenga, F., Broeder, D., Wittenburg, P., Ducret, J., & Romary, L. (2006). Metadata profile in the ISO data category registry. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 1866-1869).
  • Ozyurek, A. (1998). An analysis of the basic meaning of Turkish demonstratives in face-to-face conversational interaction. In S. Santi, I. Guaitella, C. Cave, & G. Konopczynski (Eds.), Oralite et gestualite: Communication multimodale, interaction: actes du colloque ORAGE 98 (pp. 609-614). Paris: L'Harmattan.
  • Ozyurek, A. (1994). How children talk about a conversation. In K. Beals, J. Denton, R. Knippen, L. Melnar, H. Suzuki, & E. Zeinfeld (Eds.), Papers from the Thirtieth Regional Meeting of the Chicago Linguistic Society: Main Session (pp. 309-319). Chicago, Ill: Chicago Linguistic Society.
  • Ozyurek, A. (1994). How children talk about conversations: Development of roles and voices. In E. V. Clark (Ed.), Proceedings of the Twenty-Sixth Annual Child Language Research Forum (pp. 197-206). Stanford: CSLI Publications.
  • Papafragou, A., & Ozturk, O. (2006). The acquisition of epistemic modality. In A. Botinis (Ed.), Proceedings of ITRW on Experimental Linguistics in ExLing-2006 (pp. 201-204). ISCA Archive.

    Abstract

    In this paper we try to contribute to the body of knowledge about the acquisition of English epistemic modal verbs (e.g. Mary may/has to be at school). Semantically, these verbs encode possibility or necessity with respect to available evidence. Pragmatically, the use of epistemic modals often gives rise to scalar conversational inferences (Mary may be at school -> Mary doesn’t have to be at school). The acquisition of epistemic modals is challenging for children on both these levels. In this paper, we present findings from two studies which were conducted with 5-year-old children and adults. Our findings, unlike previous work, show that 5-yr-olds have mastered epistemic modal semantics, including the notions of necessity and possibility. However, they are still in the process of acquiring epistemic modal pragmatics.
  • Peirolo, M., Meyer, A. S., & Frances, C. (2024). Investigating the causes of prosodic marking in self-repairs: An automatic process? In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 1080-1084). doi:10.21437/SpeechProsody.2024-218.

    Abstract

    Natural speech involves repair. These repairs are often highlighted through prosodic marking (Levelt & Cutler, 1983). Prosodic marking usually entails an increase in pitch, loudness, and/or duration that draws attention to the corrected word. While it is established that natural self-repairs typically elicit prosodic marking, the exact cause of this is unclear. This study investigates whether producing a prosodic marking emerges from an automatic correction process or has a communicative purpose. In the current study, we elicit corrections to test whether all self-corrections elicit prosodic marking. Participants carried out a picture-naming task in which they described two images presented on-screen. To prompt self-correction, the second image was altered in some cases, requiring participants to abandon their initial utterance and correct their description to match the new image. This manipulation was compared to a control condition in which only the orientation of the object would change, eliciting no self-correction while still presenting a visual change. We found that the replacement of the item did not elicit a prosodic marking, regardless of the type of change. Theoretical implications and research directions are discussed, in particular theories of prosodic planning.
  • Pereiro Estevan, Y., Wan, V., Scharenborg, O., & Gallardo Antolín, A. (2006). Segmentación de fonemas no supervisada basada en métodos kernel de máximo margen. In Proceedings of IV Jornadas en Tecnología del Habla.

    Abstract

    En este artículo se desarrolla un método automático de segmentación de fonemas no supervisado. Este método utiliza el algoritmo de agrupación de máximo margen [1] para realizar segmentación de fonemas sobre habla continua sin necesidad de información a priori para el entrenamiento del sistema.
  • Petersson, K. M., Grenholm, P., & Forkstam, C. (2005). Artificial grammar learning and neural networks. In G. B. Bruna, L. Barsalou, & M. Bucciarelli (Eds.), Proceedings of the 27th Annual Conference of the Cognitive Science Society (pp. 1726-1731).

    Abstract

    Recent FMRI studies indicate that language related brain regions are engaged in artificial grammar (AG) processing. In the present study we investigate the Reber grammar by means of formal analysis and network simulations. We outline a new method for describing the network dynamics and propose an approach to grammar extraction based on the state-space dynamics of the network. We conclude that statistical frequency-based and rule-based acquisition procedures can be viewed as complementary perspectives on grammar learning, and more generally, that classical cognitive models can be viewed as a special case of a dynamical systems perspective on information processing
  • Pluymaekers, M., Ernestus, M., Baayen, R. H., & Booij, G. (2006). The role of morphology in fine phonetic detail: The case of Dutch -igheid. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 53-54).
  • Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2006). Effects of word frequency on the acoustic durations of affixes. In Proceedings of Interspeech 2006 (pp. 953-956). Pittsburgh: ICSLP.

    Abstract

    This study investigates whether the acoustic durations of derivational affixes in Dutch are affected by the frequency of the word they occur in. In a word naming experiment, subjects were presented with a large number of words containing one of the affixes ge-, ver-, ont, or -lijk. Their responses were recorded on DAT tapes, and the durations of the affixes were measured using Automatic Speech Recognition technology. To investigate whether frequency also affected durations when speech rate was high, the presentation rate of the stimuli was varied. The results show that a higher frequency of the word as a whole led to shorter acoustic realizations for all affixes. Furthermore, affixes became shorter as the presentation rate of the stimuli increased. There was no interaction between word frequency and presentation rate, suggesting that the frequency effect also applies in situations in which the speed of articulation is very high.
  • Poletiek, F. H., & Chater, N. (2006). Grammar induction profits from representative stimulus sampling. In R. Sun (Ed.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006) (pp. 1968-1973). Austin, TX, USA: Cognitive Science Society.
  • Poletiek, F. H., & Rassin E. (Eds.). (2005). Het (on)bewuste [Special Issue]. De Psycholoog.
  • de Reus, K., Benítez-Burraco, A., Hersh, T. A., Groot, N., Lambert, M. L., Slocombe, K. E., Vernes, S. C., & Raviv, L. (2024). Self-domestication traits in vocal learning mammals. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 105-108). Nijmegen: The Evolution of Language Conferences.
  • Rohrer, P. L., Bujok, R., Van Maastricht, L., & Bosker, H. R. (2024). The timing of beat gestures affects lexical stress perception in Spanish. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings Speech Prosody 2024 (pp. 702-706). doi:10.21437/SpeechProsody.2024-142.

    Abstract

    It has been shown that when speakers produce hand gestures, addressees are attentive towards these gestures, using them to facilitate speech processing. Even relatively simple “beat” gestures are taken into account to help process aspects of speech such as prosodic prominence. In fact, recent evidence suggests that the timing of a beat gesture can influence spoken word recognition. Termed the manual McGurk Effect, Dutch participants, when presented with lexical stress minimal pair continua in Dutch, were biased to hear lexical stress on the syllable that coincided with a beat gesture. However, little is known about how this manual McGurk effect would surface in languages other than Dutch, with different acoustic cues to prominence, and variable gestures. Therefore, this study tests the effect in Spanish where lexical stress is arguably even more important, being a contrastive cue in the regular verb conjugation system. Results from 24 participants corroborate the effect in Spanish, namely that when given the same auditory stimulus, participants were biased to perceive lexical stress on the syllable that visually co-occurred with a beat gesture. These findings extend the manual McGurk effect to a different language, emphasizing the impact of gestures' timing on prosody perception and spoken word recognition.
  • Rohrer, P. L., Hong, Y., & Bosker, H. R. (2024). Gestures time to vowel onset and change the acoustics of the word in Mandarin. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 866-870). doi:10.21437/SpeechProsody.2024-175.

    Abstract

    Recent research on multimodal language production has revealed that prominence in speech and gesture go hand-in-hand. Specifically, peaks in gesture (i.e., the apex) seem to closely coordinate with peaks in fundamental frequency (F0). The nature of this relationship may also be bi-directional, as it has also been shown that the production of gesture directly affects speech acoustics. However, most studies on the topic have largely focused on stress-based languages, where fundamental frequency has a prominence-lending function. Less work has been carried out on lexical tone languages such as Mandarin, where F0 is lexically distinctive. In this study, four native Mandarin speakers were asked to produce single monosyllabic CV words, taken from minimal lexical tone triplets (e.g., /pi1/, /pi2/, /pi3/), either with or without a beat gesture. Our analyses of the timing of the gestures showed that the gesture apex most stably occurred near vowel onset, with consonantal duration being the strongest predictor of apex placement. Acoustic analyses revealed that words produced with gesture showed raised F0 contours, greater intensity, and shorter durations. These findings further our understanding of gesture-speech alignment in typologically diverse languages, and add to the discussion about multimodal prominence.
  • Ronderos, C. R., Zhang, Y., & Rubio-Fernandez, P. (2024). Weighted parameters in demonstrative use: The case of Spanish teens and adults. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 3279-3286).
  • Rubio-Fernandez, P., Long, M., Shukla, V., Bhatia, V., Mahapatra, A., Ralekar, C., Ben-Ami, S., & Sinha, P. (2024). Multimodal communication in newly sighted children: An investigation of the relation between visual experience and pragmatic development. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 2560-2567).

    Abstract

    We investigated the relationship between visual experience and pragmatic development by testing the socio-communicative skills of a unique population: the Prakash children of India, who received treatment for congenital cataracts after years of visual deprivation. Using two different referential communication tasks, our study investigated Prakash' children ability to produce sufficiently informative referential expressions (e.g., ‘the green pear' or ‘the small plate') and pay attention to their interlocutor's face during the task (Experiment 1), as well as their ability to recognize a speaker's referential intent through non-verbal cues such as head turning and pointing (Experiment 2). Our results show that Prakash children have strong pragmatic skills, but do not look at their interlocutor's face as often as neurotypical children do. However, longitudinal analyses revealed an increase in face fixations, suggesting that over time, Prakash children come to utilize their improved visual skills for efficient referential communication.

    Additional information

    link to eScholarship
  • Sander, J., Çetinçelik, M., Zhang, Y., Rowland, C. F., & Harmon, Z. (2024). Why does joint attention predict vocabulary acquisition? The answer depends on what coding scheme you use. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 1607-1613).

    Abstract

    Despite decades of study, we still know less than we would like about the association between joint attention (JA) and language acquisition. This is partly because of disagreements on how to operationalise JA. In this study, we examine the impact of applying two different, influential JA operationalisation schemes to the same dataset of child-caregiver interactions, to determine which yields a better fit to children's later vocabulary size. Two coding schemes— one defining JA in terms of gaze overlap and one in terms of social aspects of shared attention—were applied to video-recordings of dyadic naturalistic toy-play interactions (N=45). We found that JA was predictive of later production vocabulary when operationalised as shared focus (study 1), but also that its operationalisation as shared social awareness increased its predictive power (study 2). Our results emphasise the critical role of methodological choices in understanding how and why JA is associated with vocabulary size.
  • Sauter, D., Wiland, J., Warren, J., Eisner, F., Calder, A., & Scott, S. K. (2005). Sounds of joy: An investigation of vocal expressions of positive emotions [Abstract]. Journal of Cognitive Neuroscience, 61(Supplement), B99.

    Abstract

    A series of experiment tested Ekman’s (1992) hypothesis that there are a set of positive basic emotions that are expressed using vocal para-linguistic sounds, e.g. laughter and cheers. The proposed categories investigated were amusement, contentment, pleasure, relief and triumph. Behavioural testing using a forced-choice task indicated that participants were able to reliably recognize vocal expressions of the proposed emotions. A cross-cultural study in the preliterate Himba culture in Namibia confirmed that these categories are also recognized across cultures. A recognition test of acoustically manipulated emotional vocalizations established that the recognition of different emotions utilizes different vocal cues, and that these in turn differ from the cues used when comprehending speech. In a study using fMRI we found that relative to a signal correlated noise baseline, the paralinguistic expressions of emotion activated bilateral superior temporal gyri and sulci, lateral and anterior to primary auditory cortex, which is consistent with the processing of non linguistic vocal cues in the auditory ‘what’ pathway. Notably amusement was associated with greater activation extending into both temporal poles and amygdale and insular cortex. Overall, these results support the claim that ‘happiness’ can be fractionated into amusement, pleasure, relief and triumph.
  • Scharenborg, O., Wan, V., & Moore, R. K. (2006). Capturing fine-phonetic variation in speech through automatic classification of articulatory features. In Speech Recognition and Intrinsic Variation Workshop [SRIV2006] (pp. 77-82). ISCA Archive.

    Abstract

    The ultimate goal of our research is to develop a computational model of human speech recognition that is able to capture the effects of fine-grained acoustic variation on speech recognition behaviour. As part of this work we are investigating automatic feature classifiers that are able to create reliable and accurate transcriptions of the articulatory behaviour encoded in the acoustic speech signal. In the experiments reported here, we compared support vector machines (SVMs) with multilayer perceptrons (MLPs). MLPs have been widely (and rather successfully) used for the task of multi-value articulatory feature classification, while (to the best of our knowledge) SVMs have not. This paper compares the performances of the two classifiers and analyses the results in order to better understand the articulatory representations. It was found that the MLPs outperformed the SVMs, but it is concluded that both classifiers exhibit similar behaviour in terms of patterns of errors.
  • Scharenborg, O., & Seneff, S. (2005). A two-pass strategy for handling OOVs in a large vocabulary recognition task. In Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, (pp. 1669-1672). ISCA Archive.

    Abstract

    This paper addresses the issue of large-vocabulary recognition in a specific word class. We propose a two-pass strategy in which only major cities are explicitly represented in the first stage lexicon. An unknown word model encoded as a phone loop is used to detect OOV city names (referred to as rare city names). After which SpeM, a tool that can extract words and word-initial cohorts from phone graphs on the basis of a large fallback lexicon, provides an N-best list of promising city names on the basis of the phone sequences generated in the first stage. This N-best list is then inserted into the second stage lexicon for a subsequent recognition pass. Experiments were conducted on a set of spontaneous telephone-quality utterances each containing one rare city name. We tested the size of the N-best list and three types of language models (LMs). The experiments showed that SpeM was able to include nearly 85% of the correct city names into an N-best list of 3000 city names when a unigram LM, which also boosted the unigram scores of a city name in a given state, was used.
  • Scharenborg, O. (2005). Parallels between HSR and ASR: How ASR can contribute to HSR. In Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology (pp. 1237-1240). ISCA Archive.

    Abstract

    In this paper, we illustrate the close parallels between the research fields of human speech recognition (HSR) and automatic speech recognition (ASR) using a computational model of human word recognition, SpeM, which was built using techniques from ASR. We show that ASR has proven to be useful for improving models of HSR by relieving them of some of their shortcomings. However, in order to build an integrated computational model of all aspects of HSR, a lot of issues remain to be resolved. In this process, ASR algorithms and techniques definitely can play an important role.
  • Scott, S., & Sauter, D. (2006). Non-verbal expressions of emotion - acoustics, valence, and cross cultural factors. In Third International Conference on Speech Prosody 2006. ISCA.

    Abstract

    This presentation will address aspects of the expression of emotion in non-verbal vocal behaviour, specifically attempting to determine the roles of both positive and negative emotions, their acoustic bases, and the extent to which these are recognized in non-Western cultures.
  • Seuren, P. A. M. (1971). Qualche osservazione sulla frase durativa e iterativa in italiano. In M. Medici, & R. Simone (Eds.), Grammatica trasformazionale italiana (pp. 209-224). Roma: Bulzoni.
  • Seuren, P. A. M. (1994). The computational lexicon: All lexical content is predicate. In Z. Yusoff (Ed.), Proceedings of the International Conference on Linguistic Applications 26-28 July 1994 (pp. 211-216). Penang: Universiti Sains Malaysia, Unit Terjemahan Melalui Komputer (UTMK).
  • Seuren, P. A. M. (1994). Translation relations in semantic syntax. In G. Bouma, & G. Van Noord (Eds.), CLIN IV: Papers from the Fourth CLIN Meeting (pp. 149-162). Groningen: Vakgroep Alfa-informatica, Rijksuniversiteit Groningen.
  • Seuren, P. A. M. (1980). Variabele competentie: Linguïstiek en sociolinguïstiek anno 1980. In Handelingen van het 36e Nederlands Filologencongres: Gehouden te Groningen op woensdag 9, donderdag 10 en vrijdag 11 April 1980 (pp. 41-56). Amsterdam: Holland University Press.
  • Sidnell, J., & Stivers, T. (Eds.). (2005). Multimodal Interaction [Special Issue]. Semiotica, 156.
  • Silverstein, P., Bergmann, C., & Syed, M. (Eds.). (2024). Open science and metascience in developmental psychology [Special Issue]. Infant and Child Development, 33(1).
  • Sprenger, S. A., & Van Rijn, H. (2005). Clock time naming: Complexities of a simple task. In B. G. Bara, L. Barsalou, & M. Bucciarelli (Eds.), Proceedings of the 27th Annual Meeting of the Cognitive Science Society (pp. 2062-2067).
  • Ten Bosch, L., Baayen, R. H., & Ernestus, M. (2006). On speech variation and word type differentiation by articulatory feature representations. In Proceedings of Interspeech 2006 (pp. 2230-2233).

    Abstract

    This paper describes ongoing research aiming at the description of variation in speech as represented by asynchronous articulatory features. We will first illustrate how distances in the articulatory feature space can be used for event detection along speech trajectories in this space. The temporal structure imposed by the cosine distance in articulatory feature space coincides to a large extent with the manual segmentation on phone level. The analysis also indicates that the articulatory feature representation provides better such alignments than the MFCC representation does. Secondly, we will present first results that indicate that articulatory features can be used to probe for acoustic differences in the onsets of Dutch singulars and plurals.
  • ten Bosch, L., Hämäläinen, A., Scharenborg, O., & Boves, L. (2006). Acoustic scores and symbolic mismatch penalties in phone lattices. In Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing [ICASSP 2006]. IEEE.

    Abstract

    This paper builds on previous work that aims at unraveling the structure of the speech signal by means of using probabilistic representations. The context of this work is a multi-pass speech recognition system in which a phone lattice is created and used as a basis for a lexical search in which symbolic mismatches are allowed at certain costs. The focus is on the optimization of the costs of phone insertions, deletions and substitutions that are used in the lexical decoding pass. Two optimization approaches are presented, one related to a multi-pass computational model for human speech recognition, the other based on a decoding in which Bayes’ risks are minimized. In the final section, the advantages of these optimization methods are discussed and compared.
  • ten Bosch, L., & Scharenborg, O. (2005). ASR decoding in a computational model of human word recognition. In Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology (pp. 1241-1244). ISCA Archive.

    Abstract

    This paper investigates the interaction between acoustic scores and symbolic mismatch penalties in multi-pass speech decoding techniques that are based on the creation of a segment graph followed by a lexical search. The interaction between acoustic and symbolic mismatches determines to a large extent the structure of the search space of these multipass approaches. The background of this study is a recently developed computational model of human word recognition, called SpeM. SpeM is able to simulate human word recognition data and is built as a multi-pass speech decoder. Here, we focus on unravelling the structure of the search space that is used in SpeM and similar decoding strategies. Finally, we elaborate on the close relation between distances in this search space, and distance measures in search spaces that are based on a combination of acoustic and phonetic features.
  • Tuinman, A. (2006). Overcompensation of /t/ reduction in Dutch by German/Dutch bilinguals. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 101-102).
  • Uluşahin, O., Bosker, H. R., McQueen, J. M., & Meyer, A. S. (2024). Knowledge of a talker’s f0 affects subsequent perception of voiceless fricatives. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 432-436).

    Abstract

    The human brain deals with the infinite variability of speech through multiple mechanisms. Some of them rely solely on information in the speech input (i.e., signal-driven) whereas some rely on linguistic or real-world knowledge (i.e., knowledge-driven). Many signal-driven perceptual processes rely on the enhancement of acoustic differences between incoming speech sounds, producing contrastive adjustments. For instance, when an ambiguous voiceless fricative is preceded by a high fundamental frequency (f0) sentence, the fricative is perceived as having lower a spectral center of gravity (CoG). However, it is not clear whether knowledge of a talker’s typical f0 can lead to similar contrastive effects. This study investigated a possible talker f0 effect on fricative CoG perception. In the exposure phase, two groups of participants (N=16 each) heard the same talker at high or low f0 for 20 minutes. Later, in the test phase, participants rated fixed-f0 /?ɔk/ tokens as being /sɔk/ (i.e., high CoG) or /ʃɔk/ (i.e., low CoG), where /?/ represents a fricative from a 5-step /s/-/ʃ/ continuum. Surprisingly, the data revealed the opposite of our contrastive hypothesis, whereby hearing high f0 instead biased perception towards high CoG. Thus, we demonstrated that talker f0 information affects fricative CoG perception.
  • Van den Bos, E. J., & Poletiek, F. H. (2006). Implicit artificial grammar learning in adults and children. In R. Sun (Ed.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006) (pp. 2619). Austin, TX, USA: Cognitive Science Society.
  • van der Burght, C. L., & Meyer, A. S. (2024). Interindividual variation in weighting prosodic and semantic cues during sentence comprehension – a partial replication of Van der Burght et al. (2021). In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 792-796). doi:10.21437/SpeechProsody.2024-160.

    Abstract

    Contrastive pitch accents can mark sentence elements occupying parallel roles. In “Mary kissed John, not Peter”, a pitch accent on Mary or John cues the implied syntactic role of Peter. Van der Burght, Friederici, Goucha, and Hartwigsen (2021) showed that listeners can build expectations concerning syntactic and semantic properties of upcoming words, derived from pitch accent information they heard previously. To further explore these expectations, we attempted a partial replication of the original German study in Dutch. In the experimental sentences “Yesterday, the police officer arrested the thief, not the inspector/murderer”, a pitch accent on subject or object cued the subject/object role of the ellipsis clause. Contrasting elements were additionally cued by the thematic role typicality of the nouns. Participants listened to sentences in which the ellipsis clause was omitted and selected the most plausible sentence-final noun (presented visually) via button press. Replicating the original study results, listeners based their sentence-final preference on the pitch accent information available in the sentence. However, as in the original study, individual differences between listeners were found, with some following prosodic information and others relying on a structural bias. The results complement the literature on ellipsis resolution and on interindividual variability in cue weighting.
  • Weber, A. (1998). Listening to nonnative language which violates native assimilation rules. In D. Duez (Ed.), Proceedings of the European Scientific Communication Association workshop: Sound patterns of Spontaneous Speech (pp. 101-104).

    Abstract

    Recent studies using phoneme detection tasks have shown that spoken-language processing is neither facilitated nor interfered with by optional assimilation, but is inhibited by violation of obligatory assimilation. Interpretation of these results depends on an assessment of their generality, specifically, whether they also obtain when listeners are processing nonnative language. Two separate experiments are presented in which native listeners of German and native listeners of Dutch had to detect a target fricative in legal monosyllabic Dutch nonwords. All of the nonwords were correct realisations in standard Dutch. For German listeners, however, half of the nonwords contained phoneme strings which violate the German fricative assimilation rule. Whereas the Dutch listeners showed no significant effects, German listeners detected the target fricative faster when the German fricative assimilation was violated than when no violation occurred. The results might suggest that violation of assimilation rules does not have to make processing more difficult per se.
  • Widlok, T. (2006). Two ways of looking at a Mangetti grove. In A. Takada (Ed.), Proceedings of the workshop: Landscape and society (pp. 11-16). Kyoto: 21st Century Center of Excellence Program.
  • Wittek, A. (1998). Learning verb meaning via adverbial modification: Change-of-state verbs in German and the adverb "wieder" again. In A. Greenhill, M. Hughes, H. Littlefield, & H. Walsh (Eds.), Proceedings of the 22nd Annual Boston University Conference on Language Development (pp. 779-790). Somerville, MA: Cascadilla Press.
  • Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., & Sloetjes, H. (2006). ELAN: a professional framework for multimodality research. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 1556-1559).

    Abstract

    Utilization of computer tools in linguistic research has gained importance with the maturation of media frameworks for the handling of digital audio and video. The increased use of these tools in gesture, sign language and multimodal interaction studies has led to stronger requirements on the flexibility, the efficiency and in particular the time accuracy of annotation tools. This paper describes the efforts made to make ELAN a tool that meets these requirements, with special attention to the developments in the area of time accuracy. In subsequent sections an overview will be given of other enhancements in the latest versions of ELAN, that make it a useful tool in multimodality research.
  • Wittenburg, P., Broeder, D., Klein, W., Levinson, S. C., & Romary, L. (2006). Foundations of modern language resource archives. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 625-628).

    Abstract

    A number of serious reasons will convince an increasing amount of researchers to store their relevant material in centers which we will call "language resource archives". They combine the duty of taking care of long-term preservation as well as the task to give access to their material to different user groups. Access here is meant in the sense that an active interaction with the data will be made possible to support the integration of new data, new versions or commentaries of all sort. Modern Language Resource Archives will have to adhere to a number of basic principles to fulfill all requirements and they will have to be involved in federations to create joint language resource domains making it even more simple for the researchers to access the data. This paper makes an attempt to formulate the essential pillars language resource archives have to adhere to.
  • Yang, J., Zhang, Y., & Yu, C. (2024). Learning semantic knowledge based on infant real-time. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (Eds.), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 741-747).

    Abstract

    Early word learning involves mapping individual words to their meanings and building organized semantic representations among words. Previous corpus-based studies (e.g., using text from websites, newspapers, child-directed speech corpora) demonstrated that linguistic information such as word co-occurrence alone is sufficient to build semantically organized word knowledge. The present study explored two new research directions to advance understanding of how infants acquire semantically organized word knowledge. First, infants in the real world hear words surrounded by contextual information. Going beyond inferring semantic knowledge merely from language input, we examined the role of extra-linguistic contextual information in learning semantic knowledge. Second, previous research relies on large amounts of linguistic data to demonstrate in-principle learning, which is unrealistic compared with the input children receive. Here, we showed that incorporating extra-linguistic information provides an efficient mechanism through which semantic knowledge can be acquired with a small amount of data infants perceive in everyday learning contexts, such as toy play.

    Additional information

    link to eScholarship
  • Zhou, Y., van der Burght, C. L., & Meyer, A. S. (2024). Investigating the role of semantics and perceptual salience in the memory benefit of prosodic prominence. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 1250-1254). doi:10.21437/SpeechProsody.2024-252.

    Abstract

    Prosodic prominence can enhance memory for the prominent words. This mnemonic benefit has been linked to listeners’ allocation of attention and deeper processing, which leads to more robust semantic representations. We investigated whether, in addition to the well-established effect at the semantic level, there was a memory benefit for prominent words at the phonological level. To do so, participants (48 native speakers of Dutch), first performed an accent judgement task, where they had to discriminate accented from unaccented words, and accented from unaccented pseudowords. All stimuli were presented in lists. They then performed an old/new recognition task for the stimuli. Accuracy in the accent judgement task was equally high for words and pseudowords. In the recognition task, performance was, as expected, better for words than pseudowords. More importantly, there was an interaction of accent with word type, with a significant advantage for accented compared to unaccented words, but not for pseudowords. The results confirm the memory benefit for accented compared to unaccented words seen in earlier studies, and they are consistent with the view that prominence primarily affects the semantic encoding of words. There was no evidence for an additional memory benefit arising at the phonological level.
  • Zora, H., Bowin, H., Heldner, M., Riad, T., & Hagoort, P. (2024). The role of pitch accent in discourse comprehension and the markedness of Accent 2 in Central Swedish. In Y. Chen, A. Chen, & A. Arvaniti (Eds.), Proceedings of Speech Prosody 2024 (pp. 921-925). doi:10.21437/SpeechProsody.2024-186.

    Abstract

    In Swedish, words are associated with either of two pitch contours known as Accent 1 and Accent 2. Using a psychometric test, we investigated how listeners judge pitch accent violations while interpreting discourse. Forty native speakers of Central Swedish were presented with auditory dialogues, where test words were appropriately or inappropriately accented in a given context, and asked to judge the correctness of sentences containing the test words. Data indicated a statistically significant effect of wrong accent pattern on the correctness judgment. Both Accent 1 and Accent 2 violations interfered with the coherent interpretation of discourse and were judged as incorrect by the listeners. Moreover, there was a statistically significant difference in the perceived correctness between the accent patterns. Accent 2 violations led to a lower correctness score compared to Accent 1 violations, indicating that the listeners were more sensitive to pitch accent violations in Accent 2 words than in Accent 1 words. This result is in line with the notion that Accent 2 is marked and lexically represented in Central Swedish. Taken together, these findings indicate that listeners use both Accent 1 and Accent 2 to arrive at the correct interpretation of the linguistic input, while assigning varying degrees of relevance to them depending on their markedness.

Share this page