Publications

Displaying 101 - 128 of 128
  • Scheu, O., & Zinn, C. (2007). How did the e-learning session go? The student inspector. In Proceedings of the 13th International Conference on Artificial Intelligence and Education (AIED 2007). Amsterdam: IOS Press.

    Abstract

    Good teachers know their students, and exploit this knowledge to adapt or optimise their instruction. Traditional teachers know their students because they interact with them face-to-face in classroom or one-to-one tutoring sessions. In these settings, they can build student models, i.e., by exploiting the multi-faceted nature of human-human communication. In distance-learning contexts, teacher and student have to cope with the lack of such direct interaction, and this must have detrimental effects for both teacher and student. In a past study we have analysed teacher requirements for tracking student actions in computer-mediated settings. Given the results of this study, we have devised and implemented a tool that allows teachers to keep track of their learners'interaction in e-learning systems. We present the tool's functionality and user interfaces, and an evaluation of its usability.
  • Schiller, N. O. (2003). Metrical stress in speech production: A time course study. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 451-454). Adelaide: Causal Productions.

    Abstract

    This study investigated the encoding of metrical information during speech production in Dutch. In Experiment 1, participants were asked to judge whether bisyllabic picture names had initial or final stress. Results showed significantly faster decision times for initially stressed targets (e.g., LEpel 'spoon') than for targets with final stress (e.g., liBEL 'dragon fly'; capital letters indicate stressed syllables) and revealed that the monitoring latencies are not a function of the picture naming or object recognition latencies to the same pictures. Experiments 2 and 3 replicated the outcome of the first experiment with bi- and trisyllabic picture names. These results demonstrate that metrical information of words is encoded rightward incrementally during phonological encoding in speech production. The results of these experiments are in line with Levelt's model of phonological encoding.
  • Schiller, N. O., Van Lieshout, P. H. H. M., Meyer, A. S., & Levelt, W. J. M. (1997). Is the syllable an articulatory unit in speech production? Evidence from an Emma study. In P. Wille (Ed.), Fortschritte der Akustik: Plenarvorträge und Fachbeiträge der 23. Deutschen Jahrestagung für Akustik (DAGA 97) (pp. 605-606). Oldenburg: DEGA.
  • Schulte im Walde, S., Melinger, A., Roth, M., & Weber, A. (2007). An empirical characterization of response types in German association norms. In Proceedings of the GLDV workshop on lexical-semantic and ontological resources.
  • Scott, D. R., & Cutler, A. (1982). Segmental cues to syntactic structure. In Proceedings of the Institute of Acoustics 'Spectral Analysis and its Use in Underwater Acoustics' (pp. E3.1-E3.4). London: Institute of Acoustics.
  • Seidl, A., & Johnson, E. K. (2003). Position and vowel quality effects in infant's segmentation of vowel-initial words. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2233-2236). Adelaide: Causal Productions.
  • Sekine, K., & Kajikawa, T. (2023). Does the spatial distribution of a speaker's gaze and gesture impact on a listener's comprehension of discourse? In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527208.

    Abstract

    This study investigated the impact of a speaker's gaze direction
    on a listener's comprehension of discourse. Previous research
    suggests that hand gestures play a role in referent allocation,
    enabling listeners to better understand the discourse. The
    current study aims to determine whether the speaker's gaze
    direction has a similar effect on reference resolution as co-
    speech gestures. Thirty native Japanese speakers participated in
    the study and were assigned to one of three conditions:
    congruent, incongruent, or speech-only. Participants watched
    36 videos of an actor narrating a story consisting of three
    sentences with two protagonists. The speaker consistently
    used hand gestures to allocate one protagonist to the lower right
    and the other to the lower left space, while directing her gaze to
    either space of the target person (congruent), the other person
    (incongruent), or no particular space (speech-only). Participants
    were required to verbally answer a question about the target
    protagonist involved in an accidental event as quickly as
    possible. Results indicate that participants in the congruent
    condition exhibited faster reaction times than those in the
    incongruent condition, although the difference was not
    significant. These findings suggest that the speaker's gaze
    direction is not enough to facilitate a listener's comprehension
    of discourse.
  • Senft, G. (2007). Language, culture and cognition: Frames of spatial reference and why we need ontologies of space [Abstract]. In A. G. Cohn, C. Freksa, & B. Bebel (Eds.), Spatial cognition: Specialization and integration (pp. 12).

    Abstract

    One of the many results of the "Space" research project conducted at the MPI for Psycholinguistics is that there are three "Frames of spatial Reference" (FoRs), the relative, the intrinsic and the absolute FoR. Cross-linguistic research showed that speakers who prefer one FoR in verbal spatial references rely on a comparable coding system for memorizing spatial configurations and for making inferences with respect to these spatial configurations in non-verbal problem solving. Moreover, research results also revealed that in some languages these verbal FoRs also influence gestural behavior. These results document the close interrelationship between language, culture and cognition in the domain "Space". The proper description of these interrelationships in the spatial domain requires language and culture specific ontologies.
  • Seuren, P. A. M. (1982). Riorientamenti metodologici nello studio della variabilità linguistica. In D. Gambarara, & A. D'Atri (Eds.), Ideologia, filosofia e linguistica: Atti del Convegno Internazionale di Studi, Rende (CS) 15-17 Settembre 1978 ( (pp. 499-515). Roma: Bulzoni.
  • Severijnen, G. G. A., Bosker, H. R., & McQueen, J. M. (2023). Syllable rate drives rate normalization, but is not the only factor. In R. Skarnitzl, & J. Volín (Eds.), Proceedings of the 20th International Congress of the Phonetic Sciences (ICPhS 2023) (pp. 56-60). Prague: Guarant International.

    Abstract

    Speech is perceived relative to the speech rate in the context. It is unclear, however, what information listeners use to compute speech rate. The present study examines whether listeners use the number of
    syllables per unit time (i.e., syllable rate) as a measure of speech rate, as indexed by subsequent vowel perception. We ran two rate-normalization experiments in which participants heard duration-matched word lists that contained either monosyllabic
    vs. bisyllabic words (Experiment 1), or monosyllabic vs. trisyllabic pseudowords (Experiment 2). The participants’ task was to categorize an /ɑ-aː/ continuum that followed the word lists. The monosyllabic condition was perceived as slower (i.e., fewer /aː/ responses) than the bisyllabic and
    trisyllabic condition. However, no difference was observed between bisyllabic and trisyllabic contexts. Therefore, while syllable rate is used in perceiving speech rate, other factors, such as fast speech processes, mean F0, and intensity, must also influence rate normalization.
  • Shi, R., Werker, J., & Cutler, A. (2003). Function words in early speech perception. In Proceedings of the 15th International Congress of Phonetic Sciences (pp. 3009-3012).

    Abstract

    Three experiments examined whether infants recognise functors in phrases, and whether their representations of functors are phonetically well specified. Eight- and 13- month-old English infants heard monosyllabic lexical words preceded by real functors (e.g., the, his) versus nonsense functors (e.g., kuh); the latter were minimally modified segmentally (but not prosodically) from real functors. Lexical words were constant across conditions; thus recognition of functors would appear as longer listening time to sequences with real functors. Eightmonth- olds' listening times to sequences with real versus nonsense functors did not significantly differ, suggesting that they did not recognise real functors, or functor representations lacked phonetic specification. However, 13-month-olds listened significantly longer to sequences with real functors. Thus, somewhere between 8 and 13 months of age infants learn familiar functors and represent them with segmental detail. We propose that accumulated frequency of functors in input in general passes a critical threshold during this time.
  • Siahaan, P., & Wijaya Rajeg, G. P. (2023). Multimodal language use in Indonesian: Recurrent gestures associated with negation. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527196.

    Abstract

    This paper presents research findings on manual gestures
    associated with negation in Indonesian, utilizing data sourced
    from talk shows available on YouTube. The study reveals that
    Indonesian speakers employ six recurrent negation gestures,
    which have been observed in various languages worldwide.
    This suggests that gestures exhibiting a stable form-meaning
    relationship and recurring frequently in relation to negation are
    prevalent around the globe, although their distribution may
    differ across cultures and languages. Furthermore, the paper
    demonstrates that negation gestures are not strictly tied to
    verbal negation. Overall, the aim of this paper is to contribute
    to a deeper understanding of the conventional usage and cross-
    linguistic distribution of recurrent gestures.
  • Stern, G. (2023). On embodied use of recognitional demonstratives. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527204.

    Abstract

    This study focuses on embodied uses of recognitional
    demonstratives. While multimodal conversation analytic
    studies have shown how gesture and speech interact in the
    elaboration of exophoric references, little attention has been
    given to the multimodal configuration of other types of
    referential actions. Based on a video-recorded corpus of
    professional meetings held in French, this qualitative study
    shows that a subtype of deictic references, namely recognitional
    references, are frequently associated with iconic gestures, thus
    challenging the traditional distinction between exophoric and
    endophoric uses of deixis.
  • Stevens, M. A., McQueen, J. M., & Hartsuiker, R. J. (2007). No lexically-driven perceptual adjustments of the [x]-[h] boundary. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1897-1900). Dudweiler: Pirrot.

    Abstract

    Listeners can make perceptual adjustments to phoneme categories in response to a talker who consistently produces a specific phoneme ambiguously. We investigate here whether this type of perceptual learning is also used to adapt to regional accent differences. Listeners were exposed to words produced by a Flemish talker whose realization of [x℄or [h℄ was ambiguous (producing [x℄like [h℄is a property of the West-Flanders regional accent). Before and after exposure they categorized a [x℄-[h℄continuum. For both Dutch and Flemish listeners there was no shift of the categorization boundary after exposure to ambiguous sounds in [x℄- or [h℄-biasing contexts. The absence of a lexically-driven learning effect for this contrast may be because [h℄is strongly influenced by coarticulation. As is not stable across contexts, it may be futile to adapt its representation when new realizations are heard
  • Tuinman, A., Mitterer, H., & Cutler, A. (2007). Speakers differentiate English intrusive and onset /r/, but L2 listeners do not. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1905-1908). Dudweiler: Pirrot.

    Abstract

    We investigated whether non-native listeners can exploit phonetic detail in recognizing potentially ambiguous utterances, as native listeners can [6, 7, 8, 9, 10]. Due to the phenomenon of intrusive /r/, the English phrase extra ice may sound like extra rice. A production study indicates that the intrusive /r/ can be distinguished from the onset /r/ in rice, as it is phonetically weaker. In two cross-modal identity priming studies, however, we found no conclusive evidence that Dutch learners of English are able to make use of this difference. Instead, auditory primes such as extra rice and extra ice with onset and intrusive /r/s activate both types of targets such as ice and rice. This supports the notion of spurious lexical activation in L2 perception.
  • Uhrig, P., Payne, E., Pavlova, I., Burenko, I., Dykes, N., Baltazani, M., Burrows, E., Hale, S., Torr, P., & Wilson, A. (2023). Studying time conceptualisation via speech, prosody, and hand gesture: Interweaving manual and computational methods of analysis. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527220.

    Abstract

    This paper presents a new interdisciplinary methodology for the
    analysis of future conceptualisations in big messy media data.
    More specifically, it focuses on the depictions of post-Covid
    futures by RT during the pandemic, i.e. on data which are of
    interest not just from the perspective of academic research but
    also of policy engagement. The methodology has been
    developed to support the scaling up of fine-grained data-driven
    analysis of discourse utterances larger than individual lexical
    units which are centred around ‘will’ + the infinitive. It relies
    on the true integration of manual analytical and computational
    methods and tools in researching three modalities – textual,
    prosodic1, and gestural. The paper describes the process of
    building a computational infrastructure for the collection and
    processing of video data, which aims to empower the manual
    analysis. It also shows how manual analysis can motivate the
    development of computational tools. The paper presents
    individual computational tools to demonstrate how the
    combination of human and machine approaches to analysis can
    reveal new manifestations of cohesion between gesture and
    prosody. To illustrate the latter, the paper shows how the
    boundaries of prosodic units can work to help determine the
    boundaries of gestural units for future conceptualisations.
  • Uluşahin, O., Bosker, H. R., McQueen, J. M., & Meyer, A. S. (2023). No evidence for convergence to sub-phonemic F2 shifts in shadowing. In R. Skarnitzl, & J. Volín (Eds.), Proceedings of the 20th International Congress of the Phonetic Sciences (ICPhS 2023) (pp. 96-100). Prague: Guarant International.

    Abstract

    Over the course of a conversation, interlocutors sound more and more like each other in a process called convergence. However, the automaticity and grain size of convergence are not well established. This study therefore examined whether female native Dutch speakers converge to large yet sub-phonemic shifts in the F2 of the vowel /e/. Participants first performed a short reading task to establish baseline F2s for the vowel /e/, then shadowed 120 target words (alongside 360 fillers) which contained one instance of a manipulated vowel /e/ where the F2 had been shifted down to that of the vowel /ø/. Consistent exposure to large (sub-phonemic) downward shifts in F2 did not result in convergence. The results raise issues for theories which view convergence as a product of automatic integration between perception and production.
  • Van Alphen, P. M., De Bree, E., Fikkert, P., & Wijnen, F. (2007). The role of metrical stress in comprehension and production of Dutch children at risk of dyslexia. In Proceedings of Interspeech 2007 (pp. 2313-2316). Adelaide: Causal Productions.

    Abstract

    The present study compared the role of metrical stress in comprehension and production of three-year-old children with a familial risk of dyslexia with that of normally developing children. A visual fixation task with stress (mis-)matches in bisyllabic words, as well as a non-word repetition task with bisyllabic targets were presented to the control and at-risk children. Results show that the at-risk group is less sensitive to stress mismatches in word recognition than the control group. Correct production of metrical stress patterns did not differ significantly between the groups, but the percentages of phonemes produced correctly were lower for the at-risk than the control group. The findings indicate that processing of metrical stress patterns is not impaired in at-risk children, but that the at-risk group cannot exploit metrical stress in word recognition
  • Van de Weijer, J. (1997). Language input to a prelingual infant. In A. Sorace, C. Heycock, & R. Shillcock (Eds.), Proceedings of the GALA '97 conference on language acquisition (pp. 290-293). Edinburgh University Press.

    Abstract

    Pitch, intonation, and speech rate were analyzed in a collection of everyday speech heard by one Dutch infant between the ages of six and nine months. Components of each of these variables were measured in the speech of three adult speakers (mother, father, baby-sitter) when they addressed the infant, and when they addressed another adult. The results are in line with previously reported findings which are usually based on laboratory or prearranged settings: infant-directed speech in a natural setting exhibits more pitch variation, a larger number of simple intonation contours, and slower speech rate than does adult-directed speech.
  • Van Heuven, V. J., Haan, J., Janse, E., & Van der Torre, E. J. (1997). Perceptual identification of sentence type and the time-distribution of prosodic interrogativity markers in Dutch. In Proceedings of the ESCA Tutorial and Research Workshop on Intonation: Theory, Models and Applications, Athens, Greece, 1997 (pp. 317-320).

    Abstract

    Dutch distinguishes at least four sentence types: statements and questions, the latter type being subdivided into wh-questions (beginning with a question word), yes/no-questions (with inversion of subject and finite), and declarative questions (lexico-syntactically identical to statement). Acoustically, each of these (sub)types was found to have clearly distinct global F0-patterns, as well as a characteristic distribution of final rises [1,2]. The present paper explores the separate contribution of parameters of global downtrend and size of accent-lending pitch movements versus aspects of the terminal rise to the human identification of the four sentence (sub)types, at various positions in the time-course of the utterance. The results show that interrogativity in Dutch can be identified at an early point in the utterance. However, wh-questions are not distinct from statements.
  • Vogel, C., Koutsombogera, M., Murat, A. C., Khosrobeigi, Z., & Ma, X. (2023). Gestural linguistic context vectors encode gesture meaning. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527176.

    Abstract

    Linguistic context vectors are adapted for measuring the linguistic contexts that accompany gestures and comparable co-linguistic behaviours. Focusing on gestural semiotic types, it is demonstrated that gestural linguistic context vectors carry information associated with gesture. It is suggested that these may be used to approximate gesture meaning in a similar manner to the approximation of word meaning by context vectors.
  • Wagner, A., & Braun, A. (2003). Is voice quality language-dependent? Acoustic analyses based on speakers of three different languages. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 651-654). Adelaide: Causal Productions.
  • Weber, A., & Smits, R. (2003). Consonant and vowel confusion patterns by American English listeners. In M. J. Solé, D. Recasens, & J. Romero (Eds.), Proceedings of the 15th International Congress of Phonetic Sciences.

    Abstract

    This study investigated the perception of American English phonemes by native listeners. Listeners identified either the consonant or the vowel in all possible English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios (0 dB, 8 dB, and 16 dB). Effects of syllable position, signal-to-noise ratio, and articulatory features on vowel and consonant identification are discussed. The results constitute the largest source of data that is currently available on phoneme confusion patterns of American English phonemes by native listeners.
  • Weber, A., & Smits, R. (2003). Consonant and vowel confusion patterns by American English listeners. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 1437-1440). Adelaide: Causal Productions.

    Abstract

    This study investigated the perception of American English phonemes by native listeners. Listeners identified either the consonant or the vowel in all possible English CV and VC syllables. The syllables were embedded in multispeaker babble at three signalto-noise ratios (0 dB, 8 dB, and 16 dB). Effects of syllable position, signal-to-noise ratio, and articulatory features on vowel and consonant identification are discussed. The results constitute the largest source of data that is currently available on phoneme confusion patterns of American English phonemes by native listeners.
  • Weber, A. (1998). Listening to nonnative language which violates native assimilation rules. In D. Duez (Ed.), Proceedings of the European Scientific Communication Association workshop: Sound patterns of Spontaneous Speech (pp. 101-104).

    Abstract

    Recent studies using phoneme detection tasks have shown that spoken-language processing is neither facilitated nor interfered with by optional assimilation, but is inhibited by violation of obligatory assimilation. Interpretation of these results depends on an assessment of their generality, specifically, whether they also obtain when listeners are processing nonnative language. Two separate experiments are presented in which native listeners of German and native listeners of Dutch had to detect a target fricative in legal monosyllabic Dutch nonwords. All of the nonwords were correct realisations in standard Dutch. For German listeners, however, half of the nonwords contained phoneme strings which violate the German fricative assimilation rule. Whereas the Dutch listeners showed no significant effects, German listeners detected the target fricative faster when the German fricative assimilation was violated than when no violation occurred. The results might suggest that violation of assimilation rules does not have to make processing more difficult per se.
  • Weber, A., Melinger, A., & Lara Tapia, L. (2007). The mapping of phonetic information to lexical presentations in Spanish: Evidence from eye movements. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1941-1944). Dudweiler: Pirrot.

    Abstract

    In a visual-world study, we examined spoken-wordrecognition in Spanish. Spanish listeners followed spoken instructions to click on pictures while their eye movements were monitored. When instructed to click on the picture of a door (puerta), they experienced interference from the picture of a pig (p u e r c o ). The same interference from phonologically related items was observed when the displays contained printed names or a combination of pictures with their names printed underneath, although the effect was strongest for displays with printed names. Implications of the finding that the interference effect can be induced with standard pictorial displays as well as with orthographic displays are discussed.
  • Wittek, A. (1998). Learning verb meaning via adverbial modification: Change-of-state verbs in German and the adverb "wieder" again. In A. Greenhill, M. Hughes, H. Littlefield, & H. Walsh (Eds.), Proceedings of the 22nd Annual Boston University Conference on Language Development (pp. 779-790). Somerville, MA: Cascadilla Press.
  • Witteman, J., Karaseva, E., Schiller, N. O., & McQueen, J. M. (2023). What does successful L2 vowel acquisition depend on? A conceptual replication. In R. Skarnitzl, & J. Volín (Eds.), Proceedings of the 20th International Congress of the Phonetic Sciences (ICPhS 2023) (pp. 928-931). Prague: Guarant International.

    Abstract

    It has been suggested that individual variation in vowel compactness of the native language (L1) and the distance between L1 vowels and vowels in the second language (L2) predict successful L2 vowel acquisition. Moreover, general articulatory skills have been proposed to account for variation in vowel compactness. In the present work, we conceptually replicate a previous study to test these hypotheses with a large sample size, a new language pair and a
    new vowel pair. We find evidence that individual variation in L1 vowel compactness has opposing effects for two different vowels. We do not find evidence that individual variation in L1 compactness
    is explained by general articulatory skills. We conclude that the results found previously might be specific to sub-groups of L2 learners and/or specific sub-sets of vowel pairs.

Share this page