Joe Rodd

Presentations

Displaying 1 - 18 of 18
  • Maslowski, M., & Rodd, J. (2019). Speech rate variation: How to perceive fast and slow speech, and how to speed up and slow down in speech production. Talk presented at the ACLC Seminar. Amsterdam, The Netherlands. 2019-04-26.

    Abstract

    Speech rate is one of the more salient stylistic dimensions along which speech can vary. We present both sides of this story: how listeners make use of this variation to optimise speech perception, and how the speech production system is modulated to produce speech at different rates. Listeners take speech rate variation into account by normalizing vowel duration or contextual speech rate: an ambiguous Dutch word /m?t/ is perceived as short /mAt/ when embedded in a slow context, but long /ma:t/ in a fast context. Many have argued that rate normalization involves low-level early and automatic perceptual processing. However, prior research on rate-dependent speech perception has only used explicit recognition tasks to investigate the phenomenon, involving both perceptual processing and decision making. Speech rate effects are induced by both local adjacent temporal cues and global non-adjacent cues. In this talk, I present evidence that local rate normalization takes place, at least in part, at a perceptual level, and even in the absence of an explicit recognition task. In contrast, global effects of speech rate seem to involve higher-level cognitive adjustments, possibly taking place at a later decision-making level. That speakers can vary their speech rate is evident, but how they accomplish this has hardly been studied. Consider this analogy: when walking, speed can be continuously increased, within limits, but to speed up further, humans must run. Are there multiple qualitatively distinct speech 'gaits' that resemble walking and running? Or is control achieved solely by continuous modulation of a single gait? These possibilities are investigated through simulations of a new connectionist computational model of the cognitive process of speech production. The model has parameters that can be adjusted to fit the temporal characteristics of natural speech at different rates. During training, different clusters of parameter values (regimes) were identified for different speech rates. In a one gait system, the regimes used to achieve fast and slow speech are qualitatively similar, but quantitatively different. In a multiple gait system, there is no linear relationship between the parameter settings associated with each gait, resulting in an abrupt shift in parameter values to move from speaking slowly to speaking fast. After training, the model achieved good fits in all three speech rates. The parameter settings associated with each speech rate were not linearly related, suggesting the presence of cognitive gaits, and thus that speakers make use of distinct cognitive configurations for different speech rates.

    Additional information

    Link to ACLC Seminar site
  • Rodd, J., & Maslowski, M. (2019). Speech rate variation: How to speed up and slow down in speech production, and how to perceive fast and slow speech. Talk presented at the Experimental Linguistics Talks Utrecht (ELiTU). Utrecht, The Netherlands. 2019-04-15 - 2019-04-15.

    Abstract

    Speech rate is one of the more salient stylistic dimensions along which speech can vary. We present both sides of this story: how the speech production system is modulated to produce speech at different rates, and how listeners make use of this variation to optimise speech perception. Joe Rodd: Speakers switch between qualitatively different cognitive ‘gaits’ to produce speech at different rates That speakers can vary their speech rate is evident, but how they accomplish this has hardly been studied. Consider this analogy: when walking, speed can be continuously increased, within limits, but to speed up further, humans must run. Are there multiple qualitatively distinct speech 'gaits' that resemble walking and running? Or is control achieved solely by continuous modulation of a single gait? These possibilities are investigated through simulations of a new connectionist computational model of the cognitive process of speech production. The model has parameters that can be adjusted to fit the temporal characteristics of natural speech at different rates. During training, different clusters of parameter values (regimes) were identified for different speech rates. In a one gait system, the regimes used to achieve fast and slow speech are qualitatively similar, but quantitatively different. In a multiple gait system, there is no linear relationship between the parameter settings associated with each gait, resulting in an abrupt shift in parameter values to move from speaking slowly to speaking fast. After training, the model achieved good fits in all three speech rates. The parameter settings associated with each speech rate were not linearly related, suggesting the presence of cognitive gaits, and thus that speakers make use of distinct cognitive configurations for different speech rates. Merel Maslowski: Listeners use the speech rate context to tune their speech perceptions Listeners take speech rate variation into account by normalizing vowel duration or contextual speech rate: an ambiguous Dutch word /m?t/ is perceived as short /mAt/ when embedded in a slow context, but long /ma:t/ in a fast context. Many have argued that rate normalization involves low-level early and automatic perceptual processing. However, prior research on rate-dependent speech perception has only used explicit recognition tasks to investigate the phenomenon, involving both perceptual processing and decision making. Speech rate effects are induced by both local adjacent temporal cues and global non-adjacent cues. In this talk, I present evidence that local rate normalization takes place, at least in part, at a perceptual level, and even in the absence of an explicit recognition task. In contrast, global effects of speech rate seem to involve higher-level cognitive adjustments, possibly taking place at a later decision-making level.
  • Rodd, J. (2019). The EPONA model: Simulation of the control of speaking rate. Talk presented at the Seminar of the DFG Research Group "Spoken Morphology". Düsseldorf, Germany. 2019-03-26 - 2019-03-26.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Bosch, L. t. (2019). The speech production system is reconfigured to change speaking rate. Poster presented at Crossing the Boundaries: Language in Interaction Symposium, Nijmegen, The Netherlands.

    Abstract

    It is evident that speakers can freely vary stylistic features of their speech, such as speech rate, but how they accomplish this has hardly been studied, let alone implemented in a formal model of speech production. Much as in walking and running, where qualitatively different gaits are required cover the gamut of different speeds, we might predict there to be multiple qualitatively distinct configurations, or ‘gaits’, in the speech planning system that speakers must switch between to alter their speaking rate or style. Alternatively, control might involve continuous modulation of a single ‘gait’. We investigate these possibilities by simulation of a connectionist computational model which mimics the temporal characteristics of observed speech. Different ‘regimes’ (combinations of parameter settings) can be engaged to achieve different speaking rates. The model was trained separately for each speaking rate, by an evolutionary optimisation algorithm. The training identified parameter values that resulted in the model to best approximate syllable duration distributions characteristic of each speaking rate. In one gait system, the regimes used to achieve fast and slow speech are qualitatively similar, but quantitatively different. In parameter space, they would be arranged along a straight line. Different points along this axis correspond to different speaking rates. In a multiple gait system, this linearity would be missing. Instead, the arrangement of the regimes would be triangular, with no obvious relationship between the regions associated with each gait, and an abrupt shift in parameter values to move from speeds associated with ‘walk-speaking’ to ‘run-speaking’. Our model achieved good fits in all three speaking rates. In parameter space, the arrangement of the parameter settings selected for the different speaking rates is non-axial, suggesting that ‘gaits’ are present in the speech planning system.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Bosch, L. t. (2019). The speech production system is reconfigured to change speaking rate. Poster presented at the 3rd Phonetics and Phonology in Europe conference (PaPe 2019), Lecce, Italy.
  • Rodd, J., Bosker, H. R., Ernestus, M., & Ten Bosch, L. (2018). A connectionist model of serial order applied to speaking rate control. Poster presented at Computational Linguistics in the Netherlands 28, Nijmegen, The Netherlands.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Ten Bosch, L. (2018). Run-speaking? Simulations of rate control in speech production. Poster presented at Architectures and Mechanisms for Language Processing (AMLaP 2018), Berlin, Germany.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Ten Bosch, L. (2018). Running or speed-walking? Simulations of speech production at different rates. Poster presented at the International Workshop on Language Production (IWLP 2018), Nijmegen, The Netherlands.

    Abstract

    That speakers can vary their speaking rate is evident, but how they accomplish this has hardly been studied. The effortful experience of deviating from one's preferred speaking rate might result from shifting between different regimes (system configurations) of the speech planning system. This study investigates control over speech rate through simulations of a new connectionist computational model of the cognitive process of speech production, derived from Dell, Burger and Svec’s (1997) model to fit the temporal characteristics of observed speech. We draw an analogy from human movement: the selection of walking and running gaits to achieve different movement speeds. Are the regimes of the speech production system arranged into multiple ‘gaits’ that resemble walking and running? During training of the model, different parameter settings are identified for different speech rates, which can be conflated with the regimes of the speech production system. The parameters can be considered to be dimensions of a high-dimensional ‘regime space’, in which different regimes occupy different parts of the space. In a single gait system, the regimes are qualitatively similar, but quantitatively different. They are arranged along a straight line through regime space. Different points along this axis correspond directly to different speaking rates. In a multiple gait system, the arrangement of the regimes is more disperse, with no obvious relationship between the regions associated with each gait. After training, the model achieved good fits in all three speaking rates, and the parameter settings associated with each speaking rate were different. The broad arrangement of the parameter settings for the different speaking rates in regime space was non-axial, suggesting that ‘gaits’ may be present in the speech planning system.
  • Rodd, J., Bosker, H. R., Meyer, A. S., Ernestus, M., & Ten Bosch, L. (2018). How to speed up and slow down: Speaking rate control to the level of the syllable. Talk presented at the New Observations in Speech and Hearing seminar series, Institute of Phonetics and Speech processing, LMU Munich. Munich, Germany.
  • Rodd, J., Bosker, H. R., Ernestus, M., Ten Bosch, L., & Meyer, A. S. (2018). To speed up, turn up the gain: Acoustic evidence of a 'gain-strategy' for speech planning in accelerated and decelerated speech. Poster presented at LabPhon16 - Variation, development and impairment: Between phonetics and phonology, Lisbon, Portugal.
  • Terband, H., Rodd, J., & Maas, E. (2018). Testing hypotheses about the underlying deficit of Apraxia of Speech (AOS) through computational neural modelling with the DIVA model. Talk presented at Dag van de Fonetiek. Amsterdam, The Netherlands. 2018-12-21.
  • Terband, H., Rodd, J., & Maas, E. (2018). Testing hypotheses about the underlying deficit of apraxia of speech (AOS) through computational neural modelling: Effects of noise masking on vowel production in the DIVA model. Talk presented at the Madonna Motor Speech Conference. Savannah, GA, USA. 2018-02-22 - 2018-02-25.
  • Rodd, J., Bosker, H. R., Ernestus, M., Ten Bosch, L., & Meyer, A. S. (2017). How we regulate speech rate: Phonetic evidence for a 'gain strategy' in speech planning. Poster presented at the Abstraction, Diversity and Speech Dynamics Workshop, Herrsching, Germany.
  • Rodd, J., Bosker, H. R., Ernestus, M., Meyer, A. S., & Ten Bosch, L. (2017). Simulating speaking rate control: A spreading activation model of syllable timing. Poster presented at the Workshop Conversational speech and lexical representations, Nijmegen, The Netherlands.

    Abstract

    Speech can be produced at different rates. The ability to produce faster or slower speech may be thought to result from executive control processes enlisted to modulate lexical selection and phonological encoding stages of speech planning. This study used simulations of the model of serial order in language by Dell, Burger and Svec (1997, DBS) to characterise the strategies adopted by speakers when naming pictures at fast, medium and slow prescribed rates. Our new implementation of DBS was able to produce activation patterns that correlated strongly with observed syllable-level timing of disyllabic words from this task. For each participant, different speaking rates were associated with different regions of the DBS parameter space. The precise placement of the speaking rates in the parameter space differed markedly between participants. Participants applied broadly the same parameter manipulation to accelerate their speech. This was however not the case for deceleration. Hierarchical clustering revealed two distinct patterns of parameter adjustment employed to decelerate speech, suggesting that deceleration is not necessarily achieved by the inverse process of acceleration. In addition, potential refinements to the DBS model are discussed.
  • Rodd, J., & Chen, A. (2016). Pitch accents show a perceptual magnet effect: Evidence of internal structure in intonation categories. Talk presented at Speech Prosody 2016. Boston, MA, USA. 2016-05-31 - 2016-06-03.
  • Rodd, J. (2016). How to slow down and speed up: Controlling speech rate. Poster presented at the Language in Interaction Summerschool on Human Language: From Genes and Brains to Behavior, Berg en Dal, The Netherlands.
  • Smorenburg, L., Rodd, J., & Chen, A. (2015). The effect of explicit training on the prosodic production of L2 sarcasm by Dutch learners of English. Poster presented at The 18th International Congress of Phonetic Sciences (ICPhS 2015), Glasgow, UK.

    Abstract

    Previous research [9] suggests that Dutch learners of (British) English are not able to express sarcasm prosodically in their L2. The present study investigates whether explicit training on the prosodic markers of sarcasm in English can improve learners’ realisation of sarcasm. Sarcastic speech was elicited in short simulated telephone conversations between Dutch advanced learners of English and a native British English-speaking ‘friend’ in two sessions, fourteen days apart. Between the two sessions, participants were trained by means of (1) a presentation, (2) directed independent practice, and (3) evaluation of participants’ production and individual feedback in small groups. L1 British English-speaking raters subsequently evaluated the degree of sarcastic sounding in the participants’ responses on a five-point scale. It was found that significantly higher sarcasm ratings were given to L2 learners’ production obtained after the training than that obtained before the training; explicit training on prosody has a positive effect on learners’ production of sarcasm.
  • Terband, H., Rodd, J., & Maas, E. (2015). Simulations of feedforward and feedback control in apraxia of speech (AOS): Effects of noise masking on vowel production in the DIVA model. Talk presented at The 18th International Congress of Phonetic Sciences (ICPhS 2015). Glasgow, UK. 2015-08-10 - 2015-08-14.

    Abstract

    Apraxia of Speech (AOS) is a motor speech disorder whose precise nature is still poorly understood. A recent behavioural experiment featuring a noise masking paradigm suggests that AOS reflects a disruption of feedforward control, whereas feedback control is spared and plays a more prominent role in achieving and maintaining segmental contrasts [10]. In the present study, we set out to validate the interpretation of AOS as a feedforward impairment by means of a series of computational simulations with the DIVA model [6, 7] mimicking the behavioural experiment. Simulation results showed a larger reduction in vowel spacing and a smaller vowel dispersion in the masking condition compared to the no-masking condition for the simulated feedforward deficit, whereas the other groups showed an opposite pattern. These results mimic the patterns observed in the human data, corroborating the notion that AOS can be conceptualized as a deficit in feedforward control

Share this page