Publications

Displaying 201 - 294 of 294
  • Oostdijk, N., Goedertier, W., Van Eynde, F., Boves, L., Martens, J.-P., Moortgat, M., & Baayen, R. H. (2002). Experiences from the Spoken Dutch Corpus Project. In Third international conference on language resources and evaluation (pp. 340-347). Paris: European Language Resources Association.
  • Ouni, S., Cohen, M. M., Young, K., & Jesse, A. (2003). Internationalization of a talking head. In M. Sole, D. Recasens, & J. Romero (Eds.), Proceedings of 15th International Congress of Phonetics Sciences (pp. 2569-2572). Barcelona: Casual Productions.

    Abstract

    In this paper we describe a general scheme for internationalization of our talking head, Baldi, to speak other languages. We describe the modular structure of the auditory/visual synthesis software. As an example, we have created a synthetic Arabic talker, which is evaluated using a noisy word recognition task comparing this talker with a natural one.
  • Ozyurek, A. (1998). An analysis of the basic meaning of Turkish demonstratives in face-to-face conversational interaction. In S. Santi, I. Guaitella, C. Cave, & G. Konopczynski (Eds.), Oralite et gestualite: Communication multimodale, interaction: actes du colloque ORAGE 98 (pp. 609-614). Paris: L'Harmattan.
  • Ozyurek, A. (2002). Speech-gesture relationship across languages and in second language learners: Implications for spatial thinking and speaking. In B. Skarabela, S. Fish, & A. H. Do (Eds.), Proceedings of the 26th annual Boston University Conference on Language Development (pp. 500-509). Somerville, MA: Cascadilla Press.
  • Pallier, C., Cutler, A., & Sebastian-Galles, N. (1997). Prosodic structure and phonetic processing: A cross-linguistic study. In Proceedings of EUROSPEECH 97 (pp. 2131-2134). Grenoble, France: ESCA.

    Abstract

    Dutch and Spanish differ in how predictable the stress pattern is as a function of the segmental content: it is correlated with syllable weight in Dutch but not in Spanish. In the present study, two experiments were run to compare the abilities of Dutch and Spanish speakers to separately process segmental and stress information. It was predicted that the Spanish speakers would have more difficulty focusing on the segments and ignoring the stress pattern than the Dutch speakers. The task was a speeded classification task on CVCV syllables, with blocks of trials in which the stress pattern could vary versus blocks in which it was fixed. First, we found interference due to stress variability in both languages, suggesting that the processing of segmental information cannot be performed independently of stress. Second, the effect was larger for Spanish than for Dutch, suggesting that that the degree of interference from stress variation may be partially mitigated by the predictability of stress placement in the language.
  • Papafragou, A., & Ozturk, O. (2006). The acquisition of epistemic modality. In A. Botinis (Ed.), Proceedings of ITRW on Experimental Linguistics in ExLing-2006 (pp. 201-204). ISCA Archive.

    Abstract

    In this paper we try to contribute to the body of knowledge about the acquisition of English epistemic modal verbs (e.g. Mary may/has to be at school). Semantically, these verbs encode possibility or necessity with respect to available evidence. Pragmatically, the use of epistemic modals often gives rise to scalar conversational inferences (Mary may be at school -> Mary doesn’t have to be at school). The acquisition of epistemic modals is challenging for children on both these levels. In this paper, we present findings from two studies which were conducted with 5-year-old children and adults. Our findings, unlike previous work, show that 5-yr-olds have mastered epistemic modal semantics, including the notions of necessity and possibility. However, they are still in the process of acquiring epistemic modal pragmatics.
  • Parhammer*, S. I., Ebersberg*, M., Tippmann*, J., Stärk*, K., Opitz, A., Hinger, B., & Rossi, S. (2019). The influence of distraction on speech processing: How selective is selective attention? In Proceedings of Interspeech 2019 (pp. 3093-3097). doi:10.21437/Interspeech.2019-2699.

    Abstract

    -* indicates shared first authorship -
    The present study investigated the effects of selective attention on the processing of morphosyntactic errors in unattended parts of speech. Two groups of German native (L1) speakers participated in the present study. Participants listened to sentences in which irregular verbs were manipulated in three different conditions (correct, incorrect but attested ablaut pattern, incorrect and crosslinguistically unattested ablaut pattern). In order to track fast dynamic neural reactions to the stimuli, electroencephalography was used. After each sentence, participants in Experiment 1 performed a semantic judgement task, which deliberately distracted the participants from the syntactic manipulations and directed their attention to the semantic content of the sentence. In Experiment 2, participants carried out a syntactic judgement task, which put their attention on the critical stimuli. The use of two different attentional tasks allowed for investigating the impact of selective attention on speech processing and whether morphosyntactic processing steps are performed automatically. In Experiment 2, the incorrect attested condition elicited a larger N400 component compared to the correct condition, whereas in Experiment 1 no differences between conditions were found. These results suggest that the processing of morphosyntactic violations in irregular verbs is not entirely automatic but seems to be strongly affected by selective attention.
  • Pereiro Estevan, Y., Wan, V., Scharenborg, O., & Gallardo Antolín, A. (2006). Segmentación de fonemas no supervisada basada en métodos kernel de máximo margen. In Proceedings of IV Jornadas en Tecnología del Habla.

    Abstract

    En este artículo se desarrolla un método automático de segmentación de fonemas no supervisado. Este método utiliza el algoritmo de agrupación de máximo margen [1] para realizar segmentación de fonemas sobre habla continua sin necesidad de información a priori para el entrenamiento del sistema.
  • Petersson, K. M., Grenholm, P., & Forkstam, C. (2005). Artificial grammar learning and neural networks. In G. B. Bruna, L. Barsalou, & M. Bucciarelli (Eds.), Proceedings of the 27th Annual Conference of the Cognitive Science Society (pp. 1726-1731).

    Abstract

    Recent FMRI studies indicate that language related brain regions are engaged in artificial grammar (AG) processing. In the present study we investigate the Reber grammar by means of formal analysis and network simulations. We outline a new method for describing the network dynamics and propose an approach to grammar extraction based on the state-space dynamics of the network. We conclude that statistical frequency-based and rule-based acquisition procedures can be viewed as complementary perspectives on grammar learning, and more generally, that classical cognitive models can be viewed as a special case of a dynamical systems perspective on information processing
  • Petersson, K. M. (2002). Brain physiology. In R. Behn, & C. Veranda (Eds.), Proceedings of The 4th Southern European School of the European Physical Society - Physics in Medicine (pp. 37-38). Montreux: ESF.
  • Pluymaekers, M., Ernestus, M., Baayen, R. H., & Booij, G. (2006). The role of morphology in fine phonetic detail: The case of Dutch -igheid. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 53-54).
  • Pluymaekers, M., Ernestus, M., & Baayen, R. H. (2006). Effects of word frequency on the acoustic durations of affixes. In Proceedings of Interspeech 2006 (pp. 953-956). Pittsburgh: ICSLP.

    Abstract

    This study investigates whether the acoustic durations of derivational affixes in Dutch are affected by the frequency of the word they occur in. In a word naming experiment, subjects were presented with a large number of words containing one of the affixes ge-, ver-, ont, or -lijk. Their responses were recorded on DAT tapes, and the durations of the affixes were measured using Automatic Speech Recognition technology. To investigate whether frequency also affected durations when speech rate was high, the presentation rate of the stimuli was varied. The results show that a higher frequency of the word as a whole led to shorter acoustic realizations for all affixes. Furthermore, affixes became shorter as the presentation rate of the stimuli increased. There was no interaction between word frequency and presentation rate, suggesting that the frequency effect also applies in situations in which the speed of articulation is very high.
  • Poletiek, F. H., & Chater, N. (2006). Grammar induction profits from representative stimulus sampling. In R. Sun (Ed.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006) (pp. 1968-1973). Austin, TX, USA: Cognitive Science Society.
  • Poletiek, F. H., & Rassin E. (Eds.). (2005). Het (on)bewuste [Special Issue]. De Psycholoog.
  • Pouw, W., Paxton, A., Harrison, S. J., & Dixon, J. A. (2019). Acoustic specification of upper limb movement in voicing. In A. Grimminger (Ed.), Proceedings of the 6th Gesture and Speech in Interaction – GESPIN 6 (pp. 68-74). Paderborn: Universitaetsbibliothek Paderborn. doi:10.17619/UNIPB/1-812.
  • Pouw, W., & Dixon, J. A. (2019). Quantifying gesture-speech synchrony. In A. Grimminger (Ed.), Proceedings of the 6th Gesture and Speech in Interaction – GESPIN 6 (pp. 75-80). Paderborn: Universitaetsbibliothek Paderborn. doi:10.17619/UNIPB/1-812.

    Abstract

    Spontaneously occurring speech is often seamlessly accompanied by hand gestures. Detailed
    observations of video data suggest that speech and gesture are tightly synchronized in time,
    consistent with a dynamic interplay between body and mind. However, spontaneous gesturespeech
    synchrony has rarely been objectively quantified beyond analyses of video data, which
    do not allow for identification of kinematic properties of gestures. Consequently, the point in
    gesture which is held to couple with speech, the so-called moment of “maximum effort”, has
    been variably equated with the peak velocity, peak acceleration, peak deceleration, or the onset
    of the gesture. In the current exploratory report, we provide novel evidence from motiontracking
    and acoustic data that peak velocity is closely aligned, and shortly leads, the peak pitch
    (F0) of speech

    Additional information

    https://osf.io/9843h/
  • Rissman, L., & Majid, A. (2019). Agency drives category structure in instrumental events. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2661-2667). Montreal, QB: Cognitive Science Society.

    Abstract

    Thematic roles such as Agent and Instrument have a long-standing place in theories of event representation. Nonetheless, the structure of these categories has been difficult to determine. We investigated how instrumental events, such as someone slicing bread with a knife, are categorized in English. Speakers described a variety of typical and atypical instrumental events, and we determined the similarity structure of their descriptions using correspondence analysis. We found that events where the instrument is an extension of an intentional agent were most likely to elicit similar language, highlighting the importance of agency in structuring instrumental categories.
  • Rubio-Fernández, P., Breheny, R., & Lee, M. W. (2003). Context-independent information in concepts: An investigation of the notion of ‘core features’. In Proceedings of the 25th Annual Conference of the Cognitive Science Society (CogSci 2003). Austin, TX: Cognitive Science Society.
  • De Ruiter, J. P. (2004). On the primacy of language in multimodal communication. In Workshop Proceedings on Multimodal Corpora: Models of Human Behaviour for the Specification and Evaluation of Multimodal Input and Output Interfaces.(LREC2004) (pp. 38-41). Paris: ELRA - European Language Resources Association (CD-ROM).

    Abstract

    In this paper, I will argue that although the study of multimodal interaction offers exciting new prospects for Human Computer Interaction and human-human communication research, language is the primary form of communication, even in multimodal systems. I will support this claim with theoretical and empirical arguments, mainly drawn from human-human communication research, and will discuss the implications for multimodal communication research and Human-Computer Interaction.
  • Sauter, D., Scott, S., & Calder, A. (2004). Categorisation of vocally expressed positive emotion: A first step towards basic positive emotions? [Abstract]. Proceedings of the British Psychological Society, 12, 111.

    Abstract

    Most of the study of basic emotion expressions has focused on facial expressions and little work has been done to specifically investigate happiness, the only positive of the basic emotions (Ekman & Friesen, 1971). However, a theoretical suggestion has been made that happiness could be broken down into discrete positive emotions, which each fulfil the criteria of basic emotions, and that these would be expressed vocally (Ekman, 1992). To empirically test this hypothesis, 20 participants categorised 80 paralinguistic sounds using the labels achievement, amusement, contentment, pleasure and relief. The results suggest that achievement, amusement and relief are perceived as distinct categories, which subjects accurately identify. In contrast, the categories of contentment and pleasure were systematically confused with other responses, although performance was still well above chance levels. These findings are initial evidence that the positive emotions engage distinct vocal expressions and may be considered to be distinct emotion categories.
  • Sauter, D., Wiland, J., Warren, J., Eisner, F., Calder, A., & Scott, S. K. (2005). Sounds of joy: An investigation of vocal expressions of positive emotions [Abstract]. Journal of Cognitive Neuroscience, 61(Supplement), B99.

    Abstract

    A series of experiment tested Ekman’s (1992) hypothesis that there are a set of positive basic emotions that are expressed using vocal para-linguistic sounds, e.g. laughter and cheers. The proposed categories investigated were amusement, contentment, pleasure, relief and triumph. Behavioural testing using a forced-choice task indicated that participants were able to reliably recognize vocal expressions of the proposed emotions. A cross-cultural study in the preliterate Himba culture in Namibia confirmed that these categories are also recognized across cultures. A recognition test of acoustically manipulated emotional vocalizations established that the recognition of different emotions utilizes different vocal cues, and that these in turn differ from the cues used when comprehending speech. In a study using fMRI we found that relative to a signal correlated noise baseline, the paralinguistic expressions of emotion activated bilateral superior temporal gyri and sulci, lateral and anterior to primary auditory cortex, which is consistent with the processing of non linguistic vocal cues in the auditory ‘what’ pathway. Notably amusement was associated with greater activation extending into both temporal poles and amygdale and insular cortex. Overall, these results support the claim that ‘happiness’ can be fractionated into amusement, pleasure, relief and triumph.
  • Scharenborg, O., Wan, V., & Moore, R. K. (2006). Capturing fine-phonetic variation in speech through automatic classification of articulatory features. In Speech Recognition and Intrinsic Variation Workshop [SRIV2006] (pp. 77-82). ISCA Archive.

    Abstract

    The ultimate goal of our research is to develop a computational model of human speech recognition that is able to capture the effects of fine-grained acoustic variation on speech recognition behaviour. As part of this work we are investigating automatic feature classifiers that are able to create reliable and accurate transcriptions of the articulatory behaviour encoded in the acoustic speech signal. In the experiments reported here, we compared support vector machines (SVMs) with multilayer perceptrons (MLPs). MLPs have been widely (and rather successfully) used for the task of multi-value articulatory feature classification, while (to the best of our knowledge) SVMs have not. This paper compares the performances of the two classifiers and analyses the results in order to better understand the articulatory representations. It was found that the MLPs outperformed the SVMs, but it is concluded that both classifiers exhibit similar behaviour in terms of patterns of errors.
  • Scharenborg, O., & Seneff, S. (2005). A two-pass strategy for handling OOVs in a large vocabulary recognition task. In Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology, (pp. 1669-1672). ISCA Archive.

    Abstract

    This paper addresses the issue of large-vocabulary recognition in a specific word class. We propose a two-pass strategy in which only major cities are explicitly represented in the first stage lexicon. An unknown word model encoded as a phone loop is used to detect OOV city names (referred to as rare city names). After which SpeM, a tool that can extract words and word-initial cohorts from phone graphs on the basis of a large fallback lexicon, provides an N-best list of promising city names on the basis of the phone sequences generated in the first stage. This N-best list is then inserted into the second stage lexicon for a subsequent recognition pass. Experiments were conducted on a set of spontaneous telephone-quality utterances each containing one rare city name. We tested the size of the N-best list and three types of language models (LMs). The experiments showed that SpeM was able to include nearly 85% of the correct city names into an N-best list of 3000 city names when a unigram LM, which also boosted the unigram scores of a city name in a given state, was used.
  • Scharenborg, O., Boves, L., & de Veth, J. (2002). ASR in a human word recognition model: Generating phonemic input for Shortlist. In J. H. L. Hansen, & B. Pellom (Eds.), ICSLP 2002 - INTERSPEECH 2002 - 7th International Conference on Spoken Language Processing (pp. 633-636). ISCA Archive.

    Abstract

    The current version of the psycholinguistic model of human word recognition Shortlist suffers from two unrealistic constraints. First, the input of Shortlist must consist of a single string of phoneme symbols. Second, the current version of the search in Shortlist makes it difficult to deal with insertions and deletions in the input phoneme string. This research attempts to fully automatically derive a phoneme string from the acoustic signal that is as close as possible to the number of phonemes in the lexical representation of the word. We optimised an Automatic Phone Recogniser (APR) using two approaches, viz. varying the value of the mismatch parameter and optimising the APR output strings on the output of Shortlist. The approaches show that it will be very difficult to satisfy the input requirements of the present version of Shortlist with a phoneme string generated by an APR.
  • Scharenborg, O., Boves, L., & Ten Bosch, L. (2004). ‘On-line early recognition’ of polysyllabic words in continuous speech. In S. Cassidy, F. Cox, R. Mannell, & P. Sallyanne (Eds.), Proceedings of the Tenth Australian International Conference on Speech Science & Technology (pp. 387-392). Canberra: Australian Speech Science and Technology Association Inc.

    Abstract

    In this paper, we investigate the ability of SpeM, our recognition system based on the combination of an automatic phone recogniser and a wordsearch module, to determine as early as possible during the word recognition process whether a word is likely to be recognised correctly (this we refer to as ‘on-line’ early word recognition). We present two measures that can be used to predict whether a word is correctly recognised: the Bayesian word activation and the amount of available (acoustic) information for a word. SpeM was tested on 1,463 polysyllabic words in 885 continuous speech utterances. The investigated predictors indicated that a word activation that is 1) high (but not too high) and 2) based on more phones is more reliable to predict the correctness of a word than a similarly high value based on a small number of phones or a lower value of the word activation.
  • Scharenborg, O., McQueen, J. M., Ten Bosch, L., & Norris, D. (2003). Modelling human speech recognition using automatic speech recognition paradigms in SpeM. In Proceedings of Eurospeech 2003 (pp. 2097-2100). Adelaide: Causal Productions.

    Abstract

    We have recently developed a new model of human speech recognition, based on automatic speech recognition techniques [1]. The present paper has two goals. First, we show that the new model performs well in the recognition of lexically ambiguous input. These demonstrations suggest that the model is able to operate in the same optimal way as human listeners. Second, we discuss how to relate the behaviour of a recogniser, designed to discover the optimum path through a word lattice, to data from human listening experiments. We argue that this requires a metric that combines both path-based and word-based measures of recognition performance. The combined metric varies continuously as the input speech signal unfolds over time.
  • Scharenborg, O. (2005). Parallels between HSR and ASR: How ASR can contribute to HSR. In Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology (pp. 1237-1240). ISCA Archive.

    Abstract

    In this paper, we illustrate the close parallels between the research fields of human speech recognition (HSR) and automatic speech recognition (ASR) using a computational model of human word recognition, SpeM, which was built using techniques from ASR. We show that ASR has proven to be useful for improving models of HSR by relieving them of some of their shortcomings. However, in order to build an integrated computational model of all aspects of HSR, a lot of issues remain to be resolved. In this process, ASR algorithms and techniques definitely can play an important role.
  • Scharenborg, O., & Boves, L. (2002). Pronunciation variation modelling in a model of human word recognition. In Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology [PMLA-2002] (pp. 65-70).

    Abstract

    Due to pronunciation variation, many insertions and deletions of phones occur in spontaneous speech. The psycholinguistic model of human speech recognition Shortlist is not well able to deal with phone insertions and deletions and is therefore not well suited for dealing with real-life input. The research presented in this paper explains how Shortlist can benefit from pronunciation variation modelling in dealing with real-life input. Pronunciation variation was modelled by including variants into the lexicon of Shortlist. A series of experiments was carried out to find the optimal acoustic model set for transcribing the training material that was used as basis for the generation of the variants. The Shortlist experiments clearly showed that Shortlist benefits from pronunciation variation modelling. However, the performance of Shortlist stays far behind the performance of other, more conventional speech recognisers.
  • Scharenborg, O., ten Bosch, L., & Boves, L. (2003). Recognising 'real-life' speech with SpeM: A speech-based computational model of human speech recognition. In Eurospeech 2003 (pp. 2285-2288).

    Abstract

    In this paper, we present a novel computational model of human speech recognition – called SpeM – based on the theory underlying Shortlist. We will show that SpeM, in combination with an automatic phone recogniser (APR), is able to simulate the human speech recognition process from the acoustic signal to the ultimate recognition of words. This joint model takes an acoustic speech file as input and calculates the activation flows of candidate words on the basis of the degree of fit of the candidate words with the input. Experiments showed that SpeM outperforms Shortlist on the recognition of ‘real-life’ input. Furthermore, SpeM performs only slightly worse than an off-the-shelf full-blown automatic speech recogniser in which all words are equally probable, while it provides a transparent computationally elegant paradigm for modelling word activations in human word recognition.
  • Schiller, N. O., Schmitt, B., Peters, J., & Levelt, W. J. M. (2002). 'BAnana'or 'baNAna'? Metrical encoding during speech production [Abstract]. In M. Baumann, A. Keinath, & J. Krems (Eds.), Experimentelle Psychologie: Abstracts der 44. Tagung experimentell arbeitender Psychologen. (pp. 195). TU Chemnitz, Philosophische Fakultät.

    Abstract

    The time course of metrical encoding, i.e. stress, during speech production is investigated. In a first experiment, participants were presented with pictures whose bisyllabic Dutch names had initial or final stress (KAno 'canoe' vs. kaNON 'cannon'; capital letters indicate stressed syllables). Picture names were matched for frequency and object recognition latencies. When participants were asked to judge whether picture names had stress on the first or second syllable, they showed significantly faster decision times for initially stressed targets than for targets with final stress. Experiment 2 replicated this effect with trisyllabic picture names (faster RTs for penultimate stress than for ultimate stress). In our view, these results reflect the incremental phonological encoding process. Wheeldon and Levelt (1995) found that segmental encoding is a process running from the beginning to the end of words. Here, we present evidence that the metrical pattern of words, i.e. stress, is also encoded incrementally.
  • Schiller, N. O. (2003). Metrical stress in speech production: A time course study. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 451-454). Adelaide: Causal Productions.

    Abstract

    This study investigated the encoding of metrical information during speech production in Dutch. In Experiment 1, participants were asked to judge whether bisyllabic picture names had initial or final stress. Results showed significantly faster decision times for initially stressed targets (e.g., LEpel 'spoon') than for targets with final stress (e.g., liBEL 'dragon fly'; capital letters indicate stressed syllables) and revealed that the monitoring latencies are not a function of the picture naming or object recognition latencies to the same pictures. Experiments 2 and 3 replicated the outcome of the first experiment with bi- and trisyllabic picture names. These results demonstrate that metrical information of words is encoded rightward incrementally during phonological encoding in speech production. The results of these experiments are in line with Levelt's model of phonological encoding.
  • Schiller, N. O., Van Lieshout, P. H. H. M., Meyer, A. S., & Levelt, W. J. M. (1997). Is the syllable an articulatory unit in speech production? Evidence from an Emma study. In P. Wille (Ed.), Fortschritte der Akustik: Plenarvorträge und Fachbeiträge der 23. Deutschen Jahrestagung für Akustik (DAGA 97) (pp. 605-606). Oldenburg: DEGA.
  • Schmiedtová, V., & Schmiedtová, B. (2002). The color spectrum in language: The case of Czech: Cognitive concepts, new idioms and lexical meanings. In H. Gottlieb, J. Mogensen, & A. Zettersten (Eds.), Proceedings of The 10th International Symposium on Lexicography (pp. 285-292). Tübingen: Max Niemeyer Verlag.

    Abstract

    The representative corpus SYN2000 in the Czech National Corpus (CNK) project containing 100 million word forms taken from different types of texts. I have tried to determine the extent and depth of the linguistic material in the corpus. First, I chose the adjectives indicating the basic colors of the spectrum and other parts of speech (names and adverbs) derived from these adjectives. An analysis of three examples - black, white and red - shows the extent of the linguistic wealth and diversity we are looking at: because of size limitations, no existing dictionary is capable of embracing all analyzed nuances. Currently, we can only hope that the next dictionary of contemporary Czech, built on the basis of the Czech National Corpus, will be electronic. Without the size limitations, we would be able us to include many of the fine nuances of language
  • Schoenmakers, G.-J., & De Swart, P. (2019). Adverbial hurdles in Dutch scrambling. In A. Gattnar, R. Hörnig, M. Störzer, & S. Featherston (Eds.), Proceedings of Linguistic Evidence 2018: Experimental Data Drives Linguistic Theory (pp. 124-145). Tübingen: University of Tübingen.

    Abstract

    This paper addresses the role of the adverb in Dutch direct object scrambling constructions. We report four experiments in which we investigate whether the structural position and the scope sensitivity of the adverb affect acceptability judgments of scrambling constructions and native speakers' tendency to scramble definite objects. We conclude that the type of adverb plays a key role in Dutch word ordering preferences.
  • Schuerman, W. L., McQueen, J. M., & Meyer, A. S. (2019). Speaker statistical averageness modulates word recognition in adverse listening conditions. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1203-1207). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    We tested whether statistical averageness (SA) at the level of the individual speaker could predict a speaker’s intelligibility. 28 female and 21 male speakers of Dutch were recorded producing 336 sentences,
    each containing two target nouns. Recordings were compared to those of all other same-sex speakers using dynamic time warping (DTW). For each sentence, the DTW distance constituted a metric
    of phonetic distance from one speaker to all other speakers. SA comprised the average of these distances. Later, the same participants performed a word recognition task on the target nouns in the same sentences, under three degraded listening conditions. In all three conditions, accuracy increased with SA. This held even when participants listened to their own utterances. These findings suggest that listeners process speech with respect to the statistical
    properties of the language spoken in their community, rather than using their own speech as a reference
  • Scott, S., & Sauter, D. (2006). Non-verbal expressions of emotion - acoustics, valence, and cross cultural factors. In Third International Conference on Speech Prosody 2006. ISCA.

    Abstract

    This presentation will address aspects of the expression of emotion in non-verbal vocal behaviour, specifically attempting to determine the roles of both positive and negative emotions, their acoustic bases, and the extent to which these are recognized in non-Western cultures.
  • Scott, S., & Sauter, D. (2004). Vocal expressions of emotion and positive and negative basic emotions [Abstract]. Proceedings of the British Psychological Society, 12, 156.

    Abstract

    Previous studies have indicated that vocal and facial expressions of the ‘basic’ emotions share aspects of processing. Thus amygdala damage compromises the perception of fear and anger from the face and from the voice. In the current study we tested the hypothesis that there exist positive basic emotions, expressed mainly in the voice (Ekman, 1992). Vocal stimuli were produced to express the specific positive emotions of amusement, achievement, pleasure, contentment and relief.
  • Seidl, A., & Johnson, E. K. (2003). Position and vowel quality effects in infant's segmentation of vowel-initial words. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2233-2236). Adelaide: Causal Productions.
  • Seidlmayer, E., Galke, L., Melnychuk, T., Schultz, C., Tochtermann, K., & Förstner, K. U. (2019). Take it personally - A Python library for data enrichment for infometrical applications. In M. Alam, R. Usbeck, T. Pellegrini, H. Sack, & Y. Sure-Vetter (Eds.), Proceedings of the Posters and Demo Track of the 15th International Conference on Semantic Systems co-located with 15th International Conference on Semantic Systems (SEMANTiCS 2019).

    Abstract

    Like every other social sphere, science is influenced by individual characteristics of researchers. However, for investigations on scientific networks, only little data about the social background of researchers, e.g. social origin, gender, affiliation etc., is available.
    This paper introduces ”Take it personally - TIP”, a conceptual model and library currently under development, which aims to support the
    semantic enrichment of publication databases with semantically related background information which resides elsewhere in the (semantic) web, such as Wikidata.
    The supplementary information enriches the original information in the publication databases and thus facilitates the creation of complex scientific knowledge graphs. Such enrichment helps to improve the scientometric analysis of scientific publications as they can also take social backgrounds of researchers into account and to understand social structure in research communities.
  • Seijdel, N., Sakmakidis, N., De Haan, E. H. F., Bohte, S. M., & Scholte, H. S. (2019). Implicit scene segmentation in deeper convolutional neural networks. In Proceedings of the 2019 Conference on Cognitive Computational Neuroscience (pp. 1059-1062). doi:10.32470/CCN.2019.1149-0.

    Abstract

    Feedforward deep convolutional neural networks (DCNNs) are matching and even surpassing human performance on object recognition. This performance suggests that activation of a loose collection of image
    features could support the recognition of natural object categories, without dedicated systems to solve specific visual subtasks. Recent findings in humans however, suggest that while feedforward activity may suffice for
    sparse scenes with isolated objects, additional visual operations ('routines') that aid the recognition process (e.g. segmentation or grouping) are needed for more complex scenes. Linking human visual processing to
    performance of DCNNs with increasing depth, we here explored if, how, and when object information is differentiated from the backgrounds they appear on. To this end, we controlled the information in both objects
    and backgrounds, as well as the relationship between them by adding noise, manipulating background congruence and systematically occluding parts of the image. Results indicated less distinction between object- and background features for more shallow networks. For those networks, we observed a benefit of training on segmented objects (as compared to unsegmented objects). Overall, deeper networks trained on natural
    (unsegmented) scenes seem to perform implicit 'segmentation' of the objects from their background, possibly by improved selection of relevant features.
  • Senft, G. (2002). What should the ideal online-archive documenting linguistic data of various (endangered) languages and cultures offer to interested parties? Some ideas of a technically naive linguistic field researcher and potential user. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 11-15). Paris: European Language Resources Association.
  • Senft, G. (1991). Bakavilisi Biga - we can 'turn' the language - or: What happens to English words in Kilivila language? In W. Bahner, J. Schildt, & D. Viehwegger (Eds.), Proceedings of the XIVth International Congress of Linguists (pp. 1743-1746). Berlin: Akademie Verlag.
  • Seuren, P. A. M. (2002). Existential import. In D. De Jongh, M. Nilsenová, & H. Zeevat (Eds.), Proceedings of The 3rd and 4th International Symposium on Language, Logic and Computation. Amsterdam: ILLC Scientific Publ. U. of Amsterdam.
  • Seuren, P. A. M. (1991). Notes on noun phrases and quantification. In Proceedings of the International Conference on Current Issues in Computational Linguistics (pp. 19-44). Penang, Malaysia: Universiti Sains Malaysia.
  • Seuren, P. A. M. (1971). Qualche osservazione sulla frase durativa e iterativa in italiano. In M. Medici, & R. Simone (Eds.), Grammatica trasformazionale italiana (pp. 209-224). Roma: Bulzoni.
  • Seuren, P. A. M. (1991). What makes a text untranslatable? In H. M. N. Noor Ein, & H. S. Atiah (Eds.), Pragmatik Penterjemahan: Prinsip, Amalan dan Penilaian Menuju ke Abad 21 ("The Pragmatics of Translation: Principles, Practice and Evaluation Moving towards the 21st Century") (pp. 19-27). Kuala Lumpur: Dewan Bahasa dan Pustaka.
  • Shatzman, K. B. (2004). Segmenting ambiguous phrases using phoneme duration. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 329-332). Seoul: Sunjijn Printing Co.

    Abstract

    The results of an eye-tracking experiment are presented in which Dutch listeners' eye movements were monitored as they heard sentences and saw four pictured objects. Participants were instructed to click on the object mentioned in the sentence. In the critical sentences, a stop-initial target (e.g., "pot") was preceded by an [s], thus causing ambiguity regarding whether the sentence refers to a stop-initial or a cluster-initial word (e.g., "spot"). Participants made fewer fixations to the target pictures when the stop and the preceding [s] were cross-spliced from the cluster-initial word than when they were spliced from a different token of the sentence containing the stop-initial word. Acoustic analyses showed that the two versions differed in various measures, but only one of these - the duration of the [s] - correlated with the perceptual effect. Thus, in this context, the [s] duration information is an important factor guiding word recognition.
  • Shen, C., & Janse, E. (2019). Articulatory control in speech production. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019) (pp. 2533-2537). Canberra, Australia: Australasian Speech Science and Technology Association Inc.
  • Shen, C., Cooke, M., & Janse, E. (2019). Individual articulatory control in speech enrichment. In M. Ochmann, M. Vorländer, & J. Fels (Eds.), Proceedings of the 23rd International Congress on Acoustics (pp. 5726-5730). Berlin: Deutsche Gesellschaft für Akustik.

    Abstract

    ndividual talkers may use various strategies to enrich their speech while speaking in noise (i.e., Lombard speech) to improve their intelligibility. The resulting acoustic-phonetic changes in Lombard speech vary amongst different speakers, but it is unclear what causes these talker differences, and what impact these differences have on intelligibility. This study investigates the potential role of articulatory control in talkers’ Lombard speech enrichment success. Seventy-eight speakers read out sentences in both their habitual style and in a condition where they were instructed to speak clearly while hearing loud speech-shaped noise. A diadochokinetic (DDK) speech task that requires speakers to repetitively produce word or non-word sequences as accurately and as rapidly as possible, was used to quantify their articulatory control. Individuals’ predicted intelligibility in both speaking styles (presented at -5 dB SNR) was measured using an acoustic glimpse-based metric: the High-Energy Glimpse Proportion (HEGP). Speakers’ HEGP scores show a clear effect of speaking condition (better HEGP scores in the Lombard than habitual condition), but no simple effect of articulatory control on HEGP, nor an interaction between speaking condition and articulatory control. This indicates that individuals’ speech enrichment success as measured by the HEGP metric was not predicted by DDK performance.
  • Shi, R., Werker, J., & Cutler, A. (2003). Function words in early speech perception. In Proceedings of the 15th International Congress of Phonetic Sciences (pp. 3009-3012).

    Abstract

    Three experiments examined whether infants recognise functors in phrases, and whether their representations of functors are phonetically well specified. Eight- and 13- month-old English infants heard monosyllabic lexical words preceded by real functors (e.g., the, his) versus nonsense functors (e.g., kuh); the latter were minimally modified segmentally (but not prosodically) from real functors. Lexical words were constant across conditions; thus recognition of functors would appear as longer listening time to sequences with real functors. Eightmonth- olds' listening times to sequences with real versus nonsense functors did not significantly differ, suggesting that they did not recognise real functors, or functor representations lacked phonetic specification. However, 13-month-olds listened significantly longer to sequences with real functors. Thus, somewhere between 8 and 13 months of age infants learn familiar functors and represent them with segmental detail. We propose that accumulated frequency of functors in input in general passes a critical threshold during this time.
  • Sidnell, J., & Stivers, T. (Eds.). (2005). Multimodal Interaction [Special Issue]. Semiotica, 156.
  • Silverstein, P., Bergmann, C., & Syed, M. (Eds.). (2024). Open science and metascience in developmental psychology [Special Issue]. Infant and Child Development, 33(1).
  • Sprenger, S. A., & Van Rijn, H. (2005). Clock time naming: Complexities of a simple task. In B. G. Bara, L. Barsalou, & M. Bucciarelli (Eds.), Proceedings of the 27th Annual Meeting of the Cognitive Science Society (pp. 2062-2067).
  • Ten Bosch, L., Baayen, R. H., & Ernestus, M. (2006). On speech variation and word type differentiation by articulatory feature representations. In Proceedings of Interspeech 2006 (pp. 2230-2233).

    Abstract

    This paper describes ongoing research aiming at the description of variation in speech as represented by asynchronous articulatory features. We will first illustrate how distances in the articulatory feature space can be used for event detection along speech trajectories in this space. The temporal structure imposed by the cosine distance in articulatory feature space coincides to a large extent with the manual segmentation on phone level. The analysis also indicates that the articulatory feature representation provides better such alignments than the MFCC representation does. Secondly, we will present first results that indicate that articulatory features can be used to probe for acoustic differences in the onsets of Dutch singulars and plurals.
  • Ten Bosch, L., Oostdijk, N., & De Ruiter, J. P. (2004). Turn-taking in social talk dialogues: Temporal, formal and functional aspects. In 9th International Conference Speech and Computer (SPECOM'2004) (pp. 454-461).

    Abstract

    This paper presents a quantitative analysis of the
    turn-taking mechanism evidenced in 93 telephone
    dialogues that were taken from the 9-million-word
    Spoken Dutch Corpus. While the first part of the paper
    focuses on the temporal phenomena of turn taking, such
    as durations of pauses and overlaps of turns in the
    dialogues, the second part explores the discoursefunctional
    aspects of utterances in a subset of 8
    dialogues that were annotated especially for this
    purpose. The results show that speakers adapt their turntaking
    behaviour to the interlocutor’s behaviour.
    Furthermore, the results indicate that male-male dialogs
    show a higher proportion of overlapping turns than
    female-female dialogues.
  • Ten Bosch, L., Mulder, K., & Boves, L. (2019). Phase synchronization between EEG signals as a function of differences between stimuli characteristics. In Proceedings of Interspeech 2019 (pp. 1213-1217). doi:10.21437/Interspeech.2019-2443.

    Abstract

    The neural processing of speech leads to specific patterns in the brain which can be measured as, e.g., EEG signals. When properly aligned with the speech input and averaged over many tokens, the Event Related Potential (ERP) signal is able to differentiate specific contrasts between speech signals. Well-known effects relate to the difference between expected and unexpected words, in particular in the N400, while effects in N100 and P200 are related to attention and acoustic onset effects. Most EEG studies deal with the amplitude of EEG signals over time, sidestepping the effect of phase and phase synchronization. This paper investigates the relation between phase in the EEG signals measured in an auditory lexical decision task by Dutch participants listening to full and reduced English word forms. We show that phase synchronization takes place across stimulus conditions, and that the so-called circular variance is narrowly related to the type of contrast between stimuli.
  • ten Bosch, L., Hämäläinen, A., Scharenborg, O., & Boves, L. (2006). Acoustic scores and symbolic mismatch penalties in phone lattices. In Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing [ICASSP 2006]. IEEE.

    Abstract

    This paper builds on previous work that aims at unraveling the structure of the speech signal by means of using probabilistic representations. The context of this work is a multi-pass speech recognition system in which a phone lattice is created and used as a basis for a lexical search in which symbolic mismatches are allowed at certain costs. The focus is on the optimization of the costs of phone insertions, deletions and substitutions that are used in the lexical decoding pass. Two optimization approaches are presented, one related to a multi-pass computational model for human speech recognition, the other based on a decoding in which Bayes’ risks are minimized. In the final section, the advantages of these optimization methods are discussed and compared.
  • ten Bosch, L., & Scharenborg, O. (2005). ASR decoding in a computational model of human word recognition. In Interspeech'2005 - Eurospeech, 9th European Conference on Speech Communication and Technology (pp. 1241-1244). ISCA Archive.

    Abstract

    This paper investigates the interaction between acoustic scores and symbolic mismatch penalties in multi-pass speech decoding techniques that are based on the creation of a segment graph followed by a lexical search. The interaction between acoustic and symbolic mismatches determines to a large extent the structure of the search space of these multipass approaches. The background of this study is a recently developed computational model of human word recognition, called SpeM. SpeM is able to simulate human word recognition data and is built as a multi-pass speech decoder. Here, we focus on unravelling the structure of the search space that is used in SpeM and similar decoding strategies. Finally, we elaborate on the close relation between distances in this search space, and distance measures in search spaces that are based on a combination of acoustic and phonetic features.
  • Ten Bosch, L., Oostdijk, N., & De Ruiter, J. P. (2004). Durational aspects of turn-taking in spontaneous face-to-face and telephone dialogues. In P. Sojka, I. Kopecek, & K. Pala (Eds.), Text, Speech and Dialogue: Proceedings of the 7th International Conference TSD 2004 (pp. 563-570). Heidelberg: Springer.

    Abstract

    On the basis of two-speaker spontaneous conversations, it is shown that the distributions of both pauses and speech-overlaps of telephone and faceto-face dialogues have different statistical properties. Pauses in a face-to-face
    dialogue last up to 4 times longer than pauses in telephone conversations in functionally comparable conditions. There is a high correlation (0.88 or larger) between the average pause duration for the two speakers across face-to-face
    dialogues and telephone dialogues. The data provided form a first quantitative analysis of the complex turn-taking mechanism evidenced in the dialogues available in the 9-million-word Spoken Dutch Corpus.
  • Ter Bekke, M., Ozyurek, A., & Ünal, E. (2019). Speaking but not gesturing predicts motion event memory within and across languages. In A. Goel, C. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2940-2946). Montreal, QB: Cognitive Science Society.

    Abstract

    In everyday life, people see, describe and remember motion events. We tested whether the type of motion event information (path or manner) encoded in speech and gesture predicts which information is remembered and if this varies across speakers of typologically different languages. We focus on intransitive motion events (e.g., a woman running to a tree) that are described differently in speech and co-speech gesture across languages, based on how these languages typologically encode manner and path information (Kita & Özyürek, 2003; Talmy, 1985). Speakers of Dutch (n = 19) and Turkish (n = 22) watched and described motion events. With a surprise (i.e. unexpected) recognition memory task, memory for manner and path components of these events was measured. Neither Dutch nor Turkish speakers’ memory for manner went above chance levels. However, we found a positive relation between path speech and path change detection: participants who described the path during encoding were more accurate at detecting changes to the path of an event during the memory task. In addition, the relation between path speech and path memory changed with native language: for Dutch speakers encoding path in speech was related to improved path memory, but for Turkish speakers no such relation existed. For both languages, co-speech gesture did not predict memory speakers. We discuss the implications of these findings for our understanding of the relations between speech, gesture, type of encoding in language and memory.
  • Troncoso Ruiz, A., Ernestus, M., & Broersma, M. (2019). Learning to produce difficult L2 vowels: The effects of awareness-rasing, exposure and feedback. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019) (pp. 1094-1098). Canberra, Australia: Australasian Speech Science and Technology Association Inc.
  • Tuinman, A. (2006). Overcompensation of /t/ reduction in Dutch by German/Dutch bilinguals. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 101-102).
  • Van Dooren, A., Tulling, M., Cournane, A., & Hacquard, V. (2019). Discovering modal polysemy: Lexical aspect might help. In M. Brown, & B. Dailey (Eds.), BUCLD 43: Proceedings of the 43rd annual Boston University Conference on Language Development (pp. 203-216). Sommerville, MA: Cascadilla Press.
  • Van Ooijen, B., Cutler, A., & Norris, D. (1991). Detection times for vowels versus consonants. In Eurospeech 91: Vol. 3 (pp. 1451-1454). Genova: Istituto Internazionale delle Comunicazioni.

    Abstract

    This paper reports two experiments with vowels and consonants as phoneme detection targets in real words. In the first experiment, two relatively distinct vowels were compared with two confusible stop consonants. Response times to the vowels were longer than to the consonants. Response times correlated negatively with target phoneme length. In the second, two relatively distinct vowels were compared with their corresponding semivowels. This time, the vowels were detected faster than the semivowels. We conclude that response time differences between vowels and stop consonants in this task may reflect differences between phoneme categories in the variability of tokens, both in the acoustic realisation of targets and in the' representation of targets by subjects.
  • Van Valin Jr., R. D. (1987). Aspects of the interaction of syntax and pragmatics: Discourse coreference mechanisms and the typology of grammatical systems. In M. Bertuccelli Papi, & J. Verschueren (Eds.), The pragmatic perspective: Selected papers from the 1985 International Pragmatics Conference (pp. 513-531). Amsterdam: Benjamins.
  • Van den Bos, E. J., & Poletiek, F. H. (2006). Implicit artificial grammar learning in adults and children. In R. Sun (Ed.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006) (pp. 2619). Austin, TX, USA: Cognitive Science Society.
  • Van de Weijer, J. (1997). Language input to a prelingual infant. In A. Sorace, C. Heycock, & R. Shillcock (Eds.), Proceedings of the GALA '97 conference on language acquisition (pp. 290-293). Edinburgh University Press.

    Abstract

    Pitch, intonation, and speech rate were analyzed in a collection of everyday speech heard by one Dutch infant between the ages of six and nine months. Components of each of these variables were measured in the speech of three adult speakers (mother, father, baby-sitter) when they addressed the infant, and when they addressed another adult. The results are in line with previously reported findings which are usually based on laboratory or prearranged settings: infant-directed speech in a natural setting exhibits more pitch variation, a larger number of simple intonation contours, and slower speech rate than does adult-directed speech.
  • Van Heuven, V. J., Haan, J., Janse, E., & Van der Torre, E. J. (1997). Perceptual identification of sentence type and the time-distribution of prosodic interrogativity markers in Dutch. In Proceedings of the ESCA Tutorial and Research Workshop on Intonation: Theory, Models and Applications, Athens, Greece, 1997 (pp. 317-320).

    Abstract

    Dutch distinguishes at least four sentence types: statements and questions, the latter type being subdivided into wh-questions (beginning with a question word), yes/no-questions (with inversion of subject and finite), and declarative questions (lexico-syntactically identical to statement). Acoustically, each of these (sub)types was found to have clearly distinct global F0-patterns, as well as a characteristic distribution of final rises [1,2]. The present paper explores the separate contribution of parameters of global downtrend and size of accent-lending pitch movements versus aspects of the terminal rise to the human identification of the four sentence (sub)types, at various positions in the time-course of the utterance. The results show that interrogativity in Dutch can be identified at an early point in the utterance. However, wh-questions are not distinct from statements.
  • Van Valin Jr., R. D. (1987). Pragmatics, island phenomena, and linguistic competence. In A. M. Farley, P. T. Farley, & K.-E. McCullough (Eds.), CLS 22. Papers from the parasession on pragmatics and grammatical theory (pp. 223-233). Chicago Linguistic Society.
  • Vosse, T., & Kempen, G. (1991). A hybrid model of human sentence processing: Parsing right-branching, center-embedded and cross-serial dependencies. In M. Tomita (Ed.), Proceedings of the Second International Workshop on Parsing Technologies.
  • Wagner, A., & Braun, A. (2003). Is voice quality language-dependent? Acoustic analyses based on speakers of three different languages. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 651-654). Adelaide: Causal Productions.
  • Wagner, M. A., Broersma, M., McQueen, J. M., & Lemhöfer, K. (2019). Imitating speech in an unfamiliar language and an unfamiliar non-native accent in the native language. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1362-1366). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    This study concerns individual differences in speech imitation ability and the role that lexical representations play in imitation. We examined 1) whether imitation of sounds in an unfamiliar language (L0) is related to imitation of sounds in an unfamiliar
    non-native accent in the speaker’s native language (L1) and 2) whether it is easier or harder to imitate speech when you know the words to be imitated. Fifty-nine native Dutch speakers imitated words with target vowels in Basque (/a/ and /e/) and Greekaccented
    Dutch (/i/ and /u/). Spectral and durational
    analyses of the target vowels revealed no relationship between the success of L0 and L1 imitation and no difference in performance between tasks (i.e., L1
    imitation was neither aided nor blocked by lexical knowledge about the correct pronunciation). The results suggest instead that the relationship of the vowels to native phonological categories plays a bigger role in imitation
  • Warner, N., & Weber, A. (2002). Stop epenthesis at syllable boundaries. In J. H. L. Hansen, & B. Pellom (Eds.), 7th International Conference on Spoken Language Processing (ICSLP2002 - INTERSPEECH 2002) (pp. 1121-1124). ISCA Archive.

    Abstract

    This paper investigates the production and perception of epenthetic stops at syllable boundaries in Dutch and compares the experimental data with lexical statistics for Dutch and English. This extends past work on epenthesis in coda position [1]. The current work is particularly informative regarding the question of phonotactic constraints’ influence on parsing of speech variability.
  • Warner, N., Jongman, A., & Mücke, D. (2002). Variability in direction of dorsal movement during production of /l/. In J. H. L. Hansen, & B. Pellom (Eds.), 7th International Conference on Spoken Language Processing (ICSLP2002 - INTERSPEECH 2002) (pp. 1089-1092). ISCA Archive.

    Abstract

    This paper presents articulatory data on the production of /l/ in various environments in Dutch, and shows that the direction of movement of the tongue dorsum varies across environments. This makes it impossible to measure tongue position at the peak of the dorsal gesture. We argue for an alternative method in such cases: measurement of position of one articulator at a time point defined by the gesture of another. We present new data measured this way which confirms a previous finding on the articulation of Dutch /l/.
  • Weber, A., & Smits, R. (2003). Consonant and vowel confusion patterns by American English listeners. In M. J. Solé, D. Recasens, & J. Romero (Eds.), Proceedings of the 15th International Congress of Phonetic Sciences.

    Abstract

    This study investigated the perception of American English phonemes by native listeners. Listeners identified either the consonant or the vowel in all possible English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios (0 dB, 8 dB, and 16 dB). Effects of syllable position, signal-to-noise ratio, and articulatory features on vowel and consonant identification are discussed. The results constitute the largest source of data that is currently available on phoneme confusion patterns of American English phonemes by native listeners.
  • Weber, A., & Smits, R. (2003). Consonant and vowel confusion patterns by American English listeners. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 1437-1440). Adelaide: Causal Productions.

    Abstract

    This study investigated the perception of American English phonemes by native listeners. Listeners identified either the consonant or the vowel in all possible English CV and VC syllables. The syllables were embedded in multispeaker babble at three signalto-noise ratios (0 dB, 8 dB, and 16 dB). Effects of syllable position, signal-to-noise ratio, and articulatory features on vowel and consonant identification are discussed. The results constitute the largest source of data that is currently available on phoneme confusion patterns of American English phonemes by native listeners.
  • Weber, A. (1998). Listening to nonnative language which violates native assimilation rules. In D. Duez (Ed.), Proceedings of the European Scientific Communication Association workshop: Sound patterns of Spontaneous Speech (pp. 101-104).

    Abstract

    Recent studies using phoneme detection tasks have shown that spoken-language processing is neither facilitated nor interfered with by optional assimilation, but is inhibited by violation of obligatory assimilation. Interpretation of these results depends on an assessment of their generality, specifically, whether they also obtain when listeners are processing nonnative language. Two separate experiments are presented in which native listeners of German and native listeners of Dutch had to detect a target fricative in legal monosyllabic Dutch nonwords. All of the nonwords were correct realisations in standard Dutch. For German listeners, however, half of the nonwords contained phoneme strings which violate the German fricative assimilation rule. Whereas the Dutch listeners showed no significant effects, German listeners detected the target fricative faster when the German fricative assimilation was violated than when no violation occurred. The results might suggest that violation of assimilation rules does not have to make processing more difficult per se.
  • Weber, A., & Paris, G. (2004). The origin of the linguistic gender effect in spoken-word recognition: Evidence from non-native listening. In K. Forbus, D. Gentner, & T. Tegier (Eds.), Proceedings of the 26th Annual Meeting of the Cognitive Science Society. Mahwah, NJ: Erlbaum.

    Abstract

    Two eye-tracking experiments examined linguistic gender effects in non-native spoken-word recognition. French participants, who knew German well, followed spoken instructions in German to click on pictures on a computer screen (e.g., Wo befindet sich die Perle, “where is the pearl”) while their eye movements were monitored. The name of the target picture was preceded by a gender-marked article in the instructions. When a target and a competitor picture (with phonologically similar names) were of the same gender in both German and French, French participants fixated competitor pictures more than unrelated pictures. However, when target and competitor were of the same gender in German but of different gender in French, early fixations to the competitor picture were reduced. Competitor activation in the non-native language was seemingly constrained by native gender information. German listeners showed no such viewing time difference. The results speak against a form-based account of the linguistic gender effect. They rather support the notion that the effect originates from the grammatical level of language processing.
  • Weber, A., & Mueller, K. (2004). Word order variation in German main clauses: A corpus analysis. In Proceedings of the 20th International Conference on Computational Linguistics.

    Abstract

    In this paper, we present empirical data from a corpus study on the linear order of subjects and objects in German main clauses. The aim was to establish the validity of three well-known ordering constraints: given complements tend to occur before new complements, definite before indefinite, and pronoun before full noun phrase complements. Frequencies of occurrences were derived for subject-first and object-first sentences from the German Negra corpus. While all three constraints held on subject-first sentences, results for object-first sentences varied. Our findings suggest an influence of grammatical functions on the ordering of verb complements.
  • Widlok, T. (2006). Two ways of looking at a Mangetti grove. In A. Takada (Ed.), Proceedings of the workshop: Landscape and society (pp. 11-16). Kyoto: 21st Century Center of Excellence Program.
  • Wittek, A. (1998). Learning verb meaning via adverbial modification: Change-of-state verbs in German and the adverb "wieder" again. In A. Greenhill, M. Hughes, H. Littlefield, & H. Walsh (Eds.), Proceedings of the 22nd Annual Boston University Conference on Language Development (pp. 779-790). Somerville, MA: Cascadilla Press.
  • Wittenburg, P. (2004). The IMDI metadata concept. In S. F. Ferreira (Ed.), Workingmaterial on Building the LR&E Roadmap: Joint COCOSDA and ICCWLRE Meeting, (LREC2004). Paris: ELRA - European Language Resources Association.
  • Wittenburg, P., Brugman, H., Broeder, D., & Russel, A. (2004). XML-based language archiving. In Workshop Proceedings on XML-based Richly Annotaded Corpora (LREC2004) (pp. 63-69). Paris: ELRA - European Language Resources Association.
  • Wittenburg, P., Gulrajani, G., Broeder, D., & Uneson, M. (2004). Cross-disciplinary integration of metadata descriptions. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC2004) (pp. 113-116). Paris: ELRA - European Language Resources Association.
  • Wittenburg, P., Kita, S., & Brugman, H. (2002). Crosslinguistic studies of multimodal communication.
  • Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., & Sloetjes, H. (2006). ELAN: a professional framework for multimodality research. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 1556-1559).

    Abstract

    Utilization of computer tools in linguistic research has gained importance with the maturation of media frameworks for the handling of digital audio and video. The increased use of these tools in gesture, sign language and multimodal interaction studies has led to stronger requirements on the flexibility, the efficiency and in particular the time accuracy of annotation tools. This paper describes the efforts made to make ELAN a tool that meets these requirements, with special attention to the developments in the area of time accuracy. In subsequent sections an overview will be given of other enhancements in the latest versions of ELAN, that make it a useful tool in multimodality research.
  • Wittenburg, P., Johnson, H., Buchhorn, M., Brugman, H., & Broeder, D. (2004). Architecture for distributed language resource management and archiving. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC2004) (pp. 361-364). Paris: ELRA - European Language Resources Association.
  • Wittenburg, P., Peters, W., & Drude, S. (2002). Analysis of lexical structures from field linguistics and language engineering. In M. R. González, & C. P. S. Araujo (Eds.), Third international conference on language resources and evaluation (pp. 682-686). Paris: European Language Resources Association.

    Abstract

    Lexica play an important role in every linguistic discipline. We are confronted with many types of lexica. Depending on the type of lexicon and the language we are currently faced with a large variety of structures from very simple tables to complex graphs, as was indicated by a recent overview of structures found in dictionaries from field linguistics and language engineering. It is important to assess these differences and aim at the integration of lexical resources in order to improve lexicon creation, exchange and reuse. This paper describes the first step towards the integration of existing structures and standards into a flexible abstract model.
  • Wittenburg, P., Broeder, D., Klein, W., Levinson, S. C., & Romary, L. (2006). Foundations of modern language resource archives. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 625-628).

    Abstract

    A number of serious reasons will convince an increasing amount of researchers to store their relevant material in centers which we will call "language resource archives". They combine the duty of taking care of long-term preservation as well as the task to give access to their material to different user groups. Access here is meant in the sense that an active interaction with the data will be made possible to support the integration of new data, new versions or commentaries of all sort. Modern Language Resource Archives will have to adhere to a number of basic principles to fulfill all requirements and they will have to be involved in federations to create joint language resource domains making it even more simple for the researchers to access the data. This paper makes an attempt to formulate the essential pillars language resource archives have to adhere to.
  • Wittenburg, P., & Broeder, D. (2002). Metadata overview and the semantic web. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics. Paris: European Language Resources Association.

    Abstract

    The increasing quantity and complexity of language resources leads to new management problems for those that collect and those that need to preserve them. At the same time the desire to make these resources available on the Internet demands an efficient way characterizing their properties to allow discovery and re-use. The use of metadata is seen as a solution for both these problems. However, the question is what specific requirements there are for the specific domain and if these are met by existing frameworks. Any possible solution should be evaluated with respect to its merit for solving the domain specific problems but also with respect to its future embedding in “global” metadata frameworks as part of the Semantic Web activities.
  • Wittenburg, P., Peters, W., & Broeder, D. (2002). Metadata proposals for corpora and lexica. In M. Rodriguez González, & C. Paz Suárez Araujo (Eds.), Third international conference on language resources and evaluation (pp. 1321-1326). Paris: European Language Resources Association.
  • Wittenburg, P., Mosel, U., & Dwyer, A. (2002). Methods of language documentation in the DOBES program. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 36-42). Paris: European Language Resources Association.
  • Wolf, M. C., Smith, A. C., Meyer, A. S., & Rowland, C. F. (2019). Modality effects in vocabulary acquisition. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 1212-1218). Montreal, QB: Cognitive Science Society.

    Abstract

    It is unknown whether modality affects the efficiency with which humans learn novel word forms and their meanings, with previous studies reporting both written and auditory advantages. The current study implements controls whose absence in previous work likely offers explanation for such contradictory findings. In two novel word learning experiments, participants were trained and tested on pseudoword - novel object pairs, with controls on: modality of test, modality of meaning, duration of exposure and transparency of word form. In both experiments word forms were presented in either their written or spoken form, each paired with a pictorial meaning (novel object). Following a 20-minute filler task, participants were tested on their ability to identify the picture-word form pairs on which they were trained. A between subjects design generated four participant groups per experiment 1) written training, written test; 2) written training, spoken test; 3) spoken training, written test; 4) spoken training, spoken test. In Experiment 1 the written stimulus was presented for a time period equal to the duration of the spoken form. Results showed that when the duration of exposure was equal, participants displayed a written training benefit. Given words can be read faster than the time taken for the spoken form to unfold, in Experiment 2 the written form was presented for 300 ms, sufficient time to read the word yet 65% shorter than the duration of the spoken form. No modality effect was observed under these conditions, when exposure to the word form was equivalent. These results demonstrate, at least for proficient readers, that when exposure to the word form is controlled across modalities the efficiency with which word form-meaning associations are learnt does not differ. Our results therefore suggest that, although we typically begin as aural-only word learners, we ultimately converge on developing learning mechanisms that learn equally efficiently from both written and spoken materials.
  • Zwitserlood, I. (2002). The complex structure of ‘simple’ signs in NGT. In J. Van Koppen, E. Thrift, E. Van der Torre, & M. Zimmermann (Eds.), Proceedings of ConSole IX (pp. 232-246).

    Abstract

    In this paper, I argue that components in a set of simple signs in Nederlandse Gebarentaal (also called Sign Language of the Netherlands; henceforth: NGT), i.e. hand configuration (including orientation), movement and place of articulation, can also have morphological status. Evidence for this is provided by: firstly, the fact that handshape, orientation, movement and place of articulation show regular meaningful patterns in signs, which patterns also occur in newly formed signs, and secondly, the gradual change of formerly noninflecting predicates into inflectional predicates. The morphological complexity of signs can best be accounted for in autosegmental morphological templates.

Share this page