Publications

Displaying 101 - 200 of 227
  • Janse, E., Van der Werff, M., & Quené, H. (2007). Listening to fast speech: Aging and sentence context. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 681-684). Dudweiler: Pirrot.

    Abstract

    In this study we investigated to what extent a meaningful sentence context facilitates spoken word processing in young and older listeners if listening is made taxing by time-compressing the speech. Even though elderly listeners have been shown to benefit more from sentence context in difficult listening conditions than young listeners, time compression of speech may interfere with semantic comprehension, particularly in older listeners because of cognitive slowing. The results of a target detection experiment showed that, unlike young listeners who showed facilitation by context at both rates, elderly listeners showed context facilitation at the intermediate, but not at the fastest rate. This suggests that semantic interpretation lags behind target identification.
  • Jesse, A., & McQueen, J. M. (2007). Prelexical adjustments to speaker idiosyncracies: Are they position-specific? In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 1597-1600). Adelaide: Causal Productions.

    Abstract

    Listeners use lexical knowledge to adjust their prelexical representations of speech sounds in response to the idiosyncratic pronunciations of particular speakers. We used an exposure-test paradigm to investigate whether this type of perceptual learning transfers across syllabic positions. No significant learning effect was found in Experiment 1, where exposure sounds were onsets and test sounds were codas. Experiments 2-4 showed that there was no learning even when both exposure and test sounds were onsets. But a trend was found when exposure sounds were codas and test sounds were onsets (Experiment 5). This trend was smaller than the robust effect previously found for the coda-to-coda case. These findings suggest that knowledge about idiosyncratic pronunciations may be position specific: Knowledge about how a speaker produces sounds in one position, if it can be acquired at all, influences perception of sounds in that position more strongly than of sounds in another position.
  • Jesse, A., McQueen, J. M., & Page, M. (2007). The locus of talker-specific effects in spoken-word recognition. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1921-1924). Dudweiler: Pirrot.

    Abstract

    Words repeated in the same voice are better recognized than when they are repeated in a different voice. Such findings have been taken as evidence for the storage of talker-specific lexical episodes. But results on perceptual learning suggest that talker-specific adjustments concern sublexical representations. This study thus investigates whether voice-specific repetition effects in auditory lexical decision are lexical or sublexical. The same critical set of items in Block 2 were, depending on materials in Block 1, either same-voice or different-voice word repetitions, new words comprising re-orderings of phonemes used in the same voice in Block 1, or new words with previously unused phonemes. Results show a benefit for words repeated by the same talker, and a smaller benefit for words consisting of phonemes repeated by the same talker. Talker-specific information thus appears to influence word recognition at multiple representational levels.
  • Jesse, A., & McQueen, J. M. (2007). Visual lexical stress information in audiovisual spoken-word recognition. In J. Vroomen, M. Swerts, & E. Krahmer (Eds.), Proceedings of the International Conference on Auditory-Visual Speech Processing 2007 (pp. 162-166). Tilburg: University of Tilburg.

    Abstract

    Listeners use suprasegmental auditory lexical stress information to resolve the competition words engage in during spoken-word recognition. The present study investigated whether (a) visual speech provides lexical stress information, and, more importantly, (b) whether this visual lexical stress information is used to resolve lexical competition. Dutch word pairs that differ in the lexical stress realization of their first two syllables, but not segmentally (e.g., 'OCtopus' and 'okTOber'; capitals marking primary stress) served as auditory-only, visual-only, and audiovisual speech primes. These primes either matched (e.g., 'OCto-'), mismatched (e.g., 'okTO-'), or were unrelated to (e.g., 'maCHI-') a subsequent printed target (octopus), which participants had to make a lexical decision to. To the degree that visual speech contains lexical stress information, lexical decisions to printed targets should be modulated through the addition of visual speech. Results show, however, no evidence for a role of visual lexical stress information in audiovisual spoken-word recognition.
  • Jesse, A., & Johnson, E. K. (2008). Audiovisual alignment in child-directed speech facilitates word learning. In Proceedings of the International Conference on Auditory-Visual Speech Processing (pp. 101-106). Adelaide, Aust: Causal Productions.

    Abstract

    Adult-to-child interactions are often characterized by prosodically-exaggerated speech accompanied by visually captivating co-speech gestures. In a series of adult studies, we have shown that these gestures are linked in a sophisticated manner to the prosodic structure of adults' utterances. In the current study, we use the Preferential Looking Paradigm to demonstrate that two-year-olds can use the alignment of these gestures to speech to deduce the meaning of words.
  • Joo, H., Jang, J., Kim, S., Cho, T., & Cutler, A. (2019). Prosodic structural effects on coarticulatory vowel nasalization in Australian English in comparison to American English. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 835-839). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    This study investigates effects of prosodic factors (prominence, boundary) on coarticulatory Vnasalization in Australian English (AusE) in CVN and NVC in comparison to those in American English
    (AmE). As in AmE, prominence was found to
    lengthen N, but to reduce V-nasalization, enhancing N’s nasality and V’s orality, respectively (paradigmatic contrast enhancement). But the prominence effect in CVN was more robust than that in AmE. Again similar to findings in AmE, boundary
    induced a reduction of N-duration and V-nasalization phrase-initially (syntagmatic contrast enhancement), and increased the nasality of both C and V phrasefinally.
    But AusE showed some differences in terms
    of the magnitude of V nasalization and N duration. The results suggest that the linguistic contrast enhancements underlie prosodic-structure modulation of coarticulatory V-nasalization in
    comparable ways across dialects, while the fine phonetic detail indicates that the phonetics-prosody interplay is internalized in the individual dialect’s phonetic grammar.
  • Jordan, F. (2007). A comparative phylogenetic approach to Austronesian cultural evolution. PhD Thesis, University College London, London.
  • Kelly, S. D., & Ozyurek, A. (Eds.). (2007). Gesture, language, and brain [Special Issue]. Brain and Language, 101(3).
  • Kempen, G., & Harbusch, K. (1998). A 'tree adjoining' grammar without adjoining: The case of scrambling in German. In Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4).
  • Kempen, G., & Hoenkamp, E. (1982). Incremental sentence generation: Implications for the structure of a syntactic processor. In J. Horecký (Ed.), COLING 82. Proceedings of the Ninth International Conference on Computational Linguistics, Prague, July 5-10, 1982 (pp. 151-156). Amsterdam: North-Holland.

    Abstract

    Human speakers often produce sentences incrementally. They can start speaking having in mind only a fragmentary idea of what they want to say, and while saying this they refine the contents underlying subsequent parts of the utterance. This capability imposes a number of constraints on the design of a syntactic processor. This paper explores these constraints and evaluates some recent computational sentence generators from the perspective of incremental production.
  • Kemps-Snijders, M., Klassmann, A., Zinn, C., Berck, P., Russel, A., & Wittenburg, P. (2008). Exploring and enriching a language resource archive via the web. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    The ”download first, then process paradigm” is still the predominant working method amongst the research community. The web-based paradigm, however, offers many advantages from a tool development and data management perspective as they allow a quick adaptation to changing research environments. Moreover, new ways of combining tools and data are increasingly becoming available and will eventually enable a true web-based workflow approach, thus challenging the ”download first, then process” paradigm. The necessary infrastructure for managing, exploring and enriching language resources via the Web will need to be delivered by projects like CLARIN and DARIAH
  • Kemps-Snijders, M., Zinn, C., Ringersma, J., & Windhouwer, M. (2008). Ensuring semantic interoperability on lexical resources. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    In this paper, we describe a unifying approach to tackle data heterogeneity issues for lexica and related resources. We present LEXUS, our software that implements the Lexical Markup Framework (LMF) to uniformly describe and manage lexica of different structures. LEXUS also makes use of a central Data Category Registry (DCR) to address terminological issues with regard to linguistic concepts as well as the handling of working and object languages. Finally, we report on ViCoS, a LEXUS extension, providing support for the definition of arbitrary semantic relations between lexical entries or parts thereof.
  • Kemps-Snijders, M., Windhouwer, M., Wittenburg, P., & Wright, S. E. (2008). ISOcat: Corralling data categories in the wild. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    To achieve true interoperability for valuable linguistic resources different levels of variation need to be addressed. ISO Technical Committee 37, Terminology and other language and content resources, is developing a Data Category Registry. This registry will provide a reusable set of data categories. A new implementation, dubbed ISOcat, of the registry is currently under construction. This paper shortly describes the new data model for data categories that will be introduced in this implementation. It goes on with a sketch of the standardization process. Completed data categories can be reused by the community. This is done by either making a selection of data categories using the ISOcat web interface, or by other tools which interact with the ISOcat system using one of its various Application Programming Interfaces. Linguistic resources that use data categories from the registry should include persistent references, e.g. in the metadata or schemata of the resource, which point back to their origin. These data category references can then be used to determine if two or more resources share common semantics, thus providing a level of interoperability close to the source data and a promising layer for semantic alignment on higher levels
  • Khemlani, S., Leslie, S.-J., Glucksberg, S., & Rubio-Fernández, P. (2007). Do ducks lay eggs? How people interpret generic assertions. In D. S. McNamara, & J. G. Trafton (Eds.), Proceedings of the 29th Annual Conference of the Cognitive Science Society (CogSci 2007). Austin, TX: Cognitive Science Society.
  • Kita, S., van Gijn, I., & van der Hulst, H. (1998). Movement phases in signs and co-speech gestures, and their transcription by human coders. In Gesture and Sign-Language in Human-Computer Interaction (Lecture Notes in Artificial Intelligence - LNCS Subseries, Vol. 1371) (pp. 23-35). Berlin, Germany: Springer-Verlag.

    Abstract

    The previous literature has suggested that the hand movement in co-speech gestures and signs consists of a series of phases with qualitatively different dynamic characteristics. In this paper, we propose a syntagmatic rule system for movement phases that applies to both co-speech gestures and signs. Descriptive criteria for the rule system were developed for the analysis video-recorded continuous production of signs and gesture. It involves segmenting a stream of body movement into phases and identifying different phase types. Two human coders used the criteria to analyze signs and cospeech gestures that are produced in natural discourse. It was found that the criteria yielded good inter-coder reliability. These criteria can be used for the technology of automatic recognition of signs and co-speech gestures in order to segment continuous production and identify the potentially meaningbearing phase.
  • Klein, W., & Von Stutterheim, C. (Eds.). (2007). Sprachliche Perspektivierung [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 145.
  • Klein, W. (Ed.). (1989). Kindersprache [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (73).
  • Klein, W., & Schnell, R. (Eds.). (2008). Literaturwissenschaft und Linguistik [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (150).
  • Klein, W. (Ed.). (1983). Intonation [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (49).
  • Klein, W. (Ed.). (2008). Ist Schönheit messbar? [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 152.
  • Klein, W. (Ed.). (1998). Kaleidoskop [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (112).
  • Klein, W. (Ed.). (1982). Zweitspracherwerb [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (45).
  • Kooijman, V. (2007). Continuous-speech segmentation at the beginning of language acquisition: Electrophysiological evidence. PhD Thesis, Radboud University Nijmegen, Nijmegen.

    Abstract

    Word segmentation, or detecting word boundaries in continuous speech, is not an easy task. Spoken language does not contain silences to indicate word boundaries and words partly overlap due to coarticalution. Still, adults listening to their native language perceive speech as individual words. They are able to combine different distributional cues in the language, such as the statistical distribution of sounds and metrical cues, with lexical information, to efficiently detect word boundaries. Infants in the first year of life do not command these cues. However, already between seven and ten months of age, before they know word meaning, infants learn to segment words from speech. This important step in language acquisition is the topic of this dissertation. In chapter 2, the first Event Related Brain Potential (ERP) study on word segmentation in Dutch ten-month-olds is discussed. The results show that ten-month-olds can already segment words with a strong-weak stress pattern from speech and they need roughly the first half of a word to do so. Chapter 3 deals with segmentation of words beginning with a weak syllable, as a considerable number of words in Dutch do not follow the predominant strong-weak stress pattern. The results show that ten-month-olds still largely rely on the strong syllable in the language, and do not show an ERP response to the initial weak syllable. In chapter 4, seven-month-old infants' segmentation of strong-weak words was studied. An ERP response was found to strong-weak words presented in sentences. However, a behavioral response was not found in an additional Headturn Preference Procedure study. There results suggest that the ERP response is a precursor to the behavioral response that infants show at a later age.

    Additional information

    full text via Radboud Repository
  • Kuperman, V. (2008). Lexical processing of morphologically complex words: An information-theoretical perspective. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Kuzla, C., & Ernestus, M. (2007). Prosodic conditioning of phonetic detail of German plosives. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 461-464). Dudweiler: Pirrot.

    Abstract

    The present study investigates the influence of prosodic structure on the fine-grained phonetic details of German plosives which also cue the phonological fortis-lenis contrast. Closure durations were found to be longer at higher prosodic boundaries. There was also less glottal vibration in lenis plosives at higher prosodic boundaries. Voice onset time in lenis plosives was not affected by prosody. In contrast, for the fortis plosives VOT decreased at higher boundaries, as did the maximal intensity of the release. These results demonstrate that the effects of prosody on different phonetic cues can go into opposite directions, but are overall constrained by the need to maintain phonological contrasts. While prosodic effects on some cues are compatible with a ‘fortition’ account of prosodic strengthening or with a general feature enhancement explanation, the effects on others enhance paradigmatic contrasts only within a given prosodic position.
  • Lai, V. T., Chang, M., Duffield, C., Hwang, J., Xue, N., & Palmer, M. (2007). Defining a methodology for mapping Chinese and English sense inventories. In Proceedings of the 8th Chinese Lexical Semantics Workshop 2007 (CLSW 2007). The Hong Kong Polytechnic University, Hong Kong, May 21-23 (pp. 59-65).

    Abstract

    In this study, we explored methods for linking Chinese and English sense inventories using two opposing approaches: creating links (1) bottom-up: by starting at the finer-grained sense level then proceeding to the verb subcategorization frames and (2) top-down: by starting directly with the more coarse-grained frame levels. The sense inventories for linking include pre-existing corpora, such as English Propbank (Palmer, Gildea, and Kingsbury, 2005), Chinese Propbank (Xue and Palmer, 2004) and English WordNet (Fellbaum, 1998) and newly created corpora, the English and Chinese Sense Inventories from DARPA-GALE OntoNotes. In the linking task, we selected a group of highly frequent and polysemous communication verbs, including say, ask, talk, and speak in English, and shuo, biao-shi, jiang, and wen in Chinese. We found that with the bottom-up method, although speakers of both languages agreed on the links between senses, the subcategorization frames of the corresponding senses did not match consistently. With the top-down method, if the verb frames match in both languages, their senses line up more quickly to each other. The results indicate that the top-down method is more promising in linking English and Chinese sense inventories.
  • Lenkiewicz, P., Pereira, M., Freire, M., & Fernandes, J. (2008). Accelerating 3D medical image segmentation with high performance computing. In Proceedings of the IEEE International Workshops on Image Processing Theory, Tools and Applications - IPT (pp. 1-8).

    Abstract

    Digital processing of medical images has helped physicians and patients during past years by allowing examination and diagnosis on a very precise level. Nowadays possibly the biggest deal of support it can offer for modern healthcare is the use of high performance computing architectures to treat the huge amounts of data that can be collected by modern acquisition devices. This paper presents a parallel processing implementation of an image segmentation algorithm that operates on a computer cluster equipped with 10 processing units. Thanks to well-organized distribution of the workload we manage to significantly shorten the execution time of the developed algorithm and reach a performance gain very close to linear.
  • Levelt, W. J. M. (1991). Lexical access in speech production: Stages versus cascading. In H. Peters, W. Hulstijn, & C. Starkweather (Eds.), Speech motor control and stuttering (pp. 3-10). Amsterdam: Excerpta Medica.
  • Levelt, W. J. M. (1983). The speaker's organization of discourse. In Proceedings of the XIIIth International Congress of Linguists (pp. 278-290).
  • Liu, S., & Zhang, Y. (2019). Why some verbs are harder to learn than others – A micro-level analysis of everyday learning contexts for early verb learning. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2173-2178). Montreal, QB: Cognitive Science Society.

    Abstract

    Verb learning is important for young children. While most
    previous research has focused on linguistic and conceptual
    challenges in early verb learning (e.g. Gentner, 1982, 2006),
    the present paper examined early verb learning at the
    attentional level and quantified the input for early verb learning
    by measuring verb-action co-occurrence statistics in parent-
    child interaction from the learner’s perspective. To do so, we
    used head-mounted eye tracking to record fine-grained
    multimodal behaviors during parent-infant joint play, and
    analyzed parent speech, parent and infant action, and infant
    attention at the moments when parents produced verb labels.
    Our results show great variability across different action verbs,
    in terms of frequency of verb utterances, frequency of
    corresponding actions related to verb meanings, and infants’
    attention to verbs and actions, which provide new insights on
    why some verbs are harder to learn than others.
  • Lucas, C., Griffiths, T., Xu, F., & Fawcett, C. (2008). A rational model of preference learning and choice prediction by children. In D. Koller, Y. Bengio, D. Schuurmans, L. Bottou, & A. Culotta (Eds.), Advances in Neural Information Processing Systems.

    Abstract

    Young children demonstrate the ability to make inferences about the preferences of other agents based on their choices. However, there exists no overarching account of what children are doing when they learn about preferences or how they use that knowledge. We use a rational model of preference learning, drawing on ideas from economics and computer science, to explain the behavior of children in several recent experiments. Specifically, we show how a simple econometric model can be extended to capture two- to four-year-olds’ use of statistical information in inferring preferences, and their generalization of these preferences.
  • Magyari, L., & De Ruiter, J. P. (2008). Timing in conversation: The anticipation of turn endings. In J. Ginzburg, P. Healey, & Y. Sato (Eds.), Proceedings of the 12th Workshop on the Semantics and Pragmatics Dialogue (pp. 139-146). London: King's college.

    Abstract

    We examined how communicators can switch between speaker and listener role with such accurate timing. During conversations, the majority of role transitions happens with a gap or overlap of only a few hundred milliseconds. This suggests that listeners can predict when the turn of the current speaker is going to end. Our hypothesis is that listeners know when a turn ends because they know how it ends. Anticipating the last words of a turn can help the next speaker in predicting when the turn will end, and also in anticipating the content of the turn, so that an appropriate response can be prepared in advance. We used the stimuli material of an earlier experiment (De Ruiter, Mitterer & Enfield, 2006), in which subjects were listening to turns from natural conversations and had to press a button exactly when the turn they were listening to ended. In the present experiment, we investigated if the subjects can complete those turns when only an initial fragment of the turn is presented to them. We found that the subjects made better predictions about the last words of those turns that had more accurate responses in the earlier button press experiment.
  • Mai, F., Galke, L., & Scherp, A. (2019). CBOW is not all you need: Combining CBOW with the compositional matrix space model. In Proceedings of the Seventh International Conference on Learning Representations (ICLR 2019). OpenReview.net.

    Abstract

    Continuous Bag of Words (CBOW) is a powerful text embedding method. Due to its strong capabilities to encode word content, CBOW embeddings perform well on a wide range of downstream tasks while being efficient to compute. However, CBOW is not capable of capturing the word order. The reason is that the computation of CBOW's word embeddings is commutative, i.e., embeddings of XYZ and ZYX are the same. In order to address this shortcoming, we propose a
    learning algorithm for the Continuous Matrix Space Model, which we call Continual Multiplication of Words (CMOW). Our algorithm is an adaptation of word2vec, so that it can be trained on large quantities of unlabeled text. We empirically show that CMOW better captures linguistic properties, but it is inferior to CBOW in memorizing word content. Motivated by these findings, we propose a hybrid model that combines the strengths of CBOW and CMOW. Our results show that the hybrid CBOW-CMOW-model retains CBOW's strong ability to memorize word content while at the same time substantially improving its ability to encode other linguistic information by 8%. As a result, the hybrid also performs better on 8 out of 11 supervised downstream tasks with an average improvement of 1.2%.
  • Majid, A., & Bowerman, M. (Eds.). (2007). Cutting and breaking events: A crosslinguistic perspective [Special Issue]. Cognitive Linguistics, 18(2).

    Abstract

    This special issue of Cognitive Linguistics explores the linguistic encoding of events of cutting and breaking. In this article we first introduce the project on which it is based by motivating the selection of this conceptual domain, presenting the methods of data collection used by all the investigators, and characterizing the language sample. We then present a new approach to examining crosslinguistic similarities and differences in semantic categorization. Applying statistical modeling to the descriptions of cutting and breaking events elicited from speakers of all the languages, we show that although there is crosslinguistic variation in the number of distinctions made and in the placement of category boundaries, these differences take place within a strongly constrained semantic space: across languages, there is a surprising degree of consensus on the partitioning of events in this domain. In closing, we compare our statistical approach with more conventional semantic analyses, and show how an extensional semantic typological approach like the one illustrated here can help illuminate the intensional distinctions made by languages.
  • Malaisé, V., Gazendam, L., & Brugman, H. (2007). Disambiguating automatic semantic annotation based on a thesaurus structure. In Proceedings of TALN 2007.
  • Mamus, E., Rissman, L., Majid, A., & Ozyurek, A. (2019). Effects of blindfolding on verbal and gestural expression of path in auditory motion events. In A. K. Goel, C. M. Seifert, & C. C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2275-2281). Montreal, QB: Cognitive Science Society.

    Abstract

    Studies have claimed that blind people’s spatial representations are different from sighted people, and blind people display superior auditory processing. Due to the nature of auditory and haptic information, it has been proposed that blind people have spatial representations that are more sequential than sighted people. Even the temporary loss of sight—such as through blindfolding—can affect spatial representations, but not much research has been done on this topic. We compared blindfolded and sighted people’s linguistic spatial expressions and non-linguistic localization accuracy to test how blindfolding affects the representation of path in auditory motion events. We found that blindfolded people were as good as sighted people when localizing simple sounds, but they outperformed sighted people when localizing auditory motion events. Blindfolded people’s path related speech also included more sequential, and less holistic elements. Our results indicate that even temporary loss of sight influences spatial representations of auditory motion events
  • Marcoux, K., & Ernestus, M. (2019). Differences between native and non-native Lombard speech in terms of pitch range. In M. Ochmann, M. Vorländer, & J. Fels (Eds.), Proceedings of the ICA 2019 and EAA Euroregio. 23rd International Congress on Acoustics, integrating 4th EAA Euroregio 2019 (pp. 5713-5720). Berlin: Deutsche Gesellschaft für Akustik.

    Abstract

    Lombard speech, speech produced in noise, is acoustically different from speech produced in quiet (plain speech) in several ways, including having a higher and wider F0 range (pitch). Extensive research on native Lombard speech does not consider that non-natives experience a higher cognitive load while producing
    speech and that the native language may influence the non-native speech. We investigated pitch range in plain and Lombard speech in native and non-natives.
    Dutch and American-English speakers read contrastive question-answer pairs in quiet and in noise in English, while the Dutch also read Dutch sentence pairs. We found that Lombard speech is characterized by a wider pitch range than plain speech, for all speakers (native English, non-native English, and native Dutch).
    This shows that non-natives also widen their pitch range in Lombard speech. In sentences with early-focus, we see the same increase in pitch range when going from plain to Lombard speech in native and non-native English, but a smaller increase in native Dutch. In sentences with late-focus, we see the biggest increase for the native English, followed by non-native English and then native Dutch. Together these results indicate an effect of the native language on non-native Lombard speech.
  • Marcoux, K., & Ernestus, M. (2019). Pitch in native and non-native Lombard speech. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019) (pp. 2605-2609). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Lombard speech, speech produced in noise, is
    typically produced with a higher fundamental
    frequency (F0, pitch) compared to speech in quiet. This paper examined the potential differences in native and non-native Lombard speech by analyzing median pitch in sentences with early- or late-focus produced in quiet and noise. We found an increase in pitch in late-focus sentences in noise for Dutch speakers in both English and Dutch, and for American-English speakers in English. These results
    show that non-native speakers produce Lombard speech, despite their higher cognitive load. For the early-focus sentences, we found a difference between the Dutch and the American-English speakers. Whereas the Dutch showed an increased F0 in noise
    in English and Dutch, the American-English speakers did not in English. Together, these results suggest that some acoustic characteristics of Lombard speech, such as pitch, may be language-specific, potentially
    resulting in the native language influencing the non-native Lombard speech.
  • Maslowski, M. (2019). Fast speech can sound slow: Effects of contextual speech rate on word recognition. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • McCafferty, S. G., & Gullberg, M. (Eds.). (2008). Gesture and SLA: Toward an integrated approach [Special Issue]. Studies in Second Language Acquisition, 30(2).
  • McQueen, J. M., & Cutler, A. (1998). Spotting (different kinds of) words in (different kinds of) context. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2791-2794). Sydney: ICSLP.

    Abstract

    The results of a word-spotting experiment are presented in which Dutch listeners tried to spot different types of bisyllabic Dutch words embedded in different types of nonsense contexts. Embedded verbs were not reliably harder to spot than embedded nouns; this suggests that nouns and verbs are recognised via the same basic processes. Iambic words were no harder to spot than trochaic words, suggesting that trochaic words are not in principle easier to recognise than iambic words. Words were harder to spot in consonantal contexts (i.e., contexts which themselves could not be words) than in longer contexts which contained at least one vowel (i.e., contexts which, though not words, were possible words of Dutch). A control experiment showed that this difference was not due to acoustic differences between the words in each context. The results support the claim that spoken-word recognition is sensitive to the viability of sound sequences as possible words.
  • Merkx, D., Frank, S., & Ernestus, M. (2019). Language learning using speech to image retrieval. In Proceedings of Interspeech 2019 (pp. 1841-1845). doi:10.21437/Interspeech.2019-3067.

    Abstract

    Humans learn language by interaction with their environment and listening to other humans. It should also be possible for computational models to learn language directly from speech but so far most approaches require text. We improve on existing neural network approaches to create visually grounded embeddings for spoken utterances. Using a combination of a multi-layer GRU, importance sampling, cyclic learning rates, ensembling and vectorial self-attention our results show a remarkable increase in image-caption retrieval performance over previous work. Furthermore, we investigate which layers in the model learn to recognise words in the input. We find that deeper network layers are better at encoding word presence, although the final layer has slightly lower performance. This shows that our visually grounded sentence encoder learns to recognise words from the input even though it is not explicitly trained for word recognition.
  • Mitterer, H. (2007). Top-down effects on compensation for coarticulation are not replicable. In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 1601-1604). Adelaide: Causal Productions.

    Abstract

    Listeners use lexical knowledge to judge what speech sounds they heard. I investigated whether such lexical influences are truly top-down or just reflect a merging of perceptual and lexical constraints. This is achieved by testing whether the lexically determined identity of a phone exerts the appropriate context effects on surrounding phones. The current investigations focuses on compensation for coarticulation in vowel-fricative sequences, where the presence of a rounded vowel (/y/ rather than /i/) leads fricatives to be perceived as /s/ rather than //. This results was consistently found in all three experiments. A vowel was also more likely to be perceived as rounded /y/ if that lead listeners to be perceive words rather than nonwords (Dutch: meny, English id. vs. meni nonword). This lexical influence on the perception of the vowel had, however, no consistent influence on the perception of following fricative.
  • Mitterer, H., & McQueen, J. M. (2007). Tracking perception of pronunciation variation by tracking looks to printed words: The case of word-final /t/. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1929-1932). Dudweiler: Pirrot.

    Abstract

    We investigated perception of words with reduced word-final /t/ using an adapted eyetracking paradigm. Dutch listeners followed spoken instructions to click on printed words which were accompanied on a computer screen by simple shapes (e.g., a circle). Targets were either above or next to their shapes, and the shapes uniquely identified the targets when the spoken forms were ambiguous between words with or without final /t/ (e.g., bult, bump, vs. bul, diploma). Analysis of listeners’ eye-movements revealed, in contrast to earlier results, that listeners use the following segmental context when compensating for /t/-reduction. Reflecting that /t/-reduction is more likely to occur before bilabials, listeners were more likely to look at the /t/-final words if the next word’s first segment was bilabial. This result supports models of speech perception in which prelexical phonological processes use segmental context to modulate word recognition.
  • Mitterer, H. (2007). Behavior reflects the (degree of) reality of phonological features in the brain as well. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 127-130). Dudweiler: Pirrot.

    Abstract

    To assess the reality of phonological features in language processing (vs. language description), one needs to specify the distinctive claims of distinctive-feature theory. Two of the more farreaching claims are compositionality and generalizability. I will argue that there is some evidence for the first and evidence against the second claim from a recent behavioral paradigm. Highlighting the contribution of a behavioral paradigm also counterpoints the use of brain measures as the only way to elucidate what is "real for the brain". The contributions of the speakers exemplify how brain measures can help us to understand the reality of phonological features in language processing. The evidence is, however, not convincing for a) the claim for underspecification of phonological features—which has to deal with counterevidence from behavioral as well as brain measures—, and b) the claim of position independence of phonological features.
  • Mitterer, H. (2008). How are words reduced in spontaneous speech? In A. Botonis (Ed.), Proceedings of ISCA Tutorial and Research Workshop On Experimental Linguistics (pp. 165-168). Athens: University of Athens.

    Abstract

    Words are reduced in spontaneous speech. If reductions are constrained by functional (i.e., perception and production) constraints, they should not be arbitrary. This hypothesis was tested by examing the pronunciations of high- to mid-frequency words in a Dutch and a German spontaneous speech corpus. In logistic-regression models the "reduction likelihood" of a phoneme was predicted by fixed-effect predictors such as position within the word, word length, word frequency, and stress, as well as random effects such as phoneme identity and word. The models for Dutch and German show many communalities. This is in line with the assumption that similar functional constraints influence reductions in both languages.
  • Moisik, S. R., Zhi Yun, D. P., & Dediu, D. (2019). Active adjustment of the cervical spine during pitch production compensates for shape: The ArtiVarK study. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 864-868). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    The anterior lordosis of the cervical spine is thought
    to contribute to pitch (fo) production by influencing
    cricoid rotation as a function of larynx height. This
    study examines the matter of inter-individual
    variation in cervical spine shape and whether this has
    an influence on how fo is produced along increasing
    or decreasing scales, using the ArtiVarK dataset,
    which contains real-time MRI pitch production data.
    We find that the cervical spine actively participates in
    fo production, but the amount of displacement
    depends on individual shape. In general, anterior
    spine motion (tending toward cervical lordosis)
    occurs for low fo, while posterior movement (tending
    towards cervical kyphosis) occurs for high fo.
  • Narasimhan, B., Eisenbeiss, S., & Brown, P. (Eds.). (2007). The linguistic encoding of multiple-participant events [Special Issue]. Linguistics, 45(3).

    Abstract

    This issue investigates the linguistic encoding of events with three or more participants from the perspectives of language typology and acquisition. Such “multiple-participant events” include (but are not limited to) any scenario involving at least three participants, typically encoded using transactional verbs like 'give' and 'show', placement verbs like 'put', and benefactive and applicative constructions like 'do (something for someone)', among others. There is considerable crosslinguistic and withinlanguage variation in how the participants (the Agent, Causer, Theme, Goal, Recipient, or Experiencer) and the subevents involved in multipleparticipant situations are encoded, both at the lexical and the constructional levels
  • Nijveld, A., Ten Bosch, L., & Ernestus, M. (2019). ERP signal analysis with temporal resolution using a time window bank. In Proceedings of Interspeech 2019 (pp. 1208-1212). doi:10.21437/Interspeech.2019-2729.

    Abstract

    In order to study the cognitive processes underlying speech comprehension, neuro-physiological measures (e.g., EEG and MEG), or behavioural measures (e.g., reaction times and response accuracy) can be applied. Compared to behavioural measures, EEG signals can provide a more fine-grained and complementary view of the processes that take place during the unfolding of an auditory stimulus.

    EEG signals are often analysed after having chosen specific time windows, which are usually based on the temporal structure of ERP components expected to be sensitive to the experimental manipulation. However, as the timing of ERP components may vary between experiments, trials, and participants, such a-priori defined analysis time windows may significantly hamper the exploratory power of the analysis of components of interest. In this paper, we explore a wide-window analysis method applied to EEG signals collected in an auditory repetition priming experiment.

    This approach is based on a bank of temporal filters arranged along the time axis in combination with linear mixed effects modelling. Crucially, it permits a temporal decomposition of effects in a single comprehensive statistical model which captures the entire EEG trace.
  • Nijveld, A. (2019). The role of exemplars in speech comprehension. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • Omar, R., Henley, S. M., Hailstone, J. C., Sauter, D., Scott, S. K., Fox, N. C., Rossor, M. N., & Warren, J. D. (2007). Recognition of emotions in faces, voices and music in frontotemporal lobar regeneration [Abstract]. Journal of Neurology, Neurosurgery & Psychiatry, 78(9), 1014.

    Abstract

    Frontotemporal lobar degeneration (FTLD) is a group of neurodegenerative conditions characterised by focal frontal and/or temporal lobe atrophy. Patients develop a range of cognitive and behavioural abnormalities, including prominent difficulties in comprehending and expressing emotions, with significant clinical and social consequences. Here we report a systematic prospective analysis of emotion processing in different input modalities in patients with FTLD. We examined recognition of happiness, sadness, fear and anger in facial expressions, non-verbal vocalisations and music in patients with FTLD and in healthy age matched controls. The FTLD group was significantly impaired in all modalities compared with controls, and this effect was most marked for music. Analysing each emotion separately, recognition of negative emotions was impaired in all three modalities in FTLD, and this effect was most marked for fear and anger. Recognition of happiness was deficient only with music. Our findings support the idea that FTLD causes impaired recognition of emotions across input channels, consistent with a common central representation of emotion concepts. Music may be a sensitive probe of emotional deficits in FTLD, perhaps because it requires a more abstract representation of emotion than do animate stimuli such as faces and voices.
  • Ozturk, O., & Papafragou, A. (2008). Acquisition of evidentiality and source monitoring. In H. Chan, H. Jacob, & E. Kapia (Eds.), Proceedings from the 32nd Annual Boston University Conference on Language Development [BUCLD 32] (pp. 368-377). Somerville, Mass.: Cascadilla Press.
  • Ozyurek, A. (1998). An analysis of the basic meaning of Turkish demonstratives in face-to-face conversational interaction. In S. Santi, I. Guaitella, C. Cave, & G. Konopczynski (Eds.), Oralite et gestualite: Communication multimodale, interaction: actes du colloque ORAGE 98 (pp. 609-614). Paris: L'Harmattan.
  • Papafragou, A., & Ozturk, O. (2007). Children's acquisition of modality. In Proceedings of the 2nd Conference on Generative Approaches to Language Acquisition North America (GALANA 2) (pp. 320-327). Somerville, Mass.: Cascadilla Press.
  • Papafragou, A. (2007). On the acquisition of modality. In T. Scheffler, & L. Mayol (Eds.), Penn Working Papers in Linguistics. Proceedings of the 30th Annual Penn Linguistics Colloquium (pp. 281-293). Department of Linguistics, University of Pennsylvania.
  • Parhammer*, S. I., Ebersberg*, M., Tippmann*, J., Stärk*, K., Opitz, A., Hinger, B., & Rossi, S. (2019). The influence of distraction on speech processing: How selective is selective attention? In Proceedings of Interspeech 2019 (pp. 3093-3097). doi:10.21437/Interspeech.2019-2699.

    Abstract

    -* indicates shared first authorship -
    The present study investigated the effects of selective attention on the processing of morphosyntactic errors in unattended parts of speech. Two groups of German native (L1) speakers participated in the present study. Participants listened to sentences in which irregular verbs were manipulated in three different conditions (correct, incorrect but attested ablaut pattern, incorrect and crosslinguistically unattested ablaut pattern). In order to track fast dynamic neural reactions to the stimuli, electroencephalography was used. After each sentence, participants in Experiment 1 performed a semantic judgement task, which deliberately distracted the participants from the syntactic manipulations and directed their attention to the semantic content of the sentence. In Experiment 2, participants carried out a syntactic judgement task, which put their attention on the critical stimuli. The use of two different attentional tasks allowed for investigating the impact of selective attention on speech processing and whether morphosyntactic processing steps are performed automatically. In Experiment 2, the incorrect attested condition elicited a larger N400 component compared to the correct condition, whereas in Experiment 1 no differences between conditions were found. These results suggest that the processing of morphosyntactic violations in irregular verbs is not entirely automatic but seems to be strongly affected by selective attention.
  • Perniss, P. M. (2007). Space and iconicity in German sign language (DGS). PhD Thesis, Radboud University Nijmegen, Nijmegen. doi:10.17617/2.57482.

    Abstract

    This dissertation investigates the expression of spatial relationships in German Sign Language (Deutsche Gebärdensprache, DGS). The analysis focuses on linguistic expression in the spatial domain in two types of discourse: static scene description (location) and event narratives (location and motion). Its primary theoretical objectives are to characterize the structure of locative descriptions in DGS; to explain the use of frames of reference and perspective in the expression of location and motion; to clarify the interrelationship between the systems of frames of reference, signing perspective, and classifier predicates; and to characterize the interplay between iconicity principles, on the one hand, and grammatical and discourse constraints, on the other hand, in the use of these spatial devices. In more general terms, the dissertation provides a usage-based account of iconic mapping in the visual-spatial modality. The use of space in sign language expression is widely assumed to be guided by iconic principles, which are furthermore assumed to hold in the same way across sign languages. Thus, there has been little expectation of variation between sign languages in the spatial domain in the use of spatial devices. Consequently, perhaps, there has been little systematic investigation of linguistic expression in the spatial domain in individual sign languages, and less investigation of spatial language in extended signed discourse. This dissertation provides an investigation of spatial expressions in DGS by investigating the impact of different constraints on iconicity in sign language structure. The analyses have important implications for our understanding of the role of iconicity in the visual-spatial modality, the possible language-specific variation within the spatial domain in the visual-spatial modality, the structure of spatial language in both natural language modalities, and the relationship between spatial language and cognition

    Additional information

    full text via Radboud Repository
  • Petersson, K. M. (2008). On cognition, structured sequence processing, and adaptive dynamical systems. American Institute of Physics Conference Proceedings, 1060(1), 195-200.

    Abstract

    Cognitive neuroscience approaches the brain as a cognitive system: a system that functionally is conceptualized in terms of information processing. We outline some aspects of this concept and consider a physical system to be an information processing device when a subclass of its physical states can be viewed as representational/cognitive and transitions between these can be conceptualized as a process operating on these states by implementing operations on the corresponding representational structures. We identify a generic and fundamental problem in cognition: sequentially organized structured processing. Structured sequence processing provides the brain, in an essential sense, with its processing logic. In an approach addressing this problem, we illustrate how to integrate levels of analysis within a framework of adaptive dynamical systems. We note that the dynamical system framework lends itself to a description of asynchronous event-driven devices, which is likely to be important in cognition because the brain appears to be an asynchronous processing system. We use the human language faculty and natural language processing as a concrete example through out.
  • Pluymaekers, M. (2007). Affix reduction in spoken Dutch: Probabilistic effects in production and perception. PhD Thesis, Radboud University Nijmegen, Nijmegen. doi:10.17617/2.58146.

    Abstract

    This dissertation investigates the roles of several probabilistic variables in the production and comprehension of reduced Dutch affixes. The central hypothesis is that linguistic units with a high probability of occurrence are
    more likely to be reduced (Jurafsky et al., 2001; Aylett & Turk, 2004). This hypothesis is tested by analyzing the acoustic realizations of affixes, which are meaning-carrying elements embedded in larger lexical units. Most of the results prove to be compatible with the main hypothesis, but some appear to run counter to its predictions. The final chapter of the thesis discusses the implications of these findings for models of speech production, models of speech perception, and probability-based accounts of reduction.

    Additional information

    full text via Radboud Repository
  • Poort, E. D. (2019). The representation of cognates and interlingual homographs in the bilingual lexicon. PhD Thesis, University College London, London, UK.

    Abstract

    Cognates and interlingual homographs are words that exist in multiple languages. Cognates, like “wolf” in Dutch and English, also carry the same meaning. Interlingual homographs do not: the word “angel” in English refers to a spiritual being, but in Dutch to the sting of a bee. The six experiments included in this thesis examined how these words are represented in the bilingual mental lexicon. Experiment 1 and 2 investigated the issue of task effects on the processing of cognates. Bilinguals often process cognates more quickly than single-language control words (like “carrot”, which exists in English but not Dutch). These experiments showed that the size of this cognate facilitation effect depends on the other types of stimuli included in the task. These task effects were most likely due to response competition, indicating that cognates are subject to processes of facilitation and inhibition both within the lexicon and at the level of decision making. Experiment 3 and 4 examined whether seeing a cognate or interlingual homograph in one’s native language affects subsequent processing in one’s second language. This method was used to determine whether non-identical cognates share a form representation. These experiments were inconclusive: they revealed no effect of cross-lingual long-term priming. Most likely this was because a lexical decision task was used to probe an effect that is largely semantic in nature. Given these caveats to using lexical decision tasks, two final experiments used a semantic relatedness task instead. Both experiments revealed evidence for an interlingual homograph inhibition effect but no cognate facilitation effect. Furthermore, the second experiment found evidence for a small effect of cross-lingual long-term priming. After comparing these findings to the monolingual literature on semantic ambiguity resolution, this thesis concludes that it is necessary to explore the viability of a distributed connectionist account of the bilingual mental lexicon.

    Additional information

    full text via UCL
  • Pouw, W., Paxton, A., Harrison, S. J., & Dixon, J. A. (2019). Acoustic specification of upper limb movement in voicing. In A. Grimminger (Ed.), Proceedings of the 6th Gesture and Speech in Interaction – GESPIN 6 (pp. 68-74). Paderborn: Universitaetsbibliothek Paderborn. doi:10.17619/UNIPB/1-812.
  • Pouw, W., & Dixon, J. A. (2019). Quantifying gesture-speech synchrony. In A. Grimminger (Ed.), Proceedings of the 6th Gesture and Speech in Interaction – GESPIN 6 (pp. 75-80). Paderborn: Universitaetsbibliothek Paderborn. doi:10.17619/UNIPB/1-812.

    Abstract

    Spontaneously occurring speech is often seamlessly accompanied by hand gestures. Detailed
    observations of video data suggest that speech and gesture are tightly synchronized in time,
    consistent with a dynamic interplay between body and mind. However, spontaneous gesturespeech
    synchrony has rarely been objectively quantified beyond analyses of video data, which
    do not allow for identification of kinematic properties of gestures. Consequently, the point in
    gesture which is held to couple with speech, the so-called moment of “maximum effort”, has
    been variably equated with the peak velocity, peak acceleration, peak deceleration, or the onset
    of the gesture. In the current exploratory report, we provide novel evidence from motiontracking
    and acoustic data that peak velocity is closely aligned, and shortly leads, the peak pitch
    (F0) of speech

    Additional information

    https://osf.io/9843h/
  • Rapold, C. J. (2007). From demonstratives to verb agreement in Benchnon: A diachronic perspective. In A. Amha, M. Mous, & G. Savà (Eds.), Omotic and Cushitic studies: Papers from the Fourth Cushitic Omotic Conference, Leiden, 10-12 April 2003 (pp. 69-88). Cologne: Rüdiger Köppe.
  • Reinisch, E., Jesse, A., & McQueen, J. M. (2008). The strength of stress-related lexical competition depends on the presence of first-syllable stress. In Proceedings of Interspeech 2008 (pp. 1954-1954).

    Abstract

    Dutch listeners' looks to printed words were tracked while they listened to instructions to click with their mouse on one of them. When presented with targets from word pairs where the first two syllables were segmentally identical but differed in stress location, listeners used stress information to recognize the target before segmental information disambiguated the words. Furthermore, the amount of lexical competition was influenced by the presence or absence of word-initial stress.
  • Reinisch, E., Jesse, A., & McQueen, J. M. (2008). Lexical stress information modulates the time-course of spoken-word recognition. In Proceedings of Acoustics' 08 (pp. 3183-3188).

    Abstract

    Segmental as well as suprasegmental information is used by Dutch listeners to recognize words. The time-course of the effect of suprasegmental stress information on spoken-word recognition was investigated in a previous study, in which we tracked Dutch listeners' looks to arrays of four printed words as they listened to spoken sentences. Each target was displayed along with a competitor that did not differ segmentally in its first two syllables but differed in stress placement (e.g., 'CENtimeter' and 'sentiMENT'). The listeners' eye-movements showed that stress information is used to recognize the target before distinct segmental information is available. Here, we examine the role of durational information in this effect. Two experiments showed that initial-syllable duration, as a cue to lexical stress, is not interpreted dependent on the speaking rate of the preceding carrier sentence. This still held when other stress cues like pitch and amplitude were removed. Rather, the speaking rate of the preceding carrier affected the speed of word recognition globally, even though the rate of the target itself was not altered. Stress information modulated lexical competition, but did so independently of the rate of the preceding carrier, even if duration was the only stress cue present.
  • Ringersma, J., & Kemps-Snijders, M. (2007). Creating multimedia dictionaries of endangered languages using LEXUS. In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 65-68). Baixas, France: ISCA-Int.Speech Communication Assoc.

    Abstract

    This paper reports on the development of a flexible web based lexicon tool, LEXUS. LEXUS is targeted at linguists involved in language documentation (of endangered languages). It allows the creation of lexica within the structure of the proposed ISO LMF standard and uses the proposed concept naming conventions from the ISO data categories, thus enabling interoperability, search and merging. LEXUS also offers the possibility to visualize language, since it provides functionalities to include audio, video and still images to the lexicon. With LEXUS it is possible to create semantic network knowledge bases, using typed relations. The LEXUS tool is free for use. Index Terms: lexicon, web based application, endangered languages, language documentation.
  • Rissman, L., & Majid, A. (2019). Agency drives category structure in instrumental events. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2661-2667). Montreal, QB: Cognitive Science Society.

    Abstract

    Thematic roles such as Agent and Instrument have a long-standing place in theories of event representation. Nonetheless, the structure of these categories has been difficult to determine. We investigated how instrumental events, such as someone slicing bread with a knife, are categorized in English. Speakers described a variety of typical and atypical instrumental events, and we determined the similarity structure of their descriptions using correspondence analysis. We found that events where the instrument is an extension of an intentional agent were most likely to elicit similar language, highlighting the importance of agency in structuring instrumental categories.
  • Robotham, L., Trinkler, I., & Sauter, D. (2008). The power of positives: Evidence for an overall emotional recognition deficit in Huntington's disease [Abstract]. Journal of Neurology, Neurosurgery & Psychiatry, 79, A12.

    Abstract

    The recognition of emotions of disgust, anger and fear have been shown to be significantly impaired in Huntington’s disease (eg,Sprengelmeyer et al, 1997, 2006; Gray et al, 1997; Milders et al, 2003,Montagne et al, 2006; Johnson et al, 2007; De Gelder et al, 2008). The relative impairment of these emotions might have implied a recognition impairment specific to negative emotions. Could the asymmetric recognition deficits be due not to the complexity of the emotion but rather reflect the complexity of the task? In the current study, 15 Huntington’s patients and 16 control subjects were presented with negative and positive non-speech emotional vocalisations that were to be identified as anger, fear, sadness, disgust, achievement, pleasure and amusement in a forced-choice paradigm. This experiment more accurately matched the negative emotions with positive emotions in a homogeneous modality. The resulting dually impaired ability of Huntington’s patients to identify negative and positive non-speech emotional vocalisations correctly provides evidence for an overall emotional recognition deficit in the disease. These results indicate that previous findings of a specificity in emotional recognition deficits might instead be due to the limitations of the visual modality. Previous experiments may have found an effect of emotional specificy due to the presence of a single positive emotion, happiness, in the midst of multiple negative emotions. In contrast with the previous literature, the study presented here points to a global deficit in the recognition of emotional sounds.
  • Rojas-Berscia, L. M. (2019). From Kawapanan to Shawi: Topics in language variation and change. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • De Ruiter, J. P. (2007). Some multimodal signals in humans. In I. Van de Sluis, M. Theune, E. Reiter, & E. Krahmer (Eds.), Proceedings of the Workshop on Multimodal Output Generation (MOG 2007) (pp. 141-148).

    Abstract

    In this paper, I will give an overview of some well-studied multimodal signals that humans produce while they communicate with other humans, and discuss the implications of those studies for HCI. I will first discuss a conceptual framework that allows us to distinguish between functional and sensory modalities. This distinction is important, as there are multiple functional modalities using the same sensory modality (e.g., facial expression and eye-gaze in the visual modality). A second theoretically important issue is redundancy. Some signals appear to be redundant with a signal in another modality, whereas others give new information or even appear to give conflicting information (see e.g., the work of Susan Goldin-Meadows on speech accompanying gestures). I will argue that multimodal signals are never truly redundant. First, many gestures that appear at first sight to express the same meaning as the accompanying speech generally provide extra (analog) information about manner, path, etc. Second, the simple fact that the same information is expressed in more than one modality is itself a communicative signal. Armed with this conceptual background, I will then proceed to give an overview of some multimodalsignals that have been investigated in human-human research, and the level of understanding we have of the meaning of those signals. The latter issue is especially important for potential implementations of these signals in artificial agents. First, I will discuss pointing gestures. I will address the issue of the timing of pointing gestures relative to the speech it is supposed to support, the mutual dependency between pointing gestures and speech, and discuss the existence of alternative ways of pointing from other cultures. The most frequent form of pointing that does not involve the index finger is a cultural practice called lip-pointing which employs two visual functional modalities, mouth-shape and eye-gaze, simultaneously for pointing. Next, I will address the issue of eye-gaze. A classical study by Kendon (1967) claims that there is a systematic relationship between eye-gaze (at the interlocutor) and turn-taking states. Research at our institute has shown that this relationship is weaker than has often been assumed. If the dialogue setting contains a visible object that is relevant to the dialogue (e.g., a map), the rate of eye-gaze-at-other drops dramatically and its relationship to turn taking disappears completely. The implications for machine generated eye-gaze are discussed. Finally, I will explore a theoretical debate regarding spontaneous gestures. It has often been claimed that the class of gestures that is called iconic by McNeill (1992) are a “window into the mind”. That is, they are claimed to give the researcher (or even the interlocutor) a direct view into the speaker’s thought, without being obscured by the complex transformation that take place when transforming a thought into a verbal utterance. I will argue that this is an illusion. Gestures can be shown to be specifically designed such that the listener can be expected to interpret them. Although the transformations carried out to express a thought in gesture are indeed (partly) different from the corresponding transformations for speech, they are a) complex, and b) severely understudied. This obviously has consequences both for the gesture research agenda, and for the generation of iconic gestures by machines.
  • De Ruiter, J. P. (1998). Gesture and speech production. PhD Thesis, Radboud University Nijmegen, Nijmegen. doi:10.17617/2.2057686.
  • De Ruiter, L. E. (2008). How useful are polynomials for analyzing intonation? In Proceedings of Interspeech 2008 (pp. 785-789).

    Abstract

    This paper presents the first application of polynomial modeling as a means for validating phonological pitch accent labels to German data. It is compared to traditional phonetic analysis (measuring minima, maxima, alignment). The traditional method fares better in classification, but results are comparable in statistical accent pair testing. Robustness tests show that pitch correction is necessary in both cases. The approaches are discussed in terms of their practicability, applicability to other domains of research and interpretability of their results.
  • De Ruiter, J. P., & Enfield, N. J. (2007). The BIC model: A blueprint for the communicator. In C. Stephanidis (Ed.), Universal access in Human-Computer Interaction: Applications and services (pp. 251-258). Berlin: Springer.
  • Sauter, D., Eisner, F., Rosen, S., & Scott, S. K. (2008). The role of source and filter cues in emotion recognition in speech [Abstract]. Journal of the Acoustical Society of America, 123, 3739-3740.

    Abstract

    In the context of the source-filter theory of speech, it is well established that intelligibility is heavily reliant on information carried by the filter, that is, spectral cues (e.g., Faulkner et al., 2001; Shannon et al., 1995). However, the extraction of other types of information in the speech signal, such as emotion and identity, is less well understood. In this study we investigated the extent to which emotion recognition in speech depends on filterdependent cues, using a forced-choice emotion identification task at ten levels of noise-vocoding ranging between one and 32 channels. In addition, participants performed a speech intelligibility task with the same stimuli. Our results indicate that compared to speech intelligibility, emotion recognition relies less on spectral information and more on cues typically signaled by source variations, such as voice pitch, voice quality, and intensity. We suggest that, while the reliance on spectral dynamics is likely a unique aspect of human speech, greater phylogenetic continuity across species may be found in the communication of affect in vocalizations.
  • Sauter, D. (2008). The time-course of emotional voice processing [Abstract]. Neurocase, 14, 455-455.

    Abstract

    Research using event-related brain potentials (ERPs) has demonstrated an early differential effect in fronto-central regions when processing emotional, as compared to affectively neutral facial stimuli (e.g., Eimer & Holmes, 2002). In this talk, data demonstrating a similar effect in the auditory domain will be presented. ERPs were recorded in a one-back task where participants had to identify immediate repetitions of emotion category, such as a fearful sound followed by another fearful sound. The stimulus set consisted of non-verbal emotional vocalisations communicating positive and negative sounds, as well as neutral baseline conditions. Similarly to the facial domain, fear sounds as compared to acoustically controlled neutral sounds, elicited a frontally distributed positivity with an onset latency of about 150 ms after stimulus onset. These data suggest the existence of a rapid multi-modal frontocentral mechanism discriminating emotional from non-emotional human signals.
  • Scharenborg, O., Ernestus, M., & Wan, V. (2007). Segmentation of speech: Child's play? In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 1953-1956). Adelaide: Causal Productions.

    Abstract

    The difficulty of the task of segmenting a speech signal into its words is immediately clear when listening to a foreign language; it is much harder to segment the signal into its words, since the words of the language are unknown. Infants are faced with the same task when learning their first language. This study provides a better understanding of the task that infants face while learning their native language. We employed an automatic algorithm on the task of speech segmentation without prior knowledge of the labels of the phonemes. An analysis of the boundaries erroneously placed inside a phoneme showed that the algorithm consistently placed additional boundaries in phonemes in which acoustic changes occur. These acoustic changes may be as great as the transition from the closure to the burst of a plosive or as subtle as the formant transitions in low or back vowels. Moreover, we found that glottal vibration may attenuate the relevance of acoustic changes within obstruents. An interesting question for further research is how infants learn to overcome the natural tendency to segment these ‘dynamic’ phonemes.
  • Scharenborg, O., & Wan, V. (2007). Can unquantised articulatory feature continuums be modelled? In INTERSPEECH 2007 - 8th Annual Conference of the International Speech Communication Association (pp. 2473-2476). ISCA Archive.

    Abstract

    Articulatory feature (AF) modelling of speech has received a considerable amount of attention in automatic speech recognition research. Although termed ‘articulatory’, previous definitions make certain assumptions that are invalid, for instance, that articulators ‘hop’ from one fixed position to the next. In this paper, we studied two methods, based on support vector classification (SVC) and regression (SVR), in which the articulation continuum is modelled without being restricted to using discrete AF value classes. A comparison with a baseline system trained on quantised values of the articulation continuum showed that both SVC and SVR outperform the baseline for two of the three investigated AFs, with improvements up to 5.6% absolute.
  • Scharenborg, O., & Cooke, M. P. (2008). Comparing human and machine recognition performance on a VCV corpus. In ISCA Tutorial and Research Workshop (ITRW) on "Speech Analysis and Processing for Knowledge Discovery".

    Abstract

    Listeners outperform ASR systems in every speech recognition task. However, what is not clear is where this human advantage originates. This paper investigates the role of acoustic feature representations. We test four (MFCCs, PLPs, Mel Filterbanks, Rate Maps) acoustic representations, with and without ‘pitch’ information, using the same backend. The results are compared with listener results at the level of articulatory feature classification. While no acoustic feature representation reached the levels of human performance, both MFCCs and Rate maps achieved good scores, with Rate maps nearing human performance on the classification of voicing. Comparing the results on the most difficult articulatory features to classify showed similarities between the humans and the SVMs: e.g., ‘dental’ was by far the least well identified by both groups. Overall, adding pitch information seemed to hamper classification performance.
  • Scharenborg, O. (2008). Modelling fine-phonetic detail in a computational model of word recognition. In INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association (pp. 1473-1476). ISCA Archive.

    Abstract

    There is now considerable evidence that fine-grained acoustic-phonetic detail in the speech signal helps listeners to segment a speech signal into syllables and words. In this paper, we compare two computational models of word recognition on their ability to capture and use this finephonetic detail during speech recognition. One model, SpeM, is phoneme-based, whereas the other, newly developed Fine- Tracker, is based on articulatory features. Simulations dealt with modelling the ability of listeners to distinguish short words (e.g., ‘ham’) from the longer words in which they are embedded (e.g., ‘hamster’). The simulations with Fine- Tracker showed that it was, like human listeners, able to distinguish between short words from the longer words in which they are embedded. This suggests that it is possible to extract this fine-phonetic detail from the speech signal and use it during word recognition.
  • Scheu, O., & Zinn, C. (2007). How did the e-learning session go? The student inspector. In Proceedings of the 13th International Conference on Artificial Intelligence and Education (AIED 2007). Amsterdam: IOS Press.

    Abstract

    Good teachers know their students, and exploit this knowledge to adapt or optimise their instruction. Traditional teachers know their students because they interact with them face-to-face in classroom or one-to-one tutoring sessions. In these settings, they can build student models, i.e., by exploiting the multi-faceted nature of human-human communication. In distance-learning contexts, teacher and student have to cope with the lack of such direct interaction, and this must have detrimental effects for both teacher and student. In a past study we have analysed teacher requirements for tracking student actions in computer-mediated settings. Given the results of this study, we have devised and implemented a tool that allows teachers to keep track of their learners'interaction in e-learning systems. We present the tool's functionality and user interfaces, and an evaluation of its usability.
  • Schmidt, T., Duncan, S., Ehmer, O., Hoyt, J., Kipp, M., Loehr, D., Magnusson, M., Rose, T., & Sloetjes, H. (2008). An exchange format for multimodal annotations. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    This paper presents the results of a joint effort of a group of multimodality researchers and tool developers to improve the interoperability between several tools used for the annotation of multimodality. We propose a multimodal annotation exchange format, based on the annotation graph formalism, which is supported by import and export routines in the respective tools
  • Schoenmakers, G.-J., & De Swart, P. (2019). Adverbial hurdles in Dutch scrambling. In A. Gattnar, R. Hörnig, M. Störzer, & S. Featherston (Eds.), Proceedings of Linguistic Evidence 2018: Experimental Data Drives Linguistic Theory (pp. 124-145). Tübingen: University of Tübingen.

    Abstract

    This paper addresses the role of the adverb in Dutch direct object scrambling constructions. We report four experiments in which we investigate whether the structural position and the scope sensitivity of the adverb affect acceptability judgments of scrambling constructions and native speakers' tendency to scramble definite objects. We conclude that the type of adverb plays a key role in Dutch word ordering preferences.
  • Schuerman, W. L., McQueen, J. M., & Meyer, A. S. (2019). Speaker statistical averageness modulates word recognition in adverse listening conditions. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1203-1207). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    We tested whether statistical averageness (SA) at the level of the individual speaker could predict a speaker’s intelligibility. 28 female and 21 male speakers of Dutch were recorded producing 336 sentences,
    each containing two target nouns. Recordings were compared to those of all other same-sex speakers using dynamic time warping (DTW). For each sentence, the DTW distance constituted a metric
    of phonetic distance from one speaker to all other speakers. SA comprised the average of these distances. Later, the same participants performed a word recognition task on the target nouns in the same sentences, under three degraded listening conditions. In all three conditions, accuracy increased with SA. This held even when participants listened to their own utterances. These findings suggest that listeners process speech with respect to the statistical
    properties of the language spoken in their community, rather than using their own speech as a reference
  • Schulte im Walde, S., Melinger, A., Roth, M., & Weber, A. (2007). An empirical characterization of response types in German association norms. In Proceedings of the GLDV workshop on lexical-semantic and ontological resources.
  • Schuppler, B., Ernestus, M., Scharenborg, O., & Boves, L. (2008). Preparing a corpus of Dutch spontaneous dialogues for automatic phonetic analysis. In INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association (pp. 1638-1641). ISCA Archive.

    Abstract

    This paper presents the steps needed to make a corpus of Dutch spontaneous dialogues accessible for automatic phonetic research aimed at increasing our understanding of reduction phenomena and the role of fine phonetic detail. Since the corpus was not created with automatic processing in mind, it needed to be reshaped. The first part of this paper describes the actions needed for this reshaping in some detail. The second part reports the results of a preliminary analysis of the reduction phenomena in the corpus. For this purpose a phonemic transcription of the corpus was created by means of a forced alignment, first with a lexicon of canonical pronunciations and then with multiple pronunciation variants per word. In this study pronunciation variants were generated by applying a large set of phonetic processes that have been implicated in reduction to the canonical pronunciations of the words. This relatively straightforward procedure allows us to produce plausible pronunciation variants and to verify and extend the results of previous reduction studies reported in the literature.
  • Scott, D. R., & Cutler, A. (1982). Segmental cues to syntactic structure. In Proceedings of the Institute of Acoustics 'Spectral Analysis and its Use in Underwater Acoustics' (pp. E3.1-E3.4). London: Institute of Acoustics.
  • Seidlmayer, E., Galke, L., Melnychuk, T., Schultz, C., Tochtermann, K., & Förstner, K. U. (2019). Take it personally - A Python library for data enrichment for infometrical applications. In M. Alam, R. Usbeck, T. Pellegrini, H. Sack, & Y. Sure-Vetter (Eds.), Proceedings of the Posters and Demo Track of the 15th International Conference on Semantic Systems co-located with 15th International Conference on Semantic Systems (SEMANTiCS 2019).

    Abstract

    Like every other social sphere, science is influenced by individual characteristics of researchers. However, for investigations on scientific networks, only little data about the social background of researchers, e.g. social origin, gender, affiliation etc., is available.
    This paper introduces ”Take it personally - TIP”, a conceptual model and library currently under development, which aims to support the
    semantic enrichment of publication databases with semantically related background information which resides elsewhere in the (semantic) web, such as Wikidata.
    The supplementary information enriches the original information in the publication databases and thus facilitates the creation of complex scientific knowledge graphs. Such enrichment helps to improve the scientometric analysis of scientific publications as they can also take social backgrounds of researchers into account and to understand social structure in research communities.
  • Seijdel, N., Sakmakidis, N., De Haan, E. H. F., Bohte, S. M., & Scholte, H. S. (2019). Implicit scene segmentation in deeper convolutional neural networks. In Proceedings of the 2019 Conference on Cognitive Computational Neuroscience (pp. 1059-1062). doi:10.32470/CCN.2019.1149-0.

    Abstract

    Feedforward deep convolutional neural networks (DCNNs) are matching and even surpassing human performance on object recognition. This performance suggests that activation of a loose collection of image
    features could support the recognition of natural object categories, without dedicated systems to solve specific visual subtasks. Recent findings in humans however, suggest that while feedforward activity may suffice for
    sparse scenes with isolated objects, additional visual operations ('routines') that aid the recognition process (e.g. segmentation or grouping) are needed for more complex scenes. Linking human visual processing to
    performance of DCNNs with increasing depth, we here explored if, how, and when object information is differentiated from the backgrounds they appear on. To this end, we controlled the information in both objects
    and backgrounds, as well as the relationship between them by adding noise, manipulating background congruence and systematically occluding parts of the image. Results indicated less distinction between object- and background features for more shallow networks. For those networks, we observed a benefit of training on segmented objects (as compared to unsegmented objects). Overall, deeper networks trained on natural
    (unsegmented) scenes seem to perform implicit 'segmentation' of the objects from their background, possibly by improved selection of relevant features.
  • Senft, G. (1991). Bakavilisi Biga - we can 'turn' the language - or: What happens to English words in Kilivila language? In W. Bahner, J. Schildt, & D. Viehwegger (Eds.), Proceedings of the XIVth International Congress of Linguists (pp. 1743-1746). Berlin: Akademie Verlag.
  • Senft, G. (2007). Language, culture and cognition: Frames of spatial reference and why we need ontologies of space [Abstract]. In A. G. Cohn, C. Freksa, & B. Bebel (Eds.), Spatial cognition: Specialization and integration (pp. 12).

    Abstract

    One of the many results of the "Space" research project conducted at the MPI for Psycholinguistics is that there are three "Frames of spatial Reference" (FoRs), the relative, the intrinsic and the absolute FoR. Cross-linguistic research showed that speakers who prefer one FoR in verbal spatial references rely on a comparable coding system for memorizing spatial configurations and for making inferences with respect to these spatial configurations in non-verbal problem solving. Moreover, research results also revealed that in some languages these verbal FoRs also influence gestural behavior. These results document the close interrelationship between language, culture and cognition in the domain "Space". The proper description of these interrelationships in the spatial domain requires language and culture specific ontologies.
  • Seuren, P. A. M. (1991). Notes on noun phrases and quantification. In Proceedings of the International Conference on Current Issues in Computational Linguistics (pp. 19-44). Penang, Malaysia: Universiti Sains Malaysia.
  • Seuren, P. A. M. (1982). Riorientamenti metodologici nello studio della variabilità linguistica. In D. Gambarara, & A. D'Atri (Eds.), Ideologia, filosofia e linguistica: Atti del Convegno Internazionale di Studi, Rende (CS) 15-17 Settembre 1978 ( (pp. 499-515). Roma: Bulzoni.
  • Seuren, P. A. M. (1991). What makes a text untranslatable? In H. M. N. Noor Ein, & H. S. Atiah (Eds.), Pragmatik Penterjemahan: Prinsip, Amalan dan Penilaian Menuju ke Abad 21 ("The Pragmatics of Translation: Principles, Practice and Evaluation Moving towards the 21st Century") (pp. 19-27). Kuala Lumpur: Dewan Bahasa dan Pustaka.
  • Shen, C., & Janse, E. (2019). Articulatory control in speech production. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019) (pp. 2533-2537). Canberra, Australia: Australasian Speech Science and Technology Association Inc.
  • Shen, C., Cooke, M., & Janse, E. (2019). Individual articulatory control in speech enrichment. In M. Ochmann, M. Vorländer, & J. Fels (Eds.), Proceedings of the 23rd International Congress on Acoustics (pp. 5726-5730). Berlin: Deutsche Gesellschaft für Akustik.

    Abstract

    ndividual talkers may use various strategies to enrich their speech while speaking in noise (i.e., Lombard speech) to improve their intelligibility. The resulting acoustic-phonetic changes in Lombard speech vary amongst different speakers, but it is unclear what causes these talker differences, and what impact these differences have on intelligibility. This study investigates the potential role of articulatory control in talkers’ Lombard speech enrichment success. Seventy-eight speakers read out sentences in both their habitual style and in a condition where they were instructed to speak clearly while hearing loud speech-shaped noise. A diadochokinetic (DDK) speech task that requires speakers to repetitively produce word or non-word sequences as accurately and as rapidly as possible, was used to quantify their articulatory control. Individuals’ predicted intelligibility in both speaking styles (presented at -5 dB SNR) was measured using an acoustic glimpse-based metric: the High-Energy Glimpse Proportion (HEGP). Speakers’ HEGP scores show a clear effect of speaking condition (better HEGP scores in the Lombard than habitual condition), but no simple effect of articulatory control on HEGP, nor an interaction between speaking condition and articulatory control. This indicates that individuals’ speech enrichment success as measured by the HEGP metric was not predicted by DDK performance.
  • Sloetjes, H., & Wittenburg, P. (2008). Annotation by category - ELAN and ISO DCR. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    The Data Category Registry is one of the ISO initiatives towards the establishment of standards for Language Resource management, creation and coding. Successful application of the DCR depends on the availability of tools that can interact with it. This paper describes the first steps that have been taken to provide users of the multimedia annotation tool ELAN, with the means to create references from tiers and annotations to data categories defined in the ISO Data Category Registry. It first gives a brief description of the capabilities of ELAN and the structure of the documents it creates. After a concise overview of the goals and current state of the ISO DCR infrastructure, a description is given of how the preliminary connectivity with the DCR is implemented in ELAN
  • Sollis, E. (2019). A network of interacting proteins disrupted in language-related disorders. PhD Thesis, Radboud University Nijmegen, Nijmegen.
  • De Sousa, H. (2008). The development of echo-subject markers in Southern Vanuatu. In T. J. Curnow (Ed.), Selected papers from the 2007 Conference of the Australian Linguistic Society. Australian Linguistic Society.

    Abstract

    One of the defining features of the Southern Vanuatu language family is the echo-subject (ES) marker (Lynch 2001: 177-178). Canonically, an ES marker indicates that the subject of the clause is coreferential with the subject of the preceding clause. This paper begins with a survey of the various ES systems found in Southern Vanuatu. Two prominent differences amongst the ES systems are: a) the level of obligatoriness of the ES marker; and b) the level of grammatical integration between an ES clauses and the preceding clause. The variation found amongst the ES systems reveals a clear path of grammaticalisation from the VP coordinator *ma in Proto–Southern Vanuatu to the various types of ES marker in contemporary Southern Vanuatu languages
  • Stehouwer, H., & Van den Bosch, A. (2008). Putting the t where it belongs: Solving a confusion problem in Dutch. In S. Verberne, H. Van Halteren, & P.-A. Coppen (Eds.), Computational Linguistics in the Netherlands 2007: Selected Papers from the 18th CLIN Meeting (pp. 21-36). Utrecht: LOT.

    Abstract

    A common Dutch writing error is to confuse a word ending in -d with a neighbor word ending in -dt. In this paper we describe the development of a machine-learning-based disambiguator that can determine which word ending is appropriate, on the basis of its local context. We develop alternative disambiguators, varying between a single monolithic classifier and having multiple confusable experts disambiguate between confusable pairs. Disambiguation accuracy of the best developed disambiguators exceeds 99%; when we apply these disambiguators to an external test set of collected errors, our detection strategy correctly identifies up to 79% of the errors.
  • Stevens, M. E. (2007). Perceptual adaptation to phonological differences between language varieties. PhD Thesis, University of Ghent, Ghent.

Share this page