Publications

Displaying 201 - 261 of 261
  • Scharenborg, O., Ernestus, M., & Wan, V. (2007). Segmentation of speech: Child's play? In H. van Hamme, & R. van Son (Eds.), Proceedings of Interspeech 2007 (pp. 1953-1956). Adelaide: Causal Productions.

    Abstract

    The difficulty of the task of segmenting a speech signal into its words is immediately clear when listening to a foreign language; it is much harder to segment the signal into its words, since the words of the language are unknown. Infants are faced with the same task when learning their first language. This study provides a better understanding of the task that infants face while learning their native language. We employed an automatic algorithm on the task of speech segmentation without prior knowledge of the labels of the phonemes. An analysis of the boundaries erroneously placed inside a phoneme showed that the algorithm consistently placed additional boundaries in phonemes in which acoustic changes occur. These acoustic changes may be as great as the transition from the closure to the burst of a plosive or as subtle as the formant transitions in low or back vowels. Moreover, we found that glottal vibration may attenuate the relevance of acoustic changes within obstruents. An interesting question for further research is how infants learn to overcome the natural tendency to segment these ‘dynamic’ phonemes.
  • Scharenborg, O., & Merkx, D. (2018). The role of articulatory feature representation quality in a computational model of human spoken-word recognition. In Proceedings of the Machine Learning in Speech and Language Processing Workshop (MLSLP 2018).

    Abstract

    Fine-Tracker is a speech-based model of human speech
    recognition. While previous work has shown that Fine-Tracker
    is successful at modelling aspects of human spoken-word
    recognition, its speech recognition performance is not
    comparable to that of human performance, possibly due to
    suboptimal intermediate articulatory feature (AF)
    representations. This study investigates the effect of improved
    AF representations, obtained using a state-of-the-art deep
    convolutional network, on Fine-Tracker’s simulation and
    recognition performance: Although the improved AF quality
    resulted in improved speech recognition; it, surprisingly, did
    not lead to an improvement in Fine-Tracker’s simulation power.
  • Scharenborg, O., & Wan, V. (2007). Can unquantised articulatory feature continuums be modelled? In INTERSPEECH 2007 - 8th Annual Conference of the International Speech Communication Association (pp. 2473-2476). ISCA Archive.

    Abstract

    Articulatory feature (AF) modelling of speech has received a considerable amount of attention in automatic speech recognition research. Although termed ‘articulatory’, previous definitions make certain assumptions that are invalid, for instance, that articulators ‘hop’ from one fixed position to the next. In this paper, we studied two methods, based on support vector classification (SVC) and regression (SVR), in which the articulation continuum is modelled without being restricted to using discrete AF value classes. A comparison with a baseline system trained on quantised values of the articulation continuum showed that both SVC and SVR outperform the baseline for two of the three investigated AFs, with improvements up to 5.6% absolute.
  • Scharenborg, O., Wan, V., & Moore, R. K. (2006). Capturing fine-phonetic variation in speech through automatic classification of articulatory features. In Speech Recognition and Intrinsic Variation Workshop [SRIV2006] (pp. 77-82). ISCA Archive.

    Abstract

    The ultimate goal of our research is to develop a computational model of human speech recognition that is able to capture the effects of fine-grained acoustic variation on speech recognition behaviour. As part of this work we are investigating automatic feature classifiers that are able to create reliable and accurate transcriptions of the articulatory behaviour encoded in the acoustic speech signal. In the experiments reported here, we compared support vector machines (SVMs) with multilayer perceptrons (MLPs). MLPs have been widely (and rather successfully) used for the task of multi-value articulatory feature classification, while (to the best of our knowledge) SVMs have not. This paper compares the performances of the two classifiers and analyses the results in order to better understand the articulatory representations. It was found that the MLPs outperformed the SVMs, but it is concluded that both classifiers exhibit similar behaviour in terms of patterns of errors.
  • Scharenborg, O., Boves, L., & de Veth, J. (2002). ASR in a human word recognition model: Generating phonemic input for Shortlist. In J. H. L. Hansen, & B. Pellom (Eds.), ICSLP 2002 - INTERSPEECH 2002 - 7th International Conference on Spoken Language Processing (pp. 633-636). ISCA Archive.

    Abstract

    The current version of the psycholinguistic model of human word recognition Shortlist suffers from two unrealistic constraints. First, the input of Shortlist must consist of a single string of phoneme symbols. Second, the current version of the search in Shortlist makes it difficult to deal with insertions and deletions in the input phoneme string. This research attempts to fully automatically derive a phoneme string from the acoustic signal that is as close as possible to the number of phonemes in the lexical representation of the word. We optimised an Automatic Phone Recogniser (APR) using two approaches, viz. varying the value of the mismatch parameter and optimising the APR output strings on the output of Shortlist. The approaches show that it will be very difficult to satisfy the input requirements of the present version of Shortlist with a phoneme string generated by an APR.
  • Scharenborg, O., & Boves, L. (2002). Pronunciation variation modelling in a model of human word recognition. In Pronunciation Modeling and Lexicon Adaptation for Spoken Language Technology [PMLA-2002] (pp. 65-70).

    Abstract

    Due to pronunciation variation, many insertions and deletions of phones occur in spontaneous speech. The psycholinguistic model of human speech recognition Shortlist is not well able to deal with phone insertions and deletions and is therefore not well suited for dealing with real-life input. The research presented in this paper explains how Shortlist can benefit from pronunciation variation modelling in dealing with real-life input. Pronunciation variation was modelled by including variants into the lexicon of Shortlist. A series of experiments was carried out to find the optimal acoustic model set for transcribing the training material that was used as basis for the generation of the variants. The Shortlist experiments clearly showed that Shortlist benefits from pronunciation variation modelling. However, the performance of Shortlist stays far behind the performance of other, more conventional speech recognisers.
  • Scheu, O., & Zinn, C. (2007). How did the e-learning session go? The student inspector. In Proceedings of the 13th International Conference on Artificial Intelligence and Education (AIED 2007). Amsterdam: IOS Press.

    Abstract

    Good teachers know their students, and exploit this knowledge to adapt or optimise their instruction. Traditional teachers know their students because they interact with them face-to-face in classroom or one-to-one tutoring sessions. In these settings, they can build student models, i.e., by exploiting the multi-faceted nature of human-human communication. In distance-learning contexts, teacher and student have to cope with the lack of such direct interaction, and this must have detrimental effects for both teacher and student. In a past study we have analysed teacher requirements for tracking student actions in computer-mediated settings. Given the results of this study, we have devised and implemented a tool that allows teachers to keep track of their learners'interaction in e-learning systems. We present the tool's functionality and user interfaces, and an evaluation of its usability.
  • Schiller, N. O., Schmitt, B., Peters, J., & Levelt, W. J. M. (2002). 'BAnana'or 'baNAna'? Metrical encoding during speech production [Abstract]. In M. Baumann, A. Keinath, & J. Krems (Eds.), Experimentelle Psychologie: Abstracts der 44. Tagung experimentell arbeitender Psychologen. (pp. 195). TU Chemnitz, Philosophische Fakultät.

    Abstract

    The time course of metrical encoding, i.e. stress, during speech production is investigated. In a first experiment, participants were presented with pictures whose bisyllabic Dutch names had initial or final stress (KAno 'canoe' vs. kaNON 'cannon'; capital letters indicate stressed syllables). Picture names were matched for frequency and object recognition latencies. When participants were asked to judge whether picture names had stress on the first or second syllable, they showed significantly faster decision times for initially stressed targets than for targets with final stress. Experiment 2 replicated this effect with trisyllabic picture names (faster RTs for penultimate stress than for ultimate stress). In our view, these results reflect the incremental phonological encoding process. Wheeldon and Levelt (1995) found that segmental encoding is a process running from the beginning to the end of words. Here, we present evidence that the metrical pattern of words, i.e. stress, is also encoded incrementally.
  • Schmiedtová, V., & Schmiedtová, B. (2002). The color spectrum in language: The case of Czech: Cognitive concepts, new idioms and lexical meanings. In H. Gottlieb, J. Mogensen, & A. Zettersten (Eds.), Proceedings of The 10th International Symposium on Lexicography (pp. 285-292). Tübingen: Max Niemeyer Verlag.

    Abstract

    The representative corpus SYN2000 in the Czech National Corpus (CNK) project containing 100 million word forms taken from different types of texts. I have tried to determine the extent and depth of the linguistic material in the corpus. First, I chose the adjectives indicating the basic colors of the spectrum and other parts of speech (names and adverbs) derived from these adjectives. An analysis of three examples - black, white and red - shows the extent of the linguistic wealth and diversity we are looking at: because of size limitations, no existing dictionary is capable of embracing all analyzed nuances. Currently, we can only hope that the next dictionary of contemporary Czech, built on the basis of the Czech National Corpus, will be electronic. Without the size limitations, we would be able us to include many of the fine nuances of language
  • Schulte im Walde, S., Melinger, A., Roth, M., & Weber, A. (2007). An empirical characterization of response types in German association norms. In Proceedings of the GLDV workshop on lexical-semantic and ontological resources.
  • Scott, S., & Sauter, D. (2006). Non-verbal expressions of emotion - acoustics, valence, and cross cultural factors. In Third International Conference on Speech Prosody 2006. ISCA.

    Abstract

    This presentation will address aspects of the expression of emotion in non-verbal vocal behaviour, specifically attempting to determine the roles of both positive and negative emotions, their acoustic bases, and the extent to which these are recognized in non-Western cultures.
  • Scott, D. R., & Cutler, A. (1982). Segmental cues to syntactic structure. In Proceedings of the Institute of Acoustics 'Spectral Analysis and its Use in Underwater Acoustics' (pp. E3.1-E3.4). London: Institute of Acoustics.
  • Seifart, F. (2002). El sistema de clasificación nominal del miraña. Bogotá: CCELA/Universidad de los Andes.
  • Senft, G. (2002). What should the ideal online-archive documenting linguistic data of various (endangered) languages and cultures offer to interested parties? Some ideas of a technically naive linguistic field researcher and potential user. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 11-15). Paris: European Language Resources Association.
  • Senft, G. (1991). Bakavilisi Biga - we can 'turn' the language - or: What happens to English words in Kilivila language? In W. Bahner, J. Schildt, & D. Viehwegger (Eds.), Proceedings of the XIVth International Congress of Linguists (pp. 1743-1746). Berlin: Akademie Verlag.
  • Senft, G. (2007). Language, culture and cognition: Frames of spatial reference and why we need ontologies of space [Abstract]. In A. G. Cohn, C. Freksa, & B. Bebel (Eds.), Spatial cognition: Specialization and integration (pp. 12).

    Abstract

    One of the many results of the "Space" research project conducted at the MPI for Psycholinguistics is that there are three "Frames of spatial Reference" (FoRs), the relative, the intrinsic and the absolute FoR. Cross-linguistic research showed that speakers who prefer one FoR in verbal spatial references rely on a comparable coding system for memorizing spatial configurations and for making inferences with respect to these spatial configurations in non-verbal problem solving. Moreover, research results also revealed that in some languages these verbal FoRs also influence gestural behavior. These results document the close interrelationship between language, culture and cognition in the domain "Space". The proper description of these interrelationships in the spatial domain requires language and culture specific ontologies.
  • Senft, G. (1986). Kilivila: The language of the Trobriand Islanders. Berlin: Mouton de Gruyter.
  • Senft, B., & Senft, G. (2018). Growing up on the Trobriand Islands in Papua New Guinea - Childhood and educational ideologies in Tauwema. Amsterdam: Benjamins. doi:10.1075/clu.21.

    Abstract

    This volume deals with the children’s socialization on the Trobriands. After a survey of ethnographic studies on childhood, the book zooms in on indigenous ideas of conception and birth-giving, the children’s early development, their integration into playgroups, their games and their education within their `own little community’ until they reach the age of seven years. During this time children enjoy much autonomy and independence. Attempts of parental education are confined to a minimum. However, parents use subtle means to raise their children. Educational ideologies are manifest in narratives and in speeches addressed to children. They provide guidelines for their integration into the Trobrianders’ “balanced society” which is characterized by cooperation and competition. It does not allow individual accumulation of wealth – surplus property gained has to be redistributed – but it values the fame acquired by individuals in competitive rituals. Fame is not regarded as threatening the balance of their society.
  • Seuren, P. A. M. (1982). De spelling van het Sranan: Een diskussie en een voorstel. Nijmegen: Masusa.
  • Seuren, P. A. M. (2002). Existential import. In D. De Jongh, M. Nilsenová, & H. Zeevat (Eds.), Proceedings of The 3rd and 4th International Symposium on Language, Logic and Computation. Amsterdam: ILLC Scientific Publ. U. of Amsterdam.
  • Seuren, P. A. M. (1991). Notes on noun phrases and quantification. In Proceedings of the International Conference on Current Issues in Computational Linguistics (pp. 19-44). Penang, Malaysia: Universiti Sains Malaysia.
  • Seuren, P. A. M. (2018). Semantic syntax (2nd rev. ed.). Leiden: Brill.

    Abstract

    This book presents a detailed formal machinery for the conversion of the Semantic Analyses (SAs) of sentences into surface structures of English, French, German, Dutch, and to some extent Turkish. The SAs are propositional structures consisting of a predicate and one, two or three argument terms, some of which can themselves be propositional structures. The surface structures are specified up to, but not including, the morphology. The book is thus an implementation of the programme formulated first by Albert Sechehaye (1870-1946) and then, independently, by James McCawley (1938-1999) in the school of Generative Semantics. It is the first, and so far the only formally precise and empirically motivated machinery in existence converting meaning representations into sentences of natural languages.
  • Seuren, P. A. M. (1982). Riorientamenti metodologici nello studio della variabilità linguistica. In D. Gambarara, & A. D'Atri (Eds.), Ideologia, filosofia e linguistica: Atti del Convegno Internazionale di Studi, Rende (CS) 15-17 Settembre 1978 ( (pp. 499-515). Roma: Bulzoni.
  • Seuren, P. A. M. (2018). Saussure and Sechehaye: A study in the history of linguistics and the foundations of language. Leiden: Brill.
  • Seuren, P. A. M. (1991). What makes a text untranslatable? In H. M. N. Noor Ein, & H. S. Atiah (Eds.), Pragmatik Penterjemahan: Prinsip, Amalan dan Penilaian Menuju ke Abad 21 ("The Pragmatics of Translation: Principles, Practice and Evaluation Moving towards the 21st Century") (pp. 19-27). Kuala Lumpur: Dewan Bahasa dan Pustaka.
  • Speed, L., & Majid, A. (2018). Music and odor in harmony: A case of music-odor synaesthesia. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 2527-2532). Austin, TX: Cognitive Science Society.

    Abstract

    We report an individual with music-odor synaesthesia who experiences automatic and vivid odor sensations when she hears music. S’s odor associations were recorded on two days, and compared with those of two control participants. Overall, S produced longer descriptions, and her associations were of multiple odors at once, in comparison to controls who typically reported a single odor. Although odor associations were qualitatively different between S and controls, ratings of the consistency of their descriptions did not differ. This demonstrates that crossmodal associations between music and odor exist in non-synaesthetes too. We also found that S is better at discriminating between odors than control participants, and is more likely to experience emotion, memories and evaluations triggered by odors, demonstrating the broader impact of her synaesthesia.

    Additional information

    link to conference website
  • Stevens, M. A., McQueen, J. M., & Hartsuiker, R. J. (2007). No lexically-driven perceptual adjustments of the [x]-[h] boundary. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1897-1900). Dudweiler: Pirrot.

    Abstract

    Listeners can make perceptual adjustments to phoneme categories in response to a talker who consistently produces a specific phoneme ambiguously. We investigate here whether this type of perceptual learning is also used to adapt to regional accent differences. Listeners were exposed to words produced by a Flemish talker whose realization of [x℄or [h℄ was ambiguous (producing [x℄like [h℄is a property of the West-Flanders regional accent). Before and after exposure they categorized a [x℄-[h℄continuum. For both Dutch and Flemish listeners there was no shift of the categorization boundary after exposure to ambiguous sounds in [x℄- or [h℄-biasing contexts. The absence of a lexically-driven learning effect for this contrast may be because [h℄is strongly influenced by coarticulation. As is not stable across contexts, it may be futile to adapt its representation when new realizations are heard
  • Stivers, T. (2007). Prescribing under pressure: Parent-physician conversations and antibiotics. Oxford: Oxford University Press.

    Abstract

    This book examines parent-physician conversations in detail, showing how parents put pressure on doctors in largely covert ways, for instance in specific communication practices for explaining why they have brought their child to the doctor or answering a history-taking question. This book also shows how physicians yield to this seemingly subtle pressure evidencing that apparently small differences in wording have important consequences for diagnosis and treatment recommendations. Following parents use of these interactional practices, physicians are more likely to make concessions, alter their diagnosis or alter their treatment recommendation. This book also shows how small changes in the way physicians present their findings and recommendations can decrease parent pressure for antibiotics. This book carefully documents the important and observable link between micro social interaction and macro public health domains.
  • Ten Bosch, L., Baayen, R. H., & Ernestus, M. (2006). On speech variation and word type differentiation by articulatory feature representations. In Proceedings of Interspeech 2006 (pp. 2230-2233).

    Abstract

    This paper describes ongoing research aiming at the description of variation in speech as represented by asynchronous articulatory features. We will first illustrate how distances in the articulatory feature space can be used for event detection along speech trajectories in this space. The temporal structure imposed by the cosine distance in articulatory feature space coincides to a large extent with the manual segmentation on phone level. The analysis also indicates that the articulatory feature representation provides better such alignments than the MFCC representation does. Secondly, we will present first results that indicate that articulatory features can be used to probe for acoustic differences in the onsets of Dutch singulars and plurals.
  • Ten Bosch, L., Ernestus, M., & Boves, L. (2018). Analyzing reaction time sequences from human participants in auditory experiments. In Proceedings of Interspeech 2018 (pp. 971-975). doi:10.21437/Interspeech.2018-1728.

    Abstract

    Sequences of reaction times (RT) produced by participants in an experiment are not only influenced by the stimuli, but by many other factors as well, including fatigue, attention, experience, IQ, handedness, etc. These confounding factors result in longterm effects (such as a participant’s overall reaction capability) and in short- and medium-time fluctuations in RTs (often referred to as ‘local speed effects’). Because stimuli are usually presented in a random sequence different for each participant, local speed effects affect the underlying ‘true’ RTs of specific trials in different ways across participants. To be able to focus statistical analysis on the effects of the cognitive process under study, it is necessary to reduce the effect of confounding factors as much as possible. In this paper we propose and compare techniques and criteria for doing so, with focus on reducing (‘filtering’) the local speed effects. We show that filtering matters substantially for the significance analyses of predictors in linear mixed effect regression models. The performance of filtering is assessed by the average between-participant correlation between filtered RT sequences and by Akaike’s Information Criterion, an important measure of the goodness-of-fit of linear mixed effect regression models.
  • ten Bosch, L., Hämäläinen, A., Scharenborg, O., & Boves, L. (2006). Acoustic scores and symbolic mismatch penalties in phone lattices. In Proceedings of the 2006 IEEE International Conference on Acoustics, Speech and Signal Processing [ICASSP 2006]. IEEE.

    Abstract

    This paper builds on previous work that aims at unraveling the structure of the speech signal by means of using probabilistic representations. The context of this work is a multi-pass speech recognition system in which a phone lattice is created and used as a basis for a lexical search in which symbolic mismatches are allowed at certain costs. The focus is on the optimization of the costs of phone insertions, deletions and substitutions that are used in the lexical decoding pass. Two optimization approaches are presented, one related to a multi-pass computational model for human speech recognition, the other based on a decoding in which Bayes’ risks are minimized. In the final section, the advantages of these optimization methods are discussed and compared.
  • Ten Bosch, L., & Boves, L. (2018). Information encoding by deep neural networks: what can we learn? In Proceedings of Interspeech 2018 (pp. 1457-1461). doi:10.21437/Interspeech.2018-1896.

    Abstract

    The recent advent of deep learning techniques in speech tech-nology and in particular in automatic speech recognition hasyielded substantial performance improvements. This suggeststhat deep neural networks (DNNs) are able to capture structurein speech data that older methods for acoustic modeling, suchas Gaussian Mixture Models and shallow neural networks failto uncover. In image recognition it is possible to link repre-sentations on the first couple of layers in DNNs to structuralproperties of images, and to representations on early layers inthe visual cortex. This raises the question whether it is possi-ble to accomplish a similar feat with representations on DNNlayers when processing speech input. In this paper we presentthree different experiments in which we attempt to untanglehow DNNs encode speech signals, and to relate these repre-sentations to phonetic knowledge, with the aim to advance con-ventional phonetic concepts and to choose the topology of aDNNs more efficiently. Two experiments investigate represen-tations formed by auto-encoders. A third experiment investi-gates representations on convolutional layers that treat speechspectrograms as if they were images. The results lay the basisfor future experiments with recursive networks.
  • Terrill, A. (2002). Dharumbal: The language of Rockhampton, Australia. Canberra: Pacific Linguistics.
  • Thompson, B., & Lupyan, G. (2018). Automatic estimation of lexical concreteness in 77 languages. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 1122-1127). Austin, TX: Cognitive Science Society.

    Abstract

    We estimate lexical Concreteness for millions of words across 77 languages. Using a simple regression framework, we combine vector-based models of lexical semantics with experimental norms of Concreteness in English and Dutch. By applying techniques to align vector-based semantics across distinct languages, we compute and release Concreteness estimates at scale in numerous languages for which experimental norms are not currently available. This paper lays out the technique and its efficacy. Although this is a difficult dataset to evaluate immediately, Concreteness estimates computed from English correlate with Dutch experimental norms at $\rho$ = .75 in the vocabulary at large, increasing to $\rho$ = .8 among Nouns. Our predictions also recapitulate attested relationships with word frequency. The approach we describe can be readily applied to numerous lexical measures beyond Concreteness
  • Thompson, B., Roberts, S., & Lupyan, G. (2018). Quantifying semantic similarity across languages. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 2551-2556). Austin, TX: Cognitive Science Society.

    Abstract

    Do all languages convey semantic knowledge in the same way? If language simply mirrors the structure of the world, the answer should be a qualified “yes”. If, however, languages impose structure as much as reflecting it, then even ostensibly the “same” word in different languages may mean quite different things. We provide a first pass at a large-scale quantification of cross-linguistic semantic alignment of approximately 1000 meanings in 55 languages. We find that the translation equivalents in some domains (e.g., Time, Quantity, and Kinship) exhibit high alignment across languages while the structure of other domains (e.g., Politics, Food, Emotions, and Animals) exhibits substantial cross-linguistic variability. Our measure of semantic alignment correlates with known phylogenetic distances between languages: more phylogenetically distant languages have less semantic alignment. We also find semantic alignment to correlate with cultural distances between societies speaking the languages, suggesting a rich co-adaptation of language and culture even in domains of experience that appear most constrained by the natural world
  • Tourtouri, E. N., Delogu, F., & Crocker, M. W. (2018). Specificity and entropy reduction in situated referential processing. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 3356-3361). Austin: Cognitive Science Society.

    Abstract

    In situated communication, reference to an entity in the shared visual context can be established using eitheranexpression that conveys precise (minimally specified) or redundant (over-specified) information. There is, however, along-lasting debate in psycholinguistics concerningwhether the latter hinders referential processing. We present evidence from an eyetrackingexperiment recordingfixations as well asthe Index of Cognitive Activity –a novel measure of cognitive workload –supporting the view that over-specifications facilitate processing. We further present originalevidence that, above and beyond the effect of specificity,referring expressions thatuniformly reduce referential entropyalso benefitprocessing
  • Troncarelli, M. C., & Drude, S. (2002). Awytyza Ti'ingku. Livro para alfabetização na língua aweti: Awytyza Ti’ingku. Alphabetisierungs‐Fibel der Awetí‐Sprache. São Paulo: Instituto Sócio-Ambiental.
  • Tuinman, A. (2006). Overcompensation of /t/ reduction in Dutch by German/Dutch bilinguals. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 101-102).
  • Tuinman, A., Mitterer, H., & Cutler, A. (2007). Speakers differentiate English intrusive and onset /r/, but L2 listeners do not. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1905-1908). Dudweiler: Pirrot.

    Abstract

    We investigated whether non-native listeners can exploit phonetic detail in recognizing potentially ambiguous utterances, as native listeners can [6, 7, 8, 9, 10]. Due to the phenomenon of intrusive /r/, the English phrase extra ice may sound like extra rice. A production study indicates that the intrusive /r/ can be distinguished from the onset /r/ in rice, as it is phonetically weaker. In two cross-modal identity priming studies, however, we found no conclusive evidence that Dutch learners of English are able to make use of this difference. Instead, auditory primes such as extra rice and extra ice with onset and intrusive /r/s activate both types of targets such as ice and rice. This supports the notion of spurious lexical activation in L2 perception.
  • Vagliano, I., Galke, L., Mai, F., & Scherp, A. (2018). Using adversarial autoencoders for multi-modal automatic playlist continuation. In C.-W. Chen, P. Lamere, M. Schedl, & H. Zamani (Eds.), RecSys Challenge '18: Proceedings of the ACM Recommender Systems Challenge 2018 (pp. 5.1-5.6). New York: ACM. doi:10.1145/3267471.3267476.

    Abstract

    The task of automatic playlist continuation is generating a list of recommended tracks that can be added to an existing playlist. By suggesting appropriate tracks, i. e., songs to add to a playlist, a recommender system can increase the user engagement by making playlist creation easier, as well as extending listening beyond the end of current playlist. The ACM Recommender Systems Challenge 2018 focuses on such task. Spotify released a dataset of playlists, which includes a large number of playlists and associated track listings. Given a set of playlists from which a number of tracks have been withheld, the goal is predicting the missing tracks in those playlists. We participated in the challenge as the team Unconscious Bias and, in this paper, we present our approach. We extend adversarial autoencoders to the problem of automatic playlist continuation. We show how multiple input modalities, such as the playlist titles as well as track titles, artists and albums, can be incorporated in the playlist continuation task.
  • Van Alphen, P. M., De Bree, E., Fikkert, P., & Wijnen, F. (2007). The role of metrical stress in comprehension and production of Dutch children at risk of dyslexia. In Proceedings of Interspeech 2007 (pp. 2313-2316). Adelaide: Causal Productions.

    Abstract

    The present study compared the role of metrical stress in comprehension and production of three-year-old children with a familial risk of dyslexia with that of normally developing children. A visual fixation task with stress (mis-)matches in bisyllabic words, as well as a non-word repetition task with bisyllabic targets were presented to the control and at-risk children. Results show that the at-risk group is less sensitive to stress mismatches in word recognition than the control group. Correct production of metrical stress patterns did not differ significantly between the groups, but the percentages of phonemes produced correctly were lower for the at-risk than the control group. The findings indicate that processing of metrical stress patterns is not impaired in at-risk children, but that the at-risk group cannot exploit metrical stress in word recognition
  • Van Ooijen, B., Cutler, A., & Norris, D. (1991). Detection times for vowels versus consonants. In Eurospeech 91: Vol. 3 (pp. 1451-1454). Genova: Istituto Internazionale delle Comunicazioni.

    Abstract

    This paper reports two experiments with vowels and consonants as phoneme detection targets in real words. In the first experiment, two relatively distinct vowels were compared with two confusible stop consonants. Response times to the vowels were longer than to the consonants. Response times correlated negatively with target phoneme length. In the second, two relatively distinct vowels were compared with their corresponding semivowels. This time, the vowels were detected faster than the semivowels. We conclude that response time differences between vowels and stop consonants in this task may reflect differences between phoneme categories in the variability of tokens, both in the acoustic realisation of targets and in the' representation of targets by subjects.
  • Van den Bos, E. J., & Poletiek, F. H. (2006). Implicit artificial grammar learning in adults and children. In R. Sun (Ed.), Proceedings of the 28th Annual Conference of the Cognitive Science Society (CogSci 2006) (pp. 2619). Austin, TX, USA: Cognitive Science Society.
  • Vernes, S. C. (2018). Vocal learning in bats: From genes to behaviour. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 516-518). Toruń, Poland: NCU Press. doi:10.12775/3991-1.128.
  • Von Holzen, K., & Bergmann, C. (2018). A Meta-Analysis of Infants’ Mispronunciation Sensitivity Development. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 1159-1164). Austin, TX: Cognitive Science Society.

    Abstract

    Before infants become mature speakers of their native language, they must acquire a robust word-recognition system which allows them to strike the balance between allowing some variation (mood, voice, accent) and recognizing variability that potentially changes meaning (e.g. cat vs hat). The current meta-analysis quantifies how the latter, termed mispronunciation sensitivity, changes over infants’ first three years, testing competing predictions of mainstream language acquisition theories. Our results show that infants were sensitive to mispronunciations, but accepted them as labels for target objects. Interestingly, and in contrast to predictions of mainstream theories, mispronunciation sensitivity was not modulated by infant age, suggesting that a sufficiently flexible understanding of native language phonology is in place at a young age.
  • De Vos, C. (2006). Mixed signals: Combining affective and linguistic functions of eyebrows in sign language of The Netherlands (Master's thesis). Nijmegen: Department of Linguistics, Radboud University.

    Abstract

    Sign Language of the Netherlands (NGT) is a visual-gestural language in which linguistic information is conveyed through manual as well as non-manual channels; not only the hands, but also body position, head position and facial expression are important for the language structure. Facial expressions serve grammatical functions in the marking of topics, yes/no questions, and wh-questions (Coerts, 1992). Furthermore, facial expression is used nonlinguistically in the expression of affect (Ekman, 1979). Consequently, at the phonetic level obligatory marking of grammar using facial expression may conflict with the expression of affect. In this study, I investigated the interplay of linguistic and affective functions of brow movements in NGT. Three hypotheses were tested in this thesis. The first is that the affective markers of eyebrows would dominate over the linguistic markers. The second hypothesis predicts that the grammatical markers dominate over the affective brow movements. A third possibility is that a Phonetic Sum would occur in which both functions are combined simultaneously. I elicited sentences combining grammatical and affective functions of eyebrows using a randomised design. Five sentence types were included: declarative sentences, topic sentences, yes-no questions, wh-questions with the wh-sign sentence-final and wh-questions with the wh-sign sentence-initial. These sentences were combined with neutral, surprised, angry, and distressed affect. The brow movements were analysed using the Facial Action Coding System (Ekman, Friesen, & Hager, 2002a). In these sentences, the eyebrows serve a linguistic function, an affective function, or both. One of the possibilities in the latter cases was that a Phonetic Sum would occur that combines both functions simultaneously. Surprisingly, it was found that a Phonetic Sum occurs in which the phonetic weight of Action Unit 4 appears to play an important role. The results show that affect displays may alter question signals in NGT.
  • Vosse, T., & Kempen, G. (1991). A hybrid model of human sentence processing: Parsing right-branching, center-embedded and cross-serial dependencies. In M. Tomita (Ed.), Proceedings of the Second International Workshop on Parsing Technologies.
  • Warner, N., & Weber, A. (2002). Stop epenthesis at syllable boundaries. In J. H. L. Hansen, & B. Pellom (Eds.), 7th International Conference on Spoken Language Processing (ICSLP2002 - INTERSPEECH 2002) (pp. 1121-1124). ISCA Archive.

    Abstract

    This paper investigates the production and perception of epenthetic stops at syllable boundaries in Dutch and compares the experimental data with lexical statistics for Dutch and English. This extends past work on epenthesis in coda position [1]. The current work is particularly informative regarding the question of phonotactic constraints’ influence on parsing of speech variability.
  • Warner, N., Jongman, A., & Mücke, D. (2002). Variability in direction of dorsal movement during production of /l/. In J. H. L. Hansen, & B. Pellom (Eds.), 7th International Conference on Spoken Language Processing (ICSLP2002 - INTERSPEECH 2002) (pp. 1089-1092). ISCA Archive.

    Abstract

    This paper presents articulatory data on the production of /l/ in various environments in Dutch, and shows that the direction of movement of the tongue dorsum varies across environments. This makes it impossible to measure tongue position at the peak of the dorsal gesture. We argue for an alternative method in such cases: measurement of position of one articulator at a time point defined by the gesture of another. We present new data measured this way which confirms a previous finding on the articulation of Dutch /l/.
  • Weber, A., Melinger, A., & Lara Tapia, L. (2007). The mapping of phonetic information to lexical presentations in Spanish: Evidence from eye movements. In J. Trouvain, & W. J. Barry (Eds.), Proceedings of the 16th International Congress of Phonetic Sciences (ICPhS 2007) (pp. 1941-1944). Dudweiler: Pirrot.

    Abstract

    In a visual-world study, we examined spoken-wordrecognition in Spanish. Spanish listeners followed spoken instructions to click on pictures while their eye movements were monitored. When instructed to click on the picture of a door (puerta), they experienced interference from the picture of a pig (p u e r c o ). The same interference from phonologically related items was observed when the displays contained printed names or a combination of pictures with their names printed underneath, although the effect was strongest for displays with printed names. Implications of the finding that the interference effect can be induced with standard pictorial displays as well as with orthographic displays are discussed.
  • Widlok, T. (2006). Two ways of looking at a Mangetti grove. In A. Takada (Ed.), Proceedings of the workshop: Landscape and society (pp. 11-16). Kyoto: 21st Century Center of Excellence Program.
  • Wittenburg, P., Kita, S., & Brugman, H. (2002). Crosslinguistic studies of multimodal communication.
  • Wittenburg, P., Brugman, H., Russel, A., Klassmann, A., & Sloetjes, H. (2006). ELAN: a professional framework for multimodality research. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 1556-1559).

    Abstract

    Utilization of computer tools in linguistic research has gained importance with the maturation of media frameworks for the handling of digital audio and video. The increased use of these tools in gesture, sign language and multimodal interaction studies has led to stronger requirements on the flexibility, the efficiency and in particular the time accuracy of annotation tools. This paper describes the efforts made to make ELAN a tool that meets these requirements, with special attention to the developments in the area of time accuracy. In subsequent sections an overview will be given of other enhancements in the latest versions of ELAN, that make it a useful tool in multimodality research.
  • Wittenburg, P., Peters, W., & Drude, S. (2002). Analysis of lexical structures from field linguistics and language engineering. In M. R. González, & C. P. S. Araujo (Eds.), Third international conference on language resources and evaluation (pp. 682-686). Paris: European Language Resources Association.

    Abstract

    Lexica play an important role in every linguistic discipline. We are confronted with many types of lexica. Depending on the type of lexicon and the language we are currently faced with a large variety of structures from very simple tables to complex graphs, as was indicated by a recent overview of structures found in dictionaries from field linguistics and language engineering. It is important to assess these differences and aim at the integration of lexical resources in order to improve lexicon creation, exchange and reuse. This paper describes the first step towards the integration of existing structures and standards into a flexible abstract model.
  • Wittenburg, P., Broeder, D., Klein, W., Levinson, S. C., & Romary, L. (2006). Foundations of modern language resource archives. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 625-628).

    Abstract

    A number of serious reasons will convince an increasing amount of researchers to store their relevant material in centers which we will call "language resource archives". They combine the duty of taking care of long-term preservation as well as the task to give access to their material to different user groups. Access here is meant in the sense that an active interaction with the data will be made possible to support the integration of new data, new versions or commentaries of all sort. Modern Language Resource Archives will have to adhere to a number of basic principles to fulfill all requirements and they will have to be involved in federations to create joint language resource domains making it even more simple for the researchers to access the data. This paper makes an attempt to formulate the essential pillars language resource archives have to adhere to.
  • Wittenburg, P., & Broeder, D. (2002). Metadata overview and the semantic web. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics. Paris: European Language Resources Association.

    Abstract

    The increasing quantity and complexity of language resources leads to new management problems for those that collect and those that need to preserve them. At the same time the desire to make these resources available on the Internet demands an efficient way characterizing their properties to allow discovery and re-use. The use of metadata is seen as a solution for both these problems. However, the question is what specific requirements there are for the specific domain and if these are met by existing frameworks. Any possible solution should be evaluated with respect to its merit for solving the domain specific problems but also with respect to its future embedding in “global” metadata frameworks as part of the Semantic Web activities.
  • Wittenburg, P., Peters, W., & Broeder, D. (2002). Metadata proposals for corpora and lexica. In M. Rodriguez González, & C. Paz Suárez Araujo (Eds.), Third international conference on language resources and evaluation (pp. 1321-1326). Paris: European Language Resources Association.
  • Wittenburg, P., Mosel, U., & Dwyer, A. (2002). Methods of language documentation in the DOBES program. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 36-42). Paris: European Language Resources Association.
  • Zeshan, U. (Ed.). (2006). Interrogative and negative constructions in sign languages. Nijmegen: Ishara Press.
  • Zwitserlood, I. (2002). The complex structure of ‘simple’ signs in NGT. In J. Van Koppen, E. Thrift, E. Van der Torre, & M. Zimmermann (Eds.), Proceedings of ConSole IX (pp. 232-246).

    Abstract

    In this paper, I argue that components in a set of simple signs in Nederlandse Gebarentaal (also called Sign Language of the Netherlands; henceforth: NGT), i.e. hand configuration (including orientation), movement and place of articulation, can also have morphological status. Evidence for this is provided by: firstly, the fact that handshape, orientation, movement and place of articulation show regular meaningful patterns in signs, which patterns also occur in newly formed signs, and secondly, the gradual change of formerly noninflecting predicates into inflectional predicates. The morphological complexity of signs can best be accounted for in autosegmental morphological templates.

Share this page