Publications

Displaying 1 - 100 of 180
  • Alday, P. M. (2016). Towards a rigorous motivation for Ziph's law. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/178.html.

    Abstract

    Language evolution can be viewed from two viewpoints: the development of a communicative system and the biological adaptations necessary for producing and perceiving said system. The communicative-system vantage point has enjoyed a wealth of mathematical models based on simple distributional properties of language, often formulated as empirical laws. However, be- yond vague psychological notions of “least effort”, no principled explanation has been proposed for the existence and success of such laws. Meanwhile, psychological and neurobiological mod- els have focused largely on the computational constraints presented by incremental, real-time processing. In the following, we show that information-theoretic entropy underpins successful models of both types and provides a more principled motivation for Zipf’s Law
  • Alhama, R. G., & Zuidema, W. (2016). Generalization in Artificial Language Learning: Modelling the Propensity to Generalize. In Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning (pp. 64-72). Association for Computational Linguistics. doi:10.18653/v1/W16-1909.

    Abstract

    Experiments in Artificial Language Learn- ing have revealed much about the cogni- tive mechanisms underlying sequence and language learning in human adults, in in- fants and in non-human animals. This pa- per focuses on their ability to generalize to novel grammatical instances (i.e., in- stances consistent with a familiarization pattern). Notably, the propensity to gen- eralize appears to be negatively correlated with the amount of exposure to the artifi- cial language, a fact that has been claimed to be contrary to the predictions of statis- tical models (Pe ̃ na et al. (2002); Endress and Bonatti (2007)). In this paper, we pro- pose to model generalization as a three- step process, and we demonstrate that the use of statistical models for the first two steps, contrary to widespread intuitions in the ALL-field, can explain the observed decrease of the propensity to generalize with exposure time.
  • Alhama, R. G., & Zuidema, W. (2016). Pre-Wiring and Pre-Training: What does a neural network need to learn truly general identity rules? In T. R. Besold, A. Bordes, & A. D'Avila Garcez (Eds.), CoCo 2016 Cognitive Computation: Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016. CEUR Workshop Proceedings.

    Abstract

    In an influential paper, Marcus et al. [1999] claimed that connectionist models cannot account for human success at learning tasks that involved generalization of abstract knowledge such as grammatical rules. This claim triggered a heated debate, centered mostly around variants of the Simple Recurrent Network model [Elman, 1990]. In our work, we revisit this unresolved debate and analyze the underlying issues from a different perspective. We argue that, in order to simulate human-like learning of grammatical rules, a neural network model should not be used as a tabula rasa , but rather, the initial wiring of the neural connections and the experience acquired prior to the actual task should be incorporated into the model. We present two methods that aim to provide such initial state: a manipu- lation of the initial connections of the network in a cognitively plausible manner (concretely, by implementing a “delay-line” memory), and a pre-training algorithm that incrementally challenges the network with novel stimuli. We implement such techniques in an Echo State Network [Jaeger, 2001], and we show that only when combining both techniques the ESN is able to learn truly general identity rules.
  • Alhama, R. G., Scha, R., & Zuidema, W. (2014). Rule learning in humans and animals. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Cornish (Eds.), The evolution of language: Proceedings of the 10th International Conference (EVOLANG 10) (pp. 371-372). Singapore: World Scientific.
  • Allen, S. E. M. (1998). A discourse-pragmatic explanation for the subject-object asymmetry in early null arguments. In A. Sorace, C. Heycock, & R. Shillcock (Eds.), Proceedings of the GALA '97 Conference on Language Acquisition (pp. 10-15). Edinburgh, UK: Edinburgh University Press.

    Abstract

    The present paper assesses discourse-pragmatic factors as a potential explanation for the subject-object assymetry in early child language. It identifies a set of factors which characterize typical situations of informativeness (Greenfield & Smith, 1976), and uses these factors to identify informative arguments in data from four children aged 2;0 through 3;6 learning Inuktitut as a first language. In addition, it assesses the extent of the links between features of informativeness on one hand and lexical vs. null and subject vs. object arguments on the other. Results suggest that a pragmatics account of the subject-object asymmetry can be upheld to a greater extent than previous research indicates, and that several of the factors characterizing informativeness are good indicators of those arguments which tend to be omitted in early child language.
  • Azar, Z., Backus, A., & Ozyurek, A. (2016). Pragmatic relativity: Gender and context affect the use of personal pronouns in discourse differentially across languages. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 1295-1300). Austin, TX: Cognitive Science Society.

    Abstract

    Speakers use differential referring expressions in pragmatically appropriate ways to produce coherent narratives. Languages, however, differ in a) whether REs as arguments can be dropped and b) whether personal pronouns encode gender. We examine two languages that differ from each other in these two aspects and ask whether the co-reference context and the gender encoding options affect the use of REs differentially. We elicited narratives from Dutch and Turkish speakers about two types of three-person events, one including people of the same and the other of mixed-gender. Speakers re-introduced referents into the discourse with fuller forms (NPs) and maintained them with reduced forms (overt or null pronoun). Turkish speakers used pronouns mainly to mark emphasis and only Dutch speakers used pronouns differentially across the two types of videos. We argue that linguistic possibilities available in languages tune speakers into taking different principles into account to produce pragmatically coherent narratives
  • Bauer, B. L. M. (2014). Indefinite HOMO in the Gospels of the Vulgata. In P. Molinell, P. Cuzzoli, & C. Fedriani (Eds.), Latin vulgaire – latin tardif X (pp. 415-435). Bergamo: Bergamo University Press.
  • Bergmann, C., Ten Bosch, L., & Boves, L. (2014). A computational model of the headturn preference procedure: Design, challenges, and insights. In J. Mayor, & P. Gomez (Eds.), Computational Models of Cognitive Processes (pp. 125-136). World Scientific. doi:10.1142/9789814458849_0010.

    Abstract

    The Headturn Preference Procedure (HPP) is a frequently used method (e.g., Jusczyk & Aslin; and subsequent studies) to investigate linguistic abilities in infants. In this paradigm infants are usually first familiarised with words and then tested for a listening preference for passages containing those words in comparison to unrelated passages. Listening preference is defined as the time an infant spends attending to those passages with his or her head turned towards a flashing light and the speech stimuli. The knowledge and abilities inferred from the results of HPP studies have been used to reason about and formally model early linguistic skills and language acquisition. However, the actual cause of infants' behaviour in HPP experiments has been subject to numerous assumptions as there are no means to directly tap into cognitive processes. To make these assumptions explicit, and more crucially, to understand how infants' behaviour emerges if only general learning mechanisms are assumed, we introduce a computational model of the HPP. Simulations with the computational HPP model show that the difference in infant behaviour between familiarised and unfamiliar words in passages can be explained by a general learning mechanism and that many assumptions underlying the HPP are not necessarily warranted. We discuss the implications for conventional interpretations of the outcomes of HPP experiments.
  • Bergmann, C., Cristia, A., & Dupoux, E. (2016). Discriminability of sound contrasts in the face of speaker variation quantified. In Proceedings of the 38th Annual Conference of the Cognitive Science Society. (pp. 1331-1336). Austin, TX: Cognitive Science Society.

    Abstract

    How does a naive language learner deal with speaker variation irrelevant to distinguishing word meanings? Experimental data is contradictory, and incompatible models have been proposed. Here, we examine basic assumptions regarding the acoustic signal the learner deals with: Is speaker variability a hurdle in discriminating sounds or can it easily be ignored? To this end, we summarize existing infant data. We then present machine-based discriminability scores of sound pairs obtained without any language knowledge. Our results show that speaker variability decreases sound contrast discriminability, and that some contrasts are affected more than others. However, chance performance is rare; most contrasts remain discriminable in the face of speaker variation. We take our results to mean that speaker variation is not a uniform hurdle to discriminating sound contrasts, and careful examination is necessary when planning and interpreting studies testing whether and to what extent infants (and adults) are sensitive to speaker differences.

    Additional information

    Scripts and data
  • Blasi, D. E., Christiansen, M. H., Wichmann, S., Hammarström, H., & Stadler, P. F. (2014). Sound symbolism and the origins of language. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Cornish (Eds.), The evolution of language: Proceedings of the 10th International Conference (EVOLANG 10) (pp. 391-392). Singapore: World Scientific.
  • Bocanegra, B. R., Poletiek, F. H., & Zwaan, R. A. (2014). Asymmetrical feature binding across language and perception. In Proceedings of the 7th annual Conference on Embodied and Situated Language Processing (ESLP 2014).
  • Bohnemeyer, J. (2004). Argument and event structure in Yukatek verb classes. In J.-Y. Kim, & A. Werle (Eds.), Proceedings of The Semantics of Under-Represented Languages in the Americas. Amherst, Mass: GLSA.

    Abstract

    In Yukatek Maya, event types are lexicalized in verb roots and stems that fall into a number of different form classes on the basis of (a) patterns of aspect-mood marking and (b) priviledges of undergoing valence-changing operations. Of particular interest are the intransitive classes in the light of Perlmutter’s (1978) Unaccusativity hypothesis. In the spirit of Levin & Rappaport Hovav (1995) [L&RH], Van Valin (1990), Zaenen (1993), and others, this paper investigates whether (and to what extent) the association between formal predicate classes and event types is determined by argument structure features such as ‘agentivity’ and ‘control’ or features of lexical aspect such as ‘telicity’ and ‘durativity’. It is shown that mismatches between agentivity/control and telicity/durativity are even more extensive in Yukatek than they are in English (Abusch 1985; L&RH, Van Valin & LaPolla 1997), providing new evidence against Dowty’s (1979) reconstruction of Vendler’s (1967) ‘time schemata of verbs’ in terms of argument structure configurations. Moreover, contrary to what has been claimed in earlier studies of Yukatek (Krämer & Wunderlich 1999, Lucy 1994), neither agentivity/control nor telicity/durativity turn out to be good predictors of verb class membership. Instead, the patterns of aspect-mood marking prove to be sensitive only to the presence or absense of state change, in a way that supports the unified analysis of all verbs of gradual change proposed by Kennedy & Levin (2001). The presence or absence of ‘internal causation’ (L&RH) may motivate the semantic interpretation of transitivization operations. An explicit semantics for the valence-changing operations is proposed, based on Parsons’s (1990) Neo-Davidsonian approach.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2016). Listening under cognitive load makes speech sound fast. In H. van den Heuvel, B. Cranen, & S. Mattys (Eds.), Proceedings of the Speech Processing in Realistic Environments [SPIRE] Workshop (pp. 23-24). Groningen.
  • Bosker, H. R. (2016). Our own speech rate influences speech perception. In J. Barnes, A. Brugos, S. Stattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 227-231).

    Abstract

    During conversation, spoken utterances occur in rich acoustic contexts, including speech produced by our interlocutor(s) and speech we produced ourselves. Prosodic characteristics of the acoustic context have been known to influence speech perception in a contrastive fashion: for instance, a vowel presented in a fast context is perceived to have a longer duration than the same vowel in a slow context. Given the ubiquity of the sound of our own voice, it may be that our own speech rate - a common source of acoustic context - also influences our perception of the speech of others. Two experiments were designed to test this hypothesis. Experiment 1 replicated earlier contextual rate effects by showing that hearing pre-recorded fast or slow context sentences alters the perception of ambiguous Dutch target words. Experiment 2 then extended this finding by showing that talking at a fast or slow rate prior to the presentation of the target words also altered the perception of those words. These results suggest that between-talker variation in speech rate production may induce between-talker variation in speech perception, thus potentially explaining why interlocutors tend to converge on speech rate in dialogue settings.

    Additional information

    pdf via conference website227
  • Broeder, D., Declerck, T., Romary, L., Uneson, M., Strömqvist, S., & Wittenburg, P. (2004). A large metadata domain of language resources. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004) (pp. 369-372). Paris: European Language Resources Association.
  • Broeder, D., Schuurman, I., & Windhouwer, M. (2014). Experiences with the ISOcat Data Category Registry. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 4565-4568).
  • Broeder, D., Nava, M., & Declerck, T. (2004). INTERA - a Distributed Domain of Metadata Resources. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Spoken Language Resources and Evaluation (LREC 2004) (pp. 369-372). Paris: European Language Resources Association.
  • Broeder, D., Wittenburg, P., & Crasborn, O. (2004). Using Profiles for IMDI Metadata Creation. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004) (pp. 1317-1320). Paris: European Language Resources Association.
  • Broeder, D., Brugman, H., Oostdijk, N., & Wittenburg, P. (2004). Towards Dynamic Corpora: Workshop on compiling and processing spoken corpora. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC 2004) (pp. 59-62). Paris: European Language Resource Association.
  • Broersma, M., & Kolkman, K. M. (2004). Lexical representation of non-native phonemes. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 1241-1244). Seoul: Sunjijn Printing Co.
  • Bruggeman, L., & Cutler, A. (2016). Lexical manipulation as a discovery tool for psycholinguistic research. In C. Carignan, & M. D. Tyler (Eds.), Proceedings of the 16th Australasian International Conference on Speech Science and Technology (SST2016) (pp. 313-316).
  • Brugman, H., & Russel, A. (2004). Annotating Multi-media/Multi-modal resources with ELAN. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Language Evaluation (LREC 2004) (pp. 2065-2068). Paris: European Language Resources Association.
  • Brugman, H., Crasborn, O., & Russel, A. (2004). Collaborative annotation of sign language data with Peer-to-Peer technology. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Language Evaluation (LREC 2004) (pp. 213-216). Paris: European Language Resources Association.
  • Burenhult, N. (2004). Spatial deixis in Jahai. In S. Burusphat (Ed.), Papers from the 11th Annual Meeting of the Southeast Asian Linguistics Society 2001 (pp. 87-100). Arizona State University: Program for Southeast Asian Studies.
  • Chen, A. (2014). Production-comprehension (A)Symmetry: Individual differences in the acquisition of prosodic focus-marking. In N. Campbell, D. Gibbon, & D. Hirst (Eds.), Proceedings of Speech Prosody 2014 (pp. 423-427).

    Abstract

    Previous work based on different groups of children has shown that four- to five-year-old children are similar to adults in both producing and comprehending the focus-toaccentuation mapping in Dutch, contra the alleged productionprecedes- comprehension asymmetry in earlier studies. In the current study, we addressed the question of whether there are individual differences in the production-comprehension (a)symmetricity. To this end, we examined the use of prosody in focus marking in production and the processing of focusrelated prosody in online language comprehension in the same group of 4- to 5-year-olds. We have found that the relationship between comprehension and production can be rather diverse at an individual level. This result suggests some degree of independence in learning to use prosody to mark focus in production and learning to process focus-related prosodic information in online language comprehension, and implies influences of other linguistic and non-linguistic factors on the production-comprehension (a)symmetricity
  • Chen, A., Chen, A., Kager, R., & Wong, P. (2014). Rises and falls in Dutch and Mandarin Chinese. In C. Gussenhoven, Y. Chen, & D. Dediu (Eds.), Proceedings of the 4th International Symposium on Tonal Aspects of Language (pp. 83-86).

    Abstract

    Despite of the different functions of pitch in tone and nontone languages, rises and falls are common pitch patterns across different languages. In the current study, we ask what is the language specific phonetic realization of rises and falls. Chinese and Dutch speakers participated in a production experiment. We used contexts composed for conveying specific communicative purposes to elicit rises and falls. We measured both tonal alignment and tonal scaling for both patterns. For the alignment measurements, we found language specific patterns for the rises, but for falls. For rises, both peak and valley were aligned later among Chinese speakers compared to Dutch speakers. For all the scaling measurements (maximum pitch, minimum pitch, and pitch range), no language specific patterns were found for either the rises or the falls
  • Cho, T., & Johnson, E. K. (2004). Acoustic correlates of phrase-internal lexical boundaries in Dutch. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 1297-1300). Seoul: Sunjin Printing Co.

    Abstract

    The aim of this study was to determine if Dutch speakers reliably signal phrase-internal lexical boundaries, and if so, how. Six speakers recorded 4 pairs of phonemically identical strong-weak-strong (SWS) strings with matching syllable boundaries but mismatching intended word boundaries (e.g. reis # pastei versus reispas # tij, or more broadly C1V2(C)#C2V2(C)C3V3(C) vs. C1V2(C)C2V2(C)#C3V3(C)). An Analysis of Variance revealed 3 acoustic parameters that were significantly greater in S#WS items (C2 DURATION, RIME1 DURATION, C3 BURST AMPLITUDE) and 5 parameters that were significantly greater in the SW#S items (C2 VOT, C3 DURATION, RIME2 DURATION, RIME3 DURATION, and V2 AMPLITUDE). Additionally, center of gravity measurements suggested that the [s] to [t] coarticulation was greater in reis # pa[st]ei versus reispa[s] # [t]ij. Finally, a Logistic Regression Analysis revealed that the 3 parameters (RIME1 DURATION, RIME2 DURATION, and C3 DURATION) contributed most reliably to a S#WS versus SW#S classification.
  • Cho, T., & McQueen, J. M. (2004). Phonotactics vs. phonetic cues in native and non-native listening: Dutch and Korean listeners' perception of Dutch and English. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 1301-1304). Seoul: Sunjijn Printing Co.

    Abstract

    We investigated how listeners of two unrelated languages, Dutch and Korean, process phonotactically legitimate and illegitimate sounds spoken in Dutch and American English. To Dutch listeners, unreleased word-final stops are phonotactically illegal because word-final stops in Dutch are generally released in isolation, but to Korean listeners, released final stops are illegal because word-final stops are never released in Korean. Two phoneme monitoring experiments showed a phonotactic effect: Dutch listeners detected released stops more rapidly than unreleased stops whereas the reverse was true for Korean listeners. Korean listeners with English stimuli detected released stops more accurately than unreleased stops, however, suggesting that acoustic-phonetic cues associated with released stops improve detection accuracy. We propose that in non-native speech perception, phonotactic legitimacy in the native language speeds up phoneme recognition, the richness of acousticphonetic cues improves listening accuracy, and familiarity with the non-native language modulates the relative influence of these two factors.
  • Clark, N., & Perlman, M. (2014). Breath, vocal, and supralaryngeal flexibility in a human-reared gorilla. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 11-15).

    Abstract

    “Gesture-first” theories dismiss ancestral great apes’ vocalization as a substrate for language evolution based on the claim that extant apes exhibit minimal learning and volitional control of vocalization. Contrary to this claim, we present data of novel learned and voluntarily controlled vocal behaviors produced by a human-fostered gorilla (G. gorilla gorilla). These behaviors demonstrate varying degrees of flexibility in the vocal apparatus (including diaphragm, lungs, larynx, and supralaryngeal articulators), and are predominantly performed in coordination with manual behaviors and gestures. Instead of a gesture-first theory, we suggest that these findings support multimodal theories of language evolution in which vocal and gestural forms are coordinated and supplement one another
  • Cooper, N., & Cutler, A. (2004). Perception of non-native phonemes in noise. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 469-472). Seoul: Sunjijn Printing Co.

    Abstract

    We report an investigation of the perception of American English phonemes by Dutch listeners proficient in English. Listeners identified either the consonant or the vowel in most possible English CV and VC syllables. The syllables were embedded in multispeaker babble at three signal-to-noise ratios (16 dB, 8 dB, and 0 dB). Effects of signal-to-noise ratio on vowel and consonant identification are discussed as a function of syllable position and of relationship to the native phoneme inventory. Comparison of the results with previously reported data from native listeners reveals that noise affected the responding of native and non-native listeners similarly.
  • Crago, M. B., Allen, S. E. M., & Pesco, D. (1998). Issues of Complexity in Inuktitut and English Child Directed Speech. In Proceedings of the twenty-ninth Annual Stanford Child Language Research Forum (pp. 37-46).
  • Crasborn, O., Hulsbosch, M., Lampen, L., & Sloetjes, H. (2014). New multilayer concordance functions in ELAN and TROVA. In Proceedings of the Tilburg Gesture Research Meeting [TiGeR 2013].

    Abstract

    Collocations generated by concordancers are a standard instrument in the exploitation of text corpora for the analysis of language use. Multimodal corpora show similar types of patterns, activities that frequently occur together, but there is no tool that offers facilities for visualising such patterns. Examples include timing of eye contact with respect to speech, and the alignment of activities of the two hands in signed languages. This paper describes recent enhancements to the standard CLARIN tools ELAN and TROVA for multimodal annotation to address these needs: first of all the query and concordancing functions were improved, and secondly the tools now generate visualisations of multilayer collocations that allow for intuitive explorations and analyses of multimodal data. This will provide a boost to the linguistic fields of gesture and sign language studies, as it will improve the exploitation of multimodal corpora.
  • Crasborn, O., & Sloetjes, H. (2014). Improving the exploitation of linguistic annotations in ELAN. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 3604-3608).

    Abstract

    This paper discusses some improvements in recent and planned versions of the multimodal annotation tool ELAN, which are targeted at improving the usability of annotated files. Increased support for multilingual documents is provided, by allowing for multilingual vocabularies and by specifying a language per document, annotation layer (tier) or annotation. In addition, improvements in the search possibilities and the display of the results have been implemented, which are especially relevant in the interpretation of the results of complex multi-tier searches.
  • Croijmans, I., & Majid, A. (2016). Language does not explain the wine-specific memory advantage of wine experts. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 141-146). Austin, TX: Cognitive Science Society.

    Abstract

    Although people are poor at naming odors, naming a smell helps to remember that odor. Previous studies show wine experts have better memory for smells, and they also name smells differently than novices. Is wine experts’ odor memory is verbally mediated? And is the odor memory advantage that experts have over novices restricted to odors in their domain of expertise, or does it generalize? Twenty-four wine experts and 24 novices smelled wines, wine-related odors and common odors, and remembered these. Half the participants also named the smells. Wine experts had better memory for wines, but not for the other odors, indicating their memory advantage is restricted to wine. Wine experts named odors better than novices, but there was no relationship between experts’ ability to name odors and their memory for odors. This suggests experts’ odor memory advantage is not linguistically mediated, but may be the result of differential perceptual learning
  • Cutler, A., & Otake, T. (1998). Assimilation of place in Japanese and Dutch. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: vol. 5 (pp. 1751-1754). Sydney: ICLSP.

    Abstract

    Assimilation of place of articulation across a nasal and a following stop consonant is obligatory in Japanese, but not in Dutch. In four experiments the processing of assimilated forms by speakers of Japanese and Dutch was compared, using a task in which listeners blended pseudo-word pairs such as ranga-serupa. An assimilated blend of this pair would be rampa, an unassimilated blend rangpa. Japanese listeners produced significantly more assimilated than unassimilated forms, both with pseudo-Japanese and pseudo-Dutch materials, while Dutch listeners produced significantly more unassimilated than assimilated forms in each materials set. This suggests that Japanese listeners, whose native-language phonology involves obligatory assimilation constraints, represent the assimilated nasals in nasal-stop sequences as unmarked for place of articulation, while Dutch listeners, who are accustomed to hearing unassimilated forms, represent the same nasal segments as marked for place of articulation.
  • Ip, M., & Cutler, A. (2016). Cross-language data on five types of prosodic focus. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 330-334).

    Abstract

    To examine the relative roles of language-specific and language-universal mechanisms in the production of prosodic focus, we compared production of five different types of focus by native speakers of English and Mandarin. Two comparable dialogues were constructed for each language, with the same words appearing in focused and unfocused position; 24 speakers recorded each dialogue in each language. Duration, F0 (mean, maximum, range), and rms-intensity (mean, maximum) of all critical word tokens were measured. Across the different types of focus, cross-language differences were observed in the degree to which English versus Mandarin speakers use the different prosodic parameters to mark focus, suggesting that while prosody may be universally available for expressing focus, the means of its employment may be considerably language-specific
  • Cutler, A. (1998). How listeners find the right words. In Proceedings of the Sixteenth International Congress on Acoustics: Vol. 2 (pp. 1377-1380). Melville, NY: Acoustical Society of America.

    Abstract

    Languages contain tens of thousands of words, but these are constructed from a tiny handful of phonetic elements. Consequently, words resemble one another, or can be embedded within one another, a coup stick snot with standing. me process of spoken-word recognition by human listeners involves activation of multiple word candidates consistent with the input, and direct competition between activated candidate words. Further, human listeners are sensitive, at an early, prelexical, stage of speeeh processing, to constraints on what could potentially be a word of the language.
  • Cutler, A., Treiman, R., & Van Ooijen, B. (1998). Orthografik inkoncistensy ephekts in foneme detektion? In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2783-2786). Sydney: ICSLP.

    Abstract

    The phoneme detection task is widely used in spoken word recognition research. Alphabetically literate participants, however, are more used to explicit representations of letters than of phonemes. The present study explored whether phoneme detection is sensitive to how target phonemes are, or may be, orthographically realised. Listeners detected the target sounds [b,m,t,f,s,k] in word-initial position in sequences of isolated English words. Response times were faster to the targets [b,m,t], which have consistent word-initial spelling, than to the targets [f,s,k], which are inconsistently spelled, but only when listeners’ attention was drawn to spelling by the presence in the experiment of many irregularly spelled fillers. Within the inconsistent targets [f,s,k], there was no significant difference between responses to targets in words with majority and minority spellings. We conclude that performance in the phoneme detection task is not necessarily sensitive to orthographic effects, but that salient orthographic manipulation can induce such sensitivity.
  • Cutler, A., Norris, D., & Sebastián-Gallés, N. (2004). Phonemic repertoire and similarity within the vocabulary. In S. Kin, & M. J. Bae (Eds.), Proceedings of the 8th International Conference on Spoken Language Processing (Interspeech 2004-ICSLP) (pp. 65-68). Seoul: Sunjijn Printing Co.

    Abstract

    Language-specific differences in the size and distribution of the phonemic repertoire can have implications for the task facing listeners in recognising spoken words. A language with more phonemes will allow shorter words and reduced embedding of short words within longer ones, decreasing the potential for spurious lexical competitors to be activated by speech signals. We demonstrate that this is the case via comparative analyses of the vocabularies of English and Spanish. A language which uses suprasegmental as well as segmental contrasts, however, can substantially reduce the extent of spurious embedding.
  • Cutler, A. (1998). The recognition of spoken words with variable representations. In D. Duez (Ed.), Proceedings of the ESCA Workshop on Sound Patterns of Spontaneous Speech (pp. 83-92). Aix-en-Provence: Université de Aix-en-Provence.
  • Dalli, A., Tablan, V., Bontcheva, K., Wilks, Y., Broeder, D., Brugman, H., & Wittenburg, P. (2004). Web services architecture for language resources. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC2004) (pp. 365-368). Paris: ELRA - European Language Resources Association.
  • Dediu, D., & Moisik, S. R. (2016). Anatomical biasing of click learning and production: An MRI and 3d palate imaging study. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/57.html.

    Abstract

    The current paper presents results for data on click learning obtained from a larger imaging study (using MRI and 3D intraoral scanning) designed to quantify and characterize intra- and inter-population variation of vocal tract structures and the relation of this to speech production. The aim of the click study was to ascertain whether and to what extent vocal tract morphology influences (1) the ability to learn to produce clicks and (2) the productions of those that successfully learn to produce these sounds. The results indicate that the presence of an alveolar ridge certainly does not prevent an individual from learning to produce click sounds (1). However, the subtle details of how clicks are produced may indeed be driven by palate shape (2).
  • Dediu, D., & Moisik, S. (2016). Defining and counting phonological classes in cross-linguistic segment databases. In N. Calzolari, K. Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2016: 10th International Conference on Language Resources and Evaluation (pp. 1955-1962). Paris: European Language Resources Association (ELRA).

    Abstract

    Recently, there has been an explosion in the availability of large, good-quality cross-linguistic databases such as WALS (Dryer & Haspelmath, 2013), Glottolog (Hammarstrom et al., 2015) and Phoible (Moran & McCloy, 2014). Databases such as Phoible contain the actual segments used by various languages as they are given in the primary language descriptions. However, this segment-level representation cannot be used directly for analyses that require generalizations over classes of segments that share theoretically interesting features. Here we present a method and the associated R (R Core Team, 2014) code that allows the exible denition of such meaningful classes and that can identify the sets of segments falling into such a class for any language inventory. The method and its results are important for those interested in exploring cross-linguistic patterns of phonetic and phonological diversity and their relationship to extra-linguistic factors and processes such as climate, economics, history or human genetics.
  • Dediu, D., & Levinson, S. C. (2014). Language and speech are old: A review of the evidence and consequences for modern linguistic diversity. In E. A. Cartmill, S. G. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 421-422). Singapore: World Scientific.
  • Dingemanse, M., Torreira, F., & Enfield, N. J. (2014). Conversational infrastructure and the convergent evolution of linguistic items. In E. A. Cartmill, S. G. Roberts, H. Lyn, & H. Cornish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 425-426). Singapore: World Scientific.
  • Dingemanse, M., Verhoef, T., & Roberts, S. G. (2014). The role of iconicity in the cultural evolution of communicative signals. In B. De Boer, & T. Verhoef (Eds.), Proceedings of Evolang X, Workshop on Signals, Speech, and Signs (pp. 11-15).
  • Dolscheid, S., Willems, R. M., Hagoort, P., & Casasanto, D. (2014). The relation of space and musical pitch in the brain. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 421-426). Austin, Tx: Cognitive Science Society.

    Abstract

    Numerous experiments show that space and musical pitch are closely linked in people's minds. However, the exact nature of space-pitch associations and their neuronal underpinnings are not well understood. In an fMRI experiment we investigated different types of spatial representations that may underlie musical pitch. Participants judged stimuli that varied in spatial height in both the visual and tactile modalities, as well as auditory stimuli that varied in pitch height. In order to distinguish between unimodal and multimodal spatial bases of musical pitch, we examined whether pitch activations were present in modality-specific (visual or tactile) versus multimodal (visual and tactile) regions active during spatial height processing. Judgments of musical pitch were found to activate unimodal visual areas, suggesting that space-pitch associations may involve modality-specific spatial representations, supporting a key assumption of embodied theories of metaphorical mental representation.
  • Doumas, L. A., & Martin, A. E. (2016). Abstraction in time: Finding hierarchical linguistic structure in a model of relational processing. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 2279-2284). Austin, TX: Cognitive Science Society.

    Abstract

    Abstract mental representation is fundamental for human cognition. Forming such representations in time, especially from dynamic and noisy perceptual input, is a challenge for any processing modality, but perhaps none so acutely as for language processing. We show that LISA (Hummel & Holyaok, 1997) and DORA (Doumas, Hummel, & Sandhofer, 2008), models built to process and to learn structured (i.e., symbolic) rep resentations of conceptual properties and relations from unstructured inputs, show oscillatory activation during processing that is highly similar to the cortical activity elicited by the linguistic stimuli from Ding et al.(2016). We argue, as Ding et al.(2016), that this activation reflects formation of hierarchical linguistic representation, and furthermore, that the kind of computational mechanisms in LISA/DORA (e.g., temporal binding by systematic asynchrony of firing) may underlie formation of abstract linguistic representations in the human brain. It may be this repurposing that allowed for the generation or mergence of hierarchical linguistic structure, and therefore, human language, from extant cognitive and neural systems. We conclude that models of thinking and reasoning and models of language processing must be integrated —not only for increased plausiblity, but in order to advance both fields towards a larger integrative model of human cognition
  • Drozd, K. F. (1998). No as a determiner in child English: A summary of categorical evidence. In A. Sorace, C. Heycock, & R. Shillcock (Eds.), Proceedings of the Gala '97 Conference on Language Acquisition (pp. 34-39). Edinburgh, UK: Edinburgh University Press,.

    Abstract

    This paper summarizes the results of a descriptive syntactic category analysis of child English no which reveals that young children use and represent no as a determiner and negatives like no pen as NPs, contra standard analyses.
  • Drozdova, P., Van Hout, R., & Scharenborg, O. (2016). Processing and adaptation to ambiguous sounds during the course of perceptual learning. In Proceedings of Interspeech 2016: The 17th Annual Conference of the International Speech Communication Association (pp. 2811-2815). doi:10.21437/Interspeech.2016-814.

    Abstract

    Listeners use their lexical knowledge to interpret ambiguous sounds, and retune their phonetic categories to include this ambiguous sound. Although there is ample evidence for lexically-guided retuning, the adaptation process is not fully understood. Using a lexical decision task with an embedded auditory semantic priming task, the present study investigates whether words containing an ambiguous sound are processed in the same way as “natural” words and whether adaptation to the ambiguous sound tends to equalize the processing of “ambiguous” and natural words. Analyses of the yes/no responses and reaction times to natural and “ambiguous” words showed that words containing an ambiguous sound were accepted as words less often and were processed slower than the same words without ambiguity. The difference in acceptance disappeared after exposure to approximately 15 ambiguous items. Interestingly, lower acceptance rates and slower processing did not have an effect on the processing of semantic information of the following word. However, lower acceptance rates of ambiguous primes predict slower reaction times of these primes, suggesting an important role of stimulus-specific characteristics in triggering lexically-guided perceptual learning.
  • Drozdova, P., Van Hout, R., & Scharenborg, O. (2014). Phoneme category retuning in a non-native language. In Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 553-557).

    Abstract

    Previous studies have demonstrated that native listeners modify their interpretation of a speech sound when a talker produces an ambiguous sound in order to quickly tune into a speaker, but there is hardly any evidence that non-native listeners employ a similar mechanism when encountering ambiguous pronunciations. So far, one study demonstrated this lexically-guided perceptual learning effect for nonnatives, using phoneme categories similar in the native language of the listeners and the non-native language of the stimulus materials. The present study investigates the question whether phoneme category retuning is possible in a nonnative language for a contrast, /l/-/r/, which is phonetically differently embedded in the native (Dutch) and nonnative (English) languages involved. Listening experiments indeed showed a lexically-guided perceptual learning effect. Assuming that Dutch listeners have different phoneme categories for the native Dutch and non-native English /r/, as marked differences between the languages exist for /r/, these results, for the first time, seem to suggest that listeners are not only able to retune their native phoneme categories but also their non-native phoneme categories to include ambiguous pronunciations.
  • Enfield, N. J. (2004). Areal grammaticalisation of postverbal 'acquire' in mainland Southeast Asia. In S. Burusphat (Ed.), Proceedings of the 11th Southeast Asia Linguistics Society Meeting (pp. 275-296). Arizona State University: Tempe.
  • Ernestus, M., Kočková-Amortová, L., & Pollak, P. (2014). The Nijmegen corpus of casual Czech. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 365-370).

    Abstract

    This article introduces a new speech corpus, the Nijmegen Corpus of Casual Czech (NCCCz), which contains more than 30 hours of high-quality recordings of casual conversations in Common Czech, among ten groups of three male and ten groups of three female friends. All speakers were native speakers of Czech, raised in Prague or in the region of Central Bohemia, and were between 19 and 26 years old. Every group of speakers consisted of one confederate, who was instructed to keep the conversations lively, and two speakers naive to the purposes of the recordings. The naive speakers were engaged in conversations for approximately 90 minutes, while the confederate joined them for approximately the last 72 minutes. The corpus was orthographically annotated by experienced transcribers and this orthographic transcription was aligned with the speech signal. In addition, the conversations were videotaped. This corpus can form the basis for all types of research on casual conversations in Czech, including phonetic research and research on how to improve automatic speech recognition. The corpus will be freely available
  • Eryilmaz, K., Little, H., & De Boer, B. (2016). Using HMMs To Attribute Structure To Artificial Languages. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/125.html.

    Abstract

    We investigated the use of Hidden Markov Models (HMMs) as a way of representing repertoires of continuous signals in order to infer their building blocks. We tested the idea on a dataset from an artificial language experiment. The study demonstrates using HMMs for this purpose is viable, but also that there is a lot of room for refinement such as explicit duration modeling, incorporation of autoregressive elements and relaxing the Markovian assumption, in order to accommodate specific details.
  • Filippi, P. (2014). Linguistic animals: understanding language through a comparative approach. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Crnish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 74-81). doi:10.1142/9789814603638_0082.

    Abstract

    With the aim to clarify the definition of humans as “linguistic animals”, in the present paper I functionally distinguish three types of language competences: i) language as a general biological tool for communication, ii) “perceptual syntax”, iii) propositional language. Following this terminological distinction, I review pivotal findings on animals' communication systems, which constitute useful evidence for the investigation of the nature of three core components of humans' faculty of language: semantics, syntax, and theory of mind. In fact, despite the capacity to process and share utterances with an open-ended structure is uniquely human, some isolated components of our linguistic competence are in common with nonhuman animals. Therefore, as I argue in the present paper, the investigation of animals' communicative competence provide crucial insights into the range of cognitive constraints underlying humans' ability of language, enabling at the same time the analysis of its phylogenetic path as well as of the selective pressures that have led to its emergence.
  • Filippi, P., Congdon, J. V., Hoang, J., Bowling, D. L., Reber, S., Pašukonis, A., Hoeschele, M., Ocklenburg, S., de Boer, B., Sturdy, C. B., Newen, A., & Güntürkün, O. (2016). Humans Recognize Vocal Expressions Of Emotional States Universally Across Species. In The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/91.html.

    Abstract

    The perception of danger in the environment can induce physiological responses (such as a heightened state of arousal) in animals, which may cause measurable changes in the prosodic modulation of the voice (Briefer, 2012). The ability to interpret the prosodic features of animal calls as an indicator of emotional arousal may have provided the first hominins with an adaptive advantage, enabling, for instance, the recognition of a threat in the surroundings. This ability might have paved the ability to process meaningful prosodic modulations in the emerging linguistic utterances.
  • Filippi, P., Ocklenburg, S., Bowling, D. L., Heege, L., Newen, A., Güntürkün, O., & de Boer, B. (2016). Multimodal Processing Of Emotional Meanings: A Hypothesis On The Adaptive Value Of Prosody. In The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/90.html.

    Abstract

    Humans combine multiple sources of information to comprehend meanings. These sources can be characterized as linguistic (i.e., lexical units and/or sentences) or paralinguistic (e.g. body posture, facial expression, voice intonation, pragmatic context). Emotion communication is a special case in which linguistic and paralinguistic dimensions can simultaneously denote the same, or multiple incongruous referential meanings. Think, for instance, about when someone says “I’m sad!”, but does so with happy intonation and a happy facial expression. Here, the communicative channels express very specific (although conflicting) emotional states as denotations. In such cases of intermodal incongruence, are we involuntarily biased to respond to information in one channel over the other? We hypothesize that humans are involuntary biased to respond to prosody over verbal content and facial expression, since the ability to communicate socially relevant information such as basic emotional states through prosodic modulation of the voice might have provided early hominins with an adaptive advantage that preceded the emergence of segmental speech (Darwin 1871; Mithen, 2005). To address this hypothesis, we examined the interaction between multiple communicative channels in recruiting attentional resources, within a Stroop interference task (i.e. a task in which different channels give conflicting information; Stroop, 1935). In experiment 1, we used synonyms of “happy” and “sad” spoken with happy and sad prosody. Participants were asked to identify the emotion expressed by the verbal content while ignoring prosody (Word task) or vice versa (Prosody task). Participants responded faster and more accurately in the Prosody task. Within the Word task, incongruent stimuli were responded to more slowly and less accurately than congruent stimuli. In experiment 2, we adopted synonyms of “happy” and “sad” spoken in happy and sad prosody, while a happy or sad face was displayed. Participants were asked to identify the emotion expressed by the verbal content while ignoring prosody and face (Word task), to identify the emotion expressed by prosody while ignoring verbal content and face (Prosody task), or to identify the emotion expressed by the face while ignoring prosody and verbal content (Face task). Participants responded faster in the Face task and less accurately when the two non-focused channels were expressing an emotion that was incongruent with the focused one, as compared with the condition where all the channels were congruent. In addition, in the Word task, accuracy was lower when prosody was incongruent to verbal content and face, as compared with the condition where all the channels were congruent. Our data suggest that prosody interferes with emotion word processing, eliciting automatic responses even when conflicting with both verbal content and facial expressions at the same time. In contrast, although processed significantly faster than prosody and verbal content, faces alone are not sufficient to interfere in emotion processing within a three-dimensional Stroop task. Our findings align with the hypothesis that the ability to communicate emotions through prosodic modulation of the voice – which seems to be dominant over verbal content - is evolutionary older than the emergence of segmental articulation (Mithen, 2005; Fitch, 2010). This hypothesis fits with quantitative data suggesting that prosody has a vital role in the perception of well-formed words (Johnson & Jusczyk, 2001), in the ability to map sounds to referential meanings (Filippi et al., 2014), and in syntactic disambiguation (Soderstrom et al., 2003). This research could complement studies on iconic communication within visual and auditory domains, providing new insights for models of language evolution. Further work aimed at how emotional cues from different modalities are simultaneously integrated will improve our understanding of how humans interpret multimodal emotional meanings in real life interactions.
  • Filippi, P., Gingras, B., & Fitch, W. T. (2014). The effect of pitch enhancement on spoken language acquisition. In E. A. Cartmill, S. Roberts, H. Lyn, & H. Crnish (Eds.), The Evolution of Language: Proceedings of the 10th International Conference (pp. 437-438). doi:10.1142/9789814603638_0082.

    Abstract

    The aim of this study is to investigate the word-learning phenomenon utilizing a new model that integrates three processes: a) extracting a word out of a continuous sounds sequence, b) inducing referential meanings, c) mapping a word onto its intended referent, with the possibility to extend the acquired word over a potentially infinite sets of objects of the same semantic category, and over not-previously-heard utterances. Previous work has examined the role of statistical learning and/or of prosody in each of these processes separately. In order to examine the multilayered word-learning task, we integrate these two strands of investigation into a single approach. We have conducted the study on adults and included six different experimental conditions, each including specific perceptual manipulations of the signal. In condition 1, the only cue to word-meaning mapping was the co-occurrence between words and referents (“statistical cue”). This cue was present in all the conditions. In condition 2, we added infant-directed-speech (IDS) typical pitch enhancement as a marker of the target word and of the statistical cue. In condition 3 we placed IDS typical pitch enhancement on random words of the utterances, i.e. inconsistently matching the statistical cue. In conditions 4, 5 and 6 we manipulated respectively duration, a non-prosodic acoustic cue and a visual cue as markers of the target word and of the statistical cue. Systematic comparisons between learning performance in condition 1 with the other conditions revealed that the word-learning process is facilitated only when pitch prominence consistently marks the target word and the statistical cue…
  • Floyd, S. (2004). Purismo lingüístico y realidad local: ¿Quichua puro o puro quichuañol? In Proceedings of the Conference on Indigenous Languages of Latin America (CILLA)-I.
  • Francisco, A. A., Jesse, A., Groen, M. a., & McQueen, J. M. (2014). Audiovisual temporal sensitivity in typical and dyslexic adult readers. In Proceedings of the 15th Annual Conference of the International Speech Communication Association (INTERSPEECH 2014) (pp. 2575-2579).

    Abstract

    Reading is an audiovisual process that requires the learning of systematic links between graphemes and phonemes. It is thus possible that reading impairments reflect an audiovisual processing deficit. In this study, we compared audiovisual processing in adults with developmental dyslexia and adults without reading difficulties. We focused on differences in cross-modal temporal sensitivity both for speech and for non-speech events. When compared to adults without reading difficulties, adults with developmental dyslexia presented a wider temporal window in which unsynchronized speech events were perceived as synchronized. No differences were found between groups for the non-speech events. These results suggests a deficit in dyslexia in the perception of cross-modal temporal synchrony for speech events.
  • Frost, R. L. A., Monaghan, P., & Christiansen, M. H. (2016). Using Statistics to Learn Words and Grammatical Categories: How High Frequency Words Assist Language Acquisition. In A. Papafragou, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 81-86). Austin, Tx: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2016/papers/0027/index.html.

    Abstract

    Recent studies suggest that high-frequency words may benefit speech segmentation (Bortfeld, Morgan, Golinkoff, & Rathbun, 2005) and grammatical categorisation (Monaghan, Christiansen, & Chater, 2007). To date, these tasks have been examined separately, but not together. We familiarised adults with continuous speech comprising repetitions of target words, and compared learning to a language in which targets appeared alongside high-frequency marker words. Marker words reliably preceded targets, and distinguished them into two otherwise unidentifiable categories. Participants completed a 2AFC segmentation test, and a similarity judgement categorisation test. We tested transfer to a word-picture mapping task, where words from each category were used either consistently or inconsistently to label actions/objects. Participants segmented the speech successfully, but only demonstrated effective categorisation when speech contained high-frequency marker words. The advantage of marker words extended to the early stages of the transfer task. Findings indicate the same high-frequency words may assist speech segmentation and grammatical categorisation.
  • Gannon, E., He, J., Gao, X., & Chaparro, B. (2016). RSVP Reading on a Smart Watch. In Proceedings of the Human Factors and Ergonomics Society 2016 Annual Meeting (pp. 1130-1134).

    Abstract

    Reading with Rapid Serial Visual Presentation (RSVP) has shown promise for optimizing screen space and increasing reading speed without compromising comprehension. Given the wide use of small-screen devices, the present study compared RSVP and traditional reading on three types of reading comprehension, reading speed, and subjective measures on a smart watch. Results confirm previous studies that show faster reading speed with RSVP without detracting from comprehension. Subjective data indicate that Traditional is strongly preferred to RSVP as a primary reading method. Given the optimal use of screen space, increased speed and comparable comprehension, future studies should focus on making RSVP a more comfortable format.
  • Ganushchak, L. Y., & Acheson, D. J. (Eds.). (2014). What's to be learned from speaking aloud? - Advances in the neurophysiological measurement of overt language production. [Research topic] [Special Issue]. Frontiers in Language Sciences. Retrieved from http://www.frontiersin.org/Language_Sciences/researchtopics/What_s_to_be_Learned_from_Spea/1671.

    Abstract

    Researchers have long avoided neurophysiological experiments of overt speech production due to the suspicion that artifacts caused by muscle activity may lead to a bad signal-to-noise ratio in the measurements. However, the need to actually produce speech may influence earlier processing and qualitatively change speech production processes and what we can infer from neurophysiological measures thereof. Recently, however, overt speech has been successfully investigated using EEG, MEG, and fMRI. The aim of this Research Topic is to draw together recent research on the neurophysiological basis of language production, with the aim of developing and extending theoretical accounts of the language production process. In this Research Topic of Frontiers in Language Sciences, we invite both experimental and review papers, as well as those about the latest methods in acquisition and analysis of overt language production data. All aspects of language production are welcome: i.e., from conceptualization to articulation during native as well as multilingual language production. Focus should be placed on using the neurophysiological data to inform questions about the processing stages of language production. In addition, emphasis should be placed on the extent to which the identified components of the electrophysiological signal (e.g., ERP/ERF, neuronal oscillations, etc.), brain areas or networks are related to language comprehension and other cognitive domains. By bringing together electrophysiological and neuroimaging evidence on language production mechanisms, a more complete picture of the locus of language production processes and their temporal and neurophysiological signatures will emerge.
  • Gebre, B. G., Wittenburg, P., Heskes, T., & Drude, S. (2014). Motion history images for online speaker/signer diarization. In Proceedings of the 2014 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) (pp. 1537-1541). Piscataway, NJ: IEEE.

    Abstract

    We present a solution to the problem of online speaker/signer diarization - the task of determining "who spoke/signed when?". Our solution is based on the idea that gestural activity (hands and body movement) is highly correlated with uttering activity. This correlation is necessarily true for sign languages and mostly true for spoken languages. The novel part of our solution is the use of motion history images (MHI) as a likelihood measure for probabilistically detecting uttering activities. MHI is an efficient representation of where and how motion occurred for a fixed period of time. We conducted experiments on 4.9 hours of a publicly available dataset (the AMI meeting data) and 1.4 hours of sign language dataset (Kata Kolok data). The best performance obtained is 15.70% for sign language and 31.90% for spoken language (measurements are in DER). These results show that our solution is applicable in real-world applications like video conferences.

    Files private

    Request files
  • Gebre, B. G., Wittenburg, P., Drude, S., Huijbregts, M., & Heskes, T. (2014). Speaker diarization using gesture and speech. In H. Li, & P. Ching (Eds.), Proceedings of Interspeech 2014: 15th Annual Conference of the International Speech Communication Association (pp. 582-586).

    Abstract

    We demonstrate how the problem of speaker diarization can be solved using both gesture and speaker parametric models. The novelty of our solution is that we approach the speaker diarization problem as a speaker recognition problem after learning speaker models from speech samples corresponding to gestures (the occurrence of gestures indicates the presence of speech and the location of gestures indicates the identity of the speaker). This new approach offers many advantages: comparable state-of-the-art performance, faster computation and more adaptability. In our implementation, parametric models are used to model speakers' voice and their gestures: more specifically, Gaussian mixture models are used to model the voice characteristics of each person and all persons, and gamma distributions are used to model gestural activity based on features extracted from Motion History Images. Tests on 4.24 hours of the AMI meeting data show that our solution makes DER score improvements of 19% on speech-only segments and 4% on all segments including silence (the comparison is with the AMI system).
  • Gebre, B. G., Crasborn, O., Wittenburg, P., Drude, S., & Heskes, T. (2014). Unsupervised feature learning for visual sign language identification. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics: Vol 2 (pp. 370-376). Redhook, NY: Curran Proceedings.

    Abstract

    Prior research on language identification focused primarily on text and speech. In this paper, we focus on the visual modality and present a method for identifying sign languages solely from short video samples. The method is trained on unlabelled video data (unsupervised feature learning) and using these features, it is trained to discriminate between six sign languages (supervised learning). We ran experiments on video samples involving 30 signers (running for a total of 6 hours). Using leave-one-signer-out cross-validation, our evaluation on short video samples shows an average best accuracy of 84%. Given that sign languages are under-resourced, unsupervised feature learning techniques are the right tools and our results indicate that this is realistic for sign language identification.
  • Gentzsch, W., Lecarpentier, D., & Wittenburg, P. (2014). Big data in science and the EUDAT project. In Proceeding of the 2014 Annual SRII Global Conference.
  • Gerwien, J., & Flecken, M. (2016). First things first? Top-down influences on event apprehension. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 2633-2638). Austin, TX: Cognitive Science Society.

    Abstract

    Not much is known about event apprehension, the earliest stage of information processing in elicited language production studies, using pictorial stimuli. A reason for our lack of knowledge on this process is that apprehension happens very rapidly (<350 ms after stimulus onset, Griffin & Bock 2000), making it difficult to measure the process directly. To broaden our understanding of apprehension, we analyzed landing positions and onset latencies of first fixations on visual stimuli (pictures of real-world events) given short stimulus presentation times, presupposing that the first fixation directly results from information processing during apprehension
  • Guerra, E., Huettig, F., & Knoeferle, P. (2014). Assessing the time course of the influence of featural, distributional and spatial representations during reading. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2309-2314). Austin, TX: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2014/papers/402/.

    Abstract

    What does semantic similarity between two concepts mean? How could we measure it? The way in which semantic similarity is calculated might differ depending on the theoretical notion of semantic representation. In an eye-tracking reading experiment, we investigated whether two widely used semantic similarity measures (based on featural or distributional representations) have distinctive effects on sentence reading times. In other words, we explored whether these measures of semantic similarity differ qualitatively. In addition, we examined whether visually perceived spatial distance interacts with either or both of these measures. Our results showed that the effect of featural and distributional representations on reading times can differ both in direction and in its time course. Moreover, both featural and distributional information interacted with spatial distance, yet in different sentence regions and reading measures. We conclude that featural and distributional representations are distinct components of semantic representation.
  • Guerra, E., & Knoeferle, P. (2014). Spatial distance modulates reading times for sentences about social relations: evidence from eye tracking. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2315-2320). Austin, TX: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2014/papers/403/.

    Abstract

    Recent evidence from eye tracking during reading showed that non-referential spatial distance presented in a visual context can modulate semantic interpretation of similarity relations rapidly and incrementally. In two eye-tracking reading experiments we extended these findings in two important ways; first, we examined whether other semantic domains (social relations) could also be rapidly influenced by spatial distance during sentence comprehension. Second, we aimed to further specify how abstract language is co-indexed with spatial information by varying the syntactic structure of sentences between experiments. Spatial distance rapidly modulated reading times as a function of the social relation expressed by a sentence. Moreover, our findings suggest that abstract language can be co-indexed as soon as critical information becomes available for the reader.
  • Hendricks, I., Lefever, E., Croijmans, I., Majid, A., & Van den Bosch, A. (2016). Very quaffable and great fun: Applying NLP to wine reviews. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Vol 2 (pp. 306-312). Stroudsburg, PA: Association for Computational Linguistics.

    Abstract

    We automatically predict properties of wines on the basis of smell and flavor de- scriptions from experts’ wine reviews. We show wine experts are capable of describ- ing their smell and flavor experiences in wine reviews in a sufficiently consistent manner, such that we can use their descrip- tions to predict properties of a wine based solely on language. The experimental re- sults show promising F-scores when using lexical and semantic information to predict the color, grape variety, country of origin, and price of a wine. This demonstrates, contrary to popular opinion, that wine ex- perts’ reviews really are informative.
  • Heyselaar, E., Hagoort, P., & Segaert, K. (2014). In dialogue with an avatar, syntax production is identical compared to dialogue with a human partner. In P. Bello, M. Guarini, M. McShane, & B. Scassellati (Eds.), Proceedings of the 36th Annual Meeting of the Cognitive Science Society (CogSci 2014) (pp. 2351-2356). Austin, Tx: Cognitive Science Society.

    Abstract

    The use of virtual reality (VR) as a methodological tool is becoming increasingly popular in behavioural research due to its seemingly limitless possibilities. This new method has not been used frequently in the field of psycholinguistics, however, possibly due to the assumption that humancomputer interaction does not accurately reflect human-human interaction. In the current study we compare participants’ language behaviour in a syntactic priming task with human versus avatar partners. Our study shows comparable priming effects between human and avatar partners (Human: 12.3%; Avatar: 12.6% for passive sentences) suggesting that VR is a valid platform for conducting language research and studying dialogue interactions.
  • Hintz, F., & Scharenborg, O. (2016). Neighbourhood density influences word recognition in native and non-native speech recognition in noise. In H. Van den Heuvel, B. Cranen, & S. Mattys (Eds.), Proceedings of the Speech Processing in Realistic Environments (SPIRE) workshop (pp. 46-47). Groningen.
  • Hintz, F., & Scharenborg, O. (2016). The effect of background noise on the activation of phonological and semantic information during spoken-word recognition. In Proceedings of Interspeech 2016: The 17th Annual Conference of the International Speech Communication Association (pp. 2816-2820).

    Abstract

    During spoken-word recognition, listeners experience phonological competition between multiple word candidates, which increases, relative to optimal listening conditions, when speech is masked by noise. Moreover, listeners activate semantic word knowledge during the word’s unfolding. Here, we replicated the effect of background noise on phonological competition and investigated to which extent noise affects the activation of semantic information in phonological competitors. Participants’ eye movements were recorded when they listened to sentences containing a target word and looked at three types of displays. The displays either contained a picture of the target word, or a picture of a phonological onset competitor, or a picture of a word semantically related to the onset competitor, each along with three unrelated distractors. The analyses revealed that, in noise, fixations to the target and to the phonological onset competitor were delayed and smaller in magnitude compared to the clean listening condition, most likely reflecting enhanced phonological competition. No evidence for the activation of semantic information in the phonological competitors was observed in noise and, surprisingly, also not in the clear. We discuss the implications of the lack of an effect and differences between the present and earlier studies.
  • Hoffmann, C. W. G., Sadakata, M., Chen, A., Desain, P., & McQueen, J. M. (2014). Within-category variance and lexical tone discrimination in native and non-native speakers. In C. Gussenhoven, Y. Chen, & D. Dediu (Eds.), Proceedings of the 4th International Symposium on Tonal Aspects of Language (pp. 45-49). Nijmegen: Radboud University Nijmegen.

    Abstract

    In this paper, we show how acoustic variance within lexical tones in disyllabic Mandarin Chinese pseudowords affects discrimination abilities in both native and non-native speakers of Mandarin Chinese. Within-category acoustic variance did not hinder native speakers in discriminating between lexical tones, whereas it precludes Dutch native speakers from reaching native level performance. Furthermore, the influence of acoustic variance was not uniform but asymmetric, dependent on the presentation order of the lexical tones to be discriminated. An exploratory analysis using an active adaptive oddball paradigm was used to quantify the extent of the perceptual asymmetry. We discuss two possible mechanisms underlying this asymmetry and propose possible paradigms to investigate these mechanisms
  • Irivine, E., & Roberts, S. G. (2016). Deictic tools can limit the emergence of referential symbol systems. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/99.html.

    Abstract

    Previous experiments and models show that the pressure to communicate can lead to the emergence of symbols in specific tasks. The experiment presented here suggests that the ability to use deictic gestures can reduce the pressure for symbols to emerge in co-operative tasks. In the 'gesture-only' condition, pairs built a structure together in 'Minecraft', and could only communicate using a small range of gestures. In the 'gesture-plus' condition, pairs could also use sound to develop a symbol system if they wished. All pairs were taught a pointing convention. None of the pairs we tested developed a symbol system, and performance was no different across the two conditions. We therefore suggest that deictic gestures, and non-referential means of organising activity sequences, are often sufficient for communication. This suggests that the emergence of linguistic symbols in early hominids may have been late and patchy with symbols only emerging in contexts where they could significantly improve task success or efficiency. Given the communicative power of pointing however, these contexts may be fewer than usually supposed. An approach for identifying these situations is outlined.
  • Janssen, R., Winter, B., Dediu, D., Moisik, S. R., & Roberts, S. G. (2016). Nonlinear biases in articulation constrain the design space of language. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/86.html.

    Abstract

    In Iterated Learning (IL) experiments, a participant’s learned output serves as the next participant’s learning input (Kirby et al., 2014). IL can be used to model cultural transmission and has indicated that weak biases can be amplified through repeated cultural transmission (Kirby et al., 2007). So, for example, structural language properties can emerge over time because languages come to reflect the cognitive constraints in the individuals that learn and produce the language. Similarly, we propose that languages may also reflect certain anatomical biases. Do sound systems adapt to the affordances of the articulation space induced by the vocal tract? The human vocal tract has inherent nonlinearities which might derive from acoustics and aerodynamics (cf. quantal theory, see Stevens, 1989) or biomechanics (cf. Gick & Moisik, 2015). For instance, moving the tongue anteriorly along the hard palate to produce a fricative does not result in large changes in acoustics in most cases, but for a small range there is an abrupt change from a perceived palato-alveolar [ʃ] to alveolar [s] sound (Perkell, 2012). Nonlinearities such as these might bias all human speakers to converge on a very limited set of phonetic categories, and might even be a basis for combinatoriality or phonemic ‘universals’. While IL typically uses discrete symbols, Verhoef et al. (2014) have used slide whistles to produce a continuous signal. We conducted an IL experiment with human subjects who communicated using a digital slide whistle for which the degree of nonlinearity is controlled. A single parameter (α) changes the mapping from slide whistle position (the ‘articulator’) to the acoustics. With α=0, the position of the slide whistle maps Bark-linearly to the acoustics. As α approaches 1, the mapping gets more double-sigmoidal, creating three plateaus where large ranges of positions map to similar frequencies. In more abstract terms, α represents the strength of a nonlinear (anatomical) bias in the vocal tract. Six chains (138 participants) of dyads were tested, each chain with a different, fixed α. Participants had to communicate four meanings by producing a continuous signal using the slide-whistle in a ‘director-matcher’ game, alternating roles (cf. Garrod et al., 2007). Results show that for high αs, subjects quickly converged on the plateaus. This quick convergence is indicative of a strong bias, repelling subjects away from unstable regions already within-subject. Furthermore, high αs lead to the emergence of signals that oscillate between two (out of three) plateaus. Because the sigmoidal spaces are spatially constrained, participants increasingly used the sequential/temporal dimension. As a result of this, the average duration of signals with high α was ~100ms longer than with low α. These oscillations could be an expression of a basis for phonemic combinatoriality. We have shown that it is possible to manipulate the magnitude of an articulator-induced non-linear bias in a slide whistle IL framework. The results suggest that anatomical biases might indeed constrain the design space of language. In particular, the signaling systems in our study quickly converged (within-subject) on the use of stable regions. While these conclusions were drawn from experiments using slide whistles with a relatively strong bias, weaker biases could possibly be amplified over time by repeated cultural transmission, and likely lead to similar outcomes.
  • Janssen, R., Dediu, D., & Moisik, S. R. (2016). Simple agents are able to replicate speech sounds using 3d vocal tract model. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/97.html.

    Abstract

    Many factors have been proposed to explain why groups of people use different speech sounds in their language. These range from cultural, cognitive, environmental (e.g., Everett, et al., 2015) to anatomical (e.g., vocal tract (VT) morphology). How could such anatomical properties have led to the similarities and differences in speech sound distributions between human languages? It is known that hard palate profile variation can induce different articulatory strategies in speakers (e.g., Brunner et al., 2009). That is, different hard palate profiles might induce a kind of bias on speech sound production, easing some types of sounds while impeding others. With a population of speakers (with a proportion of individuals) that share certain anatomical properties, even subtle VT biases might become expressed at a population-level (through e.g., bias amplification, Kirby et al., 2007). However, before we look into population-level effects, we should first look at within-individual anatomical factors. For that, we have developed a computer-simulated analogue for a human speaker: an agent. Our agent is designed to replicate speech sounds using a production and cognition module in a computationally tractable manner. Previous agent models have often used more abstract (e.g., symbolic) signals. (e.g., Kirby et al., 2007). We have equipped our agent with a three-dimensional model of the VT (the production module, based on Birkholz, 2005) to which we made numerous adjustments. Specifically, we used a 4th-order Bezier curve that is able to capture hard palate variation on the mid-sagittal plane (XXX, 2015). Using an evolutionary algorithm, we were able to fit the model to human hard palate MRI tracings, yielding high accuracy fits and using as little as two parameters. Finally, we show that the samples map well-dispersed to the parameter-space, demonstrating that the model cannot generate unrealistic profiles. We can thus use this procedure to import palate measurements into our agent’s production module to investigate the effects on acoustics. We can also exaggerate/introduce novel biases. Our agent is able to control the VT model using the cognition module. Previous research has focused on detailed neurocomputation (e.g., Kröger et al., 2014) that highlights e.g., neurobiological principles or speech recognition performance. However, the brain is not the focus of our current study. Furthermore, present-day computing throughput likely does not allow for large-scale deployment of these architectures, as required by the population model we are developing. Thus, the question whether a very simple cognition module is able to replicate sounds in a computationally tractable manner, and even generalize over novel stimuli, is one worthy of attention in its own right. Our agent’s cognition module is based on running an evolutionary algorithm on a large population of feed-forward neural networks (NNs). As such, (anatomical) bias strength can be thought of as an attractor basin area within the parameter-space the agent has to explore. The NN we used consists of a triple-layered (fully-connected), directed graph. The input layer (three neurons) receives the formants frequencies of a target-sound. The output layer (12 neurons) projects to the articulators in the production module. A hidden layer (seven neurons) enables the network to deal with nonlinear dependencies. The Euclidean distance (first three formants) between target and replication is used as fitness measure. Results show that sound replication is indeed possible, with Euclidean distance quickly approaching a close-to-zero asymptote. Statistical analysis should reveal if the agent can also: a) Generalize: Can it replicate sounds not exposed to during learning? b) Replicate consistently: Do different, isolated agents always converge on the same sounds? c) Deal with consolidation: Can it still learn new sounds after an extended learning phase (‘infancy’) has been terminated? Finally, a comparison with more complex models will be used to demonstrate robustness.
  • Janzen, G., & Weststeijn, C. (2004). Neural representation of object location and route direction: An fMRI study. NeuroImage, 22(Supplement 1), e634-e635.
  • Janzen, G., & Van Turennout, M. (2004). Neuronale Markierung navigationsrelevanter Objekte im räumlichen Gedächtnis: Ein fMRT Experiment. In D. Kerzel (Ed.), Beiträge zur 46. Tagung experimentell arbeitender Psychologen (pp. 125-125). Lengerich: Pabst Science Publishers.
  • Jeske, J., Kember, H., & Cutler, A. (2016). Native and non-native English speakers' use of prosody to predict sentence endings. In Proceedings of the 16th Australasian International Conference on Speech Science and Technology (SST2016).
  • Johns, T. G., Perera, R. M., Vitali, A. A., Vernes, S. C., & Scott, A. (2004). Phosphorylation of a glioma-specific mutation of the EGFR [Abstract]. Neuro-Oncology, 6, 317.

    Abstract

    Mutations of the epidermal growth factor receptor (EGFR) gene are found at a relatively high frequency in glioma, with the most common being the de2-7 EGFR (or EGFRvIII). This mutation arises from an in-frame deletion of exons 2-7, which removes 267 amino acids from the extracellular domain of the receptor. Despite being unable to bind ligand, the de2-7 EGFR is constitutively active at a low level. Transfection of human glioma cells with the de2-7 EGFR has little effect in vitro, but when grown as tumor xenografts this mutated receptor imparts a dramatic growth advantage. We mapped the phosphorylation pattern of de2-7 EGFR, both in vivo and in vitro, using a panel of antibodies specific for different phosphorylated tyrosine residues. Phosphorylation of de2-7 EGFR was detected constitutively at all tyrosine sites surveyed in vitro and in vivo, including tyrosine 845, a known target in the wild-type EGFR for src kinase. There was a substantial upregulation of phosphorylation at every yrosine residue of the de2-7 EGFR when cells were grown in vivo compared to the receptor isolated from cells cultured in vitro. Upregulation of phosphorylation at tyrosine 845 could be stimulated in vitro by the addition of specific components of the ECM via an integrindependent mechanism. These observations may partially explain why the growth enhancement mediated by de2-7 EGFR is largely restricted to the in vivo environment
  • Jung, D., Klessa, K., Duray, Z., Oszkó, B., Sipos, M., Szeverényi, S., Várnai, Z., Trilsbeek, P., & Váradi, T. (2014). Languagesindanger.eu - Including multimedia language resources to disseminate knowledge and create educational material on less-resourced languages. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 530-535).

    Abstract

    The present paper describes the development of the languagesindanger.eu interactive website as an example of including multimedia language resources to disseminate knowledge and create educational material on less-resourced languages. The website is a product of INNET (Innovative networking in infrastructure for endangered languages), European FP7 project. Its main functions can be summarized as related to the three following areas: (1) raising students' awareness of language endangerment and arouse their interest in linguistic diversity, language maintenance and language documentation; (2) informing both students and teachers about these topics and show ways how they can enlarge their knowledge further with a special emphasis on information about language archives; (3) helping teachers include these topics into their classes. The website has been localized into five language versions with the intention to be accessible to both scientific and non-scientific communities such as (primarily) secondary school teachers and students, beginning university students of linguistics, journalists, the interested public, and also members of speech communities who speak minority languages
  • Kember, H., Choi, J., & Cutler, A. (2016). Processing advantages for focused words in Korean. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 702-705).

    Abstract

    In Korean, focus is expressed in accentual phrasing. To ascertain whether words focused in this manner enjoy a processing advantage analogous to that conferred by focus as expressed in, e.g, English and Dutch, we devised sentences with target words in one of four conditions: prosodic focus, syntactic focus, prosodic + syntactic focus, and no focus as a control. 32 native speakers of Korean listened to blocks of 10 sentences, then were presented visually with words and asked whether or not they had heard them. Overall, words with focus were recognised significantly faster and more accurately than unfocused words. In addition, words with syntactic focus or syntactic + prosodic focus were recognised faster than words with prosodic focus alone. As for other languages, Korean focus confers processing advantage on the words carrying it. While prosodic focus does provide an advantage, however, syntactic focus appears to provide the greater beneficial effect for recognition memory
  • Kempen, G., & Harbusch, K. (1998). A 'tree adjoining' grammar without adjoining: The case of scrambling in German. In Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4).
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses? A corpus study revealing unexpected rigidity. In S. Kepser, & M. Reis (Eds.), Pre-Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: Niemeyer.
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses?: A corpus study revealing unexpected rigidity. In Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: University of Tübingen.
  • Kempen, G. (2004). Interactive visualization of syntactic structure assembly for grammar-intensive first- and second-language instruction. In R. Delmonte, P. Delcloque, & S. Tonelli (Eds.), Proceedings of InSTIL/ICALL2004 Symposium on NLP and speech technologies in advanced language learning systems (pp. 183-186). Venice: University of Venice.
  • Kempen, G., & Hoenkamp, E. (1982). Incremental sentence generation: Implications for the structure of a syntactic processor. In J. Horecký (Ed.), COLING 82. Proceedings of the Ninth International Conference on Computational Linguistics, Prague, July 5-10, 1982 (pp. 151-156). Amsterdam: North-Holland.

    Abstract

    Human speakers often produce sentences incrementally. They can start speaking having in mind only a fragmentary idea of what they want to say, and while saying this they refine the contents underlying subsequent parts of the utterance. This capability imposes a number of constraints on the design of a syntactic processor. This paper explores these constraints and evaluates some recent computational sentence generators from the perspective of incremental production.
  • Kempen, G. (2004). Human grammatical coding: Shared structure formation resources for grammatical encoding and decoding. In Cuny 2004 - The 17th Annual CUNY Conference on Human Sentence Processing. March 25-27, 2004. University of Maryland (pp. 66).
  • Kita, S., van Gijn, I., & van der Hulst, H. (1998). Movement phases in signs and co-speech gestures, and their transcription by human coders. In Gesture and Sign-Language in Human-Computer Interaction (Lecture Notes in Artificial Intelligence - LNCS Subseries, Vol. 1371) (pp. 23-35). Berlin, Germany: Springer-Verlag.

    Abstract

    The previous literature has suggested that the hand movement in co-speech gestures and signs consists of a series of phases with qualitatively different dynamic characteristics. In this paper, we propose a syntagmatic rule system for movement phases that applies to both co-speech gestures and signs. Descriptive criteria for the rule system were developed for the analysis video-recorded continuous production of signs and gesture. It involves segmenting a stream of body movement into phases and identifying different phase types. Two human coders used the criteria to analyze signs and cospeech gestures that are produced in natural discourse. It was found that the criteria yielded good inter-coder reliability. These criteria can be used for the technology of automatic recognition of signs and co-speech gestures in order to segment continuous production and identify the potentially meaningbearing phase.
  • Klatter-Folmer, J., Van Hout, R., Van den Heuvel, H., Fikkert, P., Baker, A., De Jong, J., Wijnen, F., Sanders, E., & Trilsbeek, P. (2014). Vulnerability in acquisition, language impairments in Dutch: Creating a VALID data archive. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 357-364).

    Abstract

    The VALID Data Archive is an open multimedia data archive (under construction) with data from speakers suffering from language impairments. We report on a pilot project in the CLARIN-NL framework in which five data resources were curated. For all data sets concerned, written informed consent from the participants or their caretakers has been obtained. All materials were anonymized. The audio files were converted into wav (linear PCM) files and the transcriptions into CHAT or ELAN format. Research data that consisted of test, SPSS and Excel files were documented and converted into CSV files. All data sets obtained appropriate CMDI metadata files. A new CMDI metadata profile for this type of data resources was established and care was taken that ISOcat metadata categories were used to optimize interoperability. After curation all data are deposited at the Max Planck Institute for Psycholinguistics Nijmegen where persistent identifiers are linked to all resources. The content of the transcriptions in CHAT and plain text format can be searched with the TROVA search engine
  • Klein, W. (Ed.). (1998). Kaleidoskop [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (112).
  • Klein, W. (Ed.). (2004). Philologie auf neuen Wegen [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 136.
  • Klein, W. (Ed.). (1982). Zweitspracherwerb [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (45).
  • Klein, W. (Ed.). (2004). Universitas [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik (LiLi), 134.
  • Latrouite, A., & Van Valin Jr., R. D. (2014). Event existentials in Tagalog: A Role and Reference Grammar account. In W. Arka, & N. L. K. Mas Indrawati (Eds.), Argument realisations and related constructions in Austronesian languages: papers from 12-ICAL (pp. 161-174). Canberra: Pacific Linguistics.
  • Lenkiewicz, P., Drude, S., Lenkiewicz, A., Gebre, B. G., Masneri, S., Schreer, O., Schwenninger, J., & Bardeli, R. (2014). Application of audio and video processing methods for language research and documentation: The AVATecH Project. In Z. Vetulani, & J. Mariani (Eds.), 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25-27, 2011, Revised Selected Papers (pp. 288-299). Berlin: Springer.

    Abstract

    Evolution and changes of all modern languages is a wellknown fact. However, recently it is reaching dynamics never seen before, which results in loss of the vast amount of information encoded in every language. In order to preserve such rich heritage, and to carry out linguistic research, properly annotated recordings of world languages are necessary. Since creating those annotations is a very laborious task, reaching times 100 longer than the length of the annotated media, innovative video processing algorithms are needed, in order to improve the efficiency and quality of annotation process. This is the scope of the AVATecH project presented in this article
  • Lenkiewicz, P., Shkaravska, O., Goosen, T., Windhouwer, M., Broeder, D., Roth, S., & Olsson, O. (2014). The DWAN framework: Application of a web annotation framework for the general humanities to the domain of language resources. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 3644-3649).
  • Lev-Ari, S., & Peperkamp, S. (2014). Do people converge to the linguistic patterns of non-reliable speakers? Perceptual learning from non-native speakers. In S. Fuchs, M. Grice, A. Hermes, L. Lancia, & D. Mücke (Eds.), Proceedings of the 10th International Seminar on Speech Production (ISSP) (pp. 261-264).

    Abstract

    People's language is shaped by the input from the environment. The environment, however, offers a range of linguistic inputs that differ in their reliability. We test whether listeners accordingly weigh input from sources that differ in reliability differently. Using a perceptual learning paradigm, we show that listeners adjust their representations according to linguistic input provided by native but not by non-native speakers. This is despite the fact that listeners are able to learn the characteristics of the speech of both speakers. These results provide evidence for a disassociation between adaptation to the characteristic of specific speakers and adjustment of linguistic representations in general based on these learned characteristics. This study also has implications for theories of language change. In particular, it cast doubts on the hypothesis that a large proportion of non-native speakers in a community can bring about linguistic changes

Share this page