Publications

Displaying 1 - 100 of 155
  • Almeida, L., Amdal, I., Beires, N., Boualem, M., Boves, L., Den Os, E., Filoche, P., Gomes, R., Knudsen, J. E., Kvale, K., Rugelbak, J., Tallec, C., & Warakagoda, N. (2002). Implementing and evaluating a multimodal tourist guide. In J. v. Kuppevelt, L. Dybkjær, & N. Bernsen (Eds.), Proceedings of the International CLASS Workshop on Natural, Intelligent and Effective Interaction in Multimodal Dialogue System (pp. 1-7). Copenhagen: Kluwer.
  • Anastasopoulos, A., Lekakou, M., Quer, J., Zimianiti, E., DeBenedetto, J., & Chiang, D. (2018). Part-of-speech tagging on an endangered language: a parallel Griko-Italian Resource. In Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018) (pp. 2529-2539).

    Abstract

    Most work on part-of-speech (POS) tagging is focused on high resource languages, or examines low-resource and active learning settings through simulated studies. We evaluate POS tagging techniques on an actual endangered language, Griko. We present a resource that contains 114 narratives in Griko, along with sentence-level translations in Italian, and provides gold annotations for the test set. Based on a previously collected small corpus, we investigate several traditional methods, as well as methods that take advantage of monolingual data or project cross-lingual POS tags. We show that the combination of a semi-supervised method with cross-lingual transfer is more appropriate for this extremely challenging setting, with the best tagger achieving an accuracy of 72.9%. With an applied active learning scheme, which we use to collect sentence-level annotations over the test set, we achieve improvements of more than 21 percentage points
  • Bavin, E. L., & Kidd, E. (2000). Learning new verbs: Beyond the input. In C. Davis, T. J. Van Gelder, & R. Wales (Eds.), Cognitive Science in Australia, 2000: Proceedings of the Fifth Biennial Conference of the Australasian Cognitive Science Society.
  • Bentz, C., Dediu, D., Verkerk, A., & Jäger, G. (2018). Language family trees reflect geography and demography beyond neutral drift. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 38-40). Toruń, Poland: NCU Press. doi:10.12775/3991-1.006.
  • Bögels, S., Barr, D., Garrod, S., & Kessler, K. (2013). "Are we still talking about the same thing?" MEG reveals perspective-taking in response to pragmatic violations, but not in anticipation. In M. Knauff, N. Pauen, I. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 215-220). Austin, TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0066/index.html.

    Abstract

    The current study investigates whether mentalizing, or taking the perspective of your interlocutor, plays an essential role throughout a conversation or whether it is mostly used in reaction to misunderstandings. This study is the first to use a brain-imaging method, MEG, to answer this question. In a first phase of the experiment, MEG participants interacted "live" with a confederate who set naming precedents for certain pictures. In a later phase, these precedents were sometimes broken by a speaker who named the same picture in a different way. This could be done by the same speaker, who set the precedent, or by a different speaker. Source analysis of MEG data showed that in the 800 ms before the naming, when the picture was already on the screen, episodic memory and language areas were activated, but no mentalizing areas, suggesting that the speaker's naming intentions were not anticipated by the listener on the basis of shared experiences. Mentalizing areas only became activated after the same speaker had broken a precedent, which we interpret as a reaction to the violation of conversational pragmatics.
  • Bone, D., Ramanarayanan, V., Narayanan, S., Hoedemaker, R. S., & Gordon, P. C. (2013). Analyzing eye-voice coordination in rapid automatized naming. In F. Bimbot, C. Cerisara, G. Fougeron, L. Gravier, L. Lamel, F. Pelligrino, & P. Perrier (Eds.), INTERSPEECH-2013: 14thAnnual Conference of the International Speech Communication Association (pp. 2425-2429). ISCA Archive. Retrieved from http://www.isca-speech.org/archive/interspeech_2013/i13_2425.html.

    Abstract

    Rapid Automatized Naming (RAN) is a powerful tool for pre- dicting future reading skill. A person’s ability to quickly name symbols as they scan a table is related to higher-level reading proficiency in adults and is predictive of future literacy gains in children. However, noticeable differences are present in the strategies or patterns within groups having similar task comple- tion times. Thus, a further stratification of RAN dynamics may lead to better characterization and later intervention to support reading skill acquisition. In this work, we analyze the dynamics of the eyes, voice, and the coordination between the two during performance. It is shown that fast performers are more similar to each other than to slow performers in their patterns, but not vice versa. Further insights are provided about the patterns of more proficient subjects. For instance, fast performers tended to exhibit smoother behavior contours, suggesting a more sta- ble perception-production process.
  • Bowerman, M., Brown, P., Eisenbeiss, S., Narasimhan, B., & Slobin, D. I. (2002). Putting things in places: Developmental consequences of linguistic typology. In E. V. Clark (Ed.), Proceedings of the 31st Stanford Child Language Research Forum. Space in language location, motion, path, and manner (pp. 1-29). Stanford: Center for the Study of Language & Information.

    Abstract

    This study explores how adults and children describe placement events (e.g., putting a book on a table) in a range of different languages (Finnish, English, German, Russian, Hindi, Tzeltal Maya, Spanish, and Turkish). Results show that the eight languages grammatically encode placement events in two main ways (Talmy, 1985, 1991), but further investigation reveals fine-grained crosslinguistic variation within each of the two groups. Children are sensitive to these finer-grained characteristics of the input language at an early age, but only when such features are perceptually salient. Our study demonstrates that a unitary notion of 'event' does not suffice to characterize complex but systematic patterns of event encoding crosslinguistically, and that children are sensitive to multiple influences, including the distributional properties of the target language, in constructing these patterns in their own speech.
  • Brand, J., Monaghan, P., & Walker, P. (2018). Changing Signs: Testing How Sound-Symbolism Supports Early Word Learning. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 1398-1403). Austin, TX: Cognitive Science Society.

    Abstract

    Learning a language involves learning how to map specific forms onto their associated meanings. Such mappings can utilise arbitrariness and non-arbitrariness, yet, our understanding of how these two systems operate at different stages of vocabulary development is still not fully understood. The Sound-Symbolism Bootstrapping Hypothesis (SSBH) proposes that sound-symbolism is essential for word learning to commence, but empirical evidence of exactly how sound-symbolism influences language learning is still sparse. It may be the case that sound-symbolism supports acquisition of categories of meaning, or that it enables acquisition of individualized word meanings. In two Experiments where participants learned form-meaning mappings from either sound-symbolic or arbitrary languages, we demonstrate the changing roles of sound-symbolism and arbitrariness for different vocabulary sizes, showing that sound-symbolism provides an advantage for learning of broad categories, which may then transfer to support learning individual words, whereas an arbitrary language impedes acquisition of categories of sound to meaning.
  • Broeder, D., Offenga, F., & Willems, D. (2002). Metadata tools supporting controlled vocabulary services. In M. Rodriguez González, & C. Paz SuárezR Araujo (Eds.), Third international conference on language resources and evaluation (pp. 1055-1059). Paris: European Language Resources Association.

    Abstract

    Within the ISLE Metadata Initiative (IMDI) project a user-friendly editor to enter metadata descriptions and a browser operating on the linked metadata descriptions were developed. Both tools support the usage of Controlled Vocabulary (CV) repositories by means of the specification of an URL where the formal CV definition data is available.
  • Broeder, D., Wittenburg, P., Declerck, T., & Romary, L. (2002). LREP: A language repository exchange protocol. In M. Rodriguez González, & C. Paz Suárez Araujo (Eds.), Third international conference on language resources and evaluation (pp. 1302-1305). Paris: European Language Resources Association.

    Abstract

    The recent increase in the number and complexity of the language resources available on the Internet is followed by a similar increase of available tools for linguistic analysis. Ideally the user does not need to be confronted with the question in how to match tools with resources. If resource repositories and tool repositories offer adequate metadata information and a suitable exchange protocol is developed this matching process could be performed (semi-) automatically.
  • Broersma, M. (2002). Comprehension of non-native speech: Inaccurate phoneme processing and activation of lexical competitors. In ICSLP-2002 (pp. 261-264). Denver: Center for Spoken Language Research, U. of Colorado Boulder.

    Abstract

    Native speakers of Dutch with English as a second language and native speakers of English participated in an English lexical decision experiment. Phonemes in real words were replaced by others from which they are hard to distinguish for Dutch listeners. Non-native listeners judged the resulting near-words more often as a word than native listeners. This not only happened when the phonemes that were exchanged did not exist as separate phonemes in the native language Dutch, but also when phoneme pairs that do exist in Dutch were used in word-final position, where they are not distinctive in Dutch. In an English bimodal priming experiment with similar groups of participants, word pairs were used which differed in one phoneme. These phonemes were hard to distinguish for the non-native listeners. Whereas in native listening both words inhibited each other, in non-native listening presentation of one word led to unresolved competition between both words. The results suggest that inaccurate phoneme processing by non-native listeners leads to the activation of spurious lexical competitors.
  • Brugman, H., Levinson, S. C., Skiba, R., & Wittenburg, P. (2002). The DOBES archive: It's purpose and implementation. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 11-11). Paris: European Language Resources Association.
  • Brugman, H., Spenke, H., Kramer, M., & Klassmann, A. (2002). Multimedia annotation with multilingual input methods and search support.
  • Brugman, H., Wittenburg, P., Levinson, S. C., & Kita, S. (2002). Multimodal annotations in gesture and sign language studies. In M. Rodriguez González, & C. Paz Suárez Araujo (Eds.), Third international conference on language resources and evaluation (pp. 176-182). Paris: European Language Resources Association.

    Abstract

    For multimodal annotations an exhaustive encoding system for gestures was developed to facilitate research. The structural requirements of multimodal annotations were analyzed to develop an Abstract Corpus Model which is the basis for a powerful annotation and exploitation tool for multimedia recordings and the definition of the XML-based EUDICO Annotation Format. Finally, a metadata-based data management environment has been setup to facilitate resource discovery and especially corpus management. Bt means of an appropriate digitization policy and their online availability researchers have been able to build up a large corpus covering gesture and sign language data.
  • Byun, K.-S., De Vos, C., Roberts, S. G., & Levinson, S. C. (2018). Interactive sequences modulate the selection of expressive forms in cross-signing. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 67-69). Toruń, Poland: NCU Press. doi:10.12775/3991-1.012.
  • Cablitz, G. (2002). The acquisition of an absolute system: learning to talk about space in Marquesan (Oceanic, French Polynesia). In E. V. Clark (Ed.), Space in language location, motion, path, and manner (pp. 40-49). Stanford: Center for the Study of Language & Information (Electronic proceedings.
  • Casillas, M., & Frank, M. C. (2013). The development of predictive processes in children’s discourse understanding. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society. (pp. 299-304). Austin,TX: Cognitive Society.

    Abstract

    We investigate children’s online predictive processing as it occurs naturally, in conversation. We showed 1–7 year-olds short videos of improvised conversation between puppets, controlling for available linguistic information through phonetic manipulation. Even one- and two-year-old children made accurate and spontaneous predictions about when a turn-switch would occur: they gazed at the upcoming speaker before they heard a response begin. This predictive skill relies on both lexical and prosodic information together, and is not tied to either type of information alone. We suggest that children integrate prosodic, lexical, and visual information to effectively predict upcoming linguistic material in conversation.
  • Chen, A., Gussenhoven, C., & Rietveld, T. (2002). Language-specific uses of the effort code. In B. Bel, & I. Marlien (Eds.), Proceedings of the 1st Conference on Speech Prosody (pp. 215-218). Aix=en-Provence: Université de Provence.

    Abstract

    Two groups of listeners with Dutch and British English language backgrounds judged Dutch and British English utterances, respectively, which varied in the intonation contour on the scales EMPHATIC vs. NOT EMPHATIC and SURPRISED vs. NOT SURPRISED, two meanings derived from the Effort Code. The stimuli, which differed in sentence mode but were otherwise lexically equivalent, were varied in peak height, peak alignment, end pitch, and overall register. In both languages, there are positive correlations between peak height and degree of emphasis, between peak height and degree of surprise, between peak alignment and degree of surprise, and between pitch register and degree of surprise. However, in all these cases, Dutch stimuli lead to larger perceived meaning differences than the British English stimuli. This difference in the extent to which increased pitch height triggers increases in perceived emphasis and surprise is argued to be due to the difference in the standard pitch ranges between Dutch and British English. In addition, we found a positive correlation between pitch register and the degree of emphasis in Dutch, but a negative correlation in British English. This is an unexpected difference, which illustrates a case of ambiguity in the meaning of pitch.
  • Cristia, A., Ganesh, S., Casillas, M., & Ganapathy, S. (2018). Talker diarization in the wild: The case of child-centered daylong audio-recordings. In Proceedings of Interspeech 2018 (pp. 2583-2587). doi:10.21437/Interspeech.2018-2078.

    Abstract

    Speaker diarization (answering 'who spoke when') is a widely researched subject within speech technology. Numerous experiments have been run on datasets built from broadcast news, meeting data, and call centers—the task sometimes appears close to being solved. Much less work has begun to tackle the hardest diarization task of all: spontaneous conversations in real-world settings. Such diarization would be particularly useful for studies of language acquisition, where researchers investigate the speech children produce and hear in their daily lives. In this paper, we study audio gathered with a recorder worn by small children as they went about their normal days. As a result, each child was exposed to different acoustic environments with a multitude of background noises and a varying number of adults and peers. The inconsistency of speech and noise within and across samples poses a challenging task for speaker diarization systems, which we tackled via retraining and data augmentation techniques. We further studied sources of structured variation across raw audio files, including the impact of speaker type distribution, proportion of speech from children, and child age on diarization performance. We discuss the extent to which these findings might generalize to other samples of speech in the wild.
  • Cutler, A., McQueen, J. M., Jansonius, M., & Bayerl, S. (2002). The lexical statistics of competitor activation in spoken-word recognition. In C. Bow (Ed.), Proceedings of the 9th Australian International Conference on Speech Science and Technology (pp. 40-45). Canberra: Australian Speech Science and Technology Association (ASSTA).

    Abstract

    The Possible Word Constraint is a proposed mechanism whereby listeners avoid recognising words spuriously embedded in other words. It applies to words leaving a vowelless residue between their edge and the nearest known word or syllable boundary. The present study tests the usefulness of this constraint via lexical statistics of both English and Dutch. The analyses demonstrate that the constraint removes a clear majority of embedded words in speech, and thus can contribute significantly to the efficiency of human speech recognition
  • Ip, M. H. K., & Cutler, A. (2018). Asymmetric efficiency of juncture perception in L1 and L2. In K. Klessa, J. Bachan, A. Wagner, M. Karpiński, & D. Śledziński (Eds.), Proceedings of Speech Prosody 2018 (pp. 289-296). Baixas, France: ISCA. doi:10.21437/SpeechProsody.2018-59.

    Abstract

    In two experiments, Mandarin listeners resolved potential syntactic ambiguities in spoken utterances in (a) their native language (L1) and (b) English which they had learned as a second language (L2). A new disambiguation task was used, requiring speeded responses to select the correct meaning for structurally ambiguous sentences. Importantly, the ambiguities used in the study are identical in Mandarin and in English, and production data show that prosodic disambiguation of this type of ambiguity is also realised very similarly in the two languages. The perceptual results here showed however that listeners’ response patterns differed for L1 and L2, although there was a significant increase in similarity between the two response patterns with increasing exposure to the L2. Thus identical ambiguity and comparable disambiguation patterns in L1 and L2 do not lead to immediate application of the appropriate L1 listening strategy to L2; instead, it appears that such a strategy may have to be learned anew for the L2.
  • Ip, M. H. K., & Cutler, A. (2018). Cue equivalence in prosodic entrainment for focus detection. In J. Epps, J. Wolfe, J. Smith, & C. Jones (Eds.), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 153-156).

    Abstract

    Using a phoneme detection task, the present series of
    experiments examines whether listeners can entrain to
    different combinations of prosodic cues to predict where focus
    will fall in an utterance. The stimuli were recorded by four
    female native speakers of Australian English who happened to
    have used different prosodic cues to produce sentences with
    prosodic focus: a combination of duration cues, mean and
    maximum F0, F0 range, and longer pre-target interval before
    the focused word onset, only mean F0 cues, only pre-target
    interval, and only duration cues. Results revealed that listeners
    can entrain in almost every condition except for where
    duration was the only reliable cue. Our findings suggest that
    listeners are flexible in the cues they use for focus processing.
  • Cutler, A., Burchfield, L. A., & Antoniou, M. (2018). Factors affecting talker adaptation in a second language. In J. Epps, J. Wolfe, J. Smith, & C. Jones (Eds.), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 33-36).

    Abstract

    Listeners adapt rapidly to previously unheard talkers by
    adjusting phoneme categories using lexical knowledge, in a
    process termed lexically-guided perceptual learning. Although
    this is firmly established for listening in the native language
    (L1), perceptual flexibility in second languages (L2) is as yet
    less well understood. We report two experiments examining L1
    and L2 perceptual learning, the first in Mandarin-English late
    bilinguals, the second in Australian learners of Mandarin. Both
    studies showed stronger learning in L1; in L2, however,
    learning appeared for the English-L1 group but not for the
    Mandarin-L1 group. Phonological mapping differences from
    the L1 to the L2 are suggested as the reason for this result.
  • Cutler, A., & Koster, M. (2000). Stress and lexical activation in Dutch. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the Sixth International Conference on Spoken Language Processing: Vol. 1 (pp. 593-596). Beijing: China Military Friendship Publish.

    Abstract

    Dutch listeners were slower to make judgements about the semantic relatedness between a spoken target word (e.g. atLEET, 'athlete') and a previously presented visual prime word (e.g. SPORT 'sport') when the spoken word was mis-stressed. The adverse effect of mis-stressing confirms the role of stress information in lexical recognition in Dutch. However, although the erroneous stress pattern was always initially compatible with a competing word (e.g. ATlas, 'atlas'), mis-stressed words did not produced high false alarm rates in unrelated pairs (e.g. SPORT - atLAS). This suggests that stress information did not completely rule out segmentally matching but suprasegmentally mismatching words, a finding consistent with spoken-word recognition models involving multiple activation and inter-word competition.
  • Cutler, A., & Butterfield, S. (1986). The perceptual integrity of initial consonant clusters. In R. Lawrence (Ed.), Speech and Hearing: Proceedings of the Institute of Acoustics (pp. 31-36). Edinburgh: Institute of Acoustics.
  • Cutler, A., Norris, D., & McQueen, J. M. (2000). Tracking TRACE’s troubles. In A. Cutler, J. M. McQueen, & R. Zondervan (Eds.), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 63-66). Nijmegen: Max-Planck-Institute for Psycholinguistics.

    Abstract

    Simulations explored the inability of the TRACE model of spoken-word recognition to model the effects on human listening of acoustic-phonetic mismatches in word forms. The source of TRACE's failure lay not in its interactive connectivity, not in the presence of interword competition, and not in the use of phonemic representations, but in the need for continuously optimised interpretation of the input. When an analogue of TRACE was allowed to cycle to asymptote on every slice of input, an acceptable simulation of the subcategorical mismatch data was achieved. Even then, however, the simulation was not as close as that produced by the Merge model.
  • Cutler, A., & Bruggeman, L. (2013). Vocabulary structure and spoken-word recognition: Evidence from French reveals the source of embedding asymmetry. In Proceedings of INTERSPEECH: 14th Annual Conference of the International Speech Communication Association (pp. 2812-2816).

    Abstract

    Vocabularies contain hundreds of thousands of words built from only a handful of phonemes, so that inevitably longer words tend to contain shorter ones. In many languages (but not all) such embedded words occur more often word-initially than word-finally, and this asymmetry, if present, has farreaching consequences for spoken-word recognition. Prior research had ascribed the asymmetry to suffixing or to effects of stress (in particular, final syllables containing the vowel schwa). Analyses of the standard French vocabulary here reveal an effect of suffixing, as predicted by this account, and further analyses of an artificial variety of French reveal that extensive final schwa has an independent and additive effect in promoting the embedding asymmetry.
  • Delgado, T., Ravignani, A., Verhoef, T., Thompson, B., Grossi, T., & Kirby, S. (2018). Cultural transmission of melodic and rhythmic universals: Four experiments and a model. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 89-91). Toruń, Poland: NCU Press. doi:10.12775/3991-1.019.
  • Dimroth, C., & Lasser, I. (Eds.). (2002). Finite options: How L1 and L2 learners cope with the acquisition of finiteness [Special Issue]. Linguistics, 40(4).
  • Dolscheid, S., Graver, C., & Casasanto, D. (2013). Spatial congruity effects reveal metaphors, not markedness. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 2213-2218). Austin,TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0405/index.html.

    Abstract

    Spatial congruity effects have often been interpreted as evidence for metaphorical thinking, but an alternative markedness-based account challenges this view. In two experiments, we directly compared metaphor and markedness explanations for spatial congruity effects, using musical pitch as a testbed. English speakers who talk about pitch in terms of spatial height were tested in speeded space-pitch compatibility tasks. To determine whether space-pitch congruency effects could be elicited by any marked spatial continuum, participants were asked to classify high- and low-frequency pitches as 'high' and 'low' or as 'front' and 'back' (both pairs of terms constitute cases of marked continuums). We found congruency effects in high/low conditions but not in front/back conditions, indicating that markedness is not sufficient to account for congruity effects (Experiment 1). A second experiment showed that congruency effects were specific to spatial words that cued a vertical schema (tall/short), and that congruity effects were not an artifact of polysemy (e.g., 'high' referring both to space and pitch). Together, these results suggest that congruency effects reveal metaphorical uses of spatial schemas, not markedness effects.
  • Duarte, R., Uhlmann, M., Van den Broek, D., Fitz, H., Petersson, K. M., & Morrison, A. (2018). Encoding symbolic sequences with spiking neural reservoirs. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN). doi:10.1109/IJCNN.2018.8489114.

    Abstract

    Biologically inspired spiking networks are an important tool to study the nature of computation and cognition in neural systems. In this work, we investigate the representational capacity of spiking networks engaged in an identity mapping task. We compare two schemes for encoding symbolic input, one in which input is injected as a direct current and one where input is delivered as a spatio-temporal spike pattern. We test the ability of networks to discriminate their input as a function of the number of distinct input symbols. We also compare performance using either membrane potentials or filtered spike trains as state variable. Furthermore, we investigate how the circuit behavior depends on the balance between excitation and inhibition, and the degree of synchrony and regularity in its internal dynamics. Finally, we compare different linear methods of decoding population activity onto desired target labels. Overall, our results suggest that even this simple mapping task is strongly influenced by design choices on input encoding, state-variables, circuit characteristics and decoding methods, and these factors can interact in complex ways. This work highlights the importance of constraining computational network models of behavior by available neurobiological evidence.
  • Durco, M., & Windhouwer, M. (2013). Semantic Mapping in CLARIN Component Metadata. In Proceedings of MTSR 2013, the 7th Metadata and Semantics Research Conference (pp. 163-168). New York: Springer.

    Abstract

    In recent years, large scale initiatives like CLARIN set out to overcome the notorious heterogeneity of metadata formats in the domain of language resource. The CLARIN Component Metadata Infrastructure established means for flexible resouce descriptions for the domain of language resources. The Data Category Registry ISOcat and the accompanying Relation Registry foster semantic interoperability within the growing heterogeneous collection of metadata records. This paper describes the CMD Infrastructure focusing on the facilities for semantic mapping, and gives also an overview of the current status in the joint component metadata domain.
  • Enfield, N. J. (2002). Parallel innovation and 'coincidence' in linguistic areas: On a bi-clausal extent/result constructions of mainland Southeast Asia. In P. Chew (Ed.), Proceedings of the 28th meeting of the Berkeley Linguistics Society. Special session on Tibeto-Burman and Southeast Asian linguistics (pp. 121-128). Berkeley: Berkeley Linguistics Society.
  • Enfield, N. J., & Evans, G. (2000). Transcription as standardisation: The problem of Tai languages. In S. Burusphat (Ed.), Proceedings: the International Conference on Tai Studies, July 29-31, 1998, (pp. 201-212). Bangkok, Thailand: Institute of Language and Culture for Rural Development, Mahidol University.
  • Ergin, R., Senghas, A., Jackendoff, R., & Gleitman, L. (2018). Structural cues for symmetry, asymmetry, and non-symmetry in Central Taurus Sign Language. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 104-106). Toruń, Poland: NCU Press. doi:10.12775/3991-1.025.
  • Flecken, M., & Gerwien, J. (2013). Grammatical aspect modulates event duration estimations: findings from Dutch. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th annual meeting of the Cognitive Science Society (CogSci 2013) (pp. 2309-2314). Austin,TX: Cognitive Science Society.
  • Galke, L., Gerstenkorn, G., & Scherp, A. (2018). A case study of closed-domain response suggestion with limited training data. In M. Elloumi, M. Granitzer, A. Hameurlain, C. Seifert, B. Stein, A. Min Tjoa, & R. Wagner (Eds.), Database and Expert Systems Applications: DEXA 2018 International Workshops, BDMICS, BIOKDD, and TIR, Regensburg, Germany, September 3–6, 2018, Proceedings (pp. 218-229). Cham, Switzerland: Springer.

    Abstract

    We analyze the problem of response suggestion in a closed domain along a real-world scenario of a digital library. We present a text-processing pipeline to generate question-answer pairs from chat transcripts. On this limited amount of training data, we compare retrieval-based, conditioned-generation, and dedicated representation learning approaches for response suggestion. Our results show that retrieval-based methods that strive to find similar, known contexts are preferable over parametric approaches from the conditioned-generation family, when the training data is limited. We, however, identify a specific representation learning approach that is competitive to the retrieval-based approaches despite the training data limitation.
  • Galke, L., Mai, F., & Vagliano, I. (2018). Multi-modal adversarial autoencoders for recommendations of citations and subject labels. In T. Mitrovic, J. Zhang, L. Chen, & D. Chin (Eds.), UMAP '18: Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization (pp. 197-205). New York: ACM. doi:10.1145/3209219.3209236.

    Abstract

    We present multi-modal adversarial autoencoders for recommendation and evaluate them on two different tasks: citation recommendation and subject label recommendation. We analyze the effects of adversarial regularization, sparsity, and different input modalities. By conducting 408 experiments, we show that adversarial regularization consistently improves the performance of autoencoders for recommendation. We demonstrate, however, that the two tasks differ in the semantics of item co-occurrence in the sense that item co-occurrence resembles relatedness in case of citations, yet implies diversity in case of subject labels. Our results reveal that supplying the partial item set as input is only helpful, when item co-occurrence resembles relatedness. When facing a new recommendation task it is therefore crucial to consider the semantics of item co-occurrence for the choice of an appropriate model.
  • Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). Automatic sign language identification. In Proceeding of the 20th IEEE International Conference on Image Processing (ICIP) (pp. 2626-2630).

    Abstract

    We propose a Random-Forest based sign language identification system. The system uses low-level visual features and is based on the hypothesis that sign languages have varying distributions of phonemes (hand-shapes, locations and movements). We evaluated the system on two sign languages -- British SL and Greek SL, both taken from a publicly available corpus, called Dicta Sign Corpus. Achieved average F1 scores are about 95% - indicating that sign languages can be identified with high accuracy using only low-level visual features.
  • Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). Automatic signer diarization - the mover is the signer approach. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2013 IEEE Conference on (pp. 283-287). doi:10.1109/CVPRW.2013.49.

    Abstract

    We present a vision-based method for signer diarization -- the task of automatically determining "who signed when?" in a video. This task has similar motivations and applications as speaker diarization but has received little attention in the literature. In this paper, we motivate the problem and propose a method for solving it. The method is based on the hypothesis that signers make more movements than their interlocutors. Experiments on four videos (a total of 1.4 hours and each consisting of two signers) show the applicability of the method. The best diarization error rate (DER) obtained is 0.16.
  • Gebre, B. G., Zampieri, M., Wittenburg, P., & Heskes, T. (2013). Improving Native Language Identification with TF-IDF weighting. In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 216-223).

    Abstract

    This paper presents a Native Language Identification (NLI) system based on TF-IDF weighting schemes and using linear classifiers - support vector machines, logistic regressions and perceptrons. The system was one of the participants of the 2013 NLI Shared Task in the closed-training track, achieving 0.814 overall accuracy for a set of 11 native languages. This accuracy was only 2.2 percentage points lower than the winner's performance. Furthermore, with subsequent evaluations using 10-fold cross-validation (as given by the organizers) on the combined training and development data, the best average accuracy obtained is 0.8455 and the features that contributed to this accuracy are the TF-IDF of the combined unigrams and bigrams of words.
  • Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). The gesturer is the speaker. In Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013) (pp. 3751-3755).

    Abstract

    We present and solve the speaker diarization problem in a novel way. We hypothesize that the gesturer is the speaker and that identifying the gesturer can be taken as identifying the active speaker. We provide evidence in support of the hypothesis from gesture literature and audio-visual synchrony studies. We also present a vision-only diarization algorithm that relies on gestures (i.e. upper body movements). Experiments carried out on 8.9 hours of a publicly available dataset (the AMI meeting data) show that diarization error rates as low as 15% can be achieved.
  • Gijssels, T., Bottini, R., Rueschemeyer, S.-A., & Casasanto, D. (2013). Space and time in the parietal cortex: fMRI Evidence for a meural asymmetry. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 495-500). Austin,TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0113/index.html.

    Abstract

    How are space and time related in the brain? This study contrasts two proposals that make different predictions about the interaction between spatial and temporal magnitudes. Whereas ATOM implies that space and time are symmetrically related, Metaphor Theory claims they are asymmetrically related. Here we investigated whether space and time activate the same neural structures in the inferior parietal cortex (IPC) and whether the activation is symmetric or asymmetric across domains. We measured participants’ neural activity while they made temporal and spatial judgments on the same visual stimuli. The behavioral results replicated earlier observations of a space-time asymmetry: Temporal judgments were more strongly influenced by irrelevant spatial information than vice versa. The BOLD fMRI data indicated that space and time activated overlapping clusters in the IPC and that, consistent with Metaphor Theory, this activation was asymmetric: The shared region of IPC was activated more strongly during temporal judgments than during spatial judgments. We consider three possible interpretations of this neural asymmetry, based on 3 possible functions of IPC.
  • Guirardello-Damian, R., & Skiba, R. (2002). Trumai Corpus: An example of presenting multi-media data in the IMDI-browser. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 16-1-16-8). Paris: European Language Resources Association.

    Abstract

    Trumai, a genetically isolated language spoken in Brazil (Xingu reserve), is an example of an endangered language. Although the Trumai population consists of more than 100 individuals, only 51 people speak the language. The oral traditions are progressively dying. Given the current scenario, the documentation of this language and its cultural aspects is of great importance. In the framework of the DoBeS program (Documentation of Endangered Languages), the project "Documentation of Trumai" has selected and organized a collection of Trumai texts, with a multi-media representation of the corpus. Several kinds of information and data types are being included in the archive of the language: texts with audio and video recordings; written texts from educational materials; drawings; photos; songs; annotations in different formats; lexicon; field notes; results from scientific studies of the language (sound system, sketch grammar, comparative studies with other Xinguan languages), etc. All materials are integrated into the IMDI-Browser, a specialized tool for presenting and searching for linguistic data. This paper explores the processing phases and the results of the Trumai project taking into consideration the issue of how to combine the needs and wishes of field linguistics (content and research aspects) and the needs of archiving (structure and workflow aspects) in a well-organized corpus.
  • Gulrajani, G., & Harrison, D. (2002). SHAWEL: Sharable and interactive web-lexicons. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 9-1-9-4). Paris: European Language Resources Association.

    Abstract

    A prototypical lexicon tool was implemented which was intended to allow researchers to collaboratively create lexicons of endangered languages. Increasingly often researchers documenting or analyzing a language work at different locations. Lexicons that evolve through continuous interaction between the collaborators can only be efficiently produced when it can be accessed and manipulated via the Internet. The SHAWEL tool was developed to address these needs; it makes use of a thin Java client and a central database solution.
  • Gussenhoven, C., & Zhou, W. (2013). Revisiting pitch slope and height effects on perceived duration. In Proceedings of INTERSPEECH 2013: 14th Annual Conference of the International Speech Communication Association (pp. 1365-1369).

    Abstract

    The shape of pitch contours has been shown to have an effect on the perceived duration of vowels. For instance, vowels with high level pitch and vowels with falling contours sound longer than vowels with low level pitch. Depending on whether the
    comparison is between level pitches or between level and dynamic contours, these findings have been interpreted in two ways. For inter-level comparisons, where the duration results are the reverse of production results, a hypercorrection strategy in production has been proposed [1]. By contrast, for comparisons between level pitches and dynamic contours, the
    longer production data for dynamic contours have been held responsible. We report an experiment with Dutch and Chinese listeners which aimed to show that production data and perception data are each other’s opposites for high, low, falling and rising contours. We explain the results, which are consistent with earlier findings, in terms of the compensatory listening strategy of [2], arguing that the perception effects are due to a perceptual compensation of articulatory strategies and
    constraints, rather than that differences in production compensate for psycho-acoustic perception effects.
  • Gussenhoven, C., & Chen, A. (2000). Universal and language-specific effects in the perception of question intonation. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP) (pp. 91-94). Beijing: China Military Friendship Publish.

    Abstract

    Three groups of monolingual listeners, with Standard Chinese, Dutch and Hungarian as their native language, judged pairs of trisyllabic stimuli which differed only in their itch pattern. The segmental structure of the stimuli was made up by the experimenters and presented to subjects as being taken from a little-known language spoken on a South Pacific island. Pitch patterns consisted of a single rise-fall located on or near the second syllable. By and large, listeners selected the stimulus with the higher peak, the later eak, and the higher end rise as the one that signalled a question, regardless of language group. The result is argued to reflect innate, non-linguistic knowledge of the meaning of pitch variation, notably Ohala’s Frequency Code. A significant difference between groups is explained as due to the influence of the mother tongue.
  • Gussenhoven, C., & Chen, A. (2000). Universal and language-specific effects in the perception of question intonation. In Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP) (pp. 91-94).
  • Harbusch, K., & Kempen, G. (2000). Complexity of linear order computation in Performance Grammar, TAG and HPSG. In Proceedings of Fifth International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+5) (pp. 101-106).

    Abstract

    This paper investigates the time and space complexity of word order computation in the psycholinguistically motivated grammar formalism of Performance Grammar (PG). In PG, the first stage of syntax assembly yields an unordered tree ('mobile') consisting of a hierarchy of lexical frames (lexically anchored elementary trees). Associated with each lexica l frame is a linearizer—a Finite-State Automaton that locally computes the left-to-right order of the branches of the frame. Linearization takes place after the promotion component may have raised certain constituents (e.g. Wh- or focused phrases) into the domain of lexical frames higher up in the syntactic mobile. We show that the worst-case time and space complexity of analyzing input strings of length n is O(n5) and O(n4), respectively. This result compares favorably with the time complexity of word-order computations in Tree Adjoining Grammar (TAG). A comparison with Head-Driven Phrase Structure Grammar (HPSG) reveals that PG yields a more declarative linearization method, provided that the FSA is rewritten as an equivalent regular expression.
  • Harbusch, K., & Kempen, G. (2002). A quantitative model of word order and movement in English, Dutch and German complement constructions. In Proceedings of the 19th international conference on Computational linguistics. San Francisco: Morgan Kaufmann.

    Abstract

    We present a quantitative model of word order and movement constraints that enables a simple and uniform treatment of a seemingly heterogeneous collection of linear order phenomena in English, Dutch and German complement constructions (Wh-extraction, clause union, extraposition, verb clustering, particle movement, etc.). Underlying the scheme are central assumptions of the psycholinguistically motivated Performance Grammar (PG). Here we describe this formalism in declarative terms based on typed feature unification. PG allows a homogenous treatment of both the within- and between-language variations of the ordering phenomena under discussion, which reduce to different settings of a small number of quantitative parameters.
  • Holler, J., Schubotz, L., Kelly, S., Schuetze, M., Hagoort, P., & Ozyurek, A. (2013). Here's not looking at you, kid! Unaddressed recipients benefit from co-speech gestures when speech processing suffers. In M. Knauff, M. Pauen, I. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 2560-2565). Austin, TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0463/index.html.

    Abstract

    In human face-to-face communication, language comprehension is a multi-modal, situated activity. However, little is known about how we combine information from these different modalities, and how perceived communicative intentions, often signaled through visual signals, such as eye
    gaze, may influence this processing. We address this question by simulating a triadic communication context in which a
    speaker alternated her gaze between two different recipients. Participants thus viewed speech-only or speech+gesture
    object-related utterances when being addressed (direct gaze) or unaddressed (averted gaze). Two object images followed
    each message and participants’ task was to choose the object that matched the message. Unaddressed recipients responded significantly slower than addressees for speech-only
    utterances. However, perceiving the same speech accompanied by gestures sped them up to a level identical to
    that of addressees. That is, when speech processing suffers due to not being addressed, gesture processing remains intact and enhances the comprehension of a speaker’s message
  • Hopman, E., Thompson, B., Austerweil, J., & Lupyan, G. (2018). Predictors of L2 word learning accuracy: A big data investigation. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 513-518). Austin, TX: Cognitive Science Society.

    Abstract

    What makes some words harder to learn than others in a second language? Although some robust factors have been identified based on small scale experimental studies, many relevant factors are difficult to study in such experiments due to the amount of data necessary to test them. Here, we investigate what factors affect the ease of learning of a word in a second language using a large data set of users learning English as a second language through the Duolingo mobile app. In a regression analysis, we test and confirm the well-studied effect of cognate status on word learning accuracy. Furthermore, we find significant effects for both cross-linguistic semantic alignment and English semantic density, two novel predictors derived from large scale distributional models of lexical semantics. Finally, we provide data on several other psycholinguistically plausible word level predictors. We conclude with a discussion of the limits, benefits and future research potential of using big data for investigating second language learning.
  • Huettig, F., Kolinsky, R., & Lachmann, T. (Eds.). (2018). The effects of literacy on cognition and brain functioning [Special Issue]. Language, Cognition and Neuroscience, 33(3).
  • Irvine, L., Roberts, S. G., & Kirby, S. (2013). A robustness approach to theory building: A case study of language evolution. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 2614-2619). Retrieved from http://mindmodeling.org/cogsci2013/papers/0472/index.html.

    Abstract

    Models of cognitive processes often include simplifications, idealisations, and fictionalisations, so how should we learn about cognitive processes from such models? Particularly in cognitive science, when many features of the target system are unknown, it is not always clear which simplifications, idealisations, and so on, are appropriate for a research question, and which are highly misleading. Here we use a case-study from studies of language evolution, and ideas from philosophy of science, to illustrate a robustness approach to learning from models. Robust properties are those that arise across a range of models, simulations and experiments, and can be used to identify key causal structures in the models, and the phenomenon, under investigation. For example, in studies of language evolution, the emergence of compositional structure is a robust property across models, simulations and experiments of cultural transmission, but only under pressures for learnability and expressivity. This arguably illustrates the principles underlying real cases of language evolution. We provide an outline of the robustness approach, including its limitations, and suggest that this methodology can be productively used throughout cognitive science. Perhaps of most importance, it suggests that different modelling frameworks should be used as tools to identify the abstract properties of a system, rather than being definitive expressions of theories.
  • Isbilen, E., Frost, R. L. A., Monaghan, P., & Christiansen, M. (2018). Bridging artificial and natural language learning: Comparing processing- and reflection-based measures of learning. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 1856-1861). Austin, TX: Cognitive Science Society.

    Abstract

    A common assumption in the cognitive sciences is that artificial and natural language learning rely on shared mechanisms. However, attempts to bridge the two have yielded ambiguous results. We suggest that an empirical disconnect between the computations employed during learning and the methods employed at test may explain these mixed results. Further, we propose statistically-based chunking as a potential computational link between artificial and natural language learning. We compare the acquisition of non-adjacent dependencies to that of natural language structure using two types of tasks: reflection-based 2AFC measures, and processing-based recall measures, the latter being more computationally analogous to the processes used during language acquisition. Our results demonstrate that task-type significantly influences the correlations observed between artificial and natural language acquisition, with reflection-based and processing-based measures correlating within – but not across – task-type. These findings have fundamental implications for artificial-to-natural language comparisons, both methodologically and theoretically.
  • Janse, E., Sennema, A., & Slis, A. (2000). Fast speech timing in Dutch: The durational correlates of lexical stress and pitch accent. In Proceedings of the VIth International Conference on Spoken Language Processing, Vol. III (pp. 251-254).

    Abstract

    n this study we investigated the durational correlates of lexical stress and pitch accent at normal and fast speech rate in Dutch. Previous literature on English shows that durations of lexically unstressed vowels are reduced more than stressed vowels when speakers increase their speech rate. We found that the same holds for Dutch, irrespective of whether the unstressed vowel is schwa or a "full" vowel. In the same line, we expected that vowels in words without a pitch accent would be shortened relatively more than vowels in words with a pitch accent. This was not the case: if anything, the accented vowels were shortened relatively more than the unaccented vowels. We conclude that duration is an important cue for lexical stress, but not for pitch accent.
  • Janse, E. (2000). Intelligibility of time-compressed speech: Three ways of time-compression. In Proceedings of the VIth International Conference on Spoken Language Processing, vol. III (pp. 786-789).

    Abstract

    Studies on fast speech have shown that word-level timing of fast speech differs from that of normal rate speech in that unstressed syllables are shortened more than stressed syllables as speech rate increases. An earlier experiment showed that the intelligibility of time-compressed speech could not be improved by making its temporal organisation closer to natural fast speech. To test the hypothesis that segmental intelligibility is more important than prosodic timing in listening to timecompressed speech, the intelligibility of bisyllabic words was tested in three time-compression conditions: either stressed and unstressed syllable were compressed to the same degree, or the stressed syllable was compressed more than the unstressed syllable, or the reverse. As was found before, imitating wordlevel timing of fast speech did not improve intelligibility over linear compression. However, the results did not confirm the hypothesis either: there was no difference in intelligibility between the three compression conditions. We conclude that segmental intelligibility plays an important role, but further research is necessary to decide between the contributions of prosody and segmental intelligibility to the word-level intelligibility of time-compressed speech.
  • Janse, E. (2002). Time-compressing natural and synthetic speech. In Proceedings of 7th International Conference on Spoken Language Processing (pp. 1645-1648).
  • Janssen, R., Moisik, S. R., & Dediu, D. (2018). Agent model reveals the influence of vocal tract anatomy on speech during ontogeny and glossogeny. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 171-174). Toruń, Poland: NCU Press. doi:10.12775/3991-1.042.
  • Johnson, E. K., Jusczyk, P. W., Cutler, A., & Norris, D. (2000). The development of word recognition: The use of the possible-word constraint by 12-month-olds. In L. Gleitman, & A. Joshi (Eds.), Proceedings of CogSci 2000 (pp. 1034). London: Erlbaum.
  • De Jong, N. H., & Bosker, H. R. (2013). Choosing a threshold for silent pauses to measure second language fluency. In R. Eklund (Ed.), Proceedings of the 6th Workshop on Disfluency in Spontaneous Speech (DiSS) (pp. 17-20).

    Abstract

    Second language (L2) research often involves analyses of acoustic measures of fluency. The studies investigating fluency, however, have been difficult to compare because the measures of fluency that were used differed widely. One of the differences between studies concerns the lower cut-off point for silent pauses, which has been set anywhere between 100 ms and 1000 ms. The goal of this paper is to find an optimal cut-off point. We calculate acoustic measures of fluency using different pause thresholds and then relate these measures to a measure of L2 proficiency and to ratings on fluency.
  • Kanero, J., Franko, I., Oranç, C., Uluşahin, O., Koskulu, S., Adigüzel, Z., Küntay, A. C., & Göksun, T. (2018). Who can benefit from robots? Effects of individual differences in robot-assisted language learning. In Proceedings of the 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp. 212-217). Piscataway, NJ, USA: IEEE.

    Abstract

    It has been suggested that some individuals may benefit more from social robots than do others. Using second
    language (L2) as an example, the present study examined how individual differences in attitudes toward robots and personality
    traits may be related to learning outcomes. Preliminary results with 24 Turkish-speaking adults suggest that negative attitudes
    toward robots, more specifically thoughts and anxiety about the negative social impact that robots may have on the society,
    predicted how well adults learned L2 words from a social robot. The possible implications of the findings as well as future directions are also discussed
  • Kearns, R. K., Norris, D., & Cutler, A. (2002). Syllable processing in English. In Proceedings of the 7th International Conference on Spoken Language Processing [ICSLP 2002] (pp. 1657-1660).

    Abstract

    We describe a reaction time study in which listeners detected word or nonword syllable targets (e.g. zoo, trel) in sequences consisting of the target plus a consonant or syllable residue (trelsh, trelshek). The pattern of responses differed from an earlier word-spotting study with the same material, in which words were always harder to find if only a consonant residue remained. The earlier results should thus not be viewed in terms of syllabic parsing, but in terms of a universal role for syllables in speech perception; words which are accidentally present in spoken input (e.g. sell in self) can be rejected when they leave a residue of the input which could not itself be a word.
  • Kempen, G., & Van Breugel, C. (2002). A workbench for visual-interactive grammar instruction at the secondary education level. In Proceedings of the 10th International CALL Conference (pp. 157-158). Antwerp: University of Antwerp.
  • Kempen, G., & Hoenkamp, E. (1982). Incremental sentence generation: Implications for the structure of a syntactic processor. In J. Horecký (Ed.), COLING 82. Proceedings of the Ninth International Conference on Computational Linguistics, Prague, July 5-10, 1982 (pp. 151-156). Amsterdam: North-Holland.

    Abstract

    Human speakers often produce sentences incrementally. They can start speaking having in mind only a fragmentary idea of what they want to say, and while saying this they refine the contents underlying subsequent parts of the utterance. This capability imposes a number of constraints on the design of a syntactic processor. This paper explores these constraints and evaluates some recent computational sentence generators from the perspective of incremental production.
  • Kempen, G., & Harbusch, K. (2002). Rethinking the architecture of human syntactic processing: The relationship between grammatical encoding and decoding. In Proceedings of the 35th Meeting of the Societas Linguistica Europaea. University of Potsdam.
  • Khetarpal, N., Neveu, G., Majid, A., Michael, L., & Regier, T. (2013). Spatial terms across languages support near-optimal communication: Evidence from Peruvian Amazonia, and computational analyses. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (pp. 764-769). Austin, TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0158/index.html.

    Abstract

    Why do languages have the categories they do? It has been argued that spatial terms in the world’s languages reflect categories that support highly informative communication, and that this accounts for the spatial categories found across languages. However, this proposal has been tested against only nine languages, and in a limited fashion. Here, we consider two new languages: Maijɨki, an under-documented language of Peruvian Amazonia, and English. We analyze spatial data from these two new languages and the original nine, using thorough and theoretically targeted computational tests. The results support the hypothesis that spatial terms across dissimilar languages enable near-optimally informative communication, over an influential competing hypothesis
  • Klein, W. (Ed.). (2002). Sprache des Rechts II [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 128.
  • Klein, W. (2000). Changing concepts of the nature-nurture debate. In R. Hide, J. Mittelstrass, & W. Singer (Eds.), Changing concepts of nature at the turn of the millenium: Proceedings plenary session of the Pontifical academy of sciences, 26-29 October 1998 (pp. 289-299). Vatican City: Pontificia Academia Scientiarum.
  • Klein, W., & Jungbluth, K. (Eds.). (2002). Deixis [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 125.
  • Klein, W. (2013). L'effettivo declino e la crescita potenziale della lessicografia tedesca. In N. Maraschio, D. De Martiono, & G. Stanchina (Eds.), L'italiano dei vocabolari: Atti di La piazza delle lingue 2012 (pp. 11-20). Firenze: Accademia della Crusca.
  • Klein, W. (Ed.). (2000). Sprache des Rechts [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (118).
  • Klein, W. (Ed.). (1986). Sprachverfall [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (62).
  • Klein, W. (Ed.). (1982). Zweitspracherwerb [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (45).
  • Kuijpers, C., Van Donselaar, W., & Cutler, A. (2002). Perceptual effects of assimilation-induced violation of final devoicing in Dutch. In J. H. L. Hansen, & B. Pellum (Eds.), The 7th International Conference on Spoken Language Processing (pp. 1661-1664). Denver: ICSA.

    Abstract

    Voice assimilation in Dutch is an optional phonological rule which changes the surface forms of words and in doing so may violate the otherwise obligatory phonological rule of syllablefinal devoicing. We report two experiments examining the influence of voice assimilation on phoneme processing, in lexical compound words and in noun-verb phrases. Processing was not impaired in appropriate assimilation contexts across morpheme boundaries, but was impaired when devoicing was violated (a) in an inappropriate non-assimilatory) context, or (b) across a syntactic boundary.
  • Kuntay, A., & Ozyurek, A. (2002). Joint attention and the development of the use of demonstrative pronouns in Turkish. In B. Skarabela, S. Fish, & A. H. Do (Eds.), Proceedings of the 26th annual Boston University Conference on Language Development (pp. 336-347). Somerville, MA: Cascadilla Press.
  • Lansner, A., Sandberg, A., Petersson, K. M., & Ingvar, M. (2000). On forgetful attractor network memories. In H. Malmgren, M. Borga, & L. Niklasson (Eds.), Artificial neural networks in medicine and biology: Proceedings of the ANNIMAB-1 Conference, Göteborg, Sweden, 13-16 May 2000 (pp. 54-62). Heidelberg: Springer Verlag.

    Abstract

    A recurrently connected attractor neural network with a Hebbian learning rule is currently our best ANN analogy for a piece cortex. Functionally biological memory operates on a spectrum of time scales with regard to induction and retention, and it is modulated in complex ways by sub-cortical neuromodulatory systems. Moreover, biological memory networks are commonly believed to be highly distributed and engage many co-operating cortical areas. Here we focus on the temporal aspects of induction and retention of memory in a connectionist type attractor memory model of a piece of cortex. A continuous time, forgetful Bayesian-Hebbian learning rule is described and compared to the characteristics of LTP and LTD seen experimentally. More generally, an attractor network implementing this learning rule can operate as a long-term, intermediate-term, or short-term memory. Modulation of the print-now signal of the learning rule replicates some experimental memory phenomena, like e.g. the von Restorff effect.
  • Lattenkamp, E. Z., Vernes, S. C., & Wiegrebe, L. (2018). Mammalian models for the study of vocal learning: A new paradigm in bats. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 235-237). Toruń, Poland: NCU Press. doi:10.12775/3991-1.056.
  • Lauscher, A., Eckert, K., Galke, L., Scherp, A., Rizvi, S. T. R., Ahmed, S., Dengel, A., Zumstein, P., & Klein, A. (2018). Linked open citation database: Enabling libraries to contribute to an open and interconnected citation graph. In J. Chen, M. A. Gonçalves, J. M. Allen, E. A. Fox, M.-Y. Kan, & V. Petras (Eds.), JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries (pp. 109-118). New York: ACM. doi:10.1145/3197026.3197050.

    Abstract

    Citations play a crucial role in the scientific discourse, in information retrieval, and in bibliometrics. Many initiatives are currently promoting the idea of having free and open citation data. Creation of citation data, however, is not part of the cataloging workflow in libraries nowadays.
    In this paper, we present our project Linked Open Citation Database, in which we design distributed processes and a system infrastructure based on linked data technology. The goal is to show that efficiently cataloging citations in libraries using a semi-automatic approach is possible. We specifically describe the current state of the workflow and its implementation. We show that we could significantly improve the automatic reference extraction that is crucial for the subsequent data curation. We further give insights on the curation and linking process and provide evaluation results that not only direct the further development of the project, but also allow us to discuss its overall feasibility.
  • Lefever, E., Hendrickx, I., Croijmans, I., Van den Bosch, A., & Majid, A. (2018). Discovering the language of wine reviews: A text mining account. In N. Calzolari, K. Choukri, C. Cieri, T. Declerck, S. Goggi, K. Hasida, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, S. Piperidis, & T. Tokunaga (Eds.), Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018) (pp. 3297-3302). Paris: LREC.

    Abstract

    It is widely held that smells and flavors are impossible to put into words. In this paper we test this claim by seeking predictive patterns in wine reviews, which ostensibly aim to provide guides to perceptual content. Wine reviews have previously been critiqued as random and meaningless. We collected an English corpus of wine reviews with their structured metadata, and applied machine learning techniques to automatically predict the wine's color, grape variety, and country of origin. To train the three supervised classifiers, three different information sources were incorporated: lexical bag-of-words features, domain-specific terminology features, and semantic word embedding features. In addition, using regression analysis we investigated basic review properties, i.e., review length, average word length, and their relationship to the scalar values of price and review score. Our results show that wine experts do share a common vocabulary to describe wines and they use this in a consistent way, which makes it possible to automatically predict wine characteristics based on the review text alone. This means that odors and flavors may be more expressible in language than typically acknowledged.
  • Lenkiewicz, A., & Drude, S. (2013). Automatic annotation of linguistic 2D and Kinect recordings with the Media Query Language for Elan. In Proceedings of Digital Humanities 2013 (pp. 276-278).

    Abstract

    Research in body language with use of gesture recognition and speech analysis has gained much attention in the recent times, influencing disciplines related to image and speech processing.

    This study aims to design the Media Query Language (MQL) (Lenkiewicz, et al. 2012) combined with the Linguistic Media Query Interface (LMQI) for Elan (Wittenburg, et al. 2006). The system integrated with the new achievements in audio-video recognition will allow querying media files with predefined gesture phases (or motion primitives) and speech characteristics as well as combinations of both. For the purpose of this work the predefined motions and speech characteristics are called patterns for atomic elements and actions for a sequence of patterns. The main assumption is that a user-customized library of patterns and actions and automated media annotation with LMQI will reduce annotation time, hence decreasing costs of creation of annotated corpora. Increase of the number of annotated data should influence the speed and number of possible research in disciplines in which human multimodal interaction is a subject of interest and where annotated corpora are required.
  • Levelt, W. J. M., & Plomp, R. (1962). Musical consonance and critical bandwidth. In Proceedings of the 4th International Congress Acoustics (pp. 55-55).
  • Levinson, S. C. (2000). Language as nature and language as art. In J. Mittelstrass, & W. Singer (Eds.), Proceedings of the Symposium on ‘Changing concepts of nature and the turn of the Millennium (pp. 257-287). Vatican City: Pontificae Academiae Scientiarium Scripta Varia.
  • Levinson, S. C. (2000). H.P. Grice on location on Rossel Island. In S. S. Chang, L. Liaw, & J. Ruppenhofer (Eds.), Proceedings of the 25th Annual Meeting of the Berkeley Linguistic Society (pp. 210-224). Berkeley: Berkeley Linguistic Society.
  • Lopopolo, A., Frank, S. L., Van den Bosch, A., Nijhof, A., & Willems, R. M. (2018). The Narrative Brain Dataset (NBD), an fMRI dataset for the study of natural language processing in the brain. In B. Devereux, E. Shutova, & C.-R. Huang (Eds.), Proceedings of LREC 2018 Workshop "Linguistic and Neuro-Cognitive Resources (LiNCR) (pp. 8-11). Paris: LREC.

    Abstract

    We present the Narrative Brain Dataset, an fMRI dataset that was collected during spoken presentation of short excerpts of three
    stories in Dutch. Together with the brain imaging data, the dataset contains the written versions of the stimulation texts. The texts are
    accompanied with stochastic (perplexity and entropy) and semantic computational linguistic measures. The richness and unconstrained
    nature of the data allows the study of language processing in the brain in a more naturalistic setting than is common for fMRI studies.
    We hope that by making NBD available we serve the double purpose of providing useful neural data to researchers interested in natural
    language processing in the brain and to further stimulate data sharing in the field of neuroscience of language.
  • Lupyan, G., Wendorf, A., Berscia, L. M., & Paul, J. (2018). Core knowledge or language-augmented cognition? The case of geometric reasoning. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 252-254). Toruń, Poland: NCU Press. doi:10.12775/3991-1.062.
  • Mai, F., Galke, L., & Scherp, A. (2018). Using deep learning for title-based semantic subject indexing to reach competitive performance to full-text. In J. Chen, M. A. Gonçalves, J. M. Allen, E. A. Fox, M.-Y. Kan, & V. Petras (Eds.), JCDL '18: Proceedings of the 18th ACM/IEEE on Joint Conference on Digital Libraries (pp. 169-178). New York: ACM.

    Abstract

    For (semi-)automated subject indexing systems in digital libraries, it is often more practical to use metadata such as the title of a publication instead of the full-text or the abstract. Therefore, it is desirable to have good text mining and text classification algorithms that operate well already on the title of a publication. So far, the classification performance on titles is not competitive with the performance on the full-texts if the same number of training samples is used for training. However, it is much easier to obtain title data in large quantities and to use it for training than full-text data. In this paper, we investigate the question how models obtained from training on increasing amounts of title training data compare to models from training on a constant number of full-texts. We evaluate this question on a large-scale dataset from the medical domain (PubMed) and from economics (EconBiz). In these datasets, the titles and annotations of millions of publications are available, and they outnumber the available full-texts by a factor of 20 and 15, respectively. To exploit these large amounts of data to their full potential, we develop three strong deep learning classifiers and evaluate their performance on the two datasets. The results are promising. On the EconBiz dataset, all three classifiers outperform their full-text counterparts by a large margin. The best title-based classifier outperforms the best full-text method by 9.4%. On the PubMed dataset, the best title-based method almost reaches the performance of the best full-text classifier, with a difference of only 2.9%.
  • Majid, A. (2013). Olfactory language and cognition. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th annual meeting of the Cognitive Science Society (CogSci 2013) (pp. 68). Austin,TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0025/index.html.

    Abstract

    Since the cognitive revolution, a widely held assumption has been that—whereas content may vary across cultures—cognitive processes would be universal, especially those on the more basic levels. Even if scholars do not fully subscribe to this assumption, they often conceptualize, or tend to investigate, cognition as if it were universal (Henrich, Heine, & Norenzayan, 2010). The insight that universality must not be presupposed but scrutinized is now gaining ground, and cognitive diversity has become one of the hot (and controversial) topics in the field (Norenzayan & Heine, 2005). We argue that, for scrutinizing the cultural dimension of cognition, taking an anthropological perspective is invaluable, not only for the task itself, but for attenuating the home-field disadvantages that are inescapably linked to cross-cultural research (Medin, Bennis, & Chandler, 2010).
  • Matsuo, A., & Duffield, N. (2002). Assessing the generality of knowledge about English ellipsis in SLA. In J. Costa, & M. J. Freitas (Eds.), Proceedings of the GALA 2001 Conference on Language Acquisition (pp. 49-53). Lisboa: Associacao Portuguesa de Linguistica.
  • Matsuo, A., & Duffield, N. (2002). Finiteness and parallelism: Assessing the generality of knowledge about English ellipsis in SLA. In B. Skarabela, S. Fish, & A.-H.-J. Do (Eds.), Proceedings of the 26th Boston University Conference on Language Development (pp. 197-207). Somerville, Massachusetts: Cascadilla Press.
  • McQueen, J. M., Cutler, A., & Norris, D. (2000). Positive and negative influences of the lexicon on phonemic decision-making. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the Sixth International Conference on Spoken Language Processing: Vol. 3 (pp. 778-781). Beijing: China Military Friendship Publish.

    Abstract

    Lexical knowledge influences how human listeners make decisions about speech sounds. Positive lexical effects (faster responses to target sounds in words than in nonwords) are robust across several laboratory tasks, while negative effects (slower responses to targets in more word-like nonwords than in less word-like nonwords) have been found in phonetic decision tasks but not phoneme monitoring tasks. The present experiments tested whether negative lexical effects are therefore a task-specific consequence of the forced choice required in phonetic decision. We compared phoneme monitoring and phonetic decision performance using the same Dutch materials in each task. In both experiments there were positive lexical effects, but no negative lexical effects. We observe that in all studies showing negative lexical effects, the materials were made by cross-splicing, which meant that they contained perceptual evidence supporting the lexically-consistent phonemes. Lexical knowledge seems to influence phonemic decision-making only when there is evidence for the lexically-consistent phoneme in the speech signal.
  • McQueen, J. M., Cutler, A., & Norris, D. (2000). Why Merge really is autonomous and parsimonious. In A. Cutler, J. M. McQueen, & R. Zondervan (Eds.), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 47-50). Nijmegen: Max-Planck-Institute for Psycholinguistics.

    Abstract

    We briefly describe the Merge model of phonemic decision-making, and, in the light of general arguments about the possible role of feedback in spoken-word recognition, defend Merge's feedforward structure. Merge not only accounts adequately for the data, without invoking feedback connections, but does so in a parsimonious manner.
  • Merkx, D., & Scharenborg, O. (2018). Articulatory feature classification using convolutional neural networks. In Proceedings of Interspeech 2018 (pp. 2142-2146). doi:10.21437/Interspeech.2018-2275.

    Abstract

    The ultimate goal of our research is to improve an existing speech-based computational model of human speech recognition on the task of simulating the role of fine-grained phonetic information in human speech processing. As part of this work we are investigating articulatory feature classifiers that are able to create reliable and accurate transcriptions of the articulatory behaviour encoded in the acoustic speech signal. Articulatory feature (AF) modelling of speech has received a considerable amount of attention in automatic speech recognition research. Different approaches have been used to build AF classifiers, most notably multi-layer perceptrons. Recently, deep neural networks have been applied to the task of AF classification. This paper aims to improve AF classification by investigating two different approaches: 1) investigating the usefulness of a deep Convolutional neural network (CNN) for AF classification; 2) integrating the Mel filtering operation into the CNN architecture. The results showed a remarkable improvement in classification accuracy of the CNNs over state-of-the-art AF classification results for Dutch, most notably in the minority classes. Integrating the Mel filtering operation into the CNN architecture did not further improve classification performance.
  • Micklos, A., Macuch Silva, V., & Fay, N. (2018). The prevalence of repair in studies of language evolution. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 316-318). Toruń, Poland: NCU Press. doi:10.12775/3991-1.075.
  • Mulder, K., Ten Bosch, L., & Boves, L. (2018). Analyzing EEG Signals in Auditory Speech Comprehension Using Temporal Response Functions and Generalized Additive Models. In Proceedings of Interspeech 2018 (pp. 1452-1456). doi:10.21437/Interspeech.2018-1676.

    Abstract

    Analyzing EEG signals recorded while participants are listening to continuous speech with the purpose of testing linguistic hypotheses is complicated by the fact that the signals simultaneously reflect exogenous acoustic excitation and endogenous linguistic processing. This makes it difficult to trace subtle differences that occur in mid-sentence position. We apply an analysis based on multivariate temporal response functions to uncover subtle mid-sentence effects. This approach is based on a per-stimulus estimate of the response of the neural system to speech input. Analyzing EEG signals predicted on the basis of the response functions might then bring to light conditionspecific differences in the filtered signals. We validate this approach by means of an analysis of EEG signals recorded with isolated word stimuli. Then, we apply the validated method to the analysis of the responses to the same words in the middle of meaningful sentences.
  • Norris, D., Cutler, A., McQueen, J. M., Butterfield, S., & Kearns, R. K. (2000). Language-universal constraints on the segmentation of English. In A. Cutler, J. M. McQueen, & R. Zondervan (Eds.), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 43-46). Nijmegen: Max-Planck-Institute for Psycholinguistics.

    Abstract

    Two word-spotting experiments are reported that examine whether the Possible-Word Constraint (PWC) [1] is a language-specific or language-universal strategy for the segmentation of continuous speech. The PWC disfavours parses which leave an impossible residue between the end of a candidate word and a known boundary. The experiments examined cases where the residue was either a CV syllable with a lax vowel, or a CVC syllable with a schwa. Although neither syllable context is a possible word in English, word-spotting in both contexts was easier than with a context consisting of a single consonant. The PWC appears to be language-universal rather than language-specific.
  • Norris, D., Cutler, A., & McQueen, J. M. (2000). The optimal architecture for simulating spoken-word recognition. In C. Davis, T. Van Gelder, & R. Wales (Eds.), Cognitive Science in Australia, 2000: Proceedings of the Fifth Biennial Conference of the Australasian Cognitive Science Society. Adelaide: Causal Productions.

    Abstract

    Simulations explored the inability of the TRACE model of spoken-word recognition to model the effects on human listening of subcategorical mismatch in word forms. The source of TRACE's failure lay not in interactive connectivity, not in the presence of inter-word competition, and not in the use of phonemic representations, but in the need for continuously optimised interpretation of the input. When an analogue of TRACE was allowed to cycle to asymptote on every slice of input, an acceptable simulation of the subcategorical mismatch data was achieved. Even then, however, the simulation was not as close as that produced by the Merge model, which has inter-word competition, phonemic representations and continuous optimisation (but no interactive connectivity).
  • Oostdijk, N., Goedertier, W., Van Eynde, F., Boves, L., Martens, J.-P., Moortgat, M., & Baayen, R. H. (2002). Experiences from the Spoken Dutch Corpus Project. In Third international conference on language resources and evaluation (pp. 340-347). Paris: European Language Resources Association.
  • Ortega, G., & Ozyurek, A. (2013). Gesture-sign interface in hearing non-signers' first exposure to sign. In Proceedings of the Tilburg Gesture Research Meeting [TiGeR 2013].

    Abstract

    Natural sign languages and gestures are complex communicative systems that allow the incorporation of features of a referent into their structure. They differ, however, in that signs are more conventionalised because they consist of meaningless phonological parameters. There is some evidence that despite non-signers finding iconic signs more memorable they can have more difficulty at articulating their exact phonological components. In the present study, hearing non-signers took part in a sign repetition task in which they had to imitate as accurately as possible a set of iconic and arbitrary signs. Their renditions showed that iconic signs were articulated significantly less accurately than arbitrary signs. Participants were recalled six months later to take part in a sign generation task. In this task, participants were shown the English translation of the iconic signs they imitated six months prior. For each word, participants were asked to generate a sign (i.e., an iconic gesture). The handshapes produced in the sign repetition and sign generation tasks were compared to detect instances in which both renditions presented the same configuration. There was a significant correlation between articulation accuracy in the sign repetition task and handshape overlap. These results suggest some form of gestural interference in the production of iconic signs by hearing non-signers. We also suggest that in some instances non-signers may deploy their own conventionalised gesture when producing some iconic signs. These findings are interpreted as evidence that non-signers process iconic signs as gestures and that in production, only when sign and gesture have overlapping features will they be capable of producing the phonological components of signs accurately.
  • Otake, T., & Cutler, A. (2000). A set of Japanese word cohorts rated for relative familiarity. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the Sixth International Conference on Spoken Language Processing: Vol. 3 (pp. 766-769). Beijing: China Military Friendship Publish.

    Abstract

    A database is presented of relative familiarity ratings for 24 sets of Japanese words, each set comprising words overlapping in the initial portions. These ratings are useful for the generation of material sets for research in the recognition of spoken words.

Share this page