Publications

Displaying 401 - 428 of 428
  • Wilson, J. J., & Little, H. (2016). A Neo-Peircean framework for experimental semiotics. In Proceedings of the 2nd Conference of the International Association for Cognitive Semiotics (pp. 171-173).
  • Wilson, J. J., & Little, H. (2014). Emerging languages in Esoteric and Exoteric Niches: evidence from Rural Sign Languages. In Ways to Potolanguage 3 book of abstracts (pp. 54-55).
  • Windhouwer, M., Kemps-Snijders, M., Trilsbeek, P., Moreira, A., Van der Veen, B., Silva, G., & Von Rhein, D. (2016). FLAT: Constructing a CLARIN Compatible Home for Language Resources. In K. Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani, H. Mazo, & A. Moreno (Eds.), Proccedings of LREC 2016: 10th International Conference on Language Resources and Evalution (pp. 2478-2483). Paris: European Language Resources Association (ELRA).

    Abstract

    Language resources are valuable assets, both for institutions and researchers. To safeguard these resources requirements for repository systems and data management have been specified by various branch organizations, e.g., CLARIN and the Data Seal of Approval. This paper describes these and some additional ones posed by the authors’ home institutions. And it shows how they are met by FLAT, to provide a new home for language resources. The basis of FLAT is formed by the Fedora Commons repository system. This repository system can meet many of the requirements out-of-the box, but still additional configuration and some development work is needed to meet the remaining ones, e.g., to add support for Handles and Component Metadata. This paper describes design decisions taken in the construction of FLAT’s system architecture via a mix-and-match strategy, with a preference for the reuse of existing solutions. FLAT is developed and used by the a Institute and The Language Archive, but is also freely available for anyone in need of a CLARIN-compliant repository for their language resources.
  • Windhouwer, M., Petro, J., & Shayan, S. (2014). RELISH LMF: Unlocking the full power of the lexical markup framework. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 1032-1037).
  • Wittek, A. (1998). Learning verb meaning via adverbial modification: Change-of-state verbs in German and the adverb "wieder" again. In A. Greenhill, M. Hughes, H. Littlefield, & H. Walsh (Eds.), Proceedings of the 22nd Annual Boston University Conference on Language Development (pp. 779-790). Somerville, MA: Cascadilla Press.
  • Wittenburg, P. (2004). The IMDI metadata concept. In S. F. Ferreira (Ed.), Workingmaterial on Building the LR&E Roadmap: Joint COCOSDA and ICCWLRE Meeting, (LREC2004). Paris: ELRA - European Language Resources Association.
  • Wittenburg, P., Brugman, H., Broeder, D., & Russel, A. (2004). XML-based language archiving. In Workshop Proceedings on XML-based Richly Annotaded Corpora (LREC2004) (pp. 63-69). Paris: ELRA - European Language Resources Association.
  • Wittenburg, P., Gulrajani, G., Broeder, D., & Uneson, M. (2004). Cross-disciplinary integration of metadata descriptions. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC2004) (pp. 113-116). Paris: ELRA - European Language Resources Association.
  • Wittenburg, P., Kita, S., & Brugman, H. (2002). Crosslinguistic studies of multimodal communication.
  • Wittenburg, P., Johnson, H., Buchhorn, M., Brugman, H., & Broeder, D. (2004). Architecture for distributed language resource management and archiving. In M. Lino, M. Xavier, F. Ferreira, R. Costa, & R. Silva (Eds.), Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC2004) (pp. 361-364). Paris: ELRA - European Language Resources Association.
  • Wittenburg, P., Peters, W., & Drude, S. (2002). Analysis of lexical structures from field linguistics and language engineering. In M. R. González, & C. P. S. Araujo (Eds.), Third international conference on language resources and evaluation (pp. 682-686). Paris: European Language Resources Association.

    Abstract

    Lexica play an important role in every linguistic discipline. We are confronted with many types of lexica. Depending on the type of lexicon and the language we are currently faced with a large variety of structures from very simple tables to complex graphs, as was indicated by a recent overview of structures found in dictionaries from field linguistics and language engineering. It is important to assess these differences and aim at the integration of lexical resources in order to improve lexicon creation, exchange and reuse. This paper describes the first step towards the integration of existing structures and standards into a flexible abstract model.
  • Wittenburg, P., & Broeder, D. (2002). Metadata overview and the semantic web. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics. Paris: European Language Resources Association.

    Abstract

    The increasing quantity and complexity of language resources leads to new management problems for those that collect and those that need to preserve them. At the same time the desire to make these resources available on the Internet demands an efficient way characterizing their properties to allow discovery and re-use. The use of metadata is seen as a solution for both these problems. However, the question is what specific requirements there are for the specific domain and if these are met by existing frameworks. Any possible solution should be evaluated with respect to its merit for solving the domain specific problems but also with respect to its future embedding in “global” metadata frameworks as part of the Semantic Web activities.
  • Wittenburg, P., Peters, W., & Broeder, D. (2002). Metadata proposals for corpora and lexica. In M. Rodriguez González, & C. Paz Suárez Araujo (Eds.), Third international conference on language resources and evaluation (pp. 1321-1326). Paris: European Language Resources Association.
  • Wittenburg, P., Mosel, U., & Dwyer, A. (2002). Methods of language documentation in the DOBES program. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 36-42). Paris: European Language Resources Association.
  • Wnuk, E. (2016). Specificity at the basic level in event taxonomies: The case of Maniq verbs of ingestion. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 2687-2692). Austin, TX: Cognitive Science Society.

    Abstract

    Previous research on basic-level object categories shows there is cross-cultural variation in basic-level concepts, arguing against the idea that the basic level reflects an objective reality. In this paper, I extend the investigation to the domain of events. More specifically, I present a case study of verbs of ingestion in Maniq illustrating a highly specific categorization of ingestion events at the basic level. A detailed analysis of these verbs reveals they tap into culturally salient notions. Yet, cultural salience alone cannot explain specificity of basic-level verbs, since ingestion is a domain of universal human experience. Further analysis reveals, however, that another key factor is the language itself. Maniq’s preference for encoding specific meaning in basic-level verbs is not a peculiarity of one domain, but a recurrent characteristic of its verb lexicon, pointing to the significant role of the language system in the structure of event concepts
  • Woensdregt, M., & Dingemanse, M. (2020). Other-initiated repair can facilitate the emergence of compositional language. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 474-476). Nijmegen: The Evolution of Language Conferences.
  • Won, S.-O., Hu, I., Kim, M.-Y., Bae, J.-M., Kim, Y.-M., & Byun, K.-S. (2009). Theory and practice of Sign Language interpretation. Pyeongtaek: Korea National College of Rehabilitation & Welfare.
  • Wright, S. E., Windhouwer, M., Schuurman, I., & Broeder, D. (2014). Segueing from a Data Category Registry to a Data Concept Registry. In Proceedings of the 11th International Conference on Terminology and Knowledge Engineering (TKE 2014).

    Abstract

    The terminology Community of Practice has long standardized data categories in the framework of ISO TC 37. ISO 12620:2009 specifies the data model and procedures for a Data Category Registry (DCR), which has been implemented by the Max Planck Institute for Psycholinguistics as the ISOcat DCR. The DCR has been used by not only ISO TC 37, but also by the CLARIN research infra-structure. This paper describes how the needs of these communities have started to diverge and the process of segueing from a DCR to a Data Concept Registry in order to meet the needs of both communities.
  • Xiao, M., Kong, X., Liu, J., & Ning, J. (2009). TMBF: Bloom filter algorithms of time-dependent multi bit-strings for incremental set. In Proceedings of the 2009 International Conference on Ultra Modern Telecommunications & Workshops.

    Abstract

    Set is widely used as a kind of basic data structure. However, when it is used for large scale data set the cost of storage, search and transport is overhead. The bloom filter uses a fixed size bit string to represent elements in a static set, which can reduce storage space and search cost that is a fixed constant. The time-space efficiency is achieved at the cost of a small probability of false positive in membership query. However, for many applications the space savings and locating time constantly outweigh this drawback. Dynamic bloom filter (DBF) can support concisely representation and approximate membership queries of dynamic set instead of static set. It has been proved that DBF not only possess the advantage of standard bloom filter, but also has better features when dealing with dynamic set. This paper proposes a time-dependent multiple bit-strings bloom filter (TMBF) which roots in the DBF and targets on dynamic incremental set. TMBF uses multiple bit-strings in time order to present a dynamic increasing set and uses backward searching to test whether an element is in a set. Based on the system logs from a real P2P file sharing system, the evaluation shows a 20% reduction in searching cost compared to DBF.
  • Yang, J., Van den Bosch, A., & Frank, S. L. (2020). Less is Better: A cognitively inspired unsupervised model for language segmentation. In M. Zock, E. Chersoni, A. Lenci, & E. Santus (Eds.), Proceedings of the Workshop on the Cognitive Aspects of the Lexicon ( 28th International Conference on Computational Linguistics) (pp. 33-45). Stroudsburg: Association for Computational Linguistics.

    Abstract

    Language users process utterances by segmenting them into many cognitive units, which vary in their sizes and linguistic levels. Although we can do such unitization/segmentation easily, its cognitive mechanism is still not clear. This paper proposes an unsupervised model, Less-is-Better (LiB), to simulate the human cognitive process with respect to language unitization/segmentation. LiB follows the principle of least effort and aims to build a lexicon which minimizes the number of unit tokens (alleviating the effort of analysis) and number of unit types (alleviating the effort of storage) at the same time on any given corpus. LiB’s workflow is inspired by empirical cognitive phenomena. The design makes the mechanism of LiB cognitively plausible and the computational requirement light-weight. The lexicon generated by LiB performs the best among different types of lexicons (e.g. ground-truth words) both from an information-theoretical view and a cognitive view, which suggests that the LiB lexicon may be a plausible proxy of the mental lexicon.

    Additional information

    full text via ACL website
  • Yang, A., & Chen, A. (2014). Prosodic focus marking in child and adult Mandarin Chinese. In C. Gussenhoven, Y. Chen, & D. Dediu (Eds.), Proceedings of the 4th International Symposium on Tonal Aspects of Language (pp. 54-58).

    Abstract

    This study investigates how Mandarin Chinese speaking children and adults use prosody to mark focus in spontaneous speech. SVO sentences were elicited from 4- and 8-year-olds and adults in a game setting. Sentence-medial verbs were acoustically analysed for both duration and pitch range in different focus conditions. We have found that like the adults, the 8-year-olds used both duration and pitch range to distinguish focus from non-focus. The 4-year-olds used only duration to distinguish focus from non-focus, unlike the adults and 8-year-olds. None of the three groups of speakers distinguished contrastive focus from non-contrastive focus using pitch range or duration. Regarding the distinction between narrow focus from broad focus, the 4- and 8-year-olds used both pitch range and duration for this purpose, while the adults used only duration
  • Yang, A., & Chen, A. (2014). Prosodic focus-marking in Chinese four- and eight-year-olds. In N. Campbell, D. Gibbon, & D. Hirst (Eds.), Proceedings of Speech Prosody 2014 (pp. 713-717).

    Abstract

    This study investigates how Mandarin Chinese speaking children use prosody to distinguish focus from non-focus, and focus types differing in size of constituent and contrastivity. SVO sentences were elicited from four- and eight-year-olds in a game setting. Sentence-medial verbs were acoustically analysed for both duration and pitch range in different focus conditions. The children started to use duration to differentiate focus from non-focus at the age of four. But their use of pitch range varied with age and depended on non-focus conditions (pre- vs. postfocus) and the lexical tones of the verbs. Further, the children in both age groups used pitch range but not duration to differentiate narrow focus from broad focus, and they did not differentiate contrastive narrow focus from non-contrastive narrow focus using duration or pitch range. The results indicated that Chinese children acquire the prosodic means (duration and pitch range) of marking focus in stages, and their acquisition of these two means appear to be early, compared to children speaking an intonation language, for example, Dutch.
  • Zampieri, M., & Gebre, B. G. (2014). VarClass: An open-source language identification tool for language varieties. In N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2014: 9th International Conference on Language Resources and Evaluation (pp. 3305-3308).

    Abstract

    This paper presents VarClass, an open-source tool for language identification available both to be downloaded as well as through a graphical user-friendly interface. The main difference of VarClass in comparison to other state-of-the-art language identification tools is its focus on language varieties. General purpose language identification tools do not take language varieties into account and our work aims to fill this gap. VarClass currently contains language models for over 27 languages in which 10 of them are language varieties. We report an average performance of over 90.5% accuracy in a challenging dataset. More language models will be included in the upcoming months
  • Zeshan, U. (2004). Basic English course taught in Indian Sign Language (Ali Yavar Young National Institute for Hearing Handicapped, Ed.). National Institute for the Hearing Handicapped: Mumbai.
  • Zhang, Y., & Yu, C. (2016). Examining referential uncertainty in naturalistic contexts from the child’s view: Evidence from an eye-tracking study with infants. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016). Austin, TX: Cognitive Science Society (pp. 2027-2032). Austin, TX: Cognitive Science Society.

    Abstract

    Young Infants are prolific word learners even though they are facing the challenge of referential uncertainty (Quine, 1960). Many laboratory studies have shown that infants are skilled at inferring correct referents of words from ambiguous contexts (Swingley, 2009). However, little is known regarding how they visually attend to and select the target object among many other objects in view when parents name it during everyday interactions. By investigating the looking pattern of 12-month-old infants using naturalistic first-person images with varying degrees of referential ambiguity, we found that infants’ attention is selective and they only select a small subset of objects to attend to at each learning instance despite the complexity of the data in the real world. This work allows us to better understand how perceptual properties of objects in infants’ view influence their visual attention, which is also related to how they select candidate objects to build word-object mappings.
  • Zhang, Y., Amatuni, A., Crain, E., & Yu, C. (2020). Seeking meaning: Examining a cross-situational solution to learn action verbs using human simulation paradigm. In S. Denison, M. Mack, Y. Xu, & B. C. Armstrong (Eds.), Proceedings of the 42nd Annual Meeting of the Cognitive Science Society (CogSci 2020) (pp. 2854-2860). Montreal, QB: Cognitive Science Society.

    Abstract

    To acquire the meaning of a verb, language learners not only need to find the correct mapping between a specific verb and an action or event in the world, but also infer the underlying relational meaning that the verb encodes. Most verb naming instances in naturalistic contexts are highly ambiguous as many possible actions can be embedded in the same scenario and many possible verbs can be used to describe those actions. To understand whether learners can find the correct verb meaning from referentially ambiguous learning situations, we conducted three experiments using the Human Simulation Paradigm with adult learners. Our results suggest that although finding the right verb meaning from one learning instance is hard, there is a statistical solution to this problem. When provided with multiple verb learning instances all referring to the same verb, learners are able to aggregate information across situations and gradually converge to the correct semantic space. Even in cases where they may not guess the exact target verb, they can still discover the right meaning by guessing a similar verb that is semantically close to the ground truth.
  • Zhou, W., & Broersma, M. (2014). Perception of birth language tone contrasts by adopted Chinese children. In C. Gussenhoven, Y. Chen, & D. Dediu (Eds.), Proceedings of the 4th International Symposium on Tonal Aspects of Language (pp. 63-66).

    Abstract

    The present study investigates how long after adoption adoptees forget the phonology of their birth language. Chinese children who were adopted by Dutch families were tested on the perception of birth language tone contrasts before, during, and after perceptual training. Experiment 1 investigated Cantonese tone 2 (High-Rising) and tone 5 (Low-Rising), and Experiment 2 investigated Mandarin tone 2 (High-Rising) and tone 3 (Low-Dipping). In both experiments, participants were adoptees and non-adopted Dutch controls. Results of both experiments show that the tone contrasts were very difficult to perceive for the adoptees, and that adoptees were not better at perceiving the tone contrasts than their non-adopted Dutch peers, before or after training. This demonstrates that forgetting took place relatively soon after adoption, and that the re-exposure that the adoptees were presented with did not lead to an improvement greater than that of the Dutch control participants. Thus, the findings confirm what has been anecdotally reported by adoptees and their parents, but what had not been empirically tested before, namely that birth language forgetting occurs very soon after adoption
  • Zwitserlood, I. (2002). The complex structure of ‘simple’ signs in NGT. In J. Van Koppen, E. Thrift, E. Van der Torre, & M. Zimmermann (Eds.), Proceedings of ConSole IX (pp. 232-246).

    Abstract

    In this paper, I argue that components in a set of simple signs in Nederlandse Gebarentaal (also called Sign Language of the Netherlands; henceforth: NGT), i.e. hand configuration (including orientation), movement and place of articulation, can also have morphological status. Evidence for this is provided by: firstly, the fact that handshape, orientation, movement and place of articulation show regular meaningful patterns in signs, which patterns also occur in newly formed signs, and secondly, the gradual change of formerly noninflecting predicates into inflectional predicates. The morphological complexity of signs can best be accounted for in autosegmental morphological templates.

Share this page