Publications

Displaying 301 - 347 of 347
  • Ten Bosch, L., & Scharenborg, O. (2012). Modeling cue trading in human word recognition. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 2003-2006).

    Abstract

    Classical phonetic studies have shown that acoustic-articulatory cues can be interchanged without affecting the resulting phoneme percept (‘cue trading’). Cue trading has so far mainly been investigated in the context of phoneme identification. In this study, we investigate cue trading in word recognition, because words are the units of speech through which we communicate. This paper aims to provide a method to quantify cue trading effects by using a computational model of human word recognition. This model takes the acoustic signal as input and represents speech using articulatory feature streams. Importantly, it allows cue trading and underspecification. Its set-up is inspired by the functionality of Fine-Tracker, a recent computational model of human word recognition. This approach makes it possible, for the first time, to quantify cue trading in terms of a trade-off between features and to investigate cue trading in the context of a word recognition task.
  • Torreira, F., & Ernestus, M. (2009). Probabilistic effects on French [t] duration. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) (pp. 448-451). Causal Productions Pty Ltd.

    Abstract

    The present study shows that [t] consonants are affected by probabilistic factors in a syllable-timed language as French, and in spontaneous as well as in journalistic speech. Study 1 showed a word bigram frequency effect in spontaneous French, but its exact nature depended on the corpus on which the probabilistic measures were based. Study 2 investigated journalistic speech and showed an effect of the joint frequency of the test word and its following word. We discuss the possibility that these probabilistic effects are due to the speaker’s planning of upcoming words, and to the speaker’s adaptation to the listener’s needs.
  • Turco, G., & Gubian, M. (2012). L1 Prosodic transfer and priming effects: A quantitative study on semi-spontaneous dialogues. In Q. Ma, H. Ding, & D. Hirst (Eds.), Proceedings of the 6th International Conference on Speech Prosody (pp. 386-389). International Speech Communication Association (ISCA).

    Abstract

    This paper represents a pilot investigation of primed accentuation patterns produced by advanced Dutch speakers of Italian as a second language (L2). Contrastive accent patterns within prepositional phrases were elicited in a semispontaneous dialogue entertained with a confederate native speaker of Italian. The aim of the analysis was to compare learner’s contrastive accentual configurations induced by the confederate speaker’s prime against those produced by Italian and Dutch natives in the same testing conditions. F0 and speech rate data were analysed by applying powerful datadriven techniques available in the Functional Data Analysis statistical framework. Results reveal different accentual configurations in L1 and L2 Italian in response to the confederate’s prime. We conclude that learner’s accentual patterns mirror those ones produced by their L1 control group (prosodic-transfer hypothesis) although the hypothesis of a transient priming effect on learners’ choice of contrastive patterns cannot be completely ruled out.
  • Uddén, J., Araújo, S., Forkstam, C., Ingvar, M., Hagoort, P., & Petersson, K. M. (2009). A matter of time: Implicit acquisition of recursive sequence structures. In N. Taatgen, & H. Van Rijn (Eds.), Proceedings of the Thirty-First Annual Conference of the Cognitive Science Society (pp. 2444-2449).

    Abstract

    A dominant hypothesis in empirical research on the evolution of language is the following: the fundamental difference between animal and human communication systems is captured by the distinction between regular and more complex non-regular grammars. Studies reporting successful artificial grammar learning of nested recursive structures and imaging studies of the same have methodological shortcomings since they typically allow explicit problem solving strategies and this has been shown to account for the learning effect in subsequent behavioral studies. The present study overcomes these shortcomings by using subtle violations of agreement structure in a preference classification task. In contrast to the studies conducted so far, we use an implicit learning paradigm, allowing the time needed for both abstraction processes and consolidation to take place. Our results demonstrate robust implicit learning of recursively embedded structures (context-free grammar) and recursive structures with cross-dependencies (context-sensitive grammar) in an artificial grammar learning task spanning 9 days. Keywords: Implicit artificial grammar learning; centre embedded; cross-dependency; implicit learning; context-sensitive grammar; context-free grammar; regular grammar; non-regular grammar
  • Vainio, M., Suni, A., Raitio, T., Nurminen, J., Järvikivi, J., & Alku, P. (2009). New method for delexicalization and its application to prosodic tagging for text-to-speech synthesis. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) (pp. 1703-1706).

    Abstract

    This paper describes a new flexible delexicalization method based on glottal excited parametric speech synthesis scheme. The system utilizes inverse filtered glottal flow and all-pole modelling of the vocal tract. The method provides a possibility to retain and manipulate all relevant prosodic features of any kind of speech. Most importantly, the features include voice quality, which has not been properly modeled in earlier delexicalization methods. The functionality of the new method was tested in a prosodic tagging experiment aimed at providing word prominence data for a text-to-speech synthesis system. The experiment confirmed the usefulness of the method and further corroborated earlier evidence that linguistic factors influence the perception of prosodic prominence.
  • Van Berkum, J. J. A. (2009). The neuropragmatics of 'simple' utterance comprehension: An ERP review. In U. Sauerland, & K. Yatsushiro (Eds.), Semantics and pragmatics: From experiment to theory (pp. 276-316). Basingstoke: Palgrave Macmillan.

    Abstract

    In this chapter, I review my EEG research on comprehending sentences in context from a pragmatics-oriented perspective. The review is organized around four questions: (1) When and how do extra-sentential factors such as the prior text, identity of the speaker, or value system of the comprehender affect the incremental sentence interpretation processes indexed by the so-called N400 component of the ERP? (2) When and how do people identify the referents for expressions such as “he” or “the review”, and how do referential processes interact with sense and syntax? (3) How directly pragmatic are the interpretation-relevant ERP effects reported here? (4) Do readers and listeners anticipate upcoming information? One important claim developed in the chapter is that the well-known N400 component, although often associated with ‘semantic integration’, only indirectly reflects the sense-making involved in structure-sensitive dynamic composition of the type studied in semantics and pragmatics. According to the multiple-cause intensified retrieval (MIR) account -- essentially an extension of the memory retrieval account proposed by Kutas and colleagues -- the amplitude of the word-elicited N400 reflects the computational resources used in retrieving the relatively invariant coded meaning stored in semantic long-term memory for, and made available by, the word at hand. Such retrieval becomes more resource-intensive when the coded meanings cued by this word do not match with expectations raised by the relevant interpretive context, but also when certain other relevance signals, such as strong affective connotation or a marked delivery, indicate the need for deeper processing. The most important consequence of this account is that pragmatic modulations of the N400 come about not because the N400 at hand directly reflects a rich compositional-semantic and/or Gricean analysis to make sense of the word’s coded meaning in this particular context, but simply because the semantic and pragmatic implications of the preceding words have already been computed, and now define a less or more helpful interpretive background within which to retrieve coded meaning for the critical word.
  • Van Valin Jr., R. D. (2009). Case in role and reference grammar. In A. Malchukov, & A. Spencer (Eds.), The Oxford handbook of case (pp. 102-120). Oxford University Press.
  • Van Valin Jr., R. D., & Guerrero, L. (2012). De sujetos, pivotes y controladores: El argumento sintácticamente privilegiado. In R. Marial, L. Guerrero, & C. González Vergara (Eds.), El funcionalismo en la teoría lingüística: La gramática del papel y la referencia (pp. 247-267). Madrid: Akal.

    Abstract

    Translated and expanded version of 'Privileged syntactic arguments, pivots and controllers
  • Van Berkum, J. J. A. (2009). Does the N400 directly reflect compositional sense-making? Psychophysiology, Special Issue: Society for Psychophysiological Research Abstracts for the Forty-Ninth Annual Meeting, 46(Suppl. 1), s2.

    Abstract

    A not uncommon assumption in psycholinguistics is that the N400 directly indexes high-level semantic integration, the compositional, word-driven construction of sentence- and discourse-level meaning in some language-relevant unification space. The various discourse- and speaker-dependent modulations of the N400 uncovered by us and others are often taken to support this 'compositional integration' position. In my talk, I will argue that these N400 modulations are probably better interpreted as only indirectly reflecting compositional sense-making. The account that I will advance for these N400 effects is a variant of the classic Kutas and Federmeier (2002, TICS) memory retrieval account in which context effects on the word-elicited N400 are taken to reflect contextual priming of LTM access. It differs from the latter in making more explicit that the contextual cues that prime access to a word's meaning in LTM can range from very simple (e.g., a single concept) to very complex ones (e.g., a structured representation of the current discourse). Furthermore, it incorporates the possibility, suggested by recent N400 findings, that semantic retrieval can also be intensified in response to certain ‘relevance signals’, such as strong value-relevance, or a marked delivery (linguistic focus, uncommon choice of words, etc). In all, the perspective I'll draw is that in the context of discourse-level language processing, N400 effects reflect an 'overlay of technologies', with the construction of discourse-level representations riding on top of more ancient sense-making technology.
  • Van Gijn, R., & Gipper, S. (2009). Irrealis in Yurakaré and other languages: On the cross-linguistic consistency of an elusive category. In L. Hogeweg, H. De Hoop, & A. Malchukov (Eds.), Cross-linguistic semantics of tense, aspect, and modality (pp. 155-178). Amsterdam: Benjamins.

    Abstract

    The linguistic category of irrealis does not show stable semantics across languages. This makes it difficult to formulate general statements about this category, and it has led some researchers to reject irrealis as a cross-linguistically valid category. In this paper we look at the semantics of the irrealis category of Yurakaré, an unclassified language spoken in central Bolivia, and compare it to irrealis semantics of a number of other languages. Languages differ with respect to the subcategories they subsume under the heading of irrealis. The variable subcategories are future tense, imperatives, negatives, and habitual aspect. We argue that the cross-linguistic variation is not random, and can be stated in terms of an implicational scale.
  • Van Geenhoven, V. (1998). On the Argument Structure of some Noun Incorporating Verbs in West Greenlandic. In M. Butt, & W. Geuder (Eds.), The Projection of Arguments - Lexical and Compositional Factors (pp. 225-263). Stanford, CA, USA: CSLI Publications.
  • Van Valin Jr., R. D. (2009). Privileged syntactic arguments, pivots and controllers. In L. Guerrero, S. Ibáñez, & V. A. Belloro (Eds.), Studies in role and reference grammar (pp. 45-68). Mexico: Universidad Nacional Autónoma de México.
  • Van Valin Jr., R. D. (1998). The acquisition of WH-questions and the mechanisms of language acquisition. In M. Tomasello (Ed.), The new psychology of language: Cognitive and functional approaches to language structure (pp. 221-249). Mahwah, New Jersey: Erlbaum.
  • Van Berkum, J. J. A. (2012). The electrophysiology of discourse and conversation. In M. J. Spivey, K. McRae, & M. F. Joanisse (Eds.), The Cambridge handbook of psycholinguistics (pp. 589-614). New York: Cambridge University Press.

    Abstract

    Introduction: What’s happening in the brains of two people having a conversation? One reasonable guess is that in the fMRI scanner we’d see most of their brains light up. Another is that their EEG will be a total mess, reflecting dozens of interacting neuronal systems. Conversation recruits all of the basic language systems reviewed in this book. It also heavily taxes cognitive systems more likely to be found in handbooks of memory, attention and control, or social cognition (Brownell & Friedman, 2001). With most conversations going beyond the single utterance, for instance, they place a heavy load on episodic memory, as well as on the systems that allow us to reallocate cognitive resources to meet the demands of a dynamically changing situation. Furthermore, conversation is a deeply social and collaborative enterprise (Clark, 1996; this volume), in which interlocutors have to keep track of each others state of mind and coordinate on such things as taking turns, establishing common ground, and the goals of the conversation.
  • Van Valin Jr., R. D. (2012). Some issues in the linking between syntax and semantics in relative clauses. In B. Comrie, & Z. Estrada-Fernández (Eds.), Relative Clauses in languages of the Americas: A typological overview (pp. 47-64). Amsterdam: Benjamins.

    Abstract

    Relative clauses present an interesting challenge for theories of the syntaxsemantics interface, because one element functions simultaneously in the matrix and relative clauses. The exact nature of the challenge depends on whether the relative clause is externally-headed or internallyheaded. Standard analyses of relative clauses are grounded in the analysis of Englishtype externally-headed constructions involving a relative pronoun, e.g. The horse which the man bought was a good horse, despite its typological rarity, and such accounts typically involve movement rules, both overt and covert, and phonologically null elements. The analysis of internally-headed relative clauses often involves the positing of an abstract structure including a null external head, with covert movement of the internal head to that position. The purpose of this paper is to show that the essential features of both types of relative clause can be captured in a syntactic theory that eschews movement rules and phonologically null elements, Role and Reference Grammar. It will be argued that a single set of linking principles can handle the syntax-to-semantics linking for both types. Keywords: Externally-headed relative clauses; internally-headed relative clauses; Role and Reference Grammar; linking syntax and semantics
  • Van Valin Jr., R. D. (2009). Role and reference grammar. In F. Brisard, J.-O. Östman, & J. Verschueren (Eds.), Grammar, meaning, and pragmatics (pp. 239-249). Amsterdam: Benjamins.
  • Van de Ven, M., Tucker, B. V., & Ernestus, M. (2009). Semantic context effects in the recognition of acoustically unreduced and reduced words. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (pp. 1867-1870). Causal Productions Pty Ltd.

    Abstract

    Listeners require context to understand the casual pronunciation variants of words that are typical of spontaneous speech (Ernestus et al., 2002). The present study reports two auditory lexical decision experiments, investigating listeners' use of semantic contextual information in the comprehension of unreduced and reduced words. We found a strong semantic priming effect for low frequency unreduced words, whereas there was no such effect for reduced words. Word frequency was facilitatory for all words. These results show that semantic context is relevant especially for the comprehension of unreduced words, which is unexpected given the listener driven explanation of reduction in spontaneous speech.
  • Van Uytvanck, D., Stehouwer, H., & Lampen, L. (2012). Semantic metadata mapping in practice: The Virtual Language Observatory. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 1029-1034). European Language Resources Association (ELRA).

    Abstract

    In this paper we present the Virtual Language Observatory (VLO), a metadata-based portal for language resources. It is completely based on the Component Metadata (CMDI) and ISOcat standards. This approach allows for the use of heterogeneous metadata schemas while maintaining the semantic compatibility. We describe the metadata harvesting process, based on OAI-PMH, and the conversion from several formats (OLAC, IMDI and the CLARIN LRT inventory) to their CMDI counterpart profiles. Then we focus on some post-processing steps to polish the harvested records. Next, the ingestion of the CMDI files into the VLO facet browser is described. We also include an overview of the changes since the first version of the VLO, based on user feedback from the CLARIN community. Finally there is an overview of additional ideas and improvements for future versions of the VLO.
  • van Hell, J. G., & Witteman, M. J. (2009). The neurocognition of switching between languages: A review of electrophysiological studies. In L. Isurin, D. Winford, & K. de Bot (Eds.), Multidisciplinary approaches to code switching (pp. 53-84). Philadelphia: John Benjamins.

    Abstract

    The seemingly effortless switching between languages and the merging of two languages into a coherent utterance is a hallmark of bilingual language processing, and reveals the flexibility of human speech and skilled cognitive control. That skill appears to be available not only to speakers when they produce language-switched utterances, but also to listeners and readers when presented with mixed language information. In this chapter, we review electrophysiological studies in which Event-Related Potentials (ERPs) are derived from recordings of brain activity to examine the neurocognitive aspects of comprehending and producing mixed language. Topics we discuss include the time course of brain activity associated with language switching between single stimuli and language switching of words embedded in a meaningful sentence context. The majority of ERP studies report that switching between languages incurs neurocognitive costs, but –more interestingly- ERP patterns differ as a function of L2 proficiency and the amount of daily experience with language switching, the direction of switching (switching into L2 is typically associated with higher switching costs than switching into L1), the type of language switching task, and the predictability of the language switch. Finally, we outline some future directions for this relatively new approach to the study of language switching.
  • Verhagen, J. (2009). Light verbs and the acquisition of finiteness and negation in Dutch as a second language. In C. Dimroth, & P. Jordens (Eds.), Functional categories in learner language (pp. 203-234). Berlin: Mouton de Gruyter.
  • Verkerk, A. (2009). A semantic map of secondary predication. In B. Botma, & J. Van Kampen (Eds.), Linguistics in the Netherlands 2009 (pp. 115-126).
  • Viebahn, M. C., Ernestus, M., & McQueen, J. M. (2012). Co-occurrence of reduced word forms in natural speech. In Proceedings of INTERSPEECH 2012: 13th Annual Conference of the International Speech Communication Association (pp. 2019-2022).

    Abstract

    This paper presents a corpus study that investigates the co-occurrence of reduced word forms in natural speech. We extracted Dutch past participles from three different speech registers and investigated the influence of several predictor variables on the presence and duration of schwas in prefixes and /t/s in suffixes. Our results suggest that reduced word forms tend to co-occur even if we partial out the effect of speech rate. The implications of our findings for episodic and abstractionist models of lexical representation are discussed.
  • Von Stutterheim, C., Carroll, M., & Klein, W. (2009). New perspectives in analyzing aspectual distinctions across languages. In W. Klein, & P. Li (Eds.), The expression of time (pp. 195-216). Berlin: Mouton de Gruyter.
  • De Vos, C., & Zeshan, U. (2012). Introduction: Demographic, sociocultural, and linguistic variation across rural signing communities. In U. Zeshan, & C. de Vos (Eds.), Sign languages in village communities: Anthropological and linguistic insights (pp. 2-23). Berlin: Mouton De Gruyter.
  • De Vos, C. (2012). Kata Kolok: An updated sociolinguistic profile. In U. Zeshan (Ed.), Sign languages in village communities: Anthropological and linguistic insights (pp. 381-386). Berlin: Mouton de Gruyter.
  • De Vos, C. (2012). The Kata Kolok perfective in child signing: Coordination of manual and non-manual components. In U. Zeshan, & C. De Vos (Eds.), Sign languages in village communities: Anthropological and linguistic insights (pp. 127-152). Berlin: Mouton de Gruyter.
  • Warner, N. L., McQueen, J. M., Liu, P. Z., Hoffmann, M., & Cutler, A. (2012). Timing of perception for all English diphones [Abstract]. Program abstracts from the 164th Meeting of the Acoustical Society of America published in the Journal of the Acoustical Society of America, 132(3), 1967.

    Abstract

    Information in speech does not unfold discretely over time; perceptual cues are gradient and overlapped. However, this varies greatly across segments and environments: listeners cannot identify the affricate in /ptS/ until the frication, but information about the vowel in /li/ begins early. Unlike most prior studies, which have concentrated on subsets of language sounds, this study tests perception of every English segment in every phonetic environment, sampling perceptual identification at six points in time (13,470 stimuli/listener; 20 listeners). Results show that information about consonants after another segment is most localized for affricates (almost entirely in the release), and most gradual for voiced stops. In comparison to stressed vowels, unstressed vowels have less information spreading to
    neighboring segments and are less well identified. Indeed, many vowels,
    especially lax ones, are poorly identified even by the end of the following segment. This may partly reflect listeners’ familiarity with English vowels’ dialectal variability. Diphthongs and diphthongal tense vowels show the most sudden improvement in identification, similar to affricates among the consonants, suggesting that information about segments defined by acoustic change is highly localized. This large dataset provides insights into speech perception and data for probabilistic modeling of spoken word recognition.
  • Weber, A. (1998). Listening to nonnative language which violates native assimilation rules. In D. Duez (Ed.), Proceedings of the European Scientific Communication Association workshop: Sound patterns of Spontaneous Speech (pp. 101-104).

    Abstract

    Recent studies using phoneme detection tasks have shown that spoken-language processing is neither facilitated nor interfered with by optional assimilation, but is inhibited by violation of obligatory assimilation. Interpretation of these results depends on an assessment of their generality, specifically, whether they also obtain when listeners are processing nonnative language. Two separate experiments are presented in which native listeners of German and native listeners of Dutch had to detect a target fricative in legal monosyllabic Dutch nonwords. All of the nonwords were correct realisations in standard Dutch. For German listeners, however, half of the nonwords contained phoneme strings which violate the German fricative assimilation rule. Whereas the Dutch listeners showed no significant effects, German listeners detected the target fricative faster when the German fricative assimilation was violated than when no violation occurred. The results might suggest that violation of assimilation rules does not have to make processing more difficult per se.
  • Weber, A., & Broersma, M. (2012). Spoken word recognition in second language acquisition. In C. A. Chapelle (Ed.), The encyclopedia of applied linguistics. Bognor Regis: Wiley-Blackwell. doi:10.1002/9781405198431.wbeal1104.

    Abstract

    In order to decode the message of a speaker, listeners have to recognize individual words in the speaker's utterance.
  • Weber, A. (2009). The role of linguistic experience in lexical recognition [Abstract]. Journal of the Acoustical Society of America, 125, 2759.

    Abstract

    Lexical recognition is typically slower in L2 than in L1. Part of the difficulty comes from a not precise enough processing of L2 phonemes. Consequently, L2 listeners fail to eliminate candidate words that L1 listeners can exclude from competing for recognition. For instance, the inability to distinguish /r/ from /l/ in rocket and locker makes for Japanese listeners both words possible candidates when hearing their onset (e.g., Cutler, Weber, and Otake, 2006). The L2 disadvantage can, however, be dispelled: For L2 listeners, but not L1 listeners, L2 speech from a non-native talker with the same language background is known to be as intelligible as L2 speech from a native talker (e.g., Bent and Bradlow, 2003). A reason for this may be that L2 listeners have ample experience with segmental deviations that are characteristic for their own accent. On this account, only phonemic deviations that are typical for the listeners’ own accent will cause spurious lexical activation in L2 listening (e.g., English magic pronounced as megic for Dutch listeners). In this talk, I will present evidence from cross-modal priming studies with a variety of L2 listener groups, showing how the processing of phonemic deviations is accent-specific but withstands fine phonetic differences.
  • Weissenborn, J., & Stralka, R. (1984). Das Verstehen von Mißverständnissen. Eine ontogenetische Studie. In Zeitschrift für Literaturwissenschaft und Linguistik (pp. 113-134). Stuttgart: Metzler.
  • Weissenborn, J. (1984). La genèse de la référence spatiale en langue maternelle et en langue seconde: similarités et différences. In G. Extra, & M. Mittner (Eds.), Studies in second language acquisition by adult immigrants (pp. 262-286). Tilburg: Tilburg University.
  • Weissenborn, J. (1986). Learning how to become an interlocutor. The verbal negotiation of common frames of reference and actions in dyads of 7–14 year old children. In J. Cook-Gumperz, W. A. Corsaro, & J. Streeck (Eds.), Children's worlds and children's language (pp. 377-404). Berlin: Mouton de Gruyter.
  • Windhouwer, M., Broeder, D., & Van Uytvanck, D. (2012). A CMD core model for CLARIN web services. In Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 41-48).

    Abstract

    In the CLARIN infrastructure various national projects have started initiatives to allow users of the infrastructure to create chains or workflows of web services. The Component Metadata (CMD) core model for web services described in this paper tries to align the metadata descriptions of these various initiatives. This should allow chaining/workflow engines to find matching and invoke services. The paper describes the landscape of web services architectures and the state of the national initiatives. Based on this a CMD core model for CLARIN is proposed, which, within some limits, can be adapted to the specific needs of an initiative by the standard facilities of CMD. The paper closes with the current state and usage of the model and a look into the future.
  • Windhouwer, M., & Wright, S. E. (2012). Linking to linguistic data categories in ISOcat. In C. Chiarcos, S. Nordhoff, & S. Hellmann (Eds.), Linked data in linguistics: Representing and connecting language data and language metadata (pp. 99-107). Berlin: Springer.

    Abstract

    ISO Technical Committee 37, Terminology and other language and content resources, established an ISO 12620:2009 based Data Category Registry (DCR), called ISOcat (see http://www.isocat.org), to foster semantic interoperability of linguistic resources. However, this goal can only be met if the data categories are reused by a wide variety of linguistic resource types. A resource indicates its usage of data categories by linking to them. The small DC Reference XML vocabulary is used to embed links to data categories in XML documents. The link is established by an URI, which servers as the Persistent IDentifier (PID) of a data category. This paper discusses the efforts to mimic the same approach for RDF-based resources. It also introduces the RDF quad store based Relation Registry RELcat, which enables ontological relationships between data categories not supported by ISOcat and thus adds an extra level of linguistic knowledge.
  • Windhouwer, M. (2012). RELcat: a Relation Registry for ISOcat data categories. In N. Calzolari (Ed.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 3661-3664). European Language Resources Association (ELRA).

    Abstract

    The ISOcat Data Category Registry contains basically a flat and easily extensible list of data category specifications. To foster reuse and standardization only very shallow relationships among data categories are stored in the registry. However, to assist crosswalks, possibly based on personal views, between various (application) domains and to overcome possible proliferation of data categories more types of ontological relationships need to be specified. RELcat is a first prototype of a Relation Registry, which allows storing arbitrary relationships. These relationships can reflect the personal view of one linguist or a larger community. The basis of the registry is a relation type taxonomy that can easily be extended. This allows on one hand to load existing sets of relations specified in, for example, an OWL (2) ontology or SKOS taxonomy. And on the other hand allows algorithms that query the registry to traverse the stored semantic network to remain ignorant of the original source vocabulary. This paper describes first experiences with RELcat and explains some initial design decisions.
  • Windhouwer, M. (2012). Towards standardized descriptions of linguistic features: ISOcat and procedures for using common data categories. In J. Jancsary (Ed.), Proceedings of the Conference on Natural Language Processing 2012, (SFLR 2012 workshop), September 19-21, 2012, Vienna (pp. 494). Vienna: Österreichischen Gesellschaft für Artificial Intelligende (ÖGAI).

    Abstract

    Automatic Language Identification of written texts is a well-established area of research in Computational Linguistics. State-of-the-art algorithms often rely on n-gram character models to identify the correct language of texts, with good results seen for European languages. In this paper we propose the use of a character n-gram model and a word n-gram language model for the automatic classification of two written varieties of Portuguese: European and Brazilian. Results reached 0.998 for accuracy using character 4-grams.
  • Withers, P. (2012). Metadata management with Arbil. In V. Arranz, D. Broeder, B. Gaiffe, M. Gavrilidou, & M. Monachini (Eds.), Proceedings of LREC 2012: 8th International Conference on Language Resources and Evaluation (pp. 72-75). European Language Resources Association (ELRA).

    Abstract

    Arbil is an application designed to create and manage metadata for research data and to arrange this data into a structure appropriate for archiving. The metadata is displayed in tables, which allows an overview of the metadata and the ability to populate and update many metadata sections in bulk. Both IMDI and Clarin metadata formats are supported and Arbil has been designed as a local application so that it can also be used offline, for instance in remote field sites. The metadata can be entered in any order or at any stage that the user is able; once the metadata and its data are ready for archiving and an Internet connection is available it can be exported from Arbil and in the case of IMDI it can then be transferred to the main archive via LAMUS (archive management and upload system).
  • Wittek, A. (1998). Learning verb meaning via adverbial modification: Change-of-state verbs in German and the adverb "wieder" again. In A. Greenhill, M. Hughes, H. Littlefield, & H. Walsh (Eds.), Proceedings of the 22nd Annual Boston University Conference on Language Development (pp. 779-790). Somerville, MA: Cascadilla Press.
  • Wittenburg, P., Lenkiewicz, P., Auer, E., Gebre, B. G., Lenkiewicz, A., & Drude, S. (2012). AV Processing in eHumanities - a paradigm shift. In J. C. Meister (Ed.), Digital Humanities 2012 Conference Abstracts. University of Hamburg, Germany; July 16–22, 2012 (pp. 538-541).

    Abstract

    Introduction Speech research saw a dramatic change in paradigm in the 90-ies. While earlier the discussion was dominated by a phoneticians’ approach who knew about phenomena in the speech signal, the situation completely changed after stochastic machinery such as Hidden Markov Models [1] and Artificial Neural Networks [2] had been introduced. Speech processing was now dominated by a purely mathematic approach that basically ignored all existing knowledge about the speech production process and the perception mechanisms. The key was now to construct a large enough training set that would allow identifying the many free parameters of such stochastic engines. In case that the training set is representative and the annotations of the training sets are widely ‘correct’ we could assume to get a satisfyingly functioning recognizer. While the success of knowledge-based systems such as Hearsay II [3] was limited, the statistically based approach led to great improvements in recognition rates and to industrial applications.
  • Wittenburg, P., Drude, S., & Broeder, D. (2012). Psycholinguistik. In H. Neuroth, S. Strathmann, A. Oßwald, R. Scheffel, J. Klump, & J. Ludwig (Eds.), Langzeitarchivierung von Forschungsdaten. Eine Bestandsaufnahme (pp. 83-108). Boizenburg: Verlag Werner Hülsbusch.

    Abstract

    5.1 Einführung in den Forschungsbereich Die Psycholinguistik ist der Bereich der Linguistik, der sich mit dem Zusammenhang zwischen menschlicher Sprache und dem Denken und anderen mentalen Prozessen beschäftigt, d.h. sie stellt sich einer Reihe von essentiellen Fragen wie etwa (1) Wie schafft es unser Gehirn, im Wesentlichen akustische und visuelle kommunikative Informationen zu verstehen und in mentale Repräsentationen umzusetzen? (2) Wie kann unser Gehirn einen komplexen Sachverhalt, den wir anderen übermitteln wollen, in eine von anderen verarbeitbare Sequenz von verbalen und nonverbalen Aktionen umsetzen? (3) Wie gelingt es uns, in den verschiedenen Phasen des Lebens Sprachen zu erlernen? (4) Sind die kognitiven Prozesse der Sprachverarbeitung universell, obwohl die Sprachsysteme derart unterschiedlich sind, dass sich in den Strukturen kaum Universalien finden lassen?
  • Wnuk, E., & Majid, A. (2012). Olfaction in a hunter-gatherer society: Insights from language and culture. In N. Miyake, D. Peebles, & R. P. Cooper (Eds.), Proceedings of the 34th Annual Meeting of the Cognitive Science Society (CogSci 2012) (pp. 1155-1160). Austin, TX: Cognitive Science Society.

    Abstract

    According to a widely-held view among various scholars, olfaction is inferior to other human senses. It is also believed by many that languages do not have words for describing smells. Data collected among the Maniq, a small population of nomadic foragers in southern Thailand, challenge the above claims and point to a great linguistic and cultural elaboration of odor. This article presents evidence of the importance of olfaction in indigenous rituals and beliefs, as well as in the lexicon. The results demonstrate the richness and complexity of the domain of smell in Maniq society and thereby challenge the universal paucity of olfactory terms and insignificance of olfaction for humans.
  • Wood, N. (2009). Field recording for dummies. In A. Majid (Ed.), Field manual volume 12 (pp. V). Nijmegen: Max Planck Institute for Psycholinguistics.
  • Xiao, M., Kong, X., Liu, J., & Ning, J. (2009). TMBF: Bloom filter algorithms of time-dependent multi bit-strings for incremental set. In Proceedings of the 2009 International Conference on Ultra Modern Telecommunications & Workshops.

    Abstract

    Set is widely used as a kind of basic data structure. However, when it is used for large scale data set the cost of storage, search and transport is overhead. The bloom filter uses a fixed size bit string to represent elements in a static set, which can reduce storage space and search cost that is a fixed constant. The time-space efficiency is achieved at the cost of a small probability of false positive in membership query. However, for many applications the space savings and locating time constantly outweigh this drawback. Dynamic bloom filter (DBF) can support concisely representation and approximate membership queries of dynamic set instead of static set. It has been proved that DBF not only possess the advantage of standard bloom filter, but also has better features when dealing with dynamic set. This paper proposes a time-dependent multiple bit-strings bloom filter (TMBF) which roots in the DBF and targets on dynamic incremental set. TMBF uses multiple bit-strings in time order to present a dynamic increasing set and uses backward searching to test whether an element is in a set. Based on the system logs from a real P2P file sharing system, the evaluation shows a 20% reduction in searching cost compared to DBF.
  • Zampieri, M., & Gebre, B. G. (2012). Automatic identification of language varieties: The case of Portuguese. In J. Jancsary (Ed.), Proceedings of the Conference on Natural Language Processing 2012, September 19-21, 2012, Vienna (pp. 233-237). Vienna: Österreichischen Gesellschaft für Artificial Intelligende (ÖGAI).

    Abstract

    Automatic Language Identification of written texts is a well-established area of research in Computational Linguistics. State-of-the-art algorithms often rely on n-gram character models to identify the correct language of texts, with good results seen for European languages. In this paper we propose the use of a character n-gram model and a word n-gram language model for the automatic classification of two written varieties of Portuguese: European and Brazilian. Results reached 0.998 for accuracy using character 4-grams.
  • Zampieri, M., Gebre, B. G., & Diwersy, S. (2012). Classifying pluricentric languages: Extending the monolingual model. In Proceedings of SLTC 2012. The Fourth Swedish Language Technology Conference. Lund, October 24-26, 2012 (pp. 79-80). Lund University.

    Abstract

    This study presents a new language identification model for pluricentric languages that uses n-gram language models at the character and word level. The model is evaluated in two steps. The first step consists of the identification of two varieties of Spanish (Argentina and Spain) and two varieties of French (Quebec and France) evaluated independently in binary classification schemes. The second step integrates these language models in a six-class classification with two Portuguese varieties.
  • Zwitserlood, I. (2012). Classifiers. In R. Pfau, M. Steinbach, & B. Woll (Eds.), Sign Language: an International Handbook (pp. 158-186). Berlin: Mouton de Gruyter.

    Abstract

    Classifiers (currently also called 'depicting handshapes'), are observed in almost all signed languages studied to date and form a well-researched topic in sign language linguistics. Yet, these elements are still subject to much debate with respect to a variety of matters. Several different categories of classifiers have been posited on the basis of their semantics and the linguistic context in which they occur. The function(s) of classifiers are not fully clear yet. Similarly, there are differing opinions regarding their structure and the structure of the signs in which they appear. Partly as a result of comparison to classifiers in spoken languages, the term 'classifier' itself is under debate. In contrast to these disagreements, most studies on the acquisition of classifier constructions seem to consent that these are difficult to master for Deaf children. This article presents and discusses all these issues from the viewpoint that classifiers are linguistic elements.

Share this page