Publications

Displaying 1 - 100 of 162
  • Adank, P., Smits, R., & Van Hout, R. (2003). Modeling perceived vowel height, advancement, and rounding. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 647-650). Adelaide: Causal Productions.
  • Agirrezabal, M., Paggio, P., Navarretta, C., & Jongejan, B. (2023). Multimodal detection and classification of head movements in face-to-face conversations: Exploring models, features and their interaction. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527200.

    Abstract

    In this work we perform multimodal detection and classification
    of head movements from face to face video conversation data.
    We have experimented with different models and feature sets
    and provided some insight on the effect of independent features,
    but also how their interaction can enhance a head movement
    classifier. Used features include nose, neck and mid hip position
    coordinates and their derivatives together with acoustic features,
    namely, intensity and pitch of the speaker on focus. Results
    show that when input features are sufficiently processed by in-
    teracting with each other, a linear classifier can reach a similar
    performance to a more complex non-linear neural model with
    several hidden layers. Our best models achieve state-of-the-art
    performance in the detection task, measured by macro-averaged
    F1 score.
  • Allen, S. E. M. (1997). Towards a discourse-pragmatic explanation for the subject-object asymmetry in early null arguments. In NET-Bulletin 1997 (pp. 1-16). Amsterdam, The Netherlands: Instituut voor Functioneel Onderzoek van Taal en Taalgebruik (IFOTT).
  • Almeida, L., Amdal, I., Beires, N., Boualem, M., Boves, L., Den Os, E., Filoche, P., Gomes, R., Knudsen, J. E., Kvale, K., Rugelbak, J., Tallec, C., & Warakagoda, N. (2002). Implementing and evaluating a multimodal tourist guide. In J. v. Kuppevelt, L. Dybkjær, & N. Bernsen (Eds.), Proceedings of the International CLASS Workshop on Natural, Intelligent and Effective Interaction in Multimodal Dialogue System (pp. 1-7). Copenhagen: Kluwer.
  • Bauer, B. L. M. (1997). The adjective in Italic and Romance: Genetic or areal factors affecting word order patterns?”. In B. Palek (Ed.), Proceedings of LP'96: Typology: Prototypes, item orderings and universals (pp. 295-306). Prague: Charles University Press.
  • Bauer, B. L. M. (2003). The adverbial formation in mente in Vulgar and Late Latin: A problem in grammaticalization. In H. Solin, M. Leiwo, & H. Hallo-aho (Eds.), Latin vulgaire, latin tardif VI (pp. 439-457). Hildesheim: Olms.
  • Bavin, E. L., & Kidd, E. (2000). Learning new verbs: Beyond the input. In C. Davis, T. J. Van Gelder, & R. Wales (Eds.), Cognitive Science in Australia, 2000: Proceedings of the Fifth Biennial Conference of the Australasian Cognitive Science Society.
  • Bohnemeyer, J. (1997). Yucatec Mayan Lexicalization Patterns in Time and Space. In M. Biemans, & J. van de Weijer (Eds.), Proceedings of the CLS opening of the academic year '97-'98. Tilburg, The Netherlands: University Center for Language Studies.
  • Böttner, M. (1997). Visiting some relatives of Peirce's. In 3rd International Seminar on The use of Relational Methods in Computer Science.

    Abstract

    The notion of relational grammar is extented to ternary relations and illustrated by a fragment of English. Some of Peirce's terms for ternary relations are shown to be incorrect and corrected.
  • Bowerman, M., Brown, P., Eisenbeiss, S., Narasimhan, B., & Slobin, D. I. (2002). Putting things in places: Developmental consequences of linguistic typology. In E. V. Clark (Ed.), Proceedings of the 31st Stanford Child Language Research Forum. Space in language location, motion, path, and manner (pp. 1-29). Stanford: Center for the Study of Language & Information.

    Abstract

    This study explores how adults and children describe placement events (e.g., putting a book on a table) in a range of different languages (Finnish, English, German, Russian, Hindi, Tzeltal Maya, Spanish, and Turkish). Results show that the eight languages grammatically encode placement events in two main ways (Talmy, 1985, 1991), but further investigation reveals fine-grained crosslinguistic variation within each of the two groups. Children are sensitive to these finer-grained characteristics of the input language at an early age, but only when such features are perceptually salient. Our study demonstrates that a unitary notion of 'event' does not suffice to characterize complex but systematic patterns of event encoding crosslinguistically, and that children are sensitive to multiple influences, including the distributional properties of the target language, in constructing these patterns in their own speech.
  • Broeder, D., Offenga, F., & Willems, D. (2002). Metadata tools supporting controlled vocabulary services. In M. Rodriguez González, & C. Paz SuárezR Araujo (Eds.), Third international conference on language resources and evaluation (pp. 1055-1059). Paris: European Language Resources Association.

    Abstract

    Within the ISLE Metadata Initiative (IMDI) project a user-friendly editor to enter metadata descriptions and a browser operating on the linked metadata descriptions were developed. Both tools support the usage of Controlled Vocabulary (CV) repositories by means of the specification of an URL where the formal CV definition data is available.
  • Broeder, D., Wittenburg, P., Declerck, T., & Romary, L. (2002). LREP: A language repository exchange protocol. In M. Rodriguez González, & C. Paz Suárez Araujo (Eds.), Third international conference on language resources and evaluation (pp. 1302-1305). Paris: European Language Resources Association.

    Abstract

    The recent increase in the number and complexity of the language resources available on the Internet is followed by a similar increase of available tools for linguistic analysis. Ideally the user does not need to be confronted with the question in how to match tools with resources. If resource repositories and tool repositories offer adequate metadata information and a suitable exchange protocol is developed this matching process could be performed (semi-) automatically.
  • Broersma, M. (2002). Comprehension of non-native speech: Inaccurate phoneme processing and activation of lexical competitors. In ICSLP-2002 (pp. 261-264). Denver: Center for Spoken Language Research, U. of Colorado Boulder.

    Abstract

    Native speakers of Dutch with English as a second language and native speakers of English participated in an English lexical decision experiment. Phonemes in real words were replaced by others from which they are hard to distinguish for Dutch listeners. Non-native listeners judged the resulting near-words more often as a word than native listeners. This not only happened when the phonemes that were exchanged did not exist as separate phonemes in the native language Dutch, but also when phoneme pairs that do exist in Dutch were used in word-final position, where they are not distinctive in Dutch. In an English bimodal priming experiment with similar groups of participants, word pairs were used which differed in one phoneme. These phonemes were hard to distinguish for the non-native listeners. Whereas in native listening both words inhibited each other, in non-native listening presentation of one word led to unresolved competition between both words. The results suggest that inaccurate phoneme processing by non-native listeners leads to the activation of spurious lexical competitors.
  • Brugman, H., Levinson, S. C., Skiba, R., & Wittenburg, P. (2002). The DOBES archive: It's purpose and implementation. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 11-11). Paris: European Language Resources Association.
  • Brugman, H., Spenke, H., Kramer, M., & Klassmann, A. (2002). Multimedia annotation with multilingual input methods and search support.
  • Brugman, H., Wittenburg, P., Levinson, S. C., & Kita, S. (2002). Multimodal annotations in gesture and sign language studies. In M. Rodriguez González, & C. Paz Suárez Araujo (Eds.), Third international conference on language resources and evaluation (pp. 176-182). Paris: European Language Resources Association.

    Abstract

    For multimodal annotations an exhaustive encoding system for gestures was developed to facilitate research. The structural requirements of multimodal annotations were analyzed to develop an Abstract Corpus Model which is the basis for a powerful annotation and exploitation tool for multimedia recordings and the definition of the XML-based EUDICO Annotation Format. Finally, a metadata-based data management environment has been setup to facilitate resource discovery and especially corpus management. Bt means of an appropriate digitization policy and their online availability researchers have been able to build up a large corpus covering gesture and sign language data.
  • Butterfield, S., & Cutler, A. (1988). Segmentation errors by human listeners: Evidence for a prosodic segmentation strategy. In W. Ainsworth, & J. Holmes (Eds.), Proceedings of SPEECH ’88: Seventh Symposium of the Federation of Acoustic Societies of Europe: Vol. 3 (pp. 827-833). Edinburgh: Institute of Acoustics.
  • Cablitz, G. (2002). The acquisition of an absolute system: learning to talk about space in Marquesan (Oceanic, French Polynesia). In E. V. Clark (Ed.), Space in language location, motion, path, and manner (pp. 40-49). Stanford: Center for the Study of Language & Information (Electronic proceedings.
  • Caplan, S., Peng, M. Z., Zhang, Y., & Yu, C. (2023). Using an Egocentric Human Simulation Paradigm to quantify referential and semantic ambiguity in early word learning. In M. Goldwater, F. K. Anggoro, B. K. Hayes, & D. C. Ong (Eds.), Proceedings of the 45th Annual Meeting of the Cognitive Science Society (CogSci 2023) (pp. 1043-1049).

    Abstract

    In order to understand early word learning we need to better understand and quantify properties of the input that young children receive. We extended the human simulation paradigm (HSP) using egocentric videos taken from infant head-mounted cameras. The videos were further annotated with gaze information indicating in-the-moment visual attention from the infant. Our new HSP prompted participants for two types of responses, thus differentiating referential from semantic ambiguity in the learning input. Consistent with findings on visual attention in word learning, we find a strongly bimodal distribution over HSP accuracy. Even in this open-ended task, most videos only lead to a small handful of common responses. What's more, referential ambiguity was the key bottleneck to performance: participants can nearly always recover the exact word that was said if they identify the correct referent. Finally, analysis shows that adult learners relied on particular, multimodal behavioral cues to infer those target referents.
  • Chen, A. (2003). Language dependence in continuation intonation. In M. Solé, D. Recasens, & J. Romero (Eds.), Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS.) (pp. 1069-1072). Rundle Mall, SA, Austr.: Causal Productions Pty.
  • Chen, A., Gussenhoven, C., & Rietveld, T. (2002). Language-specific uses of the effort code. In B. Bel, & I. Marlien (Eds.), Proceedings of the 1st Conference on Speech Prosody (pp. 215-218). Aix=en-Provence: Université de Provence.

    Abstract

    Two groups of listeners with Dutch and British English language backgrounds judged Dutch and British English utterances, respectively, which varied in the intonation contour on the scales EMPHATIC vs. NOT EMPHATIC and SURPRISED vs. NOT SURPRISED, two meanings derived from the Effort Code. The stimuli, which differed in sentence mode but were otherwise lexically equivalent, were varied in peak height, peak alignment, end pitch, and overall register. In both languages, there are positive correlations between peak height and degree of emphasis, between peak height and degree of surprise, between peak alignment and degree of surprise, and between pitch register and degree of surprise. However, in all these cases, Dutch stimuli lead to larger perceived meaning differences than the British English stimuli. This difference in the extent to which increased pitch height triggers increases in perceived emphasis and surprise is argued to be due to the difference in the standard pitch ranges between Dutch and British English. In addition, we found a positive correlation between pitch register and the degree of emphasis in Dutch, but a negative correlation in British English. This is an unexpected difference, which illustrates a case of ambiguity in the meaning of pitch.
  • Chen, A. (2003). Reaction time as an indicator to discrete intonational contrasts in English. In Proceedings of Eurospeech 2003 (pp. 97-100).

    Abstract

    This paper reports a perceptual study using a semantically motivated identification task in which we investigated the nature of two pairs of intonational contrasts in English: (1) normal High accent vs. emphatic High accent; (2) early peak alignment vs. late peak alignment. Unlike previous inquiries, the present study employs an on-line method using the Reaction Time measurement, in addition to the measurement of response frequencies. Regarding the peak height continuum, the mean RTs are shortest for within-category identification but longest for across-category identification. As for the peak alignment contrast, no identification boundary emerges and the mean RTs only reflect a difference between peaks aligned with the vowel onset and peaks aligned elsewhere. We conclude that the peak height contrast is discrete but the previously claimed discreteness of the peak alignment contrast is not borne out.
  • Chevrefils, L., Morgenstern, A., Beaupoil-Hourdel, P., Bedoin, D., Caët, S., Danet, C., Danino, C., De Pontonx, S., & Parisse, C. (2023). Coordinating eating and languaging: The choreography of speech, sign, gesture and action in family dinners. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527183.

    Abstract

    In this study, we analyze one French signing and one French speaking family’s interaction during dinner. The families composed of two parents and two children aged 3 to 11 were filmed with three cameras to capture all family members’ behaviors. The three videos per dinner were synchronized and coded on ELAN. We annotated all participants’ acting, and languaging.
    Our quantitative analyses show how family members collaboratively manage multiple streams of activity through the embodied performances of dining and interacting. We uncover different profiles according to participants’ modality of expression and status (focusing on the mother and the younger child). The hearing participants’ co-activity management illustrates their monitoring of dining and conversing and how they progressively master the affordances of the visual and vocal channels to maintain the simultaneity of the two activities. The deaf mother skillfully manages to alternate smoothly between dining and interacting. The deaf younger child manifests how she is in the process of developing her skills to manage multi-activity. Our qualitative analyses focus on the ecology of visual-gestural and audio-vocal languaging in the context of co-activity according to language and participant. We open new perspectives on the management of gaze and body parts in multimodal languaging.
  • Cho, T. (2003). Lexical stress, phrasal accent and prosodic boundaries in the realization of domain-initial stops in Dutch. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhs 2003) (pp. 2657-2660). Adelaide: Causal Productions.

    Abstract

    This study examines the effects of prosodic boundaries, lexical stress, and phrasal accent on the acoustic realization of stops (/t, d/) in Dutch, with special attention paid to language-specificity in the phonetics-prosody interface. The results obtained from various acoustic measures show systematic phonetic variations in the production of /t d/ as a function of prosodic position, which may be interpreted as being due to prosodicallyconditioned articulatory strengthening. Shorter VOTs were found for the voiceless stop /t/ in prosodically stronger locations (as opposed to longer VOTs in this position in English). The results suggest that prosodically-driven phonetic realization is bounded by a language-specific phonological feature system.
  • Crago, M. B., & Allen, S. E. M. (1997). Linguistic and cultural aspects of simplicity and complexity in Inuktitut child directed speech. In E. Hughes, M. Hughes, & A. Greenhill (Eds.), Proceedings of the 21st annual Boston University Conference on Language Development (pp. 91-102).
  • Cutler, A., Murty, L., & Otake, T. (2003). Rhythmic similarity effects in non-native listening? In Proceedings of the 15th International Congress of Phonetic Sciences (PCPhS 2003) (pp. 329-332). Adelaide: Causal Productions.

    Abstract

    Listeners rely on native-language rhythm in segmenting speech; in different languages, stress-, syllable- or mora-based rhythm is exploited. This language-specificity affects listening to non- native speech, if native procedures are applied even though inefficient for the non-native language. However, speakers of two languages with similar rhythmic interpretation should segment their own and the other language similarly. This was observed to date only for related languages (English-Dutch; French-Spanish). We now report experiments in which Japanese listeners heard Telugu, a Dravidian language unrelated to Japanese, and Telugu listeners heard Japanese. In both cases detection of target sequences in speech was harder when target boundaries mismatched mora boundaries, exactly the pattern that Japanese listeners earlier exhibited with Japanese and other languages. These results suggest that Telugu and Japanese listeners use similar procedures in segmenting speech, and support the idea that languages fall into rhythmic classes, with aspects of phonological structure affecting listeners' speech segmentation.
  • Cutler, A., McQueen, J. M., Jansonius, M., & Bayerl, S. (2002). The lexical statistics of competitor activation in spoken-word recognition. In C. Bow (Ed.), Proceedings of the 9th Australian International Conference on Speech Science and Technology (pp. 40-45). Canberra: Australian Speech Science and Technology Association (ASSTA).

    Abstract

    The Possible Word Constraint is a proposed mechanism whereby listeners avoid recognising words spuriously embedded in other words. It applies to words leaving a vowelless residue between their edge and the nearest known word or syllable boundary. The present study tests the usefulness of this constraint via lexical statistics of both English and Dutch. The analyses demonstrate that the constraint removes a clear majority of embedded words in speech, and thus can contribute significantly to the efficiency of human speech recognition
  • Cutler, A. (1994). How human speech recognition is affected by phonological diversity among languages. In R. Togneri (Ed.), Proceedings of the fifth Australian International Conference on Speech Science and Technology: Vol. 1 (pp. 285-288). Canberra: Australian Speech Science and Technology Association.

    Abstract

    Listeners process spoken language in ways which are adapted to the phonological structure of their native language. As a consequence, non-native speakers do not listen to a language in the same way as native speakers; moreover, listeners may use their native language listening procedures inappropriately with foreign input. With sufficient experience, however, it may be possible to inhibit this latter (counter-productive) behavior.
  • Cutler, A., & Butterfield, S. (1989). Natural speech cues to word segmentation under difficult listening conditions. In J. Tubach, & J. Mariani (Eds.), Proceedings of Eurospeech 89: European Conference on Speech Communication and Technology: Vol. 2 (pp. 372-375). Edinburgh: CEP Consultants.

    Abstract

    One of a listener's major tasks in understanding continuous speech is segmenting the speech signal into separate words. When listening conditions are difficult, speakers can help listeners by deliberately speaking more clearly. In three experiments, we examined how word boundaries are produced in deliberately clear speech. We found that speakers do indeed attempt to mark word boundaries; moreover, they differentiate between word boundaries in a way which suggests they are sensitive to listener needs. Application of heuristic segmentation strategies makes word boundaries before strong syllables easiest for listeners to perceive; but under difficult listening conditions speakers pay more attention to marking word boundaries before weak syllables, i.e. they mark those boundaries which are otherwise particularly hard to perceive.
  • Cutler, A., & Koster, M. (2000). Stress and lexical activation in Dutch. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the Sixth International Conference on Spoken Language Processing: Vol. 1 (pp. 593-596). Beijing: China Military Friendship Publish.

    Abstract

    Dutch listeners were slower to make judgements about the semantic relatedness between a spoken target word (e.g. atLEET, 'athlete') and a previously presented visual prime word (e.g. SPORT 'sport') when the spoken word was mis-stressed. The adverse effect of mis-stressing confirms the role of stress information in lexical recognition in Dutch. However, although the erroneous stress pattern was always initially compatible with a competing word (e.g. ATlas, 'atlas'), mis-stressed words did not produced high false alarm rates in unrelated pairs (e.g. SPORT - atLAS). This suggests that stress information did not completely rule out segmentally matching but suprasegmentally mismatching words, a finding consistent with spoken-word recognition models involving multiple activation and inter-word competition.
  • Cutler, A., & Young, D. (1994). Rhythmic structure of word blends in English. In Proceedings of the Third International Conference on Spoken Language Processing (pp. 1407-1410). Kobe: Acoustical Society of Japan.

    Abstract

    Word blends combine fragments from two words, either in speech errors or when a new word is created. Previous work has demonstrated that in Japanese, such blends preserve moraic structure; in English they do not. A similar effect of moraic structure is observed in perceptual research on segmentation of continuous speech in Japanese; English listeners, by contrast, exploit stress units in segmentation, suggesting that a general rhythmic constraint may underlie both findings. The present study examined whether mis parallel would also hold for word blends. In spontaneous English polysyllabic blends, the source words were significantly more likely to be split before a strong than before a weak (unstressed) syllable, i.e. to be split at a stress unit boundary. In an experiment in which listeners were asked to identify the source words of blends, significantly more correct detections resulted when splits had been made before strong syllables. Word blending, like speech segmentation, appears to be constrained by language rhythm.
  • Cutler, A., McQueen, J. M., Baayen, R. H., & Drexler, H. (1994). Words within words in a real-speech corpus. In R. Togneri (Ed.), Proceedings of the 5th Australian International Conference on Speech Science and Technology: Vol. 1 (pp. 362-367). Canberra: Australian Speech Science and Technology Association.

    Abstract

    In a 50,000-word corpus of spoken British English the occurrence of words embedded within other words is reported. Within-word embedding in this real speech sample is common, and analogous to the extent of embedding observed in the vocabulary. Imposition of a syllable boundary matching constraint reduces but by no means eliminates spurious embedding. Embedded words are most likely to overlap with the beginning of matrix words, and thus may pose serious problems for speech recognisers.
  • Cutler, A., Norris, D., & McQueen, J. M. (2000). Tracking TRACE’s troubles. In A. Cutler, J. M. McQueen, & R. Zondervan (Eds.), Proceedings of SWAP (Workshop on Spoken Word Access Processes) (pp. 63-66). Nijmegen: Max-Planck-Institute for Psycholinguistics.

    Abstract

    Simulations explored the inability of the TRACE model of spoken-word recognition to model the effects on human listening of acoustic-phonetic mismatches in word forms. The source of TRACE's failure lay not in its interactive connectivity, not in the presence of interword competition, and not in the use of phonemic representations, but in the need for continuously optimised interpretation of the input. When an analogue of TRACE was allowed to cycle to asymptote on every slice of input, an acceptable simulation of the subcategorical mismatch data was achieved. Even then, however, the simulation was not as close as that produced by the Merge model.
  • Declerck, T., Cunningham, H., Saggion, H., Kuper, J., Reidsma, D., & Wittenburg, P. (2003). MUMIS - Advanced information extraction for multimedia indexing and searching digital media - Processing for multimedia interactive services. 4th European Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS), 553-556.
  • Dimroth, C., & Lasser, I. (Eds.). (2002). Finite options: How L1 and L2 learners cope with the acquisition of finiteness [Special Issue]. Linguistics, 40(4).
  • Drude, S. (2003). Advanced glossing: A language documentation format and its implementation with Shoebox. In Proceedings of the 2002 International Conference on Language Resources and Evaluation (LREC 2002). Paris: ELRA.

    Abstract

    This paper presents Advanced Glossing, a proposal for a general glossing format designed for language documentation, and a specific setup for the Shoebox-program that implements Advanced Glossing to a large extent. Advanced Glossing (AG) goes beyond the traditional Interlinear Morphemic Translation, keeping syntactic and morphological information apart from each other in separate glossing tables. AG provides specific lines for different kinds of annotation – phonetic, phonological, orthographical, prosodic, categorial, structural, relational, and semantic, and it allows for gradual and successive, incomplete, and partial filling in case that some information may be irrelevant, unknown or uncertain. The implementation of AG in Shoebox sets up several databases. Each documented text is represented as a file of syntactic glossings. The morphological glossings are kept in a separate database. As an additional feature interaction with lexical databases is possible. The implementation makes use of the interlinearizing automatism provided by Shoebox, thus obtaining the table format for the alignment of lines in cells, and for semi-automatic filling-in of information in glossing tables which has been extracted from databases
  • Drude, S. (2003). Digitizing and annotating texts and field recordings in the Awetí project. In Proceedings of the EMELD Language Digitization Project Conference 2003. Workshop on Digitizing and Annotating Text and Field Recordings, LSA Institute, Michigan State University, July 11th -13th.

    Abstract

    Digitizing and annotating texts and field recordings Given that several initiatives worldwide currently explore the new field of documentation of endangered languages, the E-MELD project proposes to survey and unite procedures, techniques and results in order to achieve its main goal, ''the formulation and promulgation of best practice in linguistic markup of texts and lexicons''. In this context, this year's workshop deals with the processing of recorded texts. I assume the most valuable contribution I could make to the workshop is to show the procedures and methods used in the Awetí Language Documentation Project. The procedures applied in the Awetí Project are not necessarily representative of all the projects in the DOBES program, and they may very well fall short in several respects of being best practice, but I hope they might provide a good and concrete starting point for comparison, criticism and further discussion. The procedures to be exposed include: * taping with digital devices, * digitizing (preliminarily in the field, later definitely by the TIDEL-team at the Max Planck Institute in Nijmegen), * segmenting and transcribing, using the transcriber computer program, * translating (on paper, or while transcribing), * adding more specific annotation, using the Shoebox program, * converting the annotation to the ELAN-format developed by the TIDEL-team, and doing annotation with ELAN. Focus will be on the different types of annotation. Especially, I will present, justify and discuss Advanced Glossing, a text annotation format developed by H.-H. Lieb and myself designed for language documentation. It will be shown how Advanced Glossing can be applied using the Shoebox program. The Shoebox setup used in the Awetí Project will be shown in greater detail, including lexical databases and semi-automatic interaction between different database types (jumping, interlinearization). ( Freie Universität Berlin and Museu Paraense Emílio Goeldi, with funding from the Volkswagen Foundation.)
  • Duffield, N., & Matsuo, A. (2003). Factoring out the parallelism effect in ellipsis: An interactional approach? In J. Chilar, A. Franklin, D. Keizer, & I. Kimbara (Eds.), Proceedings of the 39th Annual Meeting of the Chicago Linguistic Society (CLS) (pp. 591-603). Chicago: Chicago Linguistics Society.

    Abstract

    Traditionally, there have been three standard assumptions made about the Parallelism Effect on VP-ellipsis, namely that the effect is categorical, that it applies asymmetrically and that it is uniquely due to syntactic factors. Based on the results of a series of experiments involving online and offline tasks, it will be argued that the Parallelism Effect is instead noncategorical and interactional. The factors investigated include construction type, conceptual and morpho-syntactic recoverability, finiteness and anaphor type (to test VP-anaphora). The results show that parallelism is gradient rather than categorical, effects both VP-ellipsis and anaphora, and is influenced by both structural and non-structural factors.
  • Enfield, N. J. (2002). Parallel innovation and 'coincidence' in linguistic areas: On a bi-clausal extent/result constructions of mainland Southeast Asia. In P. Chew (Ed.), Proceedings of the 28th meeting of the Berkeley Linguistics Society. Special session on Tibeto-Burman and Southeast Asian linguistics (pp. 121-128). Berkeley: Berkeley Linguistics Society.
  • Enfield, N. J., & Evans, G. (2000). Transcription as standardisation: The problem of Tai languages. In S. Burusphat (Ed.), Proceedings: the International Conference on Tai Studies, July 29-31, 1998, (pp. 201-212). Bangkok, Thailand: Institute of Language and Culture for Rural Development, Mahidol University.
  • Ferré, G. (2023). Pragmatic gestures and prosody. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527215.

    Abstract

    The study presented here focuses on two pragmatic gestures:
    the hand flip (Ferré, 2011), a gesture of the Palm Up Open
    Hand/PUOH family (Müller, 2004) and the closed hand which
    can be considered as the opposite kind of movement to the open-
    ing of the hands present in the PUOH gesture. Whereas one of
    the functions of the hand flip has been described as presenting
    a new point in speech (Cienki, 2021), the closed hand gesture
    has not yet been described in the literature to the best of our
    knowledge. It can however be conceived of as having the oppo-
    site function of announcing the end of a point in discourse. The
    object of the present study is therefore to determine, with the
    study of prosodic features, if the two gestures are found in the
    same type of speech units and what their respective scope is.
    Drawing from a corpus of three TED Talks in French the
    prosodic characteristics of the speech that accompanies the two
    gestures will be examined. The hypothesis developed in the
    present paper is that their scope should be reflected in the
    prosody of accompanying speech, especially pitch key, tone,
    and relative pitch range. The prediction is that hand flips and
    closing hand gestures are expected to be located at the periph-
    ery of Intonation Phrases (IPs), Inter-Pausal Units (IPUs) or
    more conversational Turn Constructional Units (TCUs), and are
    likely to be co-occurrent with pauses in speech. But because of
    the natural slope of intonation in speech, the speech that accom-
    pany early gestures in Intonation Phrases should reveal different
    features from the speech at the end of intonational units. Tones
    should be different as well, considering the prosodic structure
    of spoken French.
  • Gamba, M., Raimondi, T., De Gregorio, C., Valente, D., Carugati, F., Cristiano, W., Ferrario, V., Torti, V., Favaro, L., Friard, O., Giacoma, C., & Ravignani, A. (2023). Rhythmic categories across primate vocal displays. In A. Astolfi, F. Asdrubali, & L. Shtrepi (Eds.), Proceedings of the 10th Convention of the European Acoustics Association Forum Acusticum 2023 (pp. 3971-3974). Torino: European Acoustics Association.

    Abstract

    The last few years have revealed that several species may share the building blocks of Musicality with humans. The recognition of these building blocks (e.g., rhythm, frequency variation) was a necessary impetus for a new round of studies investigating rhythmic variation in animal vocal displays. Singing primates are a small group of primate species that produce modulated songs ranging from tens to thousands of vocal units. Previous studies showed that the indri, the only singing lemur, is currently the only known species that perform duet and choruses showing multiple rhythmic categories, as seen in human music. Rhythmic categories occur when temporal intervals between note onsets are not uniformly distributed, and rhythms with a small integer ratio between these intervals are typical of human music. Besides indris, white-handed gibbons and three crested gibbon species showed a prominent rhythmic category corresponding to a single small integer ratio, isochrony. This study reviews previous evidence on the co-occurrence of rhythmic categories in primates and focuses on the prospects for a comparative, multimodal study of rhythmicity in this clade.
  • Green, K., Osei-Cobbina, C., Perlman, M., & Kita, S. (2023). Infants can create different types of iconic gestures, with and without parental scaffolding. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527188.

    Abstract

    Despite the early emergence of pointing, children are generally not documented to produce iconic gestures until later in development. Although research has described this developmental trajectory and the types of iconic gestures that emerge first, there has been limited focus on iconic gestures within interactional contexts. This study identified the first 10 iconic gestures produced by five monolingual English-speaking children in a naturalistic longitudinal video corpus and analysed the interactional contexts. We found children produced their first iconic gesture between 12 and 20 months and that gestural types varied. Although 34% of gestures could have been imitated or derived from adult or child actions in the preceding context, the majority were produced independently of any observed model. In these cases, adults often led the interaction in a direction where iconic gesture was an appropriate response. Overall, we find infants can represent a referent symbolically and possess a greater capacity for innovation than previously assumed. In order to develop our understanding of how children learn to produce iconic gestures, it is important to consider the immediate interactional context. Conducting naturalistic corpus analyses could be a more ecologically valid approach to understanding how children learn to produce iconic gestures in real life contexts.
  • Guirardello-Damian, R., & Skiba, R. (2002). Trumai Corpus: An example of presenting multi-media data in the IMDI-browser. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 16-1-16-8). Paris: European Language Resources Association.

    Abstract

    Trumai, a genetically isolated language spoken in Brazil (Xingu reserve), is an example of an endangered language. Although the Trumai population consists of more than 100 individuals, only 51 people speak the language. The oral traditions are progressively dying. Given the current scenario, the documentation of this language and its cultural aspects is of great importance. In the framework of the DoBeS program (Documentation of Endangered Languages), the project "Documentation of Trumai" has selected and organized a collection of Trumai texts, with a multi-media representation of the corpus. Several kinds of information and data types are being included in the archive of the language: texts with audio and video recordings; written texts from educational materials; drawings; photos; songs; annotations in different formats; lexicon; field notes; results from scientific studies of the language (sound system, sketch grammar, comparative studies with other Xinguan languages), etc. All materials are integrated into the IMDI-Browser, a specialized tool for presenting and searching for linguistic data. This paper explores the processing phases and the results of the Trumai project taking into consideration the issue of how to combine the needs and wishes of field linguistics (content and research aspects) and the needs of archiving (structure and workflow aspects) in a well-organized corpus.
  • Gulrajani, G., & Harrison, D. (2002). SHAWEL: Sharable and interactive web-lexicons. In P. Austin, H. Dry, & P. Wittenburg (Eds.), Proceedings of the international LREC workshop on resources and tools in field linguistics (pp. 9-1-9-4). Paris: European Language Resources Association.

    Abstract

    A prototypical lexicon tool was implemented which was intended to allow researchers to collaboratively create lexicons of endangered languages. Increasingly often researchers documenting or analyzing a language work at different locations. Lexicons that evolve through continuous interaction between the collaborators can only be efficiently produced when it can be accessed and manipulated via the Internet. The SHAWEL tool was developed to address these needs; it makes use of a thin Java client and a central database solution.
  • Gussenhoven, C., & Chen, A. (2000). Universal and language-specific effects in the perception of question intonation. In B. Yuan, T. Huang, & X. Tang (Eds.), Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP) (pp. 91-94). Beijing: China Military Friendship Publish.

    Abstract

    Three groups of monolingual listeners, with Standard Chinese, Dutch and Hungarian as their native language, judged pairs of trisyllabic stimuli which differed only in their itch pattern. The segmental structure of the stimuli was made up by the experimenters and presented to subjects as being taken from a little-known language spoken on a South Pacific island. Pitch patterns consisted of a single rise-fall located on or near the second syllable. By and large, listeners selected the stimulus with the higher peak, the later eak, and the higher end rise as the one that signalled a question, regardless of language group. The result is argued to reflect innate, non-linguistic knowledge of the meaning of pitch variation, notably Ohala’s Frequency Code. A significant difference between groups is explained as due to the influence of the mother tongue.
  • Gussenhoven, C., & Chen, A. (2000). Universal and language-specific effects in the perception of question intonation. In Proceedings of the 6th International Conference on Spoken Language Processing (ICSLP) (pp. 91-94).
  • Hamilton, A., & Holler, J. (Eds.). (2023). Face2face: Advancing the science of social interaction [Special Issue]. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences. Retrieved from https://royalsocietypublishing.org/toc/rstb/2023/378/1875.

    Abstract

    Face to face interaction is fundamental to human sociality but is very complex to study in a scientific fashion. This theme issue brings together cutting-edge approaches to the study of face-to-face interaction and showcases how we can make progress in this area. Researchers are now studying interaction in adult conversation, parent-child relationships, neurodiverse groups, interactions with virtual agents and various animal species. The theme issue reveals how new paradigms are leading to more ecologically grounded and comprehensive insights into what social interaction is. Scientific advances in this area can lead to improvements in education and therapy, better understanding of neurodiversity and more engaging artificial agents
  • Harbusch, K., & Kempen, G. (2000). Complexity of linear order computation in Performance Grammar, TAG and HPSG. In Proceedings of Fifth International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+5) (pp. 101-106).

    Abstract

    This paper investigates the time and space complexity of word order computation in the psycholinguistically motivated grammar formalism of Performance Grammar (PG). In PG, the first stage of syntax assembly yields an unordered tree ('mobile') consisting of a hierarchy of lexical frames (lexically anchored elementary trees). Associated with each lexica l frame is a linearizer—a Finite-State Automaton that locally computes the left-to-right order of the branches of the frame. Linearization takes place after the promotion component may have raised certain constituents (e.g. Wh- or focused phrases) into the domain of lexical frames higher up in the syntactic mobile. We show that the worst-case time and space complexity of analyzing input strings of length n is O(n5) and O(n4), respectively. This result compares favorably with the time complexity of word-order computations in Tree Adjoining Grammar (TAG). A comparison with Head-Driven Phrase Structure Grammar (HPSG) reveals that PG yields a more declarative linearization method, provided that the FSA is rewritten as an equivalent regular expression.
  • Harbusch, K., & Kempen, G. (2002). A quantitative model of word order and movement in English, Dutch and German complement constructions. In Proceedings of the 19th international conference on Computational linguistics. San Francisco: Morgan Kaufmann.

    Abstract

    We present a quantitative model of word order and movement constraints that enables a simple and uniform treatment of a seemingly heterogeneous collection of linear order phenomena in English, Dutch and German complement constructions (Wh-extraction, clause union, extraposition, verb clustering, particle movement, etc.). Underlying the scheme are central assumptions of the psycholinguistically motivated Performance Grammar (PG). Here we describe this formalism in declarative terms based on typed feature unification. PG allows a homogenous treatment of both the within- and between-language variations of the ordering phenomena under discussion, which reduce to different settings of a small number of quantitative parameters.
  • Hellwig, B., Allen, S. E. M., Davidson, L., Defina, R., Kelly, B. F., & Kidd, E. (Eds.). (2023). The acquisition sketch project [Special Issue]. Language Documentation and Conservation Special Publication, 28.

    Abstract

    This special publication aims to build a renewed enthusiasm for collecting acquisition data across many languages, including those facing endangerment and loss. It presents a guide for documenting and describing child language and child-directed language in diverse languages and cultures, as well as a collection of acquisition sketches based on this guide. The guide is intended for anyone interested in working across child language and language documentation, including, for example, field linguists and language documenters, community language workers, child language researchers or graduate students.
  • Jadoul, Y., Düngen, D., & Ravignani, A. (2023). Live-tracking acoustic parameters in animal behavioural experiments: Interactive bioacoustics with parselmouth. In A. Astolfi, F. Asdrubali, & L. Shtrepi (Eds.), Proceedings of the 10th Convention of the European Acoustics Association Forum Acusticum 2023 (pp. 4675-4678). Torino: European Acoustics Association.

    Abstract

    Most bioacoustics software is used to analyse the already collected acoustics data in batch, i.e., after the data-collecting phase of a scientific study. However, experiments based on animal training require immediate and precise reactions from the experimenter, and thus do not easily dovetail with a typical bioacoustics workflow. Bridging this methodological gap, we have developed a custom application to live-monitor the vocal development of harbour seals in a behavioural experiment. In each trial, the application records and automatically detects an animal's call, and immediately measures duration and acoustic measures such as intensity, fundamental frequency, or formant frequencies. It then displays a spectrogram of the recording and the acoustic measurements, allowing the experimenter to instantly evaluate whether or not to reinforce the animal's vocalisation. From a technical perspective, the rapid and easy development of this custom software was made possible by combining multiple open-source software projects. Here, we integrated the acoustic analyses from Parselmouth, a Python library for Praat, together with PyAudio and Matplotlib's recording and plotting functionality, into a custom graphical user interface created with PyQt. This flexible recombination of different open-source Python libraries allows the whole program to be written in a mere couple of hundred lines of code
  • Janse, E., Sennema, A., & Slis, A. (2000). Fast speech timing in Dutch: The durational correlates of lexical stress and pitch accent. In Proceedings of the VIth International Conference on Spoken Language Processing, Vol. III (pp. 251-254).

    Abstract

    n this study we investigated the durational correlates of lexical stress and pitch accent at normal and fast speech rate in Dutch. Previous literature on English shows that durations of lexically unstressed vowels are reduced more than stressed vowels when speakers increase their speech rate. We found that the same holds for Dutch, irrespective of whether the unstressed vowel is schwa or a "full" vowel. In the same line, we expected that vowels in words without a pitch accent would be shortened relatively more than vowels in words with a pitch accent. This was not the case: if anything, the accented vowels were shortened relatively more than the unaccented vowels. We conclude that duration is an important cue for lexical stress, but not for pitch accent.
  • Janse, E. (2000). Intelligibility of time-compressed speech: Three ways of time-compression. In Proceedings of the VIth International Conference on Spoken Language Processing, vol. III (pp. 786-789).

    Abstract

    Studies on fast speech have shown that word-level timing of fast speech differs from that of normal rate speech in that unstressed syllables are shortened more than stressed syllables as speech rate increases. An earlier experiment showed that the intelligibility of time-compressed speech could not be improved by making its temporal organisation closer to natural fast speech. To test the hypothesis that segmental intelligibility is more important than prosodic timing in listening to timecompressed speech, the intelligibility of bisyllabic words was tested in three time-compression conditions: either stressed and unstressed syllable were compressed to the same degree, or the stressed syllable was compressed more than the unstressed syllable, or the reverse. As was found before, imitating wordlevel timing of fast speech did not improve intelligibility over linear compression. However, the results did not confirm the hypothesis either: there was no difference in intelligibility between the three compression conditions. We conclude that segmental intelligibility plays an important role, but further research is necessary to decide between the contributions of prosody and segmental intelligibility to the word-level intelligibility of time-compressed speech.
  • Janse, E. (2002). Time-compressing natural and synthetic speech. In Proceedings of 7th International Conference on Spoken Language Processing (pp. 1645-1648).
  • Janse, E. (2003). Word perception in natural-fast and artificially time-compressed speech. In M. SolÉ, D. Recasens, & J. Romero (Eds.), Proceedings of the 15th International Congress of the Phonetic Sciences (pp. 3001-3004).
  • Johnson, E. K. (2003). Speaker intent influences infants' segmentation of potentially ambiguous utterances. In Proceedings of the 15th International Congress of Phonetic Sciences (PCPhS 2003) (pp. 1995-1998). Adelaide: Causal Productions.
  • Johnson, E. K., Jusczyk, P. W., Cutler, A., & Norris, D. (2000). The development of word recognition: The use of the possible-word constraint by 12-month-olds. In L. Gleitman, & A. Joshi (Eds.), Proceedings of CogSci 2000 (pp. 1034). London: Erlbaum.
  • Jordanoska, I., Kocher, A., & Bendezú-Araujo, R. (Eds.). (2023). Marking the truth: A cross-linguistic approach to verum [Special Issue]. Zeitschrift für Sprachwissenschaft, 42(3).
  • Kanakanti, M., Singh, S., & Shrivastava, M. (2023). MultiFacet: A multi-tasking framework for speech-to-sign language generation. In E. André, M. Chetouani, D. Vaufreydaz, G. Lucas, T. Schultz, L.-P. Morency, & A. Vinciarelli (Eds.), ICMI '23 Companion: Companion Publication of the 25th International Conference on Multimodal Interaction (pp. 205-213). New York: ACM. doi:10.1145/3610661.3616550.

    Abstract

    Sign language is a rich form of communication, uniquely conveying meaning through a combination of gestures, facial expressions, and body movements. Existing research in sign language generation has predominantly focused on text-to-sign pose generation, while speech-to-sign pose generation remains relatively underexplored. Speech-to-sign language generation models can facilitate effective communication between the deaf and hearing communities. In this paper, we propose an architecture that utilises prosodic information from speech audio and semantic context from text to generate sign pose sequences. In our approach, we adopt a multi-tasking strategy that involves an additional task of predicting Facial Action Units (FAUs). FAUs capture the intricate facial muscle movements that play a crucial role in conveying specific facial expressions during sign language generation. We train our models on an existing Indian Sign language dataset that contains sign language videos with audio and text translations. To evaluate our models, we report Dynamic Time Warping (DTW) and Probability of Correct Keypoints (PCK) scores. We find that combining prosody and text as input, along with incorporating facial action unit prediction as an additional task, outperforms previous models in both DTW and PCK scores. We also discuss the challenges and limitations of speech-to-sign pose generation models to encourage future research in this domain. We release our models, results and code to foster reproducibility and encourage future research1.
  • Kearns, R. K., Norris, D., & Cutler, A. (2002). Syllable processing in English. In Proceedings of the 7th International Conference on Spoken Language Processing [ICSLP 2002] (pp. 1657-1660).

    Abstract

    We describe a reaction time study in which listeners detected word or nonword syllable targets (e.g. zoo, trel) in sequences consisting of the target plus a consonant or syllable residue (trelsh, trelshek). The pattern of responses differed from an earlier word-spotting study with the same material, in which words were always harder to find if only a consonant residue remained. The earlier results should thus not be viewed in terms of syllabic parsing, but in terms of a universal role for syllables in speech perception; words which are accidentally present in spoken input (e.g. sell in self) can be rejected when they leave a residue of the input which could not itself be a word.
  • Kempen, G., & Harbusch, K. (2003). A corpus study into word order variation in German subordinate clauses: Animacy affects linearization independently of function assignment. In Proceedings of AMLaP 2003 (pp. 153-154). Glasgow: Glasgow University.
  • Kempen, G., & Van Breugel, C. (2002). A workbench for visual-interactive grammar instruction at the secondary education level. In Proceedings of the 10th International CALL Conference (pp. 157-158). Antwerp: University of Antwerp.
  • Kempen, G. (1988). De netwerker: Spin in het web of rat in een doolhof? In SURF in theorie en praktijk: Van personal tot supercomputer (pp. 59-61). Amsterdam: Elsevier Science Publishers.
  • Kempen, G. (1997). De ontdubbelde taalgebruiker: Maken taalproductie en taalperceptie gebruik van één en dezelfde syntactische processor? [Abstract]. In 6e Winter Congres NvP. Programma and abstracts (pp. 31-32). Nederlandse Vereniging voor Psychonomie.
  • Kempen, G., Kooij, A., & Van Leeuwen, T. (1997). Do skilled readers exploit inflectional spelling cues that do not mirror pronunciation? An eye movement study of morpho-syntactic parsing in Dutch. In Abstracts of the Orthography Workshop "What spelling changes". Nijmegen: Max Planck Institute for Psycholinguistics.
  • Kempen, G. (1994). Innovative language checking software for Dutch. In J. Van Gent, & E. Peeters (Eds.), Proceedings of the 2e Dag van het Document (pp. 99-100). Delft: TNO Technisch Physische Dienst.
  • Kempen, G., & Harbusch, K. (2002). Rethinking the architecture of human syntactic processing: The relationship between grammatical encoding and decoding. In Proceedings of the 35th Meeting of the Societas Linguistica Europaea. University of Potsdam.
  • Kempen, G. (1994). The unification space: A hybrid model of human syntactic processing [Abstract]. In Cuny 1994 - The 7th Annual CUNY Conference on Human Sentence Processing. March 17-19, 1994. CUNY Graduate Center, New York.
  • Kempen, G., & Dijkstra, A. (1994). Toward an integrated system for grammar, writing and spelling instruction. In L. Appelo, & F. De Jong (Eds.), Computer-Assisted Language Learning: Proceedings of the Seventh Twente Workshop on Language Technology (pp. 41-46). Enschede: University of Twente.
  • Klein, W. (Ed.). (2002). Sprache des Rechts II [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 128.
  • Klein, W. (2000). Changing concepts of the nature-nurture debate. In R. Hide, J. Mittelstrass, & W. Singer (Eds.), Changing concepts of nature at the turn of the millenium: Proceedings plenary session of the Pontifical academy of sciences, 26-29 October 1998 (pp. 289-299). Vatican City: Pontificia Academia Scientiarum.
  • Klein, W., & Jungbluth, K. (Eds.). (2002). Deixis [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 125.
  • Klein, W., & Franceschini, R. (Eds.). (2003). Einfache Sprache [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 131.
  • Klein, W. (Ed.). (1989). Kindersprache [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (73).
  • Klein, W., & Dittmar, N. (Eds.). (1994). Interkulturelle Kommunikation [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (93).
  • Klein, W. (Ed.). (1997). Technologischer Wandel in den Philologien [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (106).
  • Klein, W. (Ed.). (2000). Sprache des Rechts [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (118).
  • Klein, W. (Ed.). (1988). Sprache Kranker [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (69).
  • Klein, W. (Ed.). (1979). Sprache und Kontext [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (33).
  • Koster, M., & Cutler, A. (1997). Segmental and suprasegmental contributions to spoken-word recognition in Dutch. In Proceedings of EUROSPEECH 97 (pp. 2167-2170). Grenoble, France: ESCA.

    Abstract

    Words can be distinguished by segmental differences or by suprasegmental differences or both. Studies from English suggest that suprasegmentals play little role in human spoken-word recognition; English stress, however, is nearly always unambiguously coded in segmental structure (vowel quality); this relationship is less close in Dutch. The present study directly compared the effects of segmental and suprasegmental mispronunciation on word recognition in Dutch. There was a strong effect of suprasegmental mispronunciation, suggesting that Dutch listeners do exploit suprasegmental information in word recognition. Previous findings indicating the effects of mis-stressing for Dutch differ with stress position were replicated only when segmental change was involved, suggesting that this is an effect of segmental rather than suprasegmental processing.
  • Kuijpers, C., Van Donselaar, W., & Cutler, A. (2002). Perceptual effects of assimilation-induced violation of final devoicing in Dutch. In J. H. L. Hansen, & B. Pellum (Eds.), The 7th International Conference on Spoken Language Processing (pp. 1661-1664). Denver: ICSA.

    Abstract

    Voice assimilation in Dutch is an optional phonological rule which changes the surface forms of words and in doing so may violate the otherwise obligatory phonological rule of syllablefinal devoicing. We report two experiments examining the influence of voice assimilation on phoneme processing, in lexical compound words and in noun-verb phrases. Processing was not impaired in appropriate assimilation contexts across morpheme boundaries, but was impaired when devoicing was violated (a) in an inappropriate non-assimilatory) context, or (b) across a syntactic boundary.
  • Kuntay, A., & Ozyurek, A. (2002). Joint attention and the development of the use of demonstrative pronouns in Turkish. In B. Skarabela, S. Fish, & A. H. Do (Eds.), Proceedings of the 26th annual Boston University Conference on Language Development (pp. 336-347). Somerville, MA: Cascadilla Press.
  • Kuzla, C. (2003). Prosodically-conditioned variation in the realization of domain-final stops and voicing assimilation of domain-initial fricatives in German. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2829-2832). Adelaide: Causal Productions.
  • De Lange, F. P., Hagoort, P., & Toni, I. (2003). Differential fronto-parietal contributions to visual and motor imagery. NeuroImage, 19(2), e2094-e2095.

    Abstract

    Mental imagery is a cognitive process crucial to human reasoning. Numerous studies have characterized specific
    instances of this cognitive ability, as evoked by visual imagery (VI) or motor imagery (MI) tasks. However, it
    remains unclear which neural resources are shared between VI and MI, and which are exclusively related to MI.
    To address this issue, we have used fMRI to measure human brain activity during performance of VI and MI
    tasks. Crucially, we have modulated the imagery process by manipulating the degree of mental rotation necessary
    to solve the tasks. We focused our analysis on changes in neural signal as a function of the degree of mental
    rotation in each task.
  • Lansner, A., Sandberg, A., Petersson, K. M., & Ingvar, M. (2000). On forgetful attractor network memories. In H. Malmgren, M. Borga, & L. Niklasson (Eds.), Artificial neural networks in medicine and biology: Proceedings of the ANNIMAB-1 Conference, Göteborg, Sweden, 13-16 May 2000 (pp. 54-62). Heidelberg: Springer Verlag.

    Abstract

    A recurrently connected attractor neural network with a Hebbian learning rule is currently our best ANN analogy for a piece cortex. Functionally biological memory operates on a spectrum of time scales with regard to induction and retention, and it is modulated in complex ways by sub-cortical neuromodulatory systems. Moreover, biological memory networks are commonly believed to be highly distributed and engage many co-operating cortical areas. Here we focus on the temporal aspects of induction and retention of memory in a connectionist type attractor memory model of a piece of cortex. A continuous time, forgetful Bayesian-Hebbian learning rule is described and compared to the characteristics of LTP and LTD seen experimentally. More generally, an attractor network implementing this learning rule can operate as a long-term, intermediate-term, or short-term memory. Modulation of the print-now signal of the learning rule replicates some experimental memory phenomena, like e.g. the von Restorff effect.
  • Laparle, S. (2023). Moving past the lexical affiliate with a frame-based analysis of gesture meaning. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527218.

    Abstract

    Interpreting the meaning of co-speech gesture often involves
    identifying a gesture’s ‘lexical affiliate’, the word or phrase to
    which it most closely relates (Schegloff 1984). Though there is
    work within gesture studies that resists this simplex mapping of
    meaning from speech to gesture (e.g. de Ruiter 2000; Kendon
    2014; Parrill 2008), including an evolving body of literature on
    recurrent gesture and gesture families (e.g. Fricke et al. 2014; Müller 2017), it is still the lexical affiliate model that is most ap-
    parent in formal linguistic models of multimodal meaning(e.g.
    Alahverdzhieva et al. 2017; Lascarides and Stone 2009; Puste-
    jovsky and Krishnaswamy 2021; Schlenker 2020). In this work,
    I argue that the lexical affiliate should be carefully reconsidered
    in the further development of such models.
    In place of the lexical affiliate, I suggest a further shift
    toward a frame-based, action schematic approach to gestural
    meaning in line with that proposed in, for example, Parrill and
    Sweetser (2004) and Müller (2017). To demonstrate the utility
    of this approach I present three types of compositional gesture
    sequences which I call spatial contrast, spatial embedding, and
    cooperative abstract deixis. All three rely on gestural context,
    rather than gesture-speech alignment, to convey interactive (i.e.
    pragmatic) meaning. The centrality of gestural context to ges-
    ture meaning in these examples demonstrates the necessity of
    developing a model of gestural meaning independent of its in-
    tegration with speech.
  • Levelt, C. C., Fikkert, P., & Schiller, N. O. (2003). Metrical priming in speech production. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2481-2485). Adelaide: Causal Productions.

    Abstract

    In this paper we report on four experiments in which we attempted to prime the stress position of Dutch bisyllabic target nouns. These nouns, picture names, had stress on either the first or the second syllable. Auditory prime words had either the same stress as the target or a different stress (e.g., WORtel – MOtor vs. koSTUUM – MOtor; capital letters indicate stressed syllables in prime – target pairs). Furthermore, half of the prime words were semantically related, the other half were unrelated. In none of the experiments a stress priming effect was found. This could mean that stress is not stored in the lexicon. An additional finding was that targets with initial stress had a faster response than targets with a final stress. We hypothesize that bisyllabic words with final stress take longer to be encoded because this stress pattern is irregular with respect to the lexical distribution of bisyllabic stress patterns, even though it can be regular in terms of the metrical stress rules of Dutch.
  • Levelt, W. J. M. (1994). On the skill of speaking: How do we access words? In Proceedings ICSLP 94 (pp. 2253-2258). Yokohama: The Acoustical Society of Japan.
  • Levelt, W. J. M. (1994). Onder woorden brengen: Beschouwingen over het spreekproces. In Haarlemse voordrachten: voordrachten gehouden in de Hollandsche Maatschappij der Wetenschappen te Haarlem. Haarlem: Hollandsche maatschappij der wetenschappen.
  • Levelt, W. J. M. (1994). What can a theory of normal speaking contribute to AAC? In ISAAC '94 Conference Book and Proceedings. Hoensbroek: IRV.
  • Levinson, S. C. (2000). Language as nature and language as art. In J. Mittelstrass, & W. Singer (Eds.), Proceedings of the Symposium on ‘Changing concepts of nature and the turn of the Millennium (pp. 257-287). Vatican City: Pontificae Academiae Scientiarium Scripta Varia.
  • Levinson, S. C. (2000). H.P. Grice on location on Rossel Island. In S. S. Chang, L. Liaw, & J. Ruppenhofer (Eds.), Proceedings of the 25th Annual Meeting of the Berkeley Linguistic Society (pp. 210-224). Berkeley: Berkeley Linguistic Society.
  • Levinson, S. C., & Haviland, J. B. (Eds.). (1994). Space in Mayan languages [Special Issue]. Linguistics, 32(4/5).
  • Levinson, S. C. (1979). Pragmatics and social deixis: Reclaiming the notion of conventional implicature. In C. Chiarello (Ed.), Proceedings of the Fifth Annual Meeting of the Berkeley Linguistics Society (pp. 206-223).
  • Levshina, N. (2023). Testing communicative and learning biases in a causal model of language evolution:A study of cues to Subject and Object. In M. Degano, T. Roberts, G. Sbardolini, & M. Schouwstra (Eds.), The Proceedings of the 23rd Amsterdam Colloquium (pp. 383-387). Amsterdam: University of Amsterdam.
  • Liesenfeld, A., Lopez, A., & Dingemanse, M. (2023). Opening up ChatGPT: Tracking Openness, Transparency, and Accountability in Instruction-Tuned Text Generators. In CUI '23: Proceedings of the 5th International Conference on Conversational User Interfaces. doi:10.1145/3571884.3604316.

    Abstract

    Large language models that exhibit instruction-following behaviour represent one of the biggest recent upheavals in conversational interfaces, a trend in large part fuelled by the release of OpenAI's ChatGPT, a proprietary large language model for text generation fine-tuned through reinforcement learning from human feedback (LLM+RLHF). We review the risks of relying on proprietary software and survey the first crop of open-source projects of comparable architecture and functionality. The main contribution of this paper is to show that openness is differentiated, and to offer scientific documentation of degrees of openness in this fast-moving field. We evaluate projects in terms of openness of code, training data, model weights, RLHF data, licensing, scientific documentation, and access methods. We find that while there is a fast-growing list of projects billing themselves as 'open source', many inherit undocumented data of dubious legality, few share the all-important instruction-tuning (a key site where human labour is involved), and careful scientific documentation is exceedingly rare. Degrees of openness are relevant to fairness and accountability at all points, from data collection and curation to model architecture, and from training and fine-tuning to release and deployment.
  • Liesenfeld, A., Lopez, A., & Dingemanse, M. (2023). The timing bottleneck: Why timing and overlap are mission-critical for conversational user interfaces, speech recognition and dialogue systems. In Proceedings of the 24rd Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDial 2023). doi:10.18653/v1/2023.sigdial-1.45.

    Abstract

    Speech recognition systems are a key intermediary in voice-driven human-computer interaction. Although speech recognition works well for pristine monologic audio, real-life use cases in open-ended interactive settings still present many challenges. We argue that timing is mission-critical for dialogue systems, and evaluate 5 major commercial ASR systems for their conversational and multilingual support. We find that word error rates for natural conversational data in 6 languages remain abysmal, and that overlap remains a key challenge (study 1). This impacts especially the recognition of conversational words (study 2), and in turn has dire consequences for downstream intent recognition (study 3). Our findings help to evaluate the current state of conversational ASR, contribute towards multidimensional error analysis and evaluation, and identify phenomena that need most attention on the way to build robust interactive speech technologies.
  • Matsuo, A., & Duffield, N. (2002). Assessing the generality of knowledge about English ellipsis in SLA. In J. Costa, & M. J. Freitas (Eds.), Proceedings of the GALA 2001 Conference on Language Acquisition (pp. 49-53). Lisboa: Associacao Portuguesa de Linguistica.
  • Matsuo, A., & Duffield, N. (2002). Finiteness and parallelism: Assessing the generality of knowledge about English ellipsis in SLA. In B. Skarabela, S. Fish, & A.-H.-J. Do (Eds.), Proceedings of the 26th Boston University Conference on Language Development (pp. 197-207). Somerville, Massachusetts: Cascadilla Press.

Share this page