Publications

Displaying 1 - 100 of 119
  • Akamine, S., Kohatsu, T., Niikuni, K., Schafer, A. J., & Sato, M. (2022). Emotions in language processing: Affective priming in embodied cognition. In Proceedings of the 39th Annual Meeting of Japanese Cognitive Science Society (pp. 326-332). Tokyo: Japanese Cognitive Science Society.
  • Alhama, R. G., Rowland, C. F., & Kidd, E. (2020). Evaluating word embeddings for language acquisition. In E. Chersoni, C. Jacobs, Y. Oseki, L. Prévot, & E. Santus (Eds.), Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (pp. 38-42). Stroudsburg, PA, USA: Association for Computational Linguistics (ACL). doi:10.18653/v1/2020.cmcl-1.4.

    Abstract

    Continuous vector word representations (or
    word embeddings) have shown success in cap-turing semantic relations between words, as evidenced by evaluation against behavioral data of adult performance on semantic tasks (Pereira et al., 2016). Adult semantic knowl-edge is the endpoint of a language acquisition process; thus, a relevant question is whether these models can also capture emerging word
    representations of young language learners. However, the data for children’s semantic knowledge across development is scarce. In this paper, we propose to bridge this gap by using Age of Acquisition norms to evaluate word embeddings learnt from child-directed input. We present two methods that evaluate word embeddings in terms of (a) the semantic neighbourhood density of learnt words, and (b) con-
    vergence to adult word associations. We apply our methods to bag-of-words models, and find that (1) children acquire words with fewer semantic neighbours earlier, and (2) young learners only attend to very local context. These findings provide converging evidence for validity of our methods in understanding the prerequisite features for a distributional model of word learning.
  • Allen, S. E. M. (1998). A discourse-pragmatic explanation for the subject-object asymmetry in early null arguments. In A. Sorace, C. Heycock, & R. Shillcock (Eds.), Proceedings of the GALA '97 Conference on Language Acquisition (pp. 10-15). Edinburgh, UK: Edinburgh University Press.

    Abstract

    The present paper assesses discourse-pragmatic factors as a potential explanation for the subject-object assymetry in early child language. It identifies a set of factors which characterize typical situations of informativeness (Greenfield & Smith, 1976), and uses these factors to identify informative arguments in data from four children aged 2;0 through 3;6 learning Inuktitut as a first language. In addition, it assesses the extent of the links between features of informativeness on one hand and lexical vs. null and subject vs. object arguments on the other. Results suggest that a pragmatics account of the subject-object asymmetry can be upheld to a greater extent than previous research indicates, and that several of the factors characterizing informativeness are good indicators of those arguments which tend to be omitted in early child language.
  • Asano, Y., Yuan, C., Grohe, A.-K., Weber, A., Antoniou, M., & Cutler, A. (2020). Uptalk interpretation as a function of listening experience. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (Eds.), Proceedings of Speech Prosody 2020 (pp. 735-739). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-150.

    Abstract

    The term “uptalk” describes utterance-final pitch rises that carry no sentence-structural information. Uptalk is usually dialectal or sociolectal, and Australian English (AusEng) is particularly known for this attribute. We ask here whether experience with an uptalk variety affects listeners’ ability to categorise rising pitch contours on the basis of the timing and height of their onset and offset. Listeners were two groups of English-speakers (AusEng, and American English), and three groups of listeners with L2 English: one group with Mandarin as L1 and experience of listening to AusEng, one with German as L1 and experience of listening to AusEng, and one with German as L1 but no AusEng experience. They heard nouns (e.g. flower, piano) in the framework “Got a NOUN”, each ending with a pitch rise artificially manipulated on three contrasts: low vs. high rise onset, low vs. high rise offset and early vs. late rise onset. Their task was to categorise the tokens as “question” or “statement”, and we analysed the effect of the pitch contrasts on their judgements. Only the native AusEng listeners were able to use the pitch contrasts systematically in making these categorisations.
  • Bauer, B. L. M. (1999). Aspects of impersonal constructions in Late Latin. In H. Petersmann, & R. Kettelmann (Eds.), Latin vulgaire – latin tardif V (pp. 209-211). Heidelberg: Winter.
  • Bauer, B. L. M. (2022). Finite verb + infinite + object in later Latin: Early brace constructions? In G. V. M. Haverling (Ed.), Studies on Late and Vulgar Latin in the Early 21st Century: Acts of the 12th International Colloquium "Latin vulgaire – Latin tardif (pp. 166-181). Uppsala: Acta Universitatis Upsaliensis.
  • De Boer, B., Thompson, B., Ravignani, A., & Boeckx, C. (2020). Analysis of mutation and fixation for language. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 56-58). Nijmegen: The Evolution of Language Conferences.
  • Bowerman, M. (1996). Argument structure and learnability: Is a solution in sight? In J. Johnson, M. L. Juge, & J. L. Moxley (Eds.), Proceedings of the Twenty-second Annual Meeting of the Berkeley Linguistics Society, February 16-19, 1996. General Session and Parasession on The Role of Learnability in Grammatical Theory (pp. 454-468). Berkeley Linguistics Society.
  • Bowerman, M., de León, L., & Choi, S. (1995). Verbs, particles, and spatial semantics: Learning to talk about spatial actions in typologically different languages. In E. V. Clark (Ed.), Proceedings of the Twenty-seventh Annual Child Language Research Forum (pp. 101-110). Stanford, CA: Center for the Study of Language and Information.
  • Bruggeman, L., Yu, J., & Cutler, A. (2022). Listener adjustment of stress cue use to fit language vocabulary structure. In S. Frota, M. Cruz, & M. Vigário (Eds.), Proceedings of Speech Prosody 2022 (pp. 264-267). doi:10.21437/SpeechProsody.2022-54.

    Abstract

    In lexical stress languages, phonemically identical syllables can differ suprasegmentally (in duration, amplitude, F0). Such stress
    cues allow listeners to speed spoken-word recognition by rejecting mismatching competitors (e.g., unstressed set- in settee
    rules out stressed set- in setting, setter, settle). Such processing effects have indeed been observed in Spanish, Dutch and German, but English listeners are known to largely ignore stress cues. Dutch and German listeners even outdo English listeners in distinguishing stressed versus unstressed English syllables. This has been attributed to the relative frequency across the stress languages of unstressed syllables with full vowels; in English most unstressed syllables contain schwa, instead, and stress cues on full vowels are thus least often informative in this language. If only informativeness matters, would English listeners who encounter situations where such cues would pay off for them (e.g., learning one of those other stress languages) then shift to using stress cues? Likewise, would stress cue users with English as L2, if mainly using English, shift away from
    using the cues in English? Here we report tests of these two questions, with each receiving a yes answer. We propose that
    English listeners’ disregard of stress cues is purely pragmatic.
  • Bujok, R., Meyer, A. S., & Bosker, H. R. (2022). Visible lexical stress cues on the face do not influence audiovisual speech perception. In S. Frota, M. Cruz, & M. Vigário (Eds.), Proceedings of Speech Prosody 2022 (pp. 259-263). doi:10.21437/SpeechProsody.2022-53.

    Abstract

    Producing lexical stress leads to visible changes on the face, such as longer duration and greater size of the opening of the mouth. Research suggests that these visual cues alone can inform participants about which syllable carries stress (i.e., lip-reading silent videos). This study aims to determine the influence of visual articulatory cues on lexical stress perception in more naturalistic audiovisual settings. Participants were presented with seven disyllabic, Dutch minimal stress pairs (e.g., VOORnaam [first name] & voorNAAM [respectable]) in audio-only (phonetic lexical stress continua without video), video-only (lip-reading silent videos), and audiovisual trials (e.g., phonetic lexical stress continua with video of talker saying VOORnaam or voorNAAM). Categorization data from video-only trials revealed that participants could distinguish the minimal pairs above chance from seeing the silent videos alone. However, responses in the audiovisual condition did not differ from the audio-only condition. We thus conclude that visual lexical stress information on the face, while clearly perceivable, does not play a major role in audiovisual speech perception. This study demonstrates that clear unimodal effects do not always generalize to more naturalistic multimodal communication, advocating that speech prosody is best considered in multimodal settings.
  • Cambier, N., Miletitch, R., Burraco, A. B., & Raviv, L. (2022). Prosociality in swarm robotics: A model to study self-domestication and language evolution. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 98-100). Nijmegen: Joint Conference on Language Evolution (JCoLE).
  • Cheung, C.-Y., Yakpo, K., & Coupé, C. (2022). A computational simulation of the genesis and spread of lexical items in situations of abrupt language contact. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 115-122). Nijmegen: Joint Conference on Language Evolution (JCoLE).

    Abstract

    The current study presents an agent-based model which simulates the innovation and
    competition among lexical items in cases of language contact. It is inspired by relatively
    recent historical cases in which the linguistic ecology and sociohistorical context are highly complex. Pidgin and creole genesis offers an opportunity to obtain linguistic facts, social dynamics, and historical demography in a highly segregated society. This provides a solid ground for researching the interaction of populations with different pre-existing language systems, and how different factors contribute to the genesis of the lexicon of a newly generated mixed language. We take into consideration the population dynamics and structures, as well as a distribution of word frequencies related to language use, in order to study how social factors may affect the developmental trajectory of languages. Focusing on the case of Sranan in Suriname, our study shows that it is possible to account for the
    composition of its core lexicon in relation to different social groups, contact patterns, and
    large population movements.
  • Crago, M. B., Allen, S. E. M., & Pesco, D. (1998). Issues of Complexity in Inuktitut and English Child Directed Speech. In Proceedings of the twenty-ninth Annual Stanford Child Language Research Forum (pp. 37-46).
  • Cutler, A., & Otake, T. (1998). Assimilation of place in Japanese and Dutch. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: vol. 5 (pp. 1751-1754). Sydney: ICLSP.

    Abstract

    Assimilation of place of articulation across a nasal and a following stop consonant is obligatory in Japanese, but not in Dutch. In four experiments the processing of assimilated forms by speakers of Japanese and Dutch was compared, using a task in which listeners blended pseudo-word pairs such as ranga-serupa. An assimilated blend of this pair would be rampa, an unassimilated blend rangpa. Japanese listeners produced significantly more assimilated than unassimilated forms, both with pseudo-Japanese and pseudo-Dutch materials, while Dutch listeners produced significantly more unassimilated than assimilated forms in each materials set. This suggests that Japanese listeners, whose native-language phonology involves obligatory assimilation constraints, represent the assimilated nasals in nasal-stop sequences as unmarked for place of articulation, while Dutch listeners, who are accustomed to hearing unassimilated forms, represent the same nasal segments as marked for place of articulation.
  • Cutler, A. (1998). How listeners find the right words. In Proceedings of the Sixteenth International Congress on Acoustics: Vol. 2 (pp. 1377-1380). Melville, NY: Acoustical Society of America.

    Abstract

    Languages contain tens of thousands of words, but these are constructed from a tiny handful of phonetic elements. Consequently, words resemble one another, or can be embedded within one another, a coup stick snot with standing. me process of spoken-word recognition by human listeners involves activation of multiple word candidates consistent with the input, and direct competition between activated candidate words. Further, human listeners are sensitive, at an early, prelexical, stage of speeeh processing, to constraints on what could potentially be a word of the language.
  • Cutler, A., & Butterfield, S. (1989). Natural speech cues to word segmentation under difficult listening conditions. In J. Tubach, & J. Mariani (Eds.), Proceedings of Eurospeech 89: European Conference on Speech Communication and Technology: Vol. 2 (pp. 372-375). Edinburgh: CEP Consultants.

    Abstract

    One of a listener's major tasks in understanding continuous speech is segmenting the speech signal into separate words. When listening conditions are difficult, speakers can help listeners by deliberately speaking more clearly. In three experiments, we examined how word boundaries are produced in deliberately clear speech. We found that speakers do indeed attempt to mark word boundaries; moreover, they differentiate between word boundaries in a way which suggests they are sensitive to listener needs. Application of heuristic segmentation strategies makes word boundaries before strong syllables easiest for listeners to perceive; but under difficult listening conditions speakers pay more attention to marking word boundaries before weak syllables, i.e. they mark those boundaries which are otherwise particularly hard to perceive.
  • Cutler, A., Treiman, R., & Van Ooijen, B. (1998). Orthografik inkoncistensy ephekts in foneme detektion? In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2783-2786). Sydney: ICSLP.

    Abstract

    The phoneme detection task is widely used in spoken word recognition research. Alphabetically literate participants, however, are more used to explicit representations of letters than of phonemes. The present study explored whether phoneme detection is sensitive to how target phonemes are, or may be, orthographically realised. Listeners detected the target sounds [b,m,t,f,s,k] in word-initial position in sequences of isolated English words. Response times were faster to the targets [b,m,t], which have consistent word-initial spelling, than to the targets [f,s,k], which are inconsistently spelled, but only when listeners’ attention was drawn to spelling by the presence in the experiment of many irregularly spelled fillers. Within the inconsistent targets [f,s,k], there was no significant difference between responses to targets in words with majority and minority spellings. We conclude that performance in the phoneme detection task is not necessarily sensitive to orthographic effects, but that salient orthographic manipulation can induce such sensitivity.
  • Cutler, A., & Chen, H.-C. (1995). Phonological similarity effects in Cantonese word recognition. In K. Elenius, & P. Branderud (Eds.), Proceedings of the Thirteenth International Congress of Phonetic Sciences: Vol. 1 (pp. 106-109). Stockholm: Stockholm University.

    Abstract

    Two lexical decision experiments in Cantonese are described in which the recognition of spoken target words as a function of phonological similarity to a preceding prime is investigated. Phonological similaritv in first syllables produced inhibition, while similarity in second syllables led to facilitation. Differences between syllables in tonal and segmental structure had generally similar effects.
  • Cutler, A. (1980). Productivity in word formation. In J. Kreiman, & A. E. Ojeda (Eds.), Papers from the Sixteenth Regional Meeting, Chicago Linguistic Society (pp. 45-51). Chicago, Ill.: CLS.
  • Cutler, A. (1996). The comparative study of spoken-language processing. In H. T. Bunnell (Ed.), Proceedings of the Fourth International Conference on Spoken Language Processing: Vol. 1 (pp. 1). New York: Institute of Electrical and Electronics Engineers.

    Abstract

    Psycholinguists are saddled with a paradox. Their aim is to construct a model of human language processing, which will hold equally well for the processing of any language, but this aim cannot be achieved just by doing experiments in any language. They have to compare processing of many languages, and actively search for effects which are specific to a single language, even though a model which is itself specific to a single language is really the last thing they want.
  • Cutler, A., & Otake, T. (1996). The processing of word prosody in Japanese. In P. McCormack, & A. Russell (Eds.), Proceedings of the 6th Australian International Conference on Speech Science and Technology (pp. 599-604). Canberra: Australian Speech Science and Technology Association.
  • Cutler, A. (1998). The recognition of spoken words with variable representations. In D. Duez (Ed.), Proceedings of the ESCA Workshop on Sound Patterns of Spontaneous Speech (pp. 83-92). Aix-en-Provence: Université de Aix-en-Provence.
  • Cutler, A., Van Ooijen, B., & Norris, D. (1999). Vowels, consonants, and lexical activation. In J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, & A. Bailey (Eds.), Proceedings of the Fourteenth International Congress of Phonetic Sciences: Vol. 3 (pp. 2053-2056). Berkeley: University of California.

    Abstract

    Two lexical decision studies examined the effects of single-phoneme mismatches on lexical activation in spoken-word recognition. One study was carried out in English, and involved spoken primes and visually presented lexical decision targets. The other study was carried out in Dutch, and primes and targets were both presented auditorily. Facilitation was found only for spoken targets preceded immediately by spoken primes; no facilitation occurred when targets were presented visually, or when intervening input occurred between prime and target. The effects of vowel mismatches and consonant mismatches were equivalent.
  • Cutler, A. (1995). Universal and Language-Specific in the Development of Speech. Biology International, (Special Issue 33).
  • Dingemanse, M., Liesenfeld, A., & Woensdregt, M. (2022). Convergent cultural evolution of continuers (mhmm). In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 160-167). Nijmegen: Joint Conference on Language Evolution (JCoLE). doi:10.31234/osf.io/65c79.

    Abstract

    Continuers —words like mm, mmhm, uhum and the like— are among the most frequent types of responses in conversation. They play a key role in joint action coordination by showing positive evidence of understanding and scaffolding narrative delivery. Here we investigate the hypothesis that their functional importance along with their conversational ecology places selective pressures on their form and may lead to cross-linguistic similarities through convergent cultural evolution. We compare continuer tokens in linguistically diverse conversational corpora and find languages make available highly similar forms. We then approach the causal mechanism of convergent cultural evolution using exemplar modelling, simulating the process by which a combination of effort minimization and functional specialization may push continuers to a particular region of phonological possibility space. By combining comparative linguistics and computational modelling we shed new light on the question of how language structure is shaped by and for social interaction.
  • Dingemanse, M., & Liesenfeld, A. (2022). From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022) (pp. 5614 -5633). Dublin, Ireland: Association for Computational Linguistics.

    Abstract

    Informal social interaction is the primordial home of human language. Linguistically diverse conversational corpora are an important and largely untapped resource for computational linguistics and language technology. Through the efforts of a worldwide language documentation movement, such corpora are increasingly becoming available. We show how interactional data from 63 languages (26 families) harbours insights about turn-taking, timing, sequential structure and social action, with implications for language technology, natural language understanding, and the design of conversational interfaces. Harnessing linguistically diverse conversational corpora will provide the empirical foundations for flexible, localizable, humane language technologies of the future.
  • Dona, L., & Schouwstra, M. (2022). The Role of Structural Priming, Semantics and Population Structure in Word Order Conventionalization: A Computational Model. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 171-173). Nijmegen: Joint Conference on Language Evolution (JCoLE).
  • Doumas, L. A. A., Martin, A. E., & Hummel, J. E. (2020). Relation learning in a neurocomputational architecture supports cross-domain transfer. In S. Denison, M. Mack, Y. Xu, & B. C. Armstrong (Eds.), Proceedings of the 42nd Annual Virtual Meeting of the Cognitive Science Society (CogSci 2020) (pp. 932-937). Montreal, QB: Cognitive Science Society.

    Abstract

    Humans readily generalize, applying prior knowledge to novel situations and stimuli. Advances in machine learning have begun to approximate and even surpass human performance, but these systems struggle to generalize what they have learned to untrained situations. We present a model based on wellestablished neurocomputational principles that demonstrates human-level generalisation. This model is trained to play one video game (Breakout) and performs one-shot generalisation to a new game (Pong) with different characteristics. The model
    generalizes because it learns structured representations that are functionally symbolic (viz., a role-filler binding calculus) from unstructured training data. It does so without feedback, and without requiring that structured representations are specified a priori. Specifically, the model uses neural co-activation to discover which characteristics of the input are invariant and to learn relational predicates, and oscillatory regularities in network firing to bind predicates to arguments. To our knowledge,
    this is the first demonstration of human-like generalisation in a machine system that does not assume structured representa-
    tions to begin with.
  • Drexler, H., Verbunt, A., & Wittenburg, P. (1996). Max Planck Electronic Information Desk. In B. den Brinker, J. Beek, A. Hollander, & R. Nieuwboer (Eds.), Zesde workshop computers in de psychologie: Programma en uitgebreide samenvattingen (pp. 64-66). Amsterdam: Vrije Universiteit Amsterdam, IFKB.
  • Drozd, K. F. (1998). No as a determiner in child English: A summary of categorical evidence. In A. Sorace, C. Heycock, & R. Shillcock (Eds.), Proceedings of the Gala '97 Conference on Language Acquisition (pp. 34-39). Edinburgh, UK: Edinburgh University Press,.

    Abstract

    This paper summarizes the results of a descriptive syntactic category analysis of child English no which reveals that young children use and represent no as a determiner and negatives like no pen as NPs, contra standard analyses.
  • Ergin, R., Raviv, L., Senghas, A., Padden, C., & Sandler, W. (2020). Community structure affects convergence on uniform word orders: Evidence from emerging sign languages. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 84-86). Nijmegen: The Evolution of Language Conferences.
  • Fletcher, J., Kidd, E., Stoakes, H., & Nordlinger, R. (2022). Prosodic phrasing, pitch range, and word order variation in Murrinhpatha. In R. Billington (Ed.), Proceedings of the 18th Australasian International Conference on Speech Science and Technology (pp. 201-205). Canberra: Australasian Speech Science and Technology Association.

    Abstract

    Like many Indigenous Australian languages, Murrinhpatha has flexible word order with no apparent configurational syntax. We analyzed an experimental corpus of Murrinhpatha utterances for associations between different thematic role orders, intonational phrasing patterns and pitch downtrends. We found that initial constituents (Agents or Patients) tend to carry the highest pitch targets (HiF0), followed by patterns of downstep and declination. Sentence-final verbs always have lower Hif0 values than either initial or medial Agents or Patients. Thematic role order does not influence intonational
    patterns, with the results suggesting that Murrinhpatha has positional prosody, although final nominals can disrupt global
    pitch downtrends regardless of thematic role.
  • Galke, L., & Scherp, A. (2022). Bag-of-words vs. graph vs. sequence in text classification: Questioning the necessity of text-graphs and the surprising strength of a wide MLP. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (pp. 4038-4051). Dublin: Association for Computational Linguistics. doi:10.18653/v1/2022.acl-long.279.
  • Galke, L., Cuber, I., Meyer, C., Nölscher, H. F., Sonderecker, A., & Scherp, A. (2022). General cross-architecture distillation of pretrained language models into matrix embedding. In Proceedings of the IEEE Joint Conference on Neural Networks (IJCNN 2022), part of the IEEE World Congress on Computational Intelligence (WCCI 2022). doi:10.1109/IJCNN55064.2022.9892144.

    Abstract

    Large pretrained language models (PreLMs) are rev-olutionizing natural language processing across all benchmarks. However, their sheer size is prohibitive for small laboratories or for deployment on mobile devices. Approaches like pruning and distillation reduce the model size but typically retain the same model architecture. In contrast, we explore distilling PreLMs into a different, more efficient architecture, Continual Multiplication of Words (CMOW), which embeds each word as a matrix and uses matrix multiplication to encode sequences. We extend the CMOW architecture and its CMOW/CBOW-Hybrid variant with a bidirectional component for more expressive power, per-token representations for a general (task-agnostic) distillation during pretraining, and a two-sequence encoding scheme that facilitates downstream tasks on sentence pairs, such as sentence similarity and natural language inference. Our matrix-based bidirectional CMOW/CBOW-Hybrid model is competitive to DistilBERT on question similarity and recognizing textual entailment, but uses only half of the number of parameters and is three times faster in terms of inference speed. We match or exceed the scores of ELMo for all tasks of the GLUE benchmark except for the sentiment analysis task SST-2 and the linguistic acceptability task CoLA. However, compared to previous cross-architecture distillation approaches, we demonstrate a doubling of the scores on detecting linguistic acceptability. This shows that matrix-based embeddings can be used to distill large PreLM into competitive models and motivates further research in this direction.
  • Gamba, M., De Gregorio, C., Valente, D., Raimondi, T., Torti, V., Miaretsoa, L., Carugati, F., Friard, O., Giacoma, C., & Ravignani, A. (2022). Primate rhythmic categories analyzed on an individual basis. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 229-236). Nijmegen: Joint Conference on Language Evolution (JCoLE).

    Abstract

    Rhythm is a fundamental feature characterizing communicative displays, and recent studies showed that primate songs encompass categorical rhythms falling on small integer ratios observed in humans. We individually assessed the presence and sexual dimorphism of rhythmic categories, analyzing songs emitted by 39 wild indris. Considering the intervals between the units given during each song, we extracted 13556 interval ratios and found three peaks (at around 0.33, 0.47, and 0.70). Two peaks indicated rhythmic categories corresponding to small integer ratios (1:1, 2:1). All individuals showed a peak at 0.70, and
    most showed those at 0.47 and 0.33. In addition, we found sex differences in the peak at 0.47 only, with males showing lower values than females. This work investigates the presence of individual rhythmic categories in a non-human species; further research may highlight the significance of rhythmicity and untie selective pressures that guided its evolution across species, including humans.
  • Harmon, Z., & Kapatsinski, V. (2020). The best-laid plan of mice and men: Competition between top-down and preceding-item cues in plan execution. In S. Denison, M. Mack, Y. Xu, & B. C. Armstrong (Eds.), Proceedings of the 42nd Annual Meeting of the Cognitive Science Society (CogSci 2020) (pp. 1674-1680). Montreal, QB: Cognitive Science Society.

    Abstract

    There is evidence that the process of executing a planned utterance involves the use of both preceding-context and top-down cues. Utterance-initial words are cued only by the top-down plan. In contrast, non-initial words are cued both by top-down cues and preceding-context cues. Co-existence of both cue types raises the question of how they interact during learning. We argue that this interaction is competitive: items that tend to be preceded by predictive preceding-context cues are harder to activate from the plan without this predictive context. A novel computational model of this competition is developed. The model is tested on a corpus of repetition disfluencies and shown to account for the influences on patterns of restarts during production. In particular, this model predicts a novel Initiation Effect: following an interruption, speakers re-initiate production from words that tend to occur in utterance-initial position, even when they are not initial in the interrupted utterance.
  • Hashemzadeh, M., Kaufeld, G., White, M., Martin, A. E., & Fyshe, A. (2020). From language to language-ish: How brain-like is an LSTM representation of nonsensical language stimuli? In T. Cohn, Y. He, & Y. Liu (Eds.), Findings of the Association for Computational Linguistics: EMNLP 2020 (pp. 645-655). Association for Computational Linguistics.

    Abstract

    The representations generated by many mod-
    els of language (word embeddings, recurrent
    neural networks and transformers) correlate
    to brain activity recorded while people read.
    However, these decoding results are usually
    based on the brain’s reaction to syntactically
    and semantically sound language stimuli. In
    this study, we asked: how does an LSTM (long
    short term memory) language model, trained
    (by and large) on semantically and syntac-
    tically intact language, represent a language
    sample with degraded semantic or syntactic
    information? Does the LSTM representation
    still resemble the brain’s reaction? We found
    that, even for some kinds of nonsensical lan-
    guage, there is a statistically significant rela-
    tionship between the brain’s activity and the
    representations of an LSTM. This indicates
    that, at least in some instances, LSTMs and the
    human brain handle nonsensical data similarly.
  • De Heer Kloots, M., Carlson, D., Garcia, M., Kotz, S., Lowry, A., Poli-Nardi, L., de Reus, K., Rubio-García, A., Sroka, M., Varola, M., & Ravignani, A. (2020). Rhythmic perception, production and interactivity in harbour and grey seals. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 59-62). Nijmegen: The Evolution of Language Conferences.
  • Hintz, F., Voeten, C. C., McQueen, J. M., & Meyer, A. S. (2022). Quantifying the relationships between linguistic experience, general cognitive skills and linguistic processing skills. In J. Culbertson, A. Perfors, H. Rabagliati, & V. Ramenzoni (Eds.), Proceedings of the 44th Annual Conference of the Cognitive Science Society (CogSci 2022) (pp. 2491-2496). Toronto, Canada: Cognitive Science Society.

    Abstract

    Humans differ greatly in their ability to use language. Contemporary psycholinguistic theories assume that individual differences in language skills arise from variability in linguistic experience and in general cognitive skills. While much previous research has tested the involvement of select verbal and non-verbal variables in select domains of linguistic processing, comprehensive characterizations of the relationships among the skills underlying language use are rare. We contribute to such a research program by re-analyzing a publicly available set of data from 112 young adults tested on 35 behavioral tests. The tests assessed nine key constructs reflecting linguistic processing skills, linguistic experience and general cognitive skills. Correlation and hierarchical clustering analyses of the test scores showed that most of the tests assumed to measure the same construct correlated moderately to strongly and largely clustered together. Furthermore, the results suggest important roles of processing speed in comprehension, and of linguistic experience in production.
  • Hoeksema, N., Hagoort, P., & Vernes, S. C. (2022). Piecing together the building blocks of the vocal learning bat brain. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 294-296). Nijmegen: Joint Conference on Language Evolution (JCoLE).
  • Hoeksema, N., Villanueva, S., Mengede, J., Salazar-Casals, A., Rubio-García, A., Curcic-Blake, B., Vernes, S. C., & Ravignani, A. (2020). Neuroanatomy of the grey seal brain: Bringing pinnipeds into the neurobiological study of vocal learning. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 162-164). Nijmegen: The Evolution of Language Conferences.
  • Hoeksema, N., Wiesmann, M., Kiliaan, A., Hagoort, P., & Vernes, S. C. (2020). Bats and the comparative neurobiology of vocal learning. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 165-167). Nijmegen: The Evolution of Language Conferences.
  • Janse, E., & Quené, H. (1999). On the suitability of the cross-modal semantic priming task. In Proceedings of the XIVth International Congress of Phonetic Sciences (pp. 1937-1940).
  • Kan, U., Gökgöz, K., Sumer, B., Tamyürek, E., & Özyürek, A. (2022). Emergence of negation in a Turkish homesign system: Insights from the family context. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 387-389). Nijmegen: Joint Conference on Language Evolution (JCoLE).
  • Kempen, G., & Harbusch, K. (1998). A 'tree adjoining' grammar without adjoining: The case of scrambling in German. In Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4).
  • Kempen, G. (1996). Human language technology can modernize writing and grammar instruction. In COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2 (pp. 1005-1006). Stroudsburg, PA: Association for Computational Linguistics.
  • Kempen, G., & Hoenkamp, E. (1982). Incremental sentence generation: Implications for the structure of a syntactic processor. In J. Horecký (Ed.), COLING 82. Proceedings of the Ninth International Conference on Computational Linguistics, Prague, July 5-10, 1982 (pp. 151-156). Amsterdam: North-Holland.

    Abstract

    Human speakers often produce sentences incrementally. They can start speaking having in mind only a fragmentary idea of what they want to say, and while saying this they refine the contents underlying subsequent parts of the utterance. This capability imposes a number of constraints on the design of a syntactic processor. This paper explores these constraints and evaluates some recent computational sentence generators from the perspective of incremental production.
  • Kempen, G., & Janssen, S. (1996). Omspellen: Reuze(n)karwei of peule(n)schil? In H. Croll, & J. Creutzberg (Eds.), Proceedings of the 5e Dag van het Document (pp. 143-146). Projectbureau Croll en Creutzberg.
  • Khoe, Y. H., Tsoukala, C., Kootstra, G. J., & Frank, S. L. (2020). Modeling cross-language structural priming in sentence production. In T. C. Stewart (Ed.), Proceedings of the 18th Annual Meeting of the International Conference on Cognitive Modeling (pp. 131-137). University Park, PA, USA: The Penn State Applied Cognitive Science Lab.

    Abstract

    A central question in the psycholinguistic study of multilingualism is how syntax is shared across languages. We implement a model to investigate whether error-based implicit learning can provide an account of cross-language structural priming. The model is based on the Dual-path model of
    sentence-production (Chang, 2002). We implement our model using the Bilingual version of Dual-path (Tsoukala, Frank, & Broersma, 2017). We answer two main questions: (1) Can structural priming of active and passive constructions occur between English and Spanish in a bilingual version of the Dual-
    path model? (2) Does cross-language priming differ quantitatively from within-language priming in this model? Our results show that cross-language priming does occur in the model. This finding adds to the viability of implicit learning as an account of structural priming in general and cross-language
    structural priming specifically. Furthermore, we find that the within-language priming effect is somewhat stronger than the cross-language effect. In the context of mixed results from
    behavioral studies, we interpret the latter finding as an indication that the difference between cross-language and within-
    language priming is small and difficult to detect statistically.
  • Kita, S., van Gijn, I., & van der Hulst, H. (1998). Movement phases in signs and co-speech gestures, and their transcription by human coders. In Gesture and Sign-Language in Human-Computer Interaction (Lecture Notes in Artificial Intelligence - LNCS Subseries, Vol. 1371) (pp. 23-35). Berlin, Germany: Springer-Verlag.

    Abstract

    The previous literature has suggested that the hand movement in co-speech gestures and signs consists of a series of phases with qualitatively different dynamic characteristics. In this paper, we propose a syntagmatic rule system for movement phases that applies to both co-speech gestures and signs. Descriptive criteria for the rule system were developed for the analysis video-recorded continuous production of signs and gesture. It involves segmenting a stream of body movement into phases and identifying different phase types. Two human coders used the criteria to analyze signs and cospeech gestures that are produced in natural discourse. It was found that the criteria yielded good inter-coder reliability. These criteria can be used for the technology of automatic recognition of signs and co-speech gestures in order to segment continuous production and identify the potentially meaningbearing phase.
  • Klein, W. (1995). A simplest analysis of the English tense-aspect system. In W. Riehle, & H. Keiper (Eds.), Proceedings of the Anglistentag 1994 (pp. 139-151). Tübingen: Niemeyer.
  • Klein, W., & Musan, R. (Eds.). (1999). Das deutsche Perfekt [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (113).
  • Klein, W. (Ed.). (1995). Epoche [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (100).
  • Klein, W. (Ed.). (1980). Argumentation [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (38/39).
  • Klein, W. (Ed.). (1989). Kindersprache [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (73).
  • Klein, W. (Ed.). (1998). Kaleidoskop [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (112).
  • Klein, W. (Ed.). (1984). Textverständlichkeit - Textverstehen [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (55).
  • Klein, W., & Schlieben-Lange, B. (Eds.). (1996). Sprache und Subjektivität I [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (101).
  • Klein, W., & Schlieben-Lange, B. (Eds.). (1996). Sprache und Subjektivität II [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (102).
  • Klein, W. (Ed.). (1985). Schriftlichkeit [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (59).
  • Klein, W. (Ed.). (1996). Zweitspracherwerb [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (104).
  • Klein, W. (Ed.). (1982). Zweitspracherwerb [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (45).
  • Kohatsu, T., Akamine, S., Sato, M., & Niikuni, K. (2022). Individual differences in empathy affect perspective adoption in language comprehension. In Proceedings of the 39th Annual Meeting of Japanese Cognitive Science Society (pp. 652-656). Tokyo: Japanese Cognitive Science Society.
  • Kuijpers, C., Van Donselaar, W., & Cutler, A. (1996). Phonological variation: Epenthesis and deletion of schwa in Dutch. In H. T. Bunnell (Ed.), Proceedings of the Fourth International Conference on Spoken Language Processing: Vol. 1 (pp. 94-97). New York: Institute of Electrical and Electronics Engineers.

    Abstract

    Two types of phonological variation in Dutch, resulting from optional rules, are schwa epenthesis and schwa deletion. In a lexical decision experiment it was investigated whether the phonological variants were processed similarly to the standard forms. It was found that the two types of variation patterned differently. Words with schwa epenthesis were processed faster and more accurately than the standard forms, whereas words with schwa deletion led to less fast and less accurate responses. The results are discussed in relation to the role of consonant-vowel alternations in speech processing and the perceptual integrity of onset clusters.
  • Lattenkamp, E. Z., Linnenschmidt, M., Mardus, E., Vernes, S. C., Wiegrebe, L., & Schutte, M. (2020). Impact of auditory feedback on bat vocal development. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 249-251). Nijmegen: The Evolution of Language Conferences.
  • Lei, L., Raviv, L., & Alday, P. M. (2020). Using spatial visualizations and real-world social networks to understand language evolution and change. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 252-254). Nijmegen: The Evolution of Language Conferences.
  • Levelt, W. J. M., & Plomp, R. (1962). Musical consonance and critical bandwidth. In Proceedings of the 4th International Congress Acoustics (pp. 55-55).
  • Levelt, W. J. M. (1984). Spontaneous self-repairs in speech: Processes and representations. In M. P. R. Van den Broecke, & A. Cohen (Eds.), Proceedings of the 10th International Congress of Phonetic Sciences (pp. 105-117). Dordrecht: Foris.
  • Levshina, N. (2020). How tight is your language? A semantic typology based on Mutual Information. In K. Evang, L. Kallmeyer, R. Ehren, S. Petitjean, E. Seyffarth, & D. Seddah (Eds.), Proceedings of the 19th International Workshop on Treebanks and Linguistic Theories (pp. 70-78). Düsseldorf, Germany: Association for Computational Linguistics. doi:10.18653/v1/2020.tlt-1.7.

    Abstract

    Languages differ in the degree of semantic flexibility of their syntactic roles. For example, Eng-
    lish and Indonesian are considered more flexible with regard to the semantics of subjects,
    whereas German and Japanese are less flexible. In Hawkins’ classification, more flexible lan-
    guages are said to have a loose fit, and less flexible ones are those that have a tight fit. This
    classification has been based on manual inspection of example sentences. The present paper
    proposes a new, quantitative approach to deriving the measures of looseness and tightness from
    corpora. We use corpora of online news from the Leipzig Corpora Collection in thirty typolog-
    ically and genealogically diverse languages and parse them syntactically with the help of the
    Universal Dependencies annotation software. Next, we compute Mutual Information scores for
    each language using the matrices of lexical lemmas and four syntactic dependencies (intransi-
    tive subjects, transitive subject, objects and obliques). The new approach allows us not only to
    reproduce the results of previous investigations, but also to extend the typology to new lan-
    guages. We also demonstrate that verb-final languages tend to have a tighter relationship be-
    tween lexemes and syntactic roles, which helps language users to recognize thematic roles early
    during comprehension.

    Additional information

    full text via ACL website
  • Liesenfeld, A., & Dingemanse, M. (2022). Bottom-up discovery of structure and variation in response tokens (‘backchannels’) across diverse languages. In Proceedings of Interspeech 2022 (pp. 1126-1130).

    Abstract

    Response tokens (also known as backchannels, continuers, or feedback) are a frequent feature of human interaction, where they serve to display understanding and streamline turn-taking. We propose a bottom-up method to study responsive behaviour across 16 languages (8 language families). We use sequential context and recurrence of turns formats to identify candidate response tokens in a language-agnostic way across diverse conversational corpora. We then use UMAP clustering directly on speech signals to represent structure and variation. We find that (i) written orthographic annotations underrepresent the attested variation, (ii) distinctions between formats can be gradient rather than discrete, (iii) most languages appear to make available a broad distinction between a minimal nasal format `mm' and a fuller `yeah’-like format. Charting this aspect of human interaction contributes to our understanding of interactional infrastructure across languages and can inform the design of speech technologies.
  • Liesenfeld, A., & Dingemanse, M. (2022). Building and curating conversational corpora for diversity-aware language science and technology. In F. Béchet, P. Blache, K. Choukri, C. Cieri, T. DeClerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, & J. Odijk (Eds.), Proceedings of the 13th Language Resources and Evaluation Conference (LREC 2022) (pp. 1178-1192). Marseille, France: European Language Resources Association.

    Abstract

    We present an analysis pipeline and best practice guidelines for building and curating corpora of everyday conversation in diverse languages. Surveying language documentation corpora and other resources that cover 67 languages and varieties from 28 phyla, we describe the compilation and curation process, specify minimal properties of a unified format for interactional data, and develop methods for quality control that take into account turn-taking and timing. Two case studies show the broad utility of conversational data for (i) charting human interactional infrastructure and (ii) tracing challenges and opportunities for current ASR solutions. Linguistically diverse conversational corpora can provide new insights for the language sciences and stronger empirical foundations for language technology.
  • MacDonald, K., Räsänen, O., Casillas, M., & Warlaumont, A. S. (2020). Measuring prosodic predictability in children’s home language environments. In S. Denison, M. Mack, Y. Xu, & B. C. Armstrong (Eds.), Proceedings of the 42nd Annual Virtual Meeting of the Cognitive Science Society (CogSci 2020) (pp. 695-701). Montreal, QB: Cognitive Science Society.

    Abstract

    Children learn language from the speech in their home environment. Recent work shows that more infant-directed speech
    (IDS) leads to stronger lexical development. But what makes IDS a particularly useful learning signal? Here, we expand on an attention-based account first proposed by Räsänen et al. (2018): that prosodic modifications make IDS less predictable, and thus more interesting. First, we reproduce the critical finding from Räsänen et al.: that lab-recorded IDS pitch is less predictable compared to adult-directed speech (ADS). Next, we show that this result generalizes to the home language environment, finding that IDS in daylong recordings is also less predictable than ADS but that this pattern is much less robust than for IDS recorded in the lab. These results link experimental work on attention and prosodic modifications of IDS to real-world language-learning environments, highlighting some challenges of scaling up analyses of IDS to larger datasets that better capture children’s actual input.
  • Yu, J., Mailhammer, R., & Cutler, A. (2020). Vocabulary structure affects word recognition: Evidence from German listeners. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (Eds.), Proceedings of Speech Prosody 2020 (pp. 474-478). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-97.

    Abstract

    Lexical stress is realised similarly in English, German, and
    Dutch. On a suprasegmental level, stressed syllables tend to be
    longer and more acoustically salient than unstressed syllables;
    segmentally, vowels in unstressed syllables are often reduced.
    The frequency of unreduced unstressed syllables (where only
    the suprasegmental cues indicate lack of stress) however,
    differs across the languages. The present studies test whether
    listener behaviour is affected by these vocabulary differences,
    by investigating German listeners’ use of suprasegmental cues
    to lexical stress in German and English word recognition. In a
    forced-choice identification task, German listeners correctly
    assigned single-syllable fragments (e.g., Kon-) to one of two
    words differing in stress (KONto, konZEPT). Thus, German
    listeners can exploit suprasegmental information for
    identifying words. German listeners also performed above
    chance in a similar task in English (with, e.g., DIver, diVERT),
    i.e., their sensitivity to these cues also transferred to a nonnative
    language. An English listener group, in contrast, failed
    in the English fragment task. These findings mirror vocabulary
    patterns: German has more words with unreduced unstressed
    syllables than English does.
  • McQueen, J. M., & Cutler, A. (1998). Spotting (different kinds of) words in (different kinds of) context. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2791-2794). Sydney: ICSLP.

    Abstract

    The results of a word-spotting experiment are presented in which Dutch listeners tried to spot different types of bisyllabic Dutch words embedded in different types of nonsense contexts. Embedded verbs were not reliably harder to spot than embedded nouns; this suggests that nouns and verbs are recognised via the same basic processes. Iambic words were no harder to spot than trochaic words, suggesting that trochaic words are not in principle easier to recognise than iambic words. Words were harder to spot in consonantal contexts (i.e., contexts which themselves could not be words) than in longer contexts which contained at least one vowel (i.e., contexts which, though not words, were possible words of Dutch). A control experiment showed that this difference was not due to acoustic differences between the words in each context. The results support the claim that spoken-word recognition is sensitive to the viability of sound sequences as possible words.
  • Mengede, J., Devanna, P., Hörpel, S. G., Firzla, U., & Vernes, S. C. (2020). Studying the genetic bases of vocal learning in bats. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 280-282). Nijmegen: The Evolution of Language Conferences.
  • Merkx, D., Frank, S. L., & Ernestus, M. (2022). Seeing the advantage: Visually grounding word embeddings to better capture human semantic knowledge. In E. Chersoni, N. Hollenstein, C. Jacobs, Y. Oseki, L. Prévot, & E. Santus (Eds.), Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (CMCL 2022) (pp. 1-11). Stroudsburg, PA, USA: Association for Computational Linguistics (ACL).

    Abstract

    Distributional semantic models capture word-level meaning that is useful in many natural language processing tasks and have even been shown to capture cognitive aspects of word meaning. The majority of these models are purely text based, even though the human sensory experience is much richer. In this paper we create visually grounded word embeddings by combining English text and images and compare them to popular text-based methods, to see if visual information allows our model to better capture cognitive aspects of word meaning. Our analysis shows that visually grounded embedding similarities are more predictive of the human reaction times in a large priming experiment than the purely text-based embeddings. The visually grounded embeddings also correlate well with human word similarity ratings.Importantly, in both experiments we show that he grounded embeddings account for a unique portion of explained variance, even when we include text-based embeddings trained on huge corpora. This shows that visual grounding allows our model to capture information that cannot be extracted using text as the only source of information.
  • Mishra, C., & Skantze, G. (2022). Knowing where to look: A planning-based architecture to automate the gaze behavior of social robots. In Proceedings of the 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN) (pp. 1201-1208). doi:10.1109/RO-MAN53752.2022.9900740.

    Abstract

    Gaze cues play an important role in human communication and are used to coordinate turn-taking and joint attention, as well as to regulate intimacy. In order to have fluent conversations with people, social robots need to exhibit humanlike gaze behavior. Previous Gaze Control Systems (GCS) in HRI have automated robot gaze using data-driven or heuristic approaches. However, these systems tend to be mainly reactive in nature. Planning the robot gaze ahead of time could help in achieving more realistic gaze behavior and better eye-head coordination. In this paper, we propose and implement a novel planning-based GCS. We evaluate our system in a comparative within-subjects user study (N=26) between a reactive system and our proposed system. The results show that the users preferred the proposed system and that it was significantly more interpretable and better at regulating intimacy.
  • Mudd, K., Lutzenberger, H., De Vos, C., Fikkert, P., Crasborn, O., & De Boer, B. (2020). How does social structure shape language variation? A case study of the Kata Kolok lexicon. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 302-304). Nijmegen: The Evolution of Language Conferences.
  • Otake, T., Davis, S. M., & Cutler, A. (1995). Listeners’ representations of within-word structure: A cross-linguistic and cross-dialectal investigation. In J. Pardo (Ed.), Proceedings of EUROSPEECH 95: Vol. 3 (pp. 1703-1706). Madrid: European Speech Communication Association.

    Abstract

    Japanese, British English and American English listeners were presented with spoken words in their native language, and asked to mark on a written transcript of each word the first natural division point in the word. The results showed clear and strong patterns of consensus, indicating that listeners have available to them conscious representations of within-word structure. Orthography did not play a strongly deciding role in the results. The patterns of response were at variance with results from on-line studies of speech segmentation, suggesting that the present task taps not those representations used in on-line listening, but levels of representation which may involve much richer knowledge of word-internal structure.
  • Ozyurek, A. (1998). An analysis of the basic meaning of Turkish demonstratives in face-to-face conversational interaction. In S. Santi, I. Guaitella, C. Cave, & G. Konopczynski (Eds.), Oralite et gestualite: Communication multimodale, interaction: actes du colloque ORAGE 98 (pp. 609-614). Paris: L'Harmattan.
  • Ozyurek, A. (2020). From hands to brains: How does human body talk, think and interact in face-to-face language use? In K. Truong, D. Heylen, & M. Czerwinski (Eds.), ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction (pp. 1-2). New York, NY, USA: Association for Computing Machinery. doi:10.1145/3382507.3419442.
  • Ozyurek, A., & Kita, S. (1999). Expressing manner and path in English and Turkish: Differences in speech, gesture, and conceptualization. In M. Hahn, & S. C. Stoness (Eds.), Proceedings of the Twenty-first Annual Conference of the Cognitive Science Society (pp. 507-512). London: Erlbaum.
  • Paplu, S. H., Mishra, C., & Berns, K. (2020). Pseudo-randomization in automating robot behaviour during human-robot interaction. In 2020 Joint IEEE 10th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) (pp. 1-6). Institute of Electrical and Electronics Engineers. doi:10.1109/ICDL-EpiRob48136.2020.9278115.

    Abstract

    Automating robot behavior in a specific situation is an active area of research. There are several approaches available in the literature of robotics to cater for the automatic behavior of a robot. However, when it comes to humanoids or human-robot interaction in general, the area has been less explored. In this paper, a pseudo-randomization approach has been introduced to automatize the gestures and facial expressions of an interactive humanoid robot called ROBIN based on its mental state. A significant number of gestures and facial expressions have been implemented to allow the robot more options to perform a relevant action or reaction based on visual stimuli. There is a display of noticeable differences in the behaviour of the robot for the same stimuli perceived from an interaction partner. This slight autonomous behavioural change in the robot clearly shows a notion of automation in behaviour. The results from experimental scenarios and human-centered evaluation of the system help validate the approach.

    Files private

    Request files
  • Rasenberg, M., Dingemanse, M., & Ozyurek, A. (2020). Lexical and gestural alignment in interaction and the emergence of novel shared symbols. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 356-358). Nijmegen: The Evolution of Language Conferences.
  • Raviv, L., Meyer, A. S., & Lev-Ari, S. (2020). Network structure and the cultural evolution of linguistic structure: A group communication experiment. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 359-361). Nijmegen: The Evolution of Language Conferences.
  • Raviv, L., Jacobson, S. L., Plotnik, J. M., Bowman, J., Lynch, V., & Benítez-Burraco, A. (2022). Elephants as a new animal model for studying the evolution of language as a result of self-domestication. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 606-608). Nijmegen: Joint Conference on Language Evolution (JCoLE).
  • de Reus, K., Carlson, D., Jadoul, Y., Lowry, A., Gross, S., Garcia, M., Salazar-Casals, A., Rubio-García, A., Haas, C. E., De Boer, B., & Ravignani, A. (2020). Relationships between vocal ontogeny and vocal tract anatomy in harbour seals (Phoca vitulina). In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 63-66). Nijmegen: The Evolution of Language Conferences.
  • de Reus, K., Carlson, D., Lowry, A., Gross, S., Garcia, M., Rubio-García, A., Salazar-Casals, A., & Ravignani, A. (2022). Body size predicts vocal tract size in a mammalian vocal learner. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 154-156). Nijmegen: Joint Conference on Language Evolution (JCoLE).
  • Scholman, M., Tianai, D., Yung, F., & Demberg, V. (2022). DiscoGeM: A crowdsourced corpus of genre-mixed implicit discourse relations. In N. Calzolari, F. Béchet, P. Blache, K. Choukri, C. Cieri, T. DeClerck, S. Goggi, H. Isahara, B. Maegaard, J. Mariani, H. Mazo, J. Odijk, & S. Piperidis (Eds.), Proceedings of the 13th Language Resources and Evaluation Conference (LREC 2022) (pp. 3281-3290). Marseille, France: European Language Resources Association.

    Abstract

    We present DiscoGeM, a crowdsourced corpus of 6,505 implicit discourse relations from three genres: political speech,
    literature, and encyclopedic texts. Each instance was annotated by 10 crowd workers. Various label aggregation methods
    were explored to evaluate how to obtain a label that best captures the meaning inferred by the crowd annotators. The results
    show that a significant proportion of discourse relations in DiscoGeM are ambiguous and can express multiple relation senses.
    Probability distribution labels better capture these interpretations than single labels. Further, the results emphasize that text
    genre crucially affects the distribution of discourse relations, suggesting that genre should be included as a factor in automatic
    relation classification. We make available the newly created DiscoGeM corpus, as well as the dataset with all annotator-level
    labels. Both the corpus and the dataset can facilitate a multitude of applications and research purposes, for example to
    function as training data to improve the performance of automatic discourse relation parsers, as well as facilitate research into
    non-connective signals of discourse relations.
  • Scott, D. R., & Cutler, A. (1982). Segmental cues to syntactic structure. In Proceedings of the Institute of Acoustics 'Spectral Analysis and its Use in Underwater Acoustics' (pp. E3.1-E3.4). London: Institute of Acoustics.
  • Seidlmayer, E., Voß, J., Melnychuk, T., Galke, L., Tochtermann, K., Schultz, C., & Förstner, K. U. (2020). ORCID for Wikidata. Data enrichment for scientometric applications. In L.-A. Kaffee, O. Tifrea-Marciuska, E. Simperl, & D. Vrandečić (Eds.), Proceedings of the 1st Wikidata Workshop (Wikidata 2020). Aachen, Germany: CEUR Workshop Proceedings.

    Abstract

    Due to its numerous bibliometric entries of scholarly articles and connected information Wikidata can serve as an open and rich
    source for deep scientometrical analyses. However, there are currently certain limitations: While 31.5% of all Wikidata entries represent scientific articles, only 8.9% are entries describing a person and the number
    of entries researcher is accordingly even lower. Another issue is the frequent absence of established relations between the scholarly article item and the author item although the author is already listed in Wikidata.
    To fill this gap and to improve the content of Wikidata in general, we established a workflow for matching authors and scholarly publications by integrating data from the ORCID (Open Researcher and Contributor ID) database. By this approach we were able to extend Wikidata by more than 12k author-publication relations and the method can be
    transferred to other enrichments based on ORCID data. This is extension is beneficial for Wikidata users performing bibliometrical analyses or using such metadata for other purposes.
  • Seuren, P. A. M. (1984). Logic and truth-values in language. In F. Landman, & F. Veltman (Eds.), Varieties of formal semantics: Proceedings of the fourth Amsterdam colloquium (pp. 343-364). Dordrecht: Foris.
  • Seuren, P. A. M. (1982). Riorientamenti metodologici nello studio della variabilità linguistica. In D. Gambarara, & A. D'Atri (Eds.), Ideologia, filosofia e linguistica: Atti del Convegno Internazionale di Studi, Rende (CS) 15-17 Settembre 1978 ( (pp. 499-515). Roma: Bulzoni.
  • Seuren, P. A. M. (1985). Predicate raising and semantic transparency in Mauritian Creole. In N. Boretzky, W. Enninger, & T. Stolz (Eds.), Akten des 2. Essener Kolloquiums über "Kreolsprachen und Sprachkontakte", 29-30 Nov. 1985 (pp. 203-229). Bochum: Brockmeyer.
  • Seuren, P. A. M. (1996). What a universal semantic interlingua can do. In A. Zamulin (Ed.), Perspectives of System Informatics. Proceedings of the Andrei Ershov Second International Memorial Conference, Novosibirsk, Akademgorodok, June 25-28,1996 (pp. 41-42). Novosibirsk: A.P. Ershov Institute of Informatics Systems.
  • Seuren, P. A. M. (1980). Variabele competentie: Linguïstiek en sociolinguïstiek anno 1980. In Handelingen van het 36e Nederlands Filologencongres: Gehouden te Groningen op woensdag 9, donderdag 10 en vrijdag 11 April 1980 (pp. 41-56). Amsterdam: Holland University Press.
  • Severijnen, G. G., Bosker, H. R., & McQueen, J. M. (2022). Acoustic correlates of Dutch lexical stress re-examined: Spectral tilt is not always more reliable than intensity. In S. Frota, M. Cruz, & M. Vigário (Eds.), Proceedings of Speech Prosody 2022 (pp. 278-282). doi:10.21437/SpeechProsody.2022-57.

    Abstract

    The present study examined two acoustic cues in the production
    of lexical stress in Dutch: spectral tilt and overall intensity.
    Sluijter and Van Heuven (1996) reported that spectral tilt is a
    more reliable cue to stress than intensity. However, that study
    included only a small number of talkers (10) and only syllables
    with the vowels /aː/ and /ɔ/.
    The present study re-examined this issue in a larger and
    more variable dataset. We recorded 38 native speakers of Dutch
    (20 females) producing 744 tokens of Dutch segmentally
    overlapping words (e.g., VOORnaam vs. voorNAAM, “first
    name” vs. “respectable”), targeting 10 different vowels, in
    variable sentence contexts. For each syllable, we measured
    overall intensity and spectral tilt following Sluijter and Van
    Heuven (1996).
    Results from Linear Discriminant Analyses showed that,
    for the vowel /aː/ alone, spectral tilt showed an advantage over
    intensity, as evidenced by higher stressed/unstressed syllable
    classification accuracy scores for spectral tilt. However, when
    all vowels were included in the analysis, the advantage
    disappeared.
    These findings confirm that spectral tilt plays a larger role
    in signaling stress in Dutch /aː/ but show that, for a larger
    sample of Dutch vowels, overall intensity and spectral tilt are
    equally important.
  • Shattuck-Hufnagel, S., & Cutler, A. (1999). The prosody of speech error corrections revisited. In J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, & A. Bailey (Eds.), Proceedings of the Fourteenth International Congress of Phonetic Sciences: Vol. 2 (pp. 1483-1486). Berkely: University of California.

    Abstract

    A corpus of digitized speech errors is used to compare the prosody of correction patterns for word-level vs. sound-level errors. Results for both peak F0 and perceived prosodic markedness confirm that speakers are more likely to mark corrections of word-level errors than corrections of sound-level errors, and that errors ambiguous between word-level and soundlevel (such as boat for moat) show correction patterns like those for sound level errors. This finding increases the plausibility of the claim that word-sound-ambiguous errors arise at the same level of processing as sound errors that do not form words.
  • Slonimska, A., Özyürek, A., & Capirci, O. (2022). Simultaneity as an emergent property of sign languages. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 678-680). Nijmegen: Joint Conference on Language Evolution (JCoLE).

Share this page