Publications

Displaying 1 - 100 of 254
  • Akamine, S., Kohatsu, T., Niikuni, K., Schafer, A. J., & Sato, M. (2022). Emotions in language processing: Affective priming in embodied cognition. In Proceedings of the 39th Annual Meeting of Japanese Cognitive Science Society (pp. 326-332). Tokyo: Japanese Cognitive Science Society.
  • Alhama, R. G., Siegelman, N., Frost, R., & Armstrong, B. C. (2019). The role of information in visual word recognition: A perceptually-constrained connectionist account. In A. Goel, C. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 83-89). Austin, TX: Cognitive Science Society.

    Abstract

    Proficient readers typically fixate near the center of a word, with a slight bias towards word onset. We explore a novel account of this phenomenon based on combining information-theory with visual perceptual constraints in a connectionist model of visual word recognition. This account posits that the amount of information-content available for word identification varies across fixation locations and across languages, thereby explaining the overall fixation location bias in different languages, making the novel prediction that certain words are more readily identified when fixating at an atypical fixation location, and predicting specific cross-linguistic differences. We tested these predictions across several simulations in English and Hebrew, and in a pilot behavioral experiment. Results confirmed that the bias to fixate closer to word onset aligns with maximizing information in the visual signal, that some words are more readily identified at atypical fixation locations, and that these effects vary to some degree across languages.
  • Amatuni, A., Schroer, S. E., Zhang, Y., Peters, R. E., Reza, M. A., Crandall, D., & Yu, C. (2021). In-the-moment visual information from the infant's egocentric view determines the success of infant word learning: A computational study. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 265-271). Vienna: Cognitive Science Society.

    Abstract

    Infants learn the meaning of words from accumulated experiences of real-time interactions with their caregivers. To study the effects of visual sensory input on word learning, we recorded infant's view of the world using head-mounted eye trackers during free-flowing play with a caregiver. While playing, infants were exposed to novel label-object mappings and later learning outcomes for these items were tested after the play session. In this study we use a classification based approach to link properties of infants' visual scenes during naturalistic labeling moments to their word learning outcomes. We find that a model which integrates both highly informative and ambiguous sensory evidence is a better fit to infants' individual learning outcomes than models where either type of evidence is taken alone, and that raw labeling frequency is unable to account for the word learning differences we observe. Here we demonstrate how a computational model, using only raw pixels taken from the egocentric scene image, can derive insights on human language learning.
  • Anastasopoulos, A., Lekakou, M., Quer, J., Zimianiti, E., DeBenedetto, J., & Chiang, D. (2018). Part-of-speech tagging on an endangered language: a parallel Griko-Italian Resource. In Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018) (pp. 2529-2539).

    Abstract

    Most work on part-of-speech (POS) tagging is focused on high resource languages, or examines low-resource and active learning settings through simulated studies. We evaluate POS tagging techniques on an actual endangered language, Griko. We present a resource that contains 114 narratives in Griko, along with sentence-level translations in Italian, and provides gold annotations for the test set. Based on a previously collected small corpus, we investigate several traditional methods, as well as methods that take advantage of monolingual data or project cross-lingual POS tags. We show that the combination of a semi-supervised method with cross-lingual transfer is more appropriate for this extremely challenging setting, with the best tagger achieving an accuracy of 72.9%. With an applied active learning scheme, which we use to collect sentence-level annotations over the test set, we achieve improvements of more than 21 percentage points
  • Badimala, P., Mishra, C., Venkataramana, R. K. M., Bukhari, S. S., & Dengel, A. (2019). A Study of Various Text Augmentation Techniques for Relation Classification in Free Text. In Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods (pp. 360-367). Setúbal, Portugal: SciTePress Digital Library. doi:10.5220/0007311003600367.

    Abstract

    Data augmentation techniques have been widely used in visual recognition tasks as it is easy to generate new
    data by simple and straight forward image transformations. However, when it comes to text data augmen-
    tations, it is difficult to find appropriate transformation techniques which also preserve the contextual and
    grammatical structure of language texts. In this paper, we explore various text data augmentation techniques
    in text space and word embedding space. We study the effect of various augmented datasets on the efficiency
    of different deep learning models for relation classification in text.
  • Bauer, B. L. M. (2022). Finite verb + infinite + object in later Latin: Early brace constructions? In G. V. M. Haverling (Ed.), Studies on Late and Vulgar Latin in the Early 21st Century: Acts of the 12th International Colloquium "Latin vulgaire – Latin tardif (pp. 166-181). Uppsala: Acta Universitatis Upsaliensis.
  • Bentum, M., Ten Bosch, L., Van den Bosch, A., & Ernestus, M. (2019). Listening with great expectations: An investigation of word form anticipations in naturalistic speech. In Proceedings of Interspeech 2019 (pp. 2265-2269). doi:10.21437/Interspeech.2019-2741.

    Abstract

    The event-related potential (ERP) component named phonological mismatch negativity (PMN) arises when listeners hear an unexpected word form in a spoken sentence [1]. The PMN is thought to reflect the mismatch between expected and perceived auditory speech input. In this paper, we use the PMN to test a central premise in the predictive coding framework [2], namely that the mismatch between prior expectations and sensory input is an important mechanism of perception. We test this with natural speech materials containing approximately 50,000 word tokens. The corresponding EEG-signal was recorded while participants (n = 48) listened to these materials. Following [3], we quantify the mismatch with two word probability distributions (WPD): a WPD based on preceding context, and a WPD that is additionally updated based on the incoming audio of the current word. We use the between-WPD cross entropy for each word in the utterances and show that a higher cross entropy correlates with a more negative PMN. Our results show that listeners anticipate auditory input while processing each word in naturalistic speech. Moreover, complementing previous research, we show that predictive language processing occurs across the whole probability spectrum.
  • Bentum, M., Ten Bosch, L., Van den Bosch, A., & Ernestus, M. (2019). Quantifying expectation modulation in human speech processing. In Proceedings of Interspeech 2019 (pp. 2270-2274). doi:10.21437/Interspeech.2019-2685.

    Abstract

    The mismatch between top-down predicted and bottom-up perceptual input is an important mechanism of perception according to the predictive coding framework (Friston, [1]). In this paper we develop and validate a new information-theoretic measure that quantifies the mismatch between expected and observed auditory input during speech processing. We argue that such a mismatch measure is useful for the study of speech processing. To compute the mismatch measure, we use naturalistic speech materials containing approximately 50,000 word tokens. For each word token we first estimate the prior word probability distribution with the aid of statistical language modelling, and next use automatic speech recognition to update this word probability distribution based on the unfolding speech signal. We validate the mismatch measure with multiple analyses, and show that the auditory-based update improves the probability of the correct word and lowers the uncertainty of the word probability distribution. Based on these results, we argue that it is possible to explicitly estimate the mismatch between predicted and perceived speech input with the cross entropy between word expectations computed before and after an auditory update.
  • Bentz, C., Dediu, D., Verkerk, A., & Jäger, G. (2018). Language family trees reflect geography and demography beyond neutral drift. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 38-40). Toruń, Poland: NCU Press. doi:10.12775/3991-1.006.
  • Berck, P., Bibiko, H.-J., Kemps-Snijders, M., Russel, A., & Wittenburg, P. (2006). Ontology-based language archive utilization. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 2295-2298).
  • Bodur, K., Branje, S., Peirolo, M., Tiscareno, I., & German, J. S. (2021). Domain-initial strengthening in Turkish: Acoustic cues to prosodic hierarchy in stop consonants. In Proceedings of Interspeech 2021 (pp. 1459-1463). doi:10.21437/Interspeech.2021-2230.

    Abstract

    Studies have shown that cross-linguistically, consonants at the left edge of higher-level prosodic boundaries tend to be more forcefully articulated than those at lower-level boundaries, a phenomenon known as domain-initial strengthening. This study tests whether similar effects occur in Turkish, using the Autosegmental-Metrical model proposed by Ipek & Jun [1, 2] as the basis for assessing boundary strength. Productions of /t/ and /d/ were elicited in four domain-initial prosodic positions corresponding to progressively higher-level boundaries: syllable, word, intermediate phrase, and Intonational Phrase. A fifth position, nuclear word, was included in order to better situate it within the prosodic hierarchy. Acoustic correlates of articulatory strength were measured, including closure duration for /d/ and /t/, as well as voice onset time and burst energy for /t/. Our results show that closure duration increases cumulatively from syllable to intermediate phrase, while voice onset time and burst energy are not influenced by boundary strength. These findings provide corroborating evidence for Ipek & Jun’s model, particularly for the distinction between word and intermediate phrase boundaries. Additionally, articulatory strength at the left edge of the nuclear word patterned closely with word-initial position, supporting the view that the nuclear word is not associated with a distinct phrasing domain
  • Bögels, S., Barr, D., Garrod, S., & Kessler, K. (2013). "Are we still talking about the same thing?" MEG reveals perspective-taking in response to pragmatic violations, but not in anticipation. In M. Knauff, N. Pauen, I. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 215-220). Austin, TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0066/index.html.

    Abstract

    The current study investigates whether mentalizing, or taking the perspective of your interlocutor, plays an essential role throughout a conversation or whether it is mostly used in reaction to misunderstandings. This study is the first to use a brain-imaging method, MEG, to answer this question. In a first phase of the experiment, MEG participants interacted "live" with a confederate who set naming precedents for certain pictures. In a later phase, these precedents were sometimes broken by a speaker who named the same picture in a different way. This could be done by the same speaker, who set the precedent, or by a different speaker. Source analysis of MEG data showed that in the 800 ms before the naming, when the picture was already on the screen, episodic memory and language areas were activated, but no mentalizing areas, suggesting that the speaker's naming intentions were not anticipated by the listener on the basis of shared experiences. Mentalizing areas only became activated after the same speaker had broken a precedent, which we interpret as a reaction to the violation of conversational pragmatics.
  • Bone, D., Ramanarayanan, V., Narayanan, S., Hoedemaker, R. S., & Gordon, P. C. (2013). Analyzing eye-voice coordination in rapid automatized naming. In F. Bimbot, C. Cerisara, G. Fougeron, L. Gravier, L. Lamel, F. Pelligrino, & P. Perrier (Eds.), INTERSPEECH-2013: 14thAnnual Conference of the International Speech Communication Association (pp. 2425-2429). ISCA Archive. Retrieved from http://www.isca-speech.org/archive/interspeech_2013/i13_2425.html.

    Abstract

    Rapid Automatized Naming (RAN) is a powerful tool for pre- dicting future reading skill. A person’s ability to quickly name symbols as they scan a table is related to higher-level reading proficiency in adults and is predictive of future literacy gains in children. However, noticeable differences are present in the strategies or patterns within groups having similar task comple- tion times. Thus, a further stratification of RAN dynamics may lead to better characterization and later intervention to support reading skill acquisition. In this work, we analyze the dynamics of the eyes, voice, and the coordination between the two during performance. It is shown that fast performers are more similar to each other than to slow performers in their patterns, but not vice versa. Further insights are provided about the patterns of more proficient subjects. For instance, fast performers tended to exhibit smoother behavior contours, suggesting a more sta- ble perception-production process.
  • Brand, J., Monaghan, P., & Walker, P. (2018). Changing Signs: Testing How Sound-Symbolism Supports Early Word Learning. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 1398-1403). Austin, TX: Cognitive Science Society.

    Abstract

    Learning a language involves learning how to map specific forms onto their associated meanings. Such mappings can utilise arbitrariness and non-arbitrariness, yet, our understanding of how these two systems operate at different stages of vocabulary development is still not fully understood. The Sound-Symbolism Bootstrapping Hypothesis (SSBH) proposes that sound-symbolism is essential for word learning to commence, but empirical evidence of exactly how sound-symbolism influences language learning is still sparse. It may be the case that sound-symbolism supports acquisition of categories of meaning, or that it enables acquisition of individualized word meanings. In two Experiments where participants learned form-meaning mappings from either sound-symbolic or arbitrary languages, we demonstrate the changing roles of sound-symbolism and arbitrariness for different vocabulary sizes, showing that sound-symbolism provides an advantage for learning of broad categories, which may then transfer to support learning individual words, whereas an arbitrary language impedes acquisition of categories of sound to meaning.
  • Brehm, L., Jackson, C. N., & Miller, K. L. (2019). Incremental interpretation in the first and second language. In M. Brown, & B. Dailey (Eds.), BUCLD 43: Proceedings of the 43rd annual Boston University Conference on Language Development (pp. 109-122). Sommerville, MA: Cascadilla Press.
  • Broeder, D., Offenga, F., Wittenburg, P., Van de Kamp, P., Nathan, D., & Strömqvist, S. (2006). Technologies for a federation of language resource archive. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 2291-2294).
  • Broeder, D., Van Veenendaal, R., Nathan, D., & Strömqvist, S. (2006). A grid of language resource repositories. In Proceedings of the 2nd IEEE International Conference on e-Science and Grid Computing.
  • Broeder, D., Claus, A., Offenga, F., Skiba, R., Trilsbeek, P., & Wittenburg, P. (2006). LAMUS: The Language Archive Management and Upload System. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 2291-2294).
  • Broersma, M. (2006). Nonnative listeners rely less on phonetic information for phonetic categorization than native listeners. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 109-110).
  • Broersma, M. (2006). Accident - execute: Increased activation in nonnative listening. In Proceedings of Interspeech 2006 (pp. 1519-1522).

    Abstract

    Dutch and English listeners’ perception of English words with partially overlapping onsets (e.g., accident- execute) was investigated. Partially overlapping words remained active longer for nonnative listeners, causing an increase of lexical competition in nonnative compared with native listening.
  • Bruggeman, L., & Cutler, A. (2019). The dynamics of lexical activation and competition in bilinguals’ first versus second language. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1342-1346). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Speech input causes listeners to activate multiple
    candidate words which then compete with one
    another. These include onset competitors, that share a
    beginning (bumper, butter), but also, counterintuitively,
    rhyme competitors, sharing an ending
    (bumper, jumper). In L1, competition is typically
    stronger for onset than for rhyme. In L2, onset
    competition has been attested but rhyme competition
    has heretofore remained largely unexamined. We
    assessed L1 (Dutch) and L2 (English) word
    recognition by the same late-bilingual individuals. In
    each language, eye gaze was recorded as listeners
    heard sentences and viewed sets of drawings: three
    unrelated, one depicting an onset or rhyme competitor
    of a word in the input. Activation patterns revealed
    substantial onset competition but no significant
    rhyme competition in either L1 or L2. Rhyme
    competition may thus be a “luxury” feature of
    maximally efficient listening, to be abandoned when
    resources are scarcer, as in listening by late
    bilinguals, in either language.
  • Bruggeman, L., Yu, J., & Cutler, A. (2022). Listener adjustment of stress cue use to fit language vocabulary structure. In S. Frota, M. Cruz, & M. Vigário (Eds.), Proceedings of Speech Prosody 2022 (pp. 264-267). doi:10.21437/SpeechProsody.2022-54.

    Abstract

    In lexical stress languages, phonemically identical syllables can differ suprasegmentally (in duration, amplitude, F0). Such stress
    cues allow listeners to speed spoken-word recognition by rejecting mismatching competitors (e.g., unstressed set- in settee
    rules out stressed set- in setting, setter, settle). Such processing effects have indeed been observed in Spanish, Dutch and German, but English listeners are known to largely ignore stress cues. Dutch and German listeners even outdo English listeners in distinguishing stressed versus unstressed English syllables. This has been attributed to the relative frequency across the stress languages of unstressed syllables with full vowels; in English most unstressed syllables contain schwa, instead, and stress cues on full vowels are thus least often informative in this language. If only informativeness matters, would English listeners who encounter situations where such cues would pay off for them (e.g., learning one of those other stress languages) then shift to using stress cues? Likewise, would stress cue users with English as L2, if mainly using English, shift away from
    using the cues in English? Here we report tests of these two questions, with each receiving a yes answer. We propose that
    English listeners’ disregard of stress cues is purely pragmatic.
  • Brugman, H., Malaisé, V., & Gazendam, L. (2006). A web based general thesaurus browser to support indexing of television and radio programs. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 1488-1491).
  • Bujok, R., Meyer, A. S., & Bosker, H. R. (2022). Visible lexical stress cues on the face do not influence audiovisual speech perception. In S. Frota, M. Cruz, & M. Vigário (Eds.), Proceedings of Speech Prosody 2022 (pp. 259-263). doi:10.21437/SpeechProsody.2022-53.

    Abstract

    Producing lexical stress leads to visible changes on the face, such as longer duration and greater size of the opening of the mouth. Research suggests that these visual cues alone can inform participants about which syllable carries stress (i.e., lip-reading silent videos). This study aims to determine the influence of visual articulatory cues on lexical stress perception in more naturalistic audiovisual settings. Participants were presented with seven disyllabic, Dutch minimal stress pairs (e.g., VOORnaam [first name] & voorNAAM [respectable]) in audio-only (phonetic lexical stress continua without video), video-only (lip-reading silent videos), and audiovisual trials (e.g., phonetic lexical stress continua with video of talker saying VOORnaam or voorNAAM). Categorization data from video-only trials revealed that participants could distinguish the minimal pairs above chance from seeing the silent videos alone. However, responses in the audiovisual condition did not differ from the audio-only condition. We thus conclude that visual lexical stress information on the face, while clearly perceivable, does not play a major role in audiovisual speech perception. This study demonstrates that clear unimodal effects do not always generalize to more naturalistic multimodal communication, advocating that speech prosody is best considered in multimodal settings.
  • Byun, K.-S., De Vos, C., Roberts, S. G., & Levinson, S. C. (2018). Interactive sequences modulate the selection of expressive forms in cross-signing. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 67-69). Toruń, Poland: NCU Press. doi:10.12775/3991-1.012.
  • Cambier, N., Miletitch, R., Burraco, A. B., & Raviv, L. (2022). Prosociality in swarm robotics: A model to study self-domestication and language evolution. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 98-100). Nijmegen: Joint Conference on Language Evolution (JCoLE).
  • Casillas, M., & Frank, M. C. (2013). The development of predictive processes in children’s discourse understanding. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society. (pp. 299-304). Austin,TX: Cognitive Society.

    Abstract

    We investigate children’s online predictive processing as it occurs naturally, in conversation. We showed 1–7 year-olds short videos of improvised conversation between puppets, controlling for available linguistic information through phonetic manipulation. Even one- and two-year-old children made accurate and spontaneous predictions about when a turn-switch would occur: they gazed at the upcoming speaker before they heard a response begin. This predictive skill relies on both lexical and prosodic information together, and is not tied to either type of information alone. We suggest that children integrate prosodic, lexical, and visual information to effectively predict upcoming linguistic material in conversation.
  • Chen, Y., & Braun, B. (2006). Prosodic realization in information structure categories in standard Chinese. In R. Hoffmann, & H. Mixdorff (Eds.), Speech Prosody 2006. Dresden: TUD Press.

    Abstract

    This paper investigates the prosodic realization of information
    structure categories in Standard Chinese. A number of proper
    names with different tonal combinations were elicited as a
    grammatical subject in five pragmatic contexts. Results show
    that both duration and F0 range of the tonal realizations were
    adjusted to signal the information structure categories (i.e.
    theme vs. rheme and background vs. focus). Rhemes
    consistently induced a longer duration and a more expanded F0
    range than themes. Focus, compared to background, generally
    induced lengthening and F0 range expansion (the presence and
    magnitude of which, however, are dependent on the tonal
    structure of the proper names). Within the rheme focus
    condition, corrective rheme focus induced more expanded F0
    range than normal rheme focus.
  • Chen, A. (2006). Variations in the marking of focus in child language. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 113-114).
  • Chen, A. (2006). Interface between information structure and intonation in Dutch wh-questions. In R. Hoffmann, & H. Mixdorff (Eds.), Speech Prosody 2006. Dresden: TUD Press.

    Abstract

    This study set out to investigate how accent placement is pragmatically governed in WH-questions. Central to this issue are questions such as whether the intonation of the WH-word depends on the information structure of the non-WH word part, whether topical constituents can be accented, and whether constituents in the non-WH word part can be non-topical and accented. Previous approaches, based either on carefully composed examples or on read speech, differ in their treatments of these questions and consequently make opposing claims on the intonation of WH-questions. We addressed these questions by examining a corpus of 90 naturally occurring WH-questions, selected from the Spoken Dutch Corpus. Results show that the intonation of the WH-word is related to the information structure of the non-WH word part. Further, topical constituents can get accented and the accents are not necessarily phonetically reduced. Additionally, certain adverbs, which have no topical relation to the presupposition of the WH-questions, also get accented. They appear to function as a device for enhancing speaker engagement.
  • Cheung, C.-Y., Yakpo, K., & Coupé, C. (2022). A computational simulation of the genesis and spread of lexical items in situations of abrupt language contact. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 115-122). Nijmegen: Joint Conference on Language Evolution (JCoLE).

    Abstract

    The current study presents an agent-based model which simulates the innovation and
    competition among lexical items in cases of language contact. It is inspired by relatively
    recent historical cases in which the linguistic ecology and sociohistorical context are highly complex. Pidgin and creole genesis offers an opportunity to obtain linguistic facts, social dynamics, and historical demography in a highly segregated society. This provides a solid ground for researching the interaction of populations with different pre-existing language systems, and how different factors contribute to the genesis of the lexicon of a newly generated mixed language. We take into consideration the population dynamics and structures, as well as a distribution of word frequencies related to language use, in order to study how social factors may affect the developmental trajectory of languages. Focusing on the case of Sranan in Suriname, our study shows that it is possible to account for the
    composition of its core lexicon in relation to different social groups, contact patterns, and
    large population movements.
  • Coopmans, C. W., De Hoop, H., Kaushik, K., Hagoort, P., & Martin, A. E. (2021). Structure-(in)dependent interpretation of phrases in humans and LSTMs. In Proceedings of the Society for Computation in Linguistics (SCiL 2021) (pp. 459-463).

    Abstract

    In this study, we compared the performance of a long short-term memory (LSTM) neural network to the behavior of human participants on a language task that requires hierarchically structured knowledge. We show that humans interpret ambiguous noun phrases, such as second blue ball, in line with their hierarchical constituent structure. LSTMs, instead, only do
    so after unambiguous training, and they do not systematically generalize to novel items. Overall, the results of our simulations indicate that a model can behave hierarchically without relying on hierarchical constituent structure.
  • Crasborn, O., Sloetjes, H., Auer, E., & Wittenburg, P. (2006). Combining video and numeric data in the analysis of sign languages with the ELAN annotation software. In C. Vetoori (Ed.), Proceedings of the 2nd Workshop on the Representation and Processing of Sign languages: Lexicographic matters and didactic scenarios (pp. 82-87). Paris: ELRA.

    Abstract

    This paper describes hardware and software that can be used for the phonetic study of sign languages. The field of sign language phonetics is characterised, and the hardware that is currently in use is described. The paper focuses on the software that was developed to enable the recording of finger and hand movement data, and the additions to the ELAN annotation software that facilitate the further visualisation and analysis of the data.
  • Cristia, A., Ganesh, S., Casillas, M., & Ganapathy, S. (2018). Talker diarization in the wild: The case of child-centered daylong audio-recordings. In Proceedings of Interspeech 2018 (pp. 2583-2587). doi:10.21437/Interspeech.2018-2078.

    Abstract

    Speaker diarization (answering 'who spoke when') is a widely researched subject within speech technology. Numerous experiments have been run on datasets built from broadcast news, meeting data, and call centers—the task sometimes appears close to being solved. Much less work has begun to tackle the hardest diarization task of all: spontaneous conversations in real-world settings. Such diarization would be particularly useful for studies of language acquisition, where researchers investigate the speech children produce and hear in their daily lives. In this paper, we study audio gathered with a recorder worn by small children as they went about their normal days. As a result, each child was exposed to different acoustic environments with a multitude of background noises and a varying number of adults and peers. The inconsistency of speech and noise within and across samples poses a challenging task for speaker diarization systems, which we tackled via retraining and data augmentation techniques. We further studied sources of structured variation across raw audio files, including the impact of speaker type distribution, proportion of speech from children, and child age on diarization performance. We discuss the extent to which these findings might generalize to other samples of speech in the wild.
  • Cutler, A., Kim, J., & Otake, T. (2006). On the limits of L1 influence on non-L1 listening: Evidence from Japanese perception of Korean. In P. Warren, & C. I. Watson (Eds.), Proceedings of the 11th Australian International Conference on Speech Science & Technology (pp. 106-111).

    Abstract

    Language-specific procedures which are efficient for listening to the L1 may be applied to non-native spoken input, often to the detriment of successful listening. However, such misapplications of L1-based listening do not always happen. We propose, based on the results from two experiments in which Japanese listeners detected target sequences in spoken Korean, that an L1 procedure is only triggered if requisite L1 features are present in the input.
  • Cutler, A., Burchfield, A., & Antoniou, M. (2019). A criterial interlocutor tally for successful talker adaptation? In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1485-1489). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Part of the remarkable efficiency of listening is
    accommodation to unfamiliar talkers’ specific
    pronunciations by retuning of phonemic intercategory
    boundaries. Such retuning occurs in second
    (L2) as well as first language (L1); however, recent
    research with emigrés revealed successful adaptation
    in the environmental L2 but, unprecedentedly, not in
    L1 despite continuing L1 use. A possible explanation
    involving relative exposure to novel talkers is here
    tested in heritage language users with Mandarin as
    family L1 and English as environmental language. In
    English, exposure to an ambiguous sound in
    disambiguating word contexts prompted the expected
    adjustment of phonemic boundaries in subsequent
    categorisation. However, no adjustment occurred in
    Mandarin, again despite regular use. Participants
    reported highly asymmetric interlocutor counts in the
    two languages. We conclude that successful retuning
    ability requires regular exposure to novel talkers in
    the language in question, a criterion not met for the
    emigrés’ or for these heritage users’ L1.
  • Cutler, A., Aslin, R. N., Gervain, J., & Nespor, M. (Eds.). (2021). Special issue in honor of Jacques Mehler, Cognition's founding editor [Special Issue]. Cognition, 213.
  • Ip, M. H. K., & Cutler, A. (2018). Asymmetric efficiency of juncture perception in L1 and L2. In K. Klessa, J. Bachan, A. Wagner, M. Karpiński, & D. Śledziński (Eds.), Proceedings of Speech Prosody 2018 (pp. 289-296). Baixas, France: ISCA. doi:10.21437/SpeechProsody.2018-59.

    Abstract

    In two experiments, Mandarin listeners resolved potential syntactic ambiguities in spoken utterances in (a) their native language (L1) and (b) English which they had learned as a second language (L2). A new disambiguation task was used, requiring speeded responses to select the correct meaning for structurally ambiguous sentences. Importantly, the ambiguities used in the study are identical in Mandarin and in English, and production data show that prosodic disambiguation of this type of ambiguity is also realised very similarly in the two languages. The perceptual results here showed however that listeners’ response patterns differed for L1 and L2, although there was a significant increase in similarity between the two response patterns with increasing exposure to the L2. Thus identical ambiguity and comparable disambiguation patterns in L1 and L2 do not lead to immediate application of the appropriate L1 listening strategy to L2; instead, it appears that such a strategy may have to be learned anew for the L2.
  • Ip, M. H. K., & Cutler, A. (2018). Cue equivalence in prosodic entrainment for focus detection. In J. Epps, J. Wolfe, J. Smith, & C. Jones (Eds.), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 153-156).

    Abstract

    Using a phoneme detection task, the present series of
    experiments examines whether listeners can entrain to
    different combinations of prosodic cues to predict where focus
    will fall in an utterance. The stimuli were recorded by four
    female native speakers of Australian English who happened to
    have used different prosodic cues to produce sentences with
    prosodic focus: a combination of duration cues, mean and
    maximum F0, F0 range, and longer pre-target interval before
    the focused word onset, only mean F0 cues, only pre-target
    interval, and only duration cues. Results revealed that listeners
    can entrain in almost every condition except for where
    duration was the only reliable cue. Our findings suggest that
    listeners are flexible in the cues they use for focus processing.
  • Cutler, A., & Pasveer, D. (2006). Explaining cross-linguistic differences in effects of lexical stress on spoken-word recognition. In R. Hoffmann, & H. Mixdorff (Eds.), Speech Prosody 2006. Dresden: TUD press.

    Abstract

    Experiments have revealed differences across languages in listeners’ use of stress information in recognising spoken words. Previous comparisons of the vocabulary of Spanish and English had suggested that the explanation of this asymmetry might lie in the extent to which considering stress in spokenword recognition allows rejection of unwanted competition from words embedded in other words. This hypothesis was tested on the vocabularies of Dutch and German, for which word recognition results resemble those from Spanish more than those from English. The vocabulary statistics likewise revealed that in each language, the reduction of embeddings resulting from taking stress into account is more similar to the reduction achieved in Spanish than in English.
  • Cutler, A., Eisner, F., McQueen, J. M., & Norris, D. (2006). Coping with speaker-related variation via abstract phonemic categories. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 31-32).
  • Cutler, A., Burchfield, L. A., & Antoniou, M. (2018). Factors affecting talker adaptation in a second language. In J. Epps, J. Wolfe, J. Smith, & C. Jones (Eds.), Proceedings of the 17th Australasian International Conference on Speech Science and Technology (pp. 33-36).

    Abstract

    Listeners adapt rapidly to previously unheard talkers by
    adjusting phoneme categories using lexical knowledge, in a
    process termed lexically-guided perceptual learning. Although
    this is firmly established for listening in the native language
    (L1), perceptual flexibility in second languages (L2) is as yet
    less well understood. We report two experiments examining L1
    and L2 perceptual learning, the first in Mandarin-English late
    bilinguals, the second in Australian learners of Mandarin. Both
    studies showed stronger learning in L1; in L2, however,
    learning appeared for the English-L1 group but not for the
    Mandarin-L1 group. Phonological mapping differences from
    the L1 to the L2 are suggested as the reason for this result.
  • Cutler, A. (1994). How human speech recognition is affected by phonological diversity among languages. In R. Togneri (Ed.), Proceedings of the fifth Australian International Conference on Speech Science and Technology: Vol. 1 (pp. 285-288). Canberra: Australian Speech Science and Technology Association.

    Abstract

    Listeners process spoken language in ways which are adapted to the phonological structure of their native language. As a consequence, non-native speakers do not listen to a language in the same way as native speakers; moreover, listeners may use their native language listening procedures inappropriately with foreign input. With sufficient experience, however, it may be possible to inhibit this latter (counter-productive) behavior.
  • Cutler, A., & Young, D. (1994). Rhythmic structure of word blends in English. In Proceedings of the Third International Conference on Spoken Language Processing (pp. 1407-1410). Kobe: Acoustical Society of Japan.

    Abstract

    Word blends combine fragments from two words, either in speech errors or when a new word is created. Previous work has demonstrated that in Japanese, such blends preserve moraic structure; in English they do not. A similar effect of moraic structure is observed in perceptual research on segmentation of continuous speech in Japanese; English listeners, by contrast, exploit stress units in segmentation, suggesting that a general rhythmic constraint may underlie both findings. The present study examined whether mis parallel would also hold for word blends. In spontaneous English polysyllabic blends, the source words were significantly more likely to be split before a strong than before a weak (unstressed) syllable, i.e. to be split at a stress unit boundary. In an experiment in which listeners were asked to identify the source words of blends, significantly more correct detections resulted when splits had been made before strong syllables. Word blending, like speech segmentation, appears to be constrained by language rhythm.
  • Cutler, A., McQueen, J. M., Baayen, R. H., & Drexler, H. (1994). Words within words in a real-speech corpus. In R. Togneri (Ed.), Proceedings of the 5th Australian International Conference on Speech Science and Technology: Vol. 1 (pp. 362-367). Canberra: Australian Speech Science and Technology Association.

    Abstract

    In a 50,000-word corpus of spoken British English the occurrence of words embedded within other words is reported. Within-word embedding in this real speech sample is common, and analogous to the extent of embedding observed in the vocabulary. Imposition of a syllable boundary matching constraint reduces but by no means eliminates spurious embedding. Embedded words are most likely to overlap with the beginning of matrix words, and thus may pose serious problems for speech recognisers.
  • Cutler, A., & Bruggeman, L. (2013). Vocabulary structure and spoken-word recognition: Evidence from French reveals the source of embedding asymmetry. In Proceedings of INTERSPEECH: 14th Annual Conference of the International Speech Communication Association (pp. 2812-2816).

    Abstract

    Vocabularies contain hundreds of thousands of words built from only a handful of phonemes, so that inevitably longer words tend to contain shorter ones. In many languages (but not all) such embedded words occur more often word-initially than word-finally, and this asymmetry, if present, has farreaching consequences for spoken-word recognition. Prior research had ascribed the asymmetry to suffixing or to effects of stress (in particular, final syllables containing the vowel schwa). Analyses of the standard French vocabulary here reveal an effect of suffixing, as predicted by this account, and further analyses of an artificial variety of French reveal that extensive final schwa has an independent and additive effect in promoting the embedding asymmetry.
  • Dediu, D. (2006). Mostly out of Africa, but what did the others have to say? In A. Cangelosi, A. D. Smith, & K. Smith (Eds.), The evolution of language: proceedings of the 6th International Conference (EVOLANG6) (pp. 59-66). World Scientific.

    Abstract

    The Recent Out-of-Africa human evolutionary model seems to be generally accepted. This impression is very prevalent outside palaeoanthropological circles (including studies of language evolution), but proves to be unwarranted. This paper offers a short review of the main challenges facing ROA and concludes that alternative models based on the concept of metapopulation must be also considered. The implications of such a model for language evolution and diversity are briefly reviewed.
  • Delgado, T., Ravignani, A., Verhoef, T., Thompson, B., Grossi, T., & Kirby, S. (2018). Cultural transmission of melodic and rhythmic universals: Four experiments and a model. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 89-91). Toruń, Poland: NCU Press. doi:10.12775/3991-1.019.
  • Dideriksen, C., Fusaroli, R., Tylén, K., Dingemanse, M., & Christiansen, M. H. (2019). Contextualizing Conversational Strategies: Backchannel, Repair and Linguistic Alignment in Spontaneous and Task-Oriented Conversations. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Conference of the Cognitive Science Society (CogSci 2019) (pp. 261-267). Montreal, QB: Cognitive Science Society.

    Abstract

    Do interlocutors adjust their conversational strategies to the specific contextual demands of a given situation? Prior studies have yielded conflicting results, making it unclear how strategies vary with demands. We combine insights from qualitative and quantitative approaches in a within-participant experimental design involving two different contexts: spontaneously occurring conversations (SOC) and task-oriented conversations (TOC). We systematically assess backchanneling, other-repair and linguistic alignment. We find that SOC exhibit a higher number of backchannels, a reduced and more generic repair format and higher rates of lexical and syntactic alignment. TOC are characterized by a high number of specific repairs and a lower rate of lexical and syntactic alignment. However, when alignment occurs, more linguistic forms are aligned. The findings show that conversational strategies adapt to specific contextual demands.
  • Dieuleveut, A., Van Dooren, A., Cournane, A., & Hacquard, V. (2019). Acquiring the force of modals: Sig you guess what sig means? In M. Brown, & B. Dailey (Eds.), BUCLD 43: Proceedings of the 43rd annual Boston University Conference on Language Development (pp. 189-202). Sommerville, MA: Cascadilla Press.
  • Dimitriadis, A., Kemps-Snijders, M., Wittenburg, P., Everaert, M., & Levinson, S. C. (2006). Towards a linguist's workbench supporting eScience methods. In Proceedings of the 2nd IEEE International Conference on e-Science and Grid Computing.
  • Dingemanse, M., Liesenfeld, A., & Woensdregt, M. (2022). Convergent cultural evolution of continuers (mhmm). In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The Evolution of Language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 160-167). Nijmegen: Joint Conference on Language Evolution (JCoLE). doi:10.31234/osf.io/65c79.

    Abstract

    Continuers —words like mm, mmhm, uhum and the like— are among the most frequent types of responses in conversation. They play a key role in joint action coordination by showing positive evidence of understanding and scaffolding narrative delivery. Here we investigate the hypothesis that their functional importance along with their conversational ecology places selective pressures on their form and may lead to cross-linguistic similarities through convergent cultural evolution. We compare continuer tokens in linguistically diverse conversational corpora and find languages make available highly similar forms. We then approach the causal mechanism of convergent cultural evolution using exemplar modelling, simulating the process by which a combination of effort minimization and functional specialization may push continuers to a particular region of phonological possibility space. By combining comparative linguistics and computational modelling we shed new light on the question of how language structure is shaped by and for social interaction.
  • Dingemanse, M., & Liesenfeld, A. (2022). From text to talk: Harnessing conversational corpora for humane and diversity-aware language technology. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL 2022) (pp. 5614 -5633). Dublin, Ireland: Association for Computational Linguistics.

    Abstract

    Informal social interaction is the primordial home of human language. Linguistically diverse conversational corpora are an important and largely untapped resource for computational linguistics and language technology. Through the efforts of a worldwide language documentation movement, such corpora are increasingly becoming available. We show how interactional data from 63 languages (26 families) harbours insights about turn-taking, timing, sequential structure and social action, with implications for language technology, natural language understanding, and the design of conversational interfaces. Harnessing linguistically diverse conversational corpora will provide the empirical foundations for flexible, localizable, humane language technologies of the future.
  • Dolscheid, S., Graver, C., & Casasanto, D. (2013). Spatial congruity effects reveal metaphors, not markedness. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 2213-2218). Austin,TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0405/index.html.

    Abstract

    Spatial congruity effects have often been interpreted as evidence for metaphorical thinking, but an alternative markedness-based account challenges this view. In two experiments, we directly compared metaphor and markedness explanations for spatial congruity effects, using musical pitch as a testbed. English speakers who talk about pitch in terms of spatial height were tested in speeded space-pitch compatibility tasks. To determine whether space-pitch congruency effects could be elicited by any marked spatial continuum, participants were asked to classify high- and low-frequency pitches as 'high' and 'low' or as 'front' and 'back' (both pairs of terms constitute cases of marked continuums). We found congruency effects in high/low conditions but not in front/back conditions, indicating that markedness is not sufficient to account for congruity effects (Experiment 1). A second experiment showed that congruency effects were specific to spatial words that cued a vertical schema (tall/short), and that congruity effects were not an artifact of polysemy (e.g., 'high' referring both to space and pitch). Together, these results suggest that congruency effects reveal metaphorical uses of spatial schemas, not markedness effects.
  • Dona, L., & Schouwstra, M. (2022). The Role of Structural Priming, Semantics and Population Structure in Word Order Conventionalization: A Computational Model. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 171-173). Nijmegen: Joint Conference on Language Evolution (JCoLE).
  • Duarte, R., Uhlmann, M., Van den Broek, D., Fitz, H., Petersson, K. M., & Morrison, A. (2018). Encoding symbolic sequences with spiking neural reservoirs. In Proceedings of the 2018 International Joint Conference on Neural Networks (IJCNN). doi:10.1109/IJCNN.2018.8489114.

    Abstract

    Biologically inspired spiking networks are an important tool to study the nature of computation and cognition in neural systems. In this work, we investigate the representational capacity of spiking networks engaged in an identity mapping task. We compare two schemes for encoding symbolic input, one in which input is injected as a direct current and one where input is delivered as a spatio-temporal spike pattern. We test the ability of networks to discriminate their input as a function of the number of distinct input symbols. We also compare performance using either membrane potentials or filtered spike trains as state variable. Furthermore, we investigate how the circuit behavior depends on the balance between excitation and inhibition, and the degree of synchrony and regularity in its internal dynamics. Finally, we compare different linear methods of decoding population activity onto desired target labels. Overall, our results suggest that even this simple mapping task is strongly influenced by design choices on input encoding, state-variables, circuit characteristics and decoding methods, and these factors can interact in complex ways. This work highlights the importance of constraining computational network models of behavior by available neurobiological evidence.
  • Durco, M., & Windhouwer, M. (2013). Semantic Mapping in CLARIN Component Metadata. In Proceedings of MTSR 2013, the 7th Metadata and Semantics Research Conference (pp. 163-168). New York: Springer.

    Abstract

    In recent years, large scale initiatives like CLARIN set out to overcome the notorious heterogeneity of metadata formats in the domain of language resource. The CLARIN Component Metadata Infrastructure established means for flexible resouce descriptions for the domain of language resources. The Data Category Registry ISOcat and the accompanying Relation Registry foster semantic interoperability within the growing heterogeneous collection of metadata records. This paper describes the CMD Infrastructure focusing on the facilities for semantic mapping, and gives also an overview of the current status in the joint component metadata domain.
  • Eijk, L., Ernestus, M., & Schriefers, H. (2019). Alignment of pitch and articulation rate. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 2690-2694). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Previous studies have shown that speakers align their speech to each other at multiple linguistic levels. This study investigates whether alignment is mostly the result of priming from the immediately preceding
    speech materials, focussing on pitch and articulation rate (AR). Native Dutch speakers completed sentences, first by themselves (pre-test), then in alternation with Confederate 1 (Round 1), with Confederate 2 (Round 2), with Confederate 1 again
    (Round 3), and lastly by themselves again (post-test). Results indicate that participants aligned to the confederates and that this alignment lasted during the post-test. The confederates’ directly preceding sentences were not good predictors for the participants’ pitch and AR. Overall, the results indicate that alignment is more of a global effect than a local priming effect.
  • Enfield, N. J. (2006). Social consequences of common ground. In N. J. Enfield, & S. C. Levinson (Eds.), Roots of human sociality: Culture, cognition and interaction (pp. 399-430). Oxford: Berg.
  • Enfield, N. J., & Levinson, S. C. (2006). Introduction: Human sociality as a new interdisciplinary field. In N. J. Enfield, & S. C. Levinson (Eds.), Roots of human sociality: Culture, cognition and interaction (pp. 1-35). Oxford: Berg.
  • Ergin, R., Senghas, A., Jackendoff, R., & Gleitman, L. (2018). Structural cues for symmetry, asymmetry, and non-symmetry in Central Taurus Sign Language. In C. Cuskley, M. Flaherty, H. Little, L. McCrohon, A. Ravignani, & T. Verhoef (Eds.), Proceedings of the 12th International Conference on the Evolution of Language (EVOLANG XII) (pp. 104-106). Toruń, Poland: NCU Press. doi:10.12775/3991-1.025.
  • Evans, N., Levinson, S. C., & Sterelny, K. (Eds.). (2021). Thematic issue on evolution of kinship systems [Special Issue]. Biological theory, 16.
  • Eviatar, Z., & Huettig, F. (Eds.). (2021). Literacy and writing systems [Special Issue]. Journal of Cultural Cognitive Science.
  • Falk, J. J., Zhang, Y., Scheutz, M., & Yu, C. (2021). Parents adaptively use anaphora during parent-child social interaction. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 1472-1478). Vienna: Cognitive Science Society.

    Abstract

    Anaphora, a ubiquitous feature of natural language, poses a particular challenge to young children as they first learn language due to its referential ambiguity. In spite of this, parents and caregivers use anaphora frequently in child-directed speech, potentially presenting a risk to effective communication if children do not yet have the linguistic capabilities of resolving anaphora successfully. Through an eye-tracking study in a naturalistic free-play context, we examine the strategies that parents employ to calibrate their use of anaphora to their child's linguistic development level. We show that, in this way, parents are able to intuitively scaffold the complexity of their speech such that greater referential ambiguity does not hurt overall communication success.
  • Felker, E. R., Ernestus, M., & Broersma, M. (2019). Evaluating dictation task measures for the study of speech perception. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 2019) (pp. 383-387). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    This paper shows that the dictation task, a well-
    known testing instrument in language education, has
    untapped potential as a research tool for studying
    speech perception. We describe how transcriptions
    can be scored on measures of lexical, orthographic,
    phonological, and semantic similarity to target
    phrases to provide comprehensive information about
    accuracy at different processing levels. The former
    three measures are automatically extractable,
    increasing objectivity, and the middle two are
    gradient, providing finer-grained information than
    traditionally used. We evaluate the measures in an
    English dictation task featuring phonetically reduced
    continuous speech. Whereas the lexical and
    orthographic measures emphasize listeners’ word
    identification difficulties, the phonological measure
    demonstrates that listeners can often still recover
    phonological features, and the semantic measure
    captures their ability to get the gist of the utterances.
    Correlational analyses and a discussion of practical
    and theoretical considerations show that combining
    multiple measures improves the dictation task’s
    utility as a research tool.
  • Felker, E. R., Ernestus, M., & Broersma, M. (2019). Lexically guided perceptual learning of a vowel shift in an interactive L2 listening context. In Proceedings of Interspeech 2019 (pp. 3123-3127). doi:10.21437/Interspeech.2019-1414.

    Abstract

    Lexically guided perceptual learning has traditionally been studied with ambiguous consonant sounds to which native listeners are exposed in a purely receptive listening context. To extend previous research, we investigate whether lexically guided learning applies to a vowel shift encountered by non-native listeners in an interactive dialogue. Dutch participants played a two-player game in English in either a control condition, which contained no evidence for a vowel shift, or a lexically constraining condition, in which onscreen lexical information required them to re-interpret their interlocutor’s /ɪ/ pronunciations as representing /ε/. A phonetic categorization pre-test and post-test were used to assess whether the game shifted listeners’ phonemic boundaries such that more of the /ε/-/ɪ/ continuum came to be perceived as /ε/. Both listener groups showed an overall post-test shift toward /ɪ/, suggesting that vowel perception may be sensitive to directional biases related to properties of the speaker’s vowel space. Importantly, listeners in the lexically constraining condition made relatively more post-test /ε/ responses than the control group, thereby exhibiting an effect of lexically guided adaptation. The results thus demonstrate that non-native listeners can adjust their phonemic boundaries on the basis of lexical information to accommodate a vowel shift learned in interactive conversation.
  • Fisher, S. E., & Tilot, A. K. (Eds.). (2019). Bridging senses: Novel insights from synaesthesia [Special Issue]. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 374.
  • Flecken, M., & Gerwien, J. (2013). Grammatical aspect modulates event duration estimations: findings from Dutch. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th annual meeting of the Cognitive Science Society (CogSci 2013) (pp. 2309-2314). Austin,TX: Cognitive Science Society.
  • Fletcher, J., Kidd, E., Stoakes, H., & Nordlinger, R. (2022). Prosodic phrasing, pitch range, and word order variation in Murrinhpatha. In R. Billington (Ed.), Proceedings of the 18th Australasian International Conference on Speech Science and Technology (pp. 201-205). Canberra: Australasian Speech Science and Technology Association.

    Abstract

    Like many Indigenous Australian languages, Murrinhpatha has flexible word order with no apparent configurational syntax. We analyzed an experimental corpus of Murrinhpatha utterances for associations between different thematic role orders, intonational phrasing patterns and pitch downtrends. We found that initial constituents (Agents or Patients) tend to carry the highest pitch targets (HiF0), followed by patterns of downstep and declination. Sentence-final verbs always have lower Hif0 values than either initial or medial Agents or Patients. Thematic role order does not influence intonational
    patterns, with the results suggesting that Murrinhpatha has positional prosody, although final nominals can disrupt global
    pitch downtrends regardless of thematic role.
  • Floyd, S. (2006). The cash value of style in the Andean market. In E.-X. Lee, K. M. Markman, V. Newdick, & T. Sakuma (Eds.), SALSA 13: Texas Linguistic Forum vol. 49. Austin, TX: Texas Linguistics Forum.

    Abstract

    This paper examines code and style shifting during sales transactions based on two market case studies from highland Ecuador. Bringing together ideas of linguistic economy with work on stylistic variation and ethnohistorical research on Andean markets, I study bartering, market calls and sales pitches to show how sellers create stylistic performances distinguished by contrasts of code, register and poetic features. The interaction of the symbolic value of language with the economic values of the market presents a place to examine the relationship between discourse and the material world.
  • Frost, R. L. A., Isbilen, E. S., Christiansen, M. H., & Monaghan, P. (2019). Testing the limits of non-adjacent dependency learning: Statistical segmentation and generalisation across domains. In A. K. Goel, C. M. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 1787-1793). Montreal, QB: Cognitive Science Society.

    Abstract

    Achieving linguistic proficiency requires identifying words from speech, and discovering the constraints that govern the way those words are used. In a recent study of non-adjacent dependency learning, Frost and Monaghan (2016) demonstrated that learners may perform these tasks together, using similar statistical processes - contrary to prior suggestions. However, in their study, non-adjacent dependencies were marked by phonological cues (plosive-continuant-plosive structure), which may have influenced learning. Here, we test the necessity of these cues by comparing learning across three conditions; fixed phonology, which contains these cues, varied phonology, which omits them, and shapes, which uses visual shape sequences to assess the generality of statistical processing for these tasks. Participants segmented the sequences and generalized the structure in both auditory conditions, but learning was best when phonological cues were present. Learning was around chance on both tasks for the visual shapes group, indicating statistical processing may critically differ across domains.
  • Furman, R., Ozyurek, A., & Allen, S. E. M. (2006). Learning to express causal events across languages: What do speech and gesture patterns reveal? In D. Bamman, T. Magnitskaia, & C. Zaller (Eds.), Proceedings of the 30th Annual Boston University Conference on Language Development (pp. 190-201). Somerville, Mass: Cascadilla Press.
  • Galke, L., Franke, B., Zielke, T., & Scherp, A. (2021). Lifelong learning of graph neural networks for open-world node classification. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN). Piscataway, NJ: IEEE. doi:10.1109/IJCNN52387.2021.9533412.

    Abstract

    Graph neural networks (GNNs) have emerged as the standard method for numerous tasks on graph-structured data such as node classification. However, real-world graphs are often evolving over time and even new classes may arise. We model these challenges as an instance of lifelong learning, in which a learner faces a sequence of tasks and may take over knowledge acquired in past tasks. Such knowledge may be stored explicitly as historic data or implicitly within model parameters. In this work, we systematically analyze the influence of implicit and explicit knowledge. Therefore, we present an incremental training method for lifelong learning on graphs and introduce a new measure based on k-neighborhood time differences to address variances in the historic data. We apply our training method to five representative GNN architectures and evaluate them on three new lifelong node classification datasets. Our results show that no more than 50% of the GNN's receptive field is necessary to retain at least 95% accuracy compared to training over the complete history of the graph data. Furthermore, our experiments confirm that implicit knowledge becomes more important when fewer explicit knowledge is available.
  • Galke, L., Gerstenkorn, G., & Scherp, A. (2018). A case study of closed-domain response suggestion with limited training data. In M. Elloumi, M. Granitzer, A. Hameurlain, C. Seifert, B. Stein, A. Min Tjoa, & R. Wagner (Eds.), Database and Expert Systems Applications: DEXA 2018 International Workshops, BDMICS, BIOKDD, and TIR, Regensburg, Germany, September 3–6, 2018, Proceedings (pp. 218-229). Cham, Switzerland: Springer.

    Abstract

    We analyze the problem of response suggestion in a closed domain along a real-world scenario of a digital library. We present a text-processing pipeline to generate question-answer pairs from chat transcripts. On this limited amount of training data, we compare retrieval-based, conditioned-generation, and dedicated representation learning approaches for response suggestion. Our results show that retrieval-based methods that strive to find similar, known contexts are preferable over parametric approaches from the conditioned-generation family, when the training data is limited. We, however, identify a specific representation learning approach that is competitive to the retrieval-based approaches despite the training data limitation.
  • Galke, L., Vagliano, I., & Scherp, A. (2019). Can graph neural networks go „online“? An analysis of pretraining and inference. In Proceedings of the Representation Learning on Graphs and Manifolds: ICLR2019 Workshop.

    Abstract

    Large-scale graph data in real-world applications is often not static but dynamic,
    i. e., new nodes and edges appear over time. Current graph convolution approaches
    are promising, especially, when all the graph’s nodes and edges are available dur-
    ing training. When unseen nodes and edges are inserted after training, it is not
    yet evaluated whether up-training or re-training from scratch is preferable. We
    construct an experimental setup, in which we insert previously unseen nodes and
    edges after training and conduct a limited amount of inference epochs. In this
    setup, we compare adapting pretrained graph neural networks against retraining
    from scratch. Our results show that pretrained models yield high accuracy scores
    on the unseen nodes and that pretraining is preferable over retraining from scratch.
    Our experiments represent a first step to evaluate and develop truly online variants
    of graph neural networks.
  • Galke, L., Seidlmayer, E., Lüdemann, G., Langnickel, L., Melnychuk, T., Förstner, K. U., Tochtermann, K., & Schultz, C. (2021). COVID-19++: A citation-aware Covid-19 dataset for the analysis of research dynamics. In Y. Chen, H. Ludwig, Y. Tu, U. Fayyad, X. Zhu, X. Hu, S. Byna, X. Liu, J. Zhang, S. Pan, V. Papalexakis, J. Wang, A. Cuzzocrea, & C. Ordonez (Eds.), Proceedings of the 2021 IEEE International Conference on Big Data (pp. 4350-4355). Piscataway, NJ: IEEE.

    Abstract

    COVID-19 research datasets are crucial for analyzing research dynamics. Most collections of COVID-19 research items do not to include cited works and do not have annotations
    from a controlled vocabulary. Starting with ZB MED KE data on COVID-19, which comprises CORD-19, we assemble a new dataset that includes cited work and MeSH annotations for all records. Furthermore, we conduct experiments on the analysis of research dynamics, in which we investigate predicting links in a co-annotation graph created on the basis of the new dataset. Surprisingly, we find that simple heuristic methods are better at
    predicting future links than more sophisticated approaches such as graph neural networks.
  • Galke, L., Melnychuk, T., Seidlmayer, E., Trog, S., Foerstner, K., Schultz, C., & Tochtermann, K. (2019). Inductive learning of concept representations from library-scale bibliographic corpora. In K. David, K. Geihs, M. Lange, & G. Stumme (Eds.), Informatik 2019: 50 Jahre Gesellschaft für Informatik - Informatik für Gesellschaft (pp. 219-232). Bonn: Gesellschaft für Informatik e.V. doi:10.18420/inf2019_26.
  • Galke, L., Mai, F., & Vagliano, I. (2018). Multi-modal adversarial autoencoders for recommendations of citations and subject labels. In T. Mitrovic, J. Zhang, L. Chen, & D. Chin (Eds.), UMAP '18: Proceedings of the 26th Conference on User Modeling, Adaptation and Personalization (pp. 197-205). New York: ACM. doi:10.1145/3209219.3209236.

    Abstract

    We present multi-modal adversarial autoencoders for recommendation and evaluate them on two different tasks: citation recommendation and subject label recommendation. We analyze the effects of adversarial regularization, sparsity, and different input modalities. By conducting 408 experiments, we show that adversarial regularization consistently improves the performance of autoencoders for recommendation. We demonstrate, however, that the two tasks differ in the semantics of item co-occurrence in the sense that item co-occurrence resembles relatedness in case of citations, yet implies diversity in case of subject labels. Our results reveal that supplying the partial item set as input is only helpful, when item co-occurrence resembles relatedness. When facing a new recommendation task it is therefore crucial to consider the semantics of item co-occurrence for the choice of an appropriate model.
  • Galke, L., & Scherp, A. (2022). Bag-of-words vs. graph vs. sequence in text classification: Questioning the necessity of text-graphs and the surprising strength of a wide MLP. In S. Muresan, P. Nakov, & A. Villavicencio (Eds.), Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (pp. 4038-4051). Dublin: Association for Computational Linguistics. doi:10.18653/v1/2022.acl-long.279.
  • Galke, L., Cuber, I., Meyer, C., Nölscher, H. F., Sonderecker, A., & Scherp, A. (2022). General cross-architecture distillation of pretrained language models into matrix embedding. In Proceedings of the IEEE Joint Conference on Neural Networks (IJCNN 2022), part of the IEEE World Congress on Computational Intelligence (WCCI 2022). doi:10.1109/IJCNN55064.2022.9892144.

    Abstract

    Large pretrained language models (PreLMs) are rev-olutionizing natural language processing across all benchmarks. However, their sheer size is prohibitive for small laboratories or for deployment on mobile devices. Approaches like pruning and distillation reduce the model size but typically retain the same model architecture. In contrast, we explore distilling PreLMs into a different, more efficient architecture, Continual Multiplication of Words (CMOW), which embeds each word as a matrix and uses matrix multiplication to encode sequences. We extend the CMOW architecture and its CMOW/CBOW-Hybrid variant with a bidirectional component for more expressive power, per-token representations for a general (task-agnostic) distillation during pretraining, and a two-sequence encoding scheme that facilitates downstream tasks on sentence pairs, such as sentence similarity and natural language inference. Our matrix-based bidirectional CMOW/CBOW-Hybrid model is competitive to DistilBERT on question similarity and recognizing textual entailment, but uses only half of the number of parameters and is three times faster in terms of inference speed. We match or exceed the scores of ELMo for all tasks of the GLUE benchmark except for the sentiment analysis task SST-2 and the linguistic acceptability task CoLA. However, compared to previous cross-architecture distillation approaches, we demonstrate a doubling of the scores on detecting linguistic acceptability. This shows that matrix-based embeddings can be used to distill large PreLM into competitive models and motivates further research in this direction.
  • Gamba, M., De Gregorio, C., Valente, D., Raimondi, T., Torti, V., Miaretsoa, L., Carugati, F., Friard, O., Giacoma, C., & Ravignani, A. (2022). Primate rhythmic categories analyzed on an individual basis. In A. Ravignani, R. Asano, D. Valente, F. Ferretti, S. Hartmann, M. Hayashi, Y. Jadoul, M. Martins, Y. Oseki, E. D. Rodrigues, O. Vasileva, & S. Wacewicz (Eds.), The evolution of language: Proceedings of the Joint Conference on Language Evolution (JCoLE) (pp. 229-236). Nijmegen: Joint Conference on Language Evolution (JCoLE).

    Abstract

    Rhythm is a fundamental feature characterizing communicative displays, and recent studies showed that primate songs encompass categorical rhythms falling on small integer ratios observed in humans. We individually assessed the presence and sexual dimorphism of rhythmic categories, analyzing songs emitted by 39 wild indris. Considering the intervals between the units given during each song, we extracted 13556 interval ratios and found three peaks (at around 0.33, 0.47, and 0.70). Two peaks indicated rhythmic categories corresponding to small integer ratios (1:1, 2:1). All individuals showed a peak at 0.70, and
    most showed those at 0.47 and 0.33. In addition, we found sex differences in the peak at 0.47 only, with males showing lower values than females. This work investigates the presence of individual rhythmic categories in a non-human species; further research may highlight the significance of rhythmicity and untie selective pressures that guided its evolution across species, including humans.
  • Gazendam, L., Malaisé, V., Schreiber, G., & Brugman, H. (2006). Deriving semantic annotations of an audiovisual program from contextual texts. In First International Workshop on Semantic Web Annotations for Multimedia (SWAMM 2006).

    Abstract

    The aim of this paper is to explore whether indexing terms for an audiovisual program can be derived from contextual texts automatically. For this we apply natural-language processing techniques to contextual texts of two Dutch TV-programs. We use a Dutch domain thesaurus to derive possible metadata. This possible metadata is ranked by an algorithm which uses the relations of the thesaurus. We evaluate the results by comparing them to human made descriptions.
  • Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). Automatic sign language identification. In Proceeding of the 20th IEEE International Conference on Image Processing (ICIP) (pp. 2626-2630).

    Abstract

    We propose a Random-Forest based sign language identification system. The system uses low-level visual features and is based on the hypothesis that sign languages have varying distributions of phonemes (hand-shapes, locations and movements). We evaluated the system on two sign languages -- British SL and Greek SL, both taken from a publicly available corpus, called Dicta Sign Corpus. Achieved average F1 scores are about 95% - indicating that sign languages can be identified with high accuracy using only low-level visual features.
  • Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). Automatic signer diarization - the mover is the signer approach. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2013 IEEE Conference on (pp. 283-287). doi:10.1109/CVPRW.2013.49.

    Abstract

    We present a vision-based method for signer diarization -- the task of automatically determining "who signed when?" in a video. This task has similar motivations and applications as speaker diarization but has received little attention in the literature. In this paper, we motivate the problem and propose a method for solving it. The method is based on the hypothesis that signers make more movements than their interlocutors. Experiments on four videos (a total of 1.4 hours and each consisting of two signers) show the applicability of the method. The best diarization error rate (DER) obtained is 0.16.
  • Gebre, B. G., Zampieri, M., Wittenburg, P., & Heskes, T. (2013). Improving Native Language Identification with TF-IDF weighting. In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 216-223).

    Abstract

    This paper presents a Native Language Identification (NLI) system based on TF-IDF weighting schemes and using linear classifiers - support vector machines, logistic regressions and perceptrons. The system was one of the participants of the 2013 NLI Shared Task in the closed-training track, achieving 0.814 overall accuracy for a set of 11 native languages. This accuracy was only 2.2 percentage points lower than the winner's performance. Furthermore, with subsequent evaluations using 10-fold cross-validation (as given by the organizers) on the combined training and development data, the best average accuracy obtained is 0.8455 and the features that contributed to this accuracy are the TF-IDF of the combined unigrams and bigrams of words.
  • Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). The gesturer is the speaker. In Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013) (pp. 3751-3755).

    Abstract

    We present and solve the speaker diarization problem in a novel way. We hypothesize that the gesturer is the speaker and that identifying the gesturer can be taken as identifying the active speaker. We provide evidence in support of the hypothesis from gesture literature and audio-visual synchrony studies. We also present a vision-only diarization algorithm that relies on gestures (i.e. upper body movements). Experiments carried out on 8.9 hours of a publicly available dataset (the AMI meeting data) show that diarization error rates as low as 15% can be achieved.
  • Gijssels, T., Bottini, R., Rueschemeyer, S.-A., & Casasanto, D. (2013). Space and time in the parietal cortex: fMRI Evidence for a meural asymmetry. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 495-500). Austin,TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0113/index.html.

    Abstract

    How are space and time related in the brain? This study contrasts two proposals that make different predictions about the interaction between spatial and temporal magnitudes. Whereas ATOM implies that space and time are symmetrically related, Metaphor Theory claims they are asymmetrically related. Here we investigated whether space and time activate the same neural structures in the inferior parietal cortex (IPC) and whether the activation is symmetric or asymmetric across domains. We measured participants’ neural activity while they made temporal and spatial judgments on the same visual stimuli. The behavioral results replicated earlier observations of a space-time asymmetry: Temporal judgments were more strongly influenced by irrelevant spatial information than vice versa. The BOLD fMRI data indicated that space and time activated overlapping clusters in the IPC and that, consistent with Metaphor Theory, this activation was asymmetric: The shared region of IPC was activated more strongly during temporal judgments than during spatial judgments. We consider three possible interpretations of this neural asymmetry, based on 3 possible functions of IPC.
  • Goldrick, M., Brehm, L., Pyeong Whan, C., & Smolensky, P. (2019). Transient blend states and discrete agreement-driven errors in sentence production. In G. J. Snover, M. Nelson, B. O'Connor, & J. Pater (Eds.), Proceedings of the Society for Computation in Linguistics (SCiL 2019) (pp. 375-376). doi:10.7275/n0b2-5305.
  • Goudbeek, M., & Swingley, D. (2006). Saliency effects in distributional learning. In Proceedings of the 11th Australasian International Conference on Speech Science and Technology (pp. 478-482). Auckland: Australasian Speech Science and Technology Association.

    Abstract

    Acquiring the sounds of a language involves learning to recognize distributional patterns present in the input. We show that among adult learners, this distributional learning of auditory categories (which are conceived of here as probability density functions in a multidimensional space) is constrained by the salience of the dimensions that form the axes of this perceptual space. Only with a particular ratio of variation in the perceptual dimensions was category learning driven by the distributional properties of the input.
  • Greenfield, M. D., Honing, H., Kotz, S. A., & Ravignani, A. (Eds.). (2021). Synchrony and rhythm interaction: From the brain to behavioural ecology [Special Issue]. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 376.
  • Gullberg, M., & Indefrey, P. (Eds.). (2006). The cognitive neuroscience of second language acquisition [Special Issue]. Language Learning, 56(suppl. 1).
  • Gullberg, M. (Ed.). (2006). Gestures and second language acquisition [Special Issue]. International Review of Applied Linguistics, 44(2).
  • Gussenhoven, C., & Zhou, W. (2013). Revisiting pitch slope and height effects on perceived duration. In Proceedings of INTERSPEECH 2013: 14th Annual Conference of the International Speech Communication Association (pp. 1365-1369).

    Abstract

    The shape of pitch contours has been shown to have an effect on the perceived duration of vowels. For instance, vowels with high level pitch and vowels with falling contours sound longer than vowels with low level pitch. Depending on whether the
    comparison is between level pitches or between level and dynamic contours, these findings have been interpreted in two ways. For inter-level comparisons, where the duration results are the reverse of production results, a hypercorrection strategy in production has been proposed [1]. By contrast, for comparisons between level pitches and dynamic contours, the
    longer production data for dynamic contours have been held responsible. We report an experiment with Dutch and Chinese listeners which aimed to show that production data and perception data are each other’s opposites for high, low, falling and rising contours. We explain the results, which are consistent with earlier findings, in terms of the compensatory listening strategy of [2], arguing that the perception effects are due to a perceptual compensation of articulatory strategies and
    constraints, rather than that differences in production compensate for psycho-acoustic perception effects.
  • Hahn, L. E., Ten Buuren, M., De Nijs, M., Snijders, T. M., & Fikkert, P. (2019). Acquiring novel words in a second language through mutual play with child songs - The Noplica Energy Center. In L. Nijs, H. Van Regenmortel, & C. Arculus (Eds.), MERYC19 Counterpoints of the senses: Bodily experiences in musical learning (pp. 78-87). Ghent, Belgium: EuNet MERYC 2019.

    Abstract

    Child songs are a great source for linguistic learning. Here we explore whether children can acquire novel words in a second language by playing a game featuring child songs in a playhouse. We present data from three studies that serve as scientific proof for the functionality of one game of the playhouse: the Energy Center. For this game, three hand-bikes were mounted on a panel. When children start moving the hand-bikes, child songs start playing simultaneously. Once the children produce enough energy with the hand-bikes, the songs are additionally accompanied with the sounds of musical instruments. In our studies, children executed a picture-selection task to evaluate whether they acquired new vocabulary from the songs presented during the game. Two of our studies were run in the field, one at a Dutch and one at an Indian pre-school. The third study features data from a more controlled laboratory setting. Our results partly confirm that the Energy Center is a successful means to support vocabulary acquisition in a second language. More research with larger sample sizes and longer access to the Energy Center is needed to evaluate the overall functionality of the game. Based on informal observations at our test sites, however, we are certain that children do pick up linguistic content from the songs during play, as many of the children repeat words and phrases from songs they heard. We will pick up upon these promising observations during future studies
  • Harbusch, K., & Kempen, G. (2006). ELLEIPO: A module that computes coordinative ellipsis for language generators that don't. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2006) (pp. 115-118).

    Abstract

    Many current sentence generators lack the ability to compute elliptical versions of coordinated clauses in accordance with the rules for Gapping, Forward and Backward Conjunction Reduction, and SGF (Subject Gap in clauses with Finite/ Fronted verb). We describe a module (implemented in JAVA, with German and Dutch as target languages) that takes non-elliptical coordinated clauses as input and returns all reduced versions licensed by coordinative ellipsis. It is loosely based on a new psycholinguistic theory of coordinative ellipsis proposed by Kempen. In this theory, coordinative ellipsis is not supposed to result from the application of declarative grammar rules for clause formation but from a procedural component that interacts with the sentence generator and may block the overt expression of certain constituents.
  • Harbusch, K., Kempen, G., Van Breugel, C., & Koch, U. (2006). A generation-oriented workbench for performance grammar: Capturing linear order variability in German and Dutch. In Proceedings of the 4th International Natural Language Generation Conference (pp. 9-11).

    Abstract

    We describe a generation-oriented workbench for the Performance Grammar (PG) formalism, highlighting the treatment of certain word order and movement constraints in Dutch and German. PG enables a simple and uniform treatment of a heterogeneous collection of linear order phenomena in the domain of verb constructions (variably known as Cross-serial Dependencies, Verb Raising, Clause Union, Extraposition, Third Construction, Particle Hopping, etc.). The central data structures enabling this feature are clausal “topologies”: one-dimensional arrays associated with clauses, whose cells (“slots”) provide landing sites for the constituents of the clause. Movement operations are enabled by unification of lateral slots of topologies at adjacent levels of the clause hierarchy. The PGW generator assists the grammar developer in testing whether the implemented syntactic knowledge allows all and only the well-formed permutations of constituents.
  • Harmon, Z., Barak, L., Shafto, P., Edwards, J., & Feldman, N. H. (2021). Making heads or tails of it: A competition–compensation account of morphological deficits in language impairment. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 1872-1878). Vienna: Cognitive Science Society.

    Abstract

    Children with developmental language disorder (DLD) regularly use the base form of verbs (e.g., dance) instead of inflected forms (e.g., danced). We propose an account of this behavior in which children with DLD have difficulty processing novel inflected verbs in their input. This leads the inflected form to face stronger competition from alternatives. Competition is resolved by the production of a more accessible alternative with high semantic overlap with the inflected form: in English, the bare form. We test our account computationally by training a nonparametric Bayesian model that infers the productivity of the inflectional suffix (-ed). We systematically vary the number of novel types of inflected verbs in the input to simulate the input as processed by children with and without DLD. Modeling results are consistent with our hypothesis, suggesting that children’s inconsistent use of inflectional morphemes could stem from inferences they make on the basis of impoverished data.
  • Heilbron, M., Ehinger, B., Hagoort, P., & De Lange, F. P. (2019). Tracking naturalistic linguistic predictions with deep neural language models. In Proceedings of the 2019 Conference on Cognitive Computational Neuroscience (pp. 424-427). doi:10.32470/CCN.2019.1096-0.

    Abstract

    Prediction in language has traditionally been studied using
    simple designs in which neural responses to expected
    and unexpected words are compared in a categorical
    fashion. However, these designs have been contested
    as being ‘prediction encouraging’, potentially exaggerating
    the importance of prediction in language understanding.
    A few recent studies have begun to address
    these worries by using model-based approaches to probe
    the effects of linguistic predictability in naturalistic stimuli
    (e.g. continuous narrative). However, these studies
    so far only looked at very local forms of prediction, using
    models that take no more than the prior two words into
    account when computing a word’s predictability. Here,
    we extend this approach using a state-of-the-art neural
    language model that can take roughly 500 times longer
    linguistic contexts into account. Predictability estimates
    fromthe neural network offer amuch better fit to EEG data
    from subjects listening to naturalistic narrative than simpler
    models, and reveal strong surprise responses akin to
    the P200 and N400. These results show that predictability
    effects in language are not a side-effect of simple designs,
    and demonstrate the practical use of recent advances
    in AI for the cognitive neuroscience of language.
  • Herbst, L. E. (2006). The influence of language dominance on bilingual VOT: A case study. In Proceedings of the 4th University of Cambridge Postgraduate Conference on Language Research (CamLing 2006) (pp. 91-98). Cambridge: Cambridge University Press.

    Abstract

    Longitudinally collected VOT data from an early English-Italian bilingual who became increasingly English-dominant was analyzed. Stops in English were always produced with significantly longer VOT than in Italian. However, the speaker did not show any significant change in the VOT production in either language over time, despite the clear dominance of English in his every day language use later in his life. The results indicate that – unlike L2 learners – early bilinguals may remain unaffected by language use with respect to phonetic realization.
  • Hintz, F., Voeten, C. C., McQueen, J. M., & Meyer, A. S. (2022). Quantifying the relationships between linguistic experience, general cognitive skills and linguistic processing skills. In J. Culbertson, A. Perfors, H. Rabagliati, & V. Ramenzoni (Eds.), Proceedings of the 44th Annual Conference of the Cognitive Science Society (CogSci 2022) (pp. 2491-2496). Toronto, Canada: Cognitive Science Society.

    Abstract

    Humans differ greatly in their ability to use language. Contemporary psycholinguistic theories assume that individual differences in language skills arise from variability in linguistic experience and in general cognitive skills. While much previous research has tested the involvement of select verbal and non-verbal variables in select domains of linguistic processing, comprehensive characterizations of the relationships among the skills underlying language use are rare. We contribute to such a research program by re-analyzing a publicly available set of data from 112 young adults tested on 35 behavioral tests. The tests assessed nine key constructs reflecting linguistic processing skills, linguistic experience and general cognitive skills. Correlation and hierarchical clustering analyses of the test scores showed that most of the tests assumed to measure the same construct correlated moderately to strongly and largely clustered together. Furthermore, the results suggest important roles of processing speed in comprehension, and of linguistic experience in production.

Share this page