Publications

Displaying 1 - 100 of 474
  • Agirrezabal, M., Paggio, P., Navarretta, C., & Jongejan, B. (2023). Multimodal detection and classification of head movements in face-to-face conversations: Exploring models, features and their interaction. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527200.

    Abstract

    In this work we perform multimodal detection and classification
    of head movements from face to face video conversation data.
    We have experimented with different models and feature sets
    and provided some insight on the effect of independent features,
    but also how their interaction can enhance a head movement
    classifier. Used features include nose, neck and mid hip position
    coordinates and their derivatives together with acoustic features,
    namely, intensity and pitch of the speaker on focus. Results
    show that when input features are sufficiently processed by in-
    teracting with each other, a linear classifier can reach a similar
    performance to a more complex non-linear neural model with
    several hidden layers. Our best models achieve state-of-the-art
    performance in the detection task, measured by macro-averaged
    F1 score.
  • Akita, K., & Dingemanse, M. (2019). Ideophones (Mimetics, Expressives). In Oxford Research Encyclopedia for Linguistics. Oxford: Oxford University Press. doi:10.1093/acrefore/9780199384655.013.477.

    Abstract

    Ideophones, also termed “mimetics” or “expressives,” are marked words that depict sensory imagery. They are found in many of the world’s languages, and sizable lexical classes of ideophones are particularly well-documented in languages of Asia, Africa, and the Americas. Ideophones are not limited to onomatopoeia like meow and smack, but cover a wide range of sensory domains, such as manner of motion (e.g., plisti plasta ‘splish-splash’ in Basque), texture (e.g., tsaklii ‘rough’ in Ewe), and psychological states (e.g., wakuwaku ‘excited’ in Japanese). Across languages, ideophones stand out as marked words due to special phonotactics, expressive morphology including certain types of reduplication, and relative syntactic independence, in addition to production features like prosodic foregrounding and common co-occurrence with iconic gestures.

    Three intertwined issues have been repeatedly debated in the century-long literature on ideophones. (a) Definition: Isolated descriptive traditions and cross-linguistic variation have sometimes obscured a typologically unified view of ideophones, but recent advances show the promise of a prototype definition of ideophones as conventionalised depictions in speech, with room for language-specific nuances. (b) Integration: The variable integration of ideophones across linguistic levels reveals an interaction between expressiveness and grammatical integration, and has important implications for how to conceive of dependencies between linguistic systems. (c) Iconicity: Ideophones form a natural laboratory for the study of iconic form-meaning associations in natural languages, and converging evidence from corpus and experimental studies suggests important developmental, evolutionary, and communicative advantages of ideophones.
  • Alhama, R. G., Rowland, C. F., & Kidd, E. (2020). Evaluating word embeddings for language acquisition. In E. Chersoni, C. Jacobs, Y. Oseki, L. Prévot, & E. Santus (Eds.), Proceedings of the Workshop on Cognitive Modeling and Computational Linguistics (pp. 38-42). Stroudsburg, PA, USA: Association for Computational Linguistics (ACL). doi:10.18653/v1/2020.cmcl-1.4.

    Abstract

    Continuous vector word representations (or
    word embeddings) have shown success in cap-turing semantic relations between words, as evidenced by evaluation against behavioral data of adult performance on semantic tasks (Pereira et al., 2016). Adult semantic knowl-edge is the endpoint of a language acquisition process; thus, a relevant question is whether these models can also capture emerging word
    representations of young language learners. However, the data for children’s semantic knowledge across development is scarce. In this paper, we propose to bridge this gap by using Age of Acquisition norms to evaluate word embeddings learnt from child-directed input. We present two methods that evaluate word embeddings in terms of (a) the semantic neighbourhood density of learnt words, and (b) con-
    vergence to adult word associations. We apply our methods to bag-of-words models, and find that (1) children acquire words with fewer semantic neighbours earlier, and (2) young learners only attend to very local context. These findings provide converging evidence for validity of our methods in understanding the prerequisite features for a distributional model of word learning.
  • Alhama, R. G., & Zuidema, W. (2017). Segmentation as Retention and Recognition: the R&R model. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 1531-1536). Austin, TX: Cognitive Science Society.

    Abstract

    We present the Retention and Recognition model (R&R), a probabilistic exemplar model that accounts for segmentation in Artificial Language Learning experiments. We show that R&R provides an excellent fit to human responses in three segmentation experiments with adults (Frank et al., 2010), outperforming existing models. Additionally, we analyze the results of the simulations and propose alternative explanations for the experimental findings.
  • Alhama, R. G., Siegelman, N., Frost, R., & Armstrong, B. C. (2019). The role of information in visual word recognition: A perceptually-constrained connectionist account. In A. Goel, C. Seifert, & C. Freksa (Eds.), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 83-89). Austin, TX: Cognitive Science Society.

    Abstract

    Proficient readers typically fixate near the center of a word, with a slight bias towards word onset. We explore a novel account of this phenomenon based on combining information-theory with visual perceptual constraints in a connectionist model of visual word recognition. This account posits that the amount of information-content available for word identification varies across fixation locations and across languages, thereby explaining the overall fixation location bias in different languages, making the novel prediction that certain words are more readily identified when fixating at an atypical fixation location, and predicting specific cross-linguistic differences. We tested these predictions across several simulations in English and Hebrew, and in a pilot behavioral experiment. Results confirmed that the bias to fixate closer to word onset aligns with maximizing information in the visual signal, that some words are more readily identified at atypical fixation locations, and that these effects vary to some degree across languages.
  • Allen, S. E. M. (1998). A discourse-pragmatic explanation for the subject-object asymmetry in early null arguments. In A. Sorace, C. Heycock, & R. Shillcock (Eds.), Proceedings of the GALA '97 Conference on Language Acquisition (pp. 10-15). Edinburgh, UK: Edinburgh University Press.

    Abstract

    The present paper assesses discourse-pragmatic factors as a potential explanation for the subject-object assymetry in early child language. It identifies a set of factors which characterize typical situations of informativeness (Greenfield & Smith, 1976), and uses these factors to identify informative arguments in data from four children aged 2;0 through 3;6 learning Inuktitut as a first language. In addition, it assesses the extent of the links between features of informativeness on one hand and lexical vs. null and subject vs. object arguments on the other. Results suggest that a pragmatics account of the subject-object asymmetry can be upheld to a greater extent than previous research indicates, and that several of the factors characterizing informativeness are good indicators of those arguments which tend to be omitted in early child language.
  • Amatuni, A., Schroer, S. E., Zhang, Y., Peters, R. E., Reza, M. A., Crandall, D., & Yu, C. (2021). In-the-moment visual information from the infant's egocentric view determines the success of infant word learning: A computational study. In T. Fitch, C. Lamm, H. Leder, & K. Teßmar-Raible (Eds.), Proceedings of the 43rd Annual Conference of the Cognitive Science Society (CogSci 2021) (pp. 265-271). Vienna: Cognitive Science Society.

    Abstract

    Infants learn the meaning of words from accumulated experiences of real-time interactions with their caregivers. To study the effects of visual sensory input on word learning, we recorded infant's view of the world using head-mounted eye trackers during free-flowing play with a caregiver. While playing, infants were exposed to novel label-object mappings and later learning outcomes for these items were tested after the play session. In this study we use a classification based approach to link properties of infants' visual scenes during naturalistic labeling moments to their word learning outcomes. We find that a model which integrates both highly informative and ambiguous sensory evidence is a better fit to infants' individual learning outcomes than models where either type of evidence is taken alone, and that raw labeling frequency is unable to account for the word learning differences we observe. Here we demonstrate how a computational model, using only raw pixels taken from the egocentric scene image, can derive insights on human language learning.
  • Ambridge, B., Rowland, C. F., Theakston, A. L., & Twomey, K. E. (2020). Introduction. In C. F. Rowland, A. L. Theakston, B. Ambridge, & K. E. Twomey (Eds.), Current Perspectives on Child Language Acquisition: How children use their environment to learn (pp. 1-7). Amsterdam: John Benjamins. doi:10.1075/tilar.27.int.
  • Ameka, F. K. (2006). Ewe serial verb constructions in their grammatical context. In A. Y. Aikhenvald, & R. M. W. Dixon (Eds.), Serial verb constructions: A cross-linguistic typology (pp. 124-143). Oxford: Oxford University Press.
  • Ameka, F. K. (2006). Elements of the grammar of space in Ewe. In S. C. Levinson, & D. P. Wilkins (Eds.), Grammars of space: Explorations in cognitive diversity (pp. 359-399). Cambridge: Cambridge University Press.
  • Ameka, F. K., & Wilkins, D. P. (2006). Interjections. In J.-O. Ostman, & J. Verschueren (Eds.), Handbook of pragmatics (pp. 1-22). Amsterdam: Benjamins.
  • Ameka, F. K. (2006). Grammars in contact in the Volta Basin (West Africa): On contact induced grammatical change in Likpe. In A. Y. Aikhenvald, & R. M. W. Dixon (Eds.), Grammars in contact: A crosslinguistic typology (pp. 114-142). Oxford: Oxford University Press.
  • Ameka, F. K. (2006). Interjections. In K. Brown (Ed.), Encyclopedia of language & linguistics (2nd ed., pp. 743-746). Oxford: Elsevier.
  • Ameka, F. K. (2006). Real descriptions: Reflections on native speaker and non-native speaker descriptions of a language. In F. K. Ameka, A. Dench, & N. Evans (Eds.), Catching language: The standing challenge of grammar writing (pp. 69-112). Berlin: Mouton de Gruyter.
  • Ameka, F. K. (2017). The Uselessness of the Useful: Language Standardisation and Variation in Multilingual Context. In I. Tieken-Boon van Ostade, & C. Percy (Eds.), Prescription and tradition in language: Establishing standards across the time and space (pp. 71-87). Bristol: Multilingual Matters.
  • Amora, K. K., Garcia, R., & Gagarina, N. (2020). Tagalog adaptation of the Multilingual Assessment Instrument for Narratives: History, process and preliminary results. In N. Gagarina, & J. Lindgren (Eds.), New language versions of MAIN: Multilingual Assessment Instrument for Narratives – Revised (pp. 221-233).

    Abstract

    This paper briefly presents the current situation of bilingualism in the Philippines,
    specifically that of Tagalog-English bilingualism. More importantly, it describes the process of adapting the Multilingual Assessment Instrument for Narratives (LITMUS-MAIN) to Tagalog, the basis of Filipino, which is the country’s national language.
    Finally, the results of a pilot study conducted on Tagalog-English bilingual children and
    adults (N=27) are presented. The results showed that Story Structure is similar across the
    two languages and that it develops significantly with age.
  • Asano, Y., Yuan, C., Grohe, A.-K., Weber, A., Antoniou, M., & Cutler, A. (2020). Uptalk interpretation as a function of listening experience. In N. Minematsu, M. Kondo, T. Arai, & R. Hayashi (Eds.), Proceedings of Speech Prosody 2020 (pp. 735-739). Tokyo: ISCA. doi:10.21437/SpeechProsody.2020-150.

    Abstract

    The term “uptalk” describes utterance-final pitch rises that carry no sentence-structural information. Uptalk is usually dialectal or sociolectal, and Australian English (AusEng) is particularly known for this attribute. We ask here whether experience with an uptalk variety affects listeners’ ability to categorise rising pitch contours on the basis of the timing and height of their onset and offset. Listeners were two groups of English-speakers (AusEng, and American English), and three groups of listeners with L2 English: one group with Mandarin as L1 and experience of listening to AusEng, one with German as L1 and experience of listening to AusEng, and one with German as L1 but no AusEng experience. They heard nouns (e.g. flower, piano) in the framework “Got a NOUN”, each ending with a pitch rise artificially manipulated on three contrasts: low vs. high rise onset, low vs. high rise offset and early vs. late rise onset. Their task was to categorise the tokens as “question” or “statement”, and we analysed the effect of the pitch contrasts on their judgements. Only the native AusEng listeners were able to use the pitch contrasts systematically in making these categorisations.
  • Azar, Z., Backus, A., & Ozyurek, A. (2017). Highly proficient bilinguals maintain language-specific pragmatic constraints on pronouns: Evidence from speech and gesture. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 81-86). Austin, TX: Cognitive Science Society.

    Abstract

    The use of subject pronouns by bilingual speakers using both a pro-drop and a non-pro-drop language (e.g. Spanish heritage speakers in the USA) is a well-studied topic in research on cross-linguistic influence in language contact situations. Previous studies looking at bilinguals with different proficiency levels have yielded conflicting results on whether there is transfer from the non-pro-drop patterns to the pro-drop language. Additionally, previous research has focused on speech patterns only. In this paper, we study the two modalities of language, speech and gesture, and ask whether and how they reveal cross-linguistic influence on the use of subject pronouns in discourse. We focus on elicited narratives from heritage speakers of Turkish in the Netherlands, in both Turkish (pro-drop) and Dutch (non-pro-drop), as well as from monolingual control groups. The use of pronouns was not very common in monolingual Turkish narratives and was constrained by the pragmatic contexts, unlike in Dutch. Furthermore, Turkish pronouns were more likely to be accompanied by localized gestures than Dutch pronouns, presumably because pronouns in Turkish are pragmatically marked forms. We did not find any cross-linguistic influence in bilingual speech or gesture patterns, in line with studies (speech only) of highly proficient bilinguals. We therefore suggest that speech and gesture parallel each other not only in monolingual but also in bilingual production. Highly proficient heritage speakers who have been exposed to diverse linguistic and gestural patterns of each language from early on maintain monolingual patterns of pragmatic constraints on the use of pronouns multimodally.
  • Badimala, P., Mishra, C., Venkataramana, R. K. M., Bukhari, S. S., & Dengel, A. (2019). A Study of Various Text Augmentation Techniques for Relation Classification in Free Text. In Proceedings of the 8th International Conference on Pattern Recognition Applications and Methods (pp. 360-367). Setúbal, Portugal: SciTePress Digital Library. doi:10.5220/0007311003600367.

    Abstract

    Data augmentation techniques have been widely used in visual recognition tasks as it is easy to generate new
    data by simple and straight forward image transformations. However, when it comes to text data augmen-
    tations, it is difficult to find appropriate transformation techniques which also preserve the contextual and
    grammatical structure of language texts. In this paper, we explore various text data augmentation techniques
    in text space and word embedding space. We study the effect of various augmented datasets on the efficiency
    of different deep learning models for relation classification in text.
  • Barbiers, S., & Van Dooren, A. (2017). Modal Auxiliaries. In M. Everaert, & H. C. Van Riemsdijk (Eds.), The Wiley Blackwell Companion to Syntax (2nd ed.). Hoboken, NJ, USA: Wiley.

    Abstract

    In many languages modal auxiliaries such as English can, must, may, need, will, ought, want are ambiguous between two types of interpretations: epistemic and root interpretations. In the epistemic interpretation the modal expresses how likely it is that a proposition is true (for example, necessarily, possibly, probably true) while in the root interpretations the modal expresses the obligatoriness, permissibility, desirability, or possibility of a state or event. A central question in much syntactic research on modal auxiliaries has been whether this systematic semantic ambiguity corresponds to a syntactic distinction. A commonly accepted answer has been that in epistemic interpretations the modal verb is a monadic predicate while in root interpretations it is a dyadic predicate, typically a relation between a subject and an infinitival verb. This distinction between monadic and dyadic modal predicates has been modeled syntactically in various ways: (i) in terms of lexical argument structure, that is, as the distinction between raising and control verbs; (ii) in terms of different base positions in the array of functional heads making up the clausal spine, with epistemic modals being higher than root modals; (iii) in terms of a higher syntactic position for epistemically interpreted modals after raising at the level of semantic interpretation (LF raising); (iv) in terms of the nature of the complement of the modal. This chapter evaluates these proposals, drawing on data from, among others, English, Dutch, Icelandic, German, and Catalan and taking into account cross-linguistic differences in the modal systems. One important conclusion is that the alleged correspondence between the epistemic/root distinction and the raising/control distinction is too simple, as there are sentences with root interpretations but a raising syntax. The chapter ends with a list of questions for future research.
  • Bastiaansen, M. C. M., & Hagoort, P. (2006). Oscillatory neuronal dynamics during language comprehension. In C. Neuper, & W. Klimesch (Eds.), Event-related dynamics of brain oscillations (pp. 179-196). Amsterdam: Elsevier.

    Abstract

    Language comprehension involves two basic operations: the retrieval of lexical information (such as phonologic, syntactic, and semantic information) from long-term memory, and the unification of this information into a coherent representation of the overall utterance. Neuroimaging studies using hemo¬dynamic measures such as PET and fMRI have provided detailed information on which areas of the brain are involved in these language-related memory and unification operations. However, much less is known about the dynamics of the brain's language network. This chapter presents a literature review of the oscillatory neuronal dynamics of EEG and MEG data that can be observed during language comprehen¬sion tasks. From a detailed review of this (rapidly growing) literature the following picture emerges: memory retrieval operations are mostly accompanied by increased neuronal synchronization in the theta frequency range (4-7 Hz). Unification operations, in contrast, induce high-frequency neuronal synchro¬nization in the beta (12-30 Hz) and gamma (above 30 Hz) frequency bands. A desynchronization in the (upper) alpha frequency band is found for those studies that use secondary tasks, and seems to correspond with attentional processes, and with the behavioral consequences of the language comprehension process. We conclude that it is possible to capture the dynamics of the brain's language network by a careful analysis of the event-related changes in power and coherence of EEG and MEG data in a wide range of frequencies, in combination with subtle experimental manipulations in a range of language comprehension tasks. It appears then that neuronal synchrony is a mechanism by which the brain integrates the different types of information about language (such as phonological, orthographic, semantic, and syntactic infor¬mation) represented in different brain areas.
  • Bauer, B. L. M. (2020). Appositive compounds in dialectal and sociolinguistic varieties of French. In M. Maiden, & S. Wolfe (Eds.), Variation and change in Gallo-Romance (pp. 326-346). Oxford: Oxford University Press.
  • Bauer, B. L. M. (2021). Formation of numerals in the romance languages. In Oxford Research Encyclopedia of Linguistics. Oxford: Oxford University Press. doi:10.1093/acrefore/9780199384655.013.685.

    Abstract

    The Romance languages have a rich numeral system that includes cardinals—providing the bases on which the other types of numeral series are built—ordinals, fractions, collectives, approximatives, distributives, and multiplicatives. Latin plays a decisive and continued role in their formation, both as the language to which many numerals go back directly and as an ongoing source for lexemes and formatives. While the Latin numeral system was synthetic, with a distinct ending for each type of numeral, the Romance numerals often feature more than one (unevenly distributed) marker or structure per series, which feature varying degrees of inherited, borrowed, or innovative elements. Formal consistency is strongest in cardinals, followed by ordinals and then the other types of numeral, which also tend to be more analytic or periphrastic. From a morphological perspective, Romance numerals overall have moved away from the inherited syntheticity, but several series continue to be synthetic formations—at least in part—with morphological markers drawn from Latin that may have undergone functional change (e.g. distributive > ordinal > collective). The underlying syntax of Romance numerals is in line with the overall grammatical patterns of Romance languages, as reflected in the prevalence of word order (with arithmetical correlates), connectors, (partial) loss of agreement, and analyticity. Innovation is prominent in the formation of higher numerals with bases beyond ‘thousand’, of teens and decads in Romanian, and of vigesimals in numerous Romance varieties.
  • Bauer, B. L. M. (2006). ‘Synthetic’ vs. ‘analytic’ in Romance: The importance of varieties. In R. Gess, & D. Arteaga (Eds.), Historical Romance linguistics: Retrospective and perspectives (pp. 287-304). Amsterdam: Benjamins.
  • Bauer, B. L. M. (1993). L’auxiliaire roman: Le cas d'une double évolution. In R. Lorenzo (Ed.), Actas do XIX Congreso Internacional de Lingüística e Filoloxía Románicas. V. Gramática Histórica e Historia da Lingua (pp. 421-433). Coruna: Fundación Pedro Barrié de la Maza.
  • Bauer, B. L. M. (1993). The coalescence of the participle and the gerund/gerundive: An integrated change. In H. Aertsen, & R. J. Jeffers (Eds.), Historical Linguistics 1989. Papers of the 9th International Conference on Historical Linguistics (pp. 59-71). Amsterdam: Benjamins.
  • Bauer, B. L. M. (1993). The tendency towards right-branching in the development and acquisition of Latin and French. In J. Van Marle (Ed.), Historical Linguistics 1991. Papers of the 10th International Conference on Historical Linguistics (pp. 1-17). Amsterdam: Benjamins.
  • Bentum, M., Ten Bosch, L., Van den Bosch, A., & Ernestus, M. (2019). Listening with great expectations: An investigation of word form anticipations in naturalistic speech. In Proceedings of Interspeech 2019 (pp. 2265-2269). doi:10.21437/Interspeech.2019-2741.

    Abstract

    The event-related potential (ERP) component named phonological mismatch negativity (PMN) arises when listeners hear an unexpected word form in a spoken sentence [1]. The PMN is thought to reflect the mismatch between expected and perceived auditory speech input. In this paper, we use the PMN to test a central premise in the predictive coding framework [2], namely that the mismatch between prior expectations and sensory input is an important mechanism of perception. We test this with natural speech materials containing approximately 50,000 word tokens. The corresponding EEG-signal was recorded while participants (n = 48) listened to these materials. Following [3], we quantify the mismatch with two word probability distributions (WPD): a WPD based on preceding context, and a WPD that is additionally updated based on the incoming audio of the current word. We use the between-WPD cross entropy for each word in the utterances and show that a higher cross entropy correlates with a more negative PMN. Our results show that listeners anticipate auditory input while processing each word in naturalistic speech. Moreover, complementing previous research, we show that predictive language processing occurs across the whole probability spectrum.
  • Bentum, M., Ten Bosch, L., Van den Bosch, A., & Ernestus, M. (2019). Quantifying expectation modulation in human speech processing. In Proceedings of Interspeech 2019 (pp. 2270-2274). doi:10.21437/Interspeech.2019-2685.

    Abstract

    The mismatch between top-down predicted and bottom-up perceptual input is an important mechanism of perception according to the predictive coding framework (Friston, [1]). In this paper we develop and validate a new information-theoretic measure that quantifies the mismatch between expected and observed auditory input during speech processing. We argue that such a mismatch measure is useful for the study of speech processing. To compute the mismatch measure, we use naturalistic speech materials containing approximately 50,000 word tokens. For each word token we first estimate the prior word probability distribution with the aid of statistical language modelling, and next use automatic speech recognition to update this word probability distribution based on the unfolding speech signal. We validate the mismatch measure with multiple analyses, and show that the auditory-based update improves the probability of the correct word and lowers the uncertainty of the word probability distribution. Based on these results, we argue that it is possible to explicitly estimate the mismatch between predicted and perceived speech input with the cross entropy between word expectations computed before and after an auditory update.
  • Berck, P., Bibiko, H.-J., Kemps-Snijders, M., Russel, A., & Wittenburg, P. (2006). Ontology-based language archive utilization. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 2295-2298).
  • Bergmann, C., Tsuji, S., & Cristia, A. (2017). Top-down versus bottom-up theories of phonological acquisition: A big data approach. In Proceedings of Interspeech 2017 (pp. 2103-2107).

    Abstract

    Recent work has made available a number of standardized meta- analyses bearing on various aspects of infant language processing. We utilize data from two such meta-analyses (discrimination of vowel contrasts and word segmentation, i.e., recognition of word forms extracted from running speech) to assess whether the published body of empirical evidence supports a bottom-up versus a top-down theory of early phonological development by leveling the power of results from thousands of infants. We predicted that if infants can rely purely on auditory experience to develop their phonological categories, then vowel discrimination and word segmentation should develop in parallel, with the latter being potentially lagged compared to the former. However, if infants crucially rely on word form information to build their phonological categories, then development at the word level must precede the acquisition of native sound categories. Our results do not support the latter prediction. We discuss potential implications and limitations, most saliently that word forms are only one top-down level proposed to affect phonological development, with other proposals suggesting that top-down pressures emerge from lexical (i.e., word-meaning pairs) development. This investigation also highlights general procedures by which standardized meta-analyses may be reused to answer theoretical questions spanning across phenomena.

    Additional information

    Scripts and data
  • Black, A., & Bergmann, C. (2017). Quantifying infants' statistical word segmentation: A meta-analysis. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Meeting of the Cognitive Science Society (pp. 124-129). Austin, TX: Cognitive Science Society.

    Abstract

    Theories of language acquisition and perceptual learning increasingly rely on statistical learning mechanisms. The current meta-analysis aims to clarify the robustness of this capacity in infancy within the word segmentation literature. Our analysis reveals a significant, small effect size for conceptual replications of Saffran, Aslin, & Newport (1996), and a nonsignificant effect across all studies that incorporate transitional probabilities to segment words. In both conceptual replications and the broader literature, however, statistical learning is moderated by whether stimuli are naturally produced or synthesized. These findings invite deeper questions about the complex factors that influence statistical learning, and the role of statistical learning in language acquisition.
  • Bodur, K., Branje, S., Peirolo, M., Tiscareno, I., & German, J. S. (2021). Domain-initial strengthening in Turkish: Acoustic cues to prosodic hierarchy in stop consonants. In Proceedings of Interspeech 2021 (pp. 1459-1463). doi:10.21437/Interspeech.2021-2230.

    Abstract

    Studies have shown that cross-linguistically, consonants at the left edge of higher-level prosodic boundaries tend to be more forcefully articulated than those at lower-level boundaries, a phenomenon known as domain-initial strengthening. This study tests whether similar effects occur in Turkish, using the Autosegmental-Metrical model proposed by Ipek & Jun [1, 2] as the basis for assessing boundary strength. Productions of /t/ and /d/ were elicited in four domain-initial prosodic positions corresponding to progressively higher-level boundaries: syllable, word, intermediate phrase, and Intonational Phrase. A fifth position, nuclear word, was included in order to better situate it within the prosodic hierarchy. Acoustic correlates of articulatory strength were measured, including closure duration for /d/ and /t/, as well as voice onset time and burst energy for /t/. Our results show that closure duration increases cumulatively from syllable to intermediate phrase, while voice onset time and burst energy are not influenced by boundary strength. These findings provide corroborating evidence for Ipek & Jun’s model, particularly for the distinction between word and intermediate phrase boundaries. Additionally, articulatory strength at the left edge of the nuclear word patterned closely with word-initial position, supporting the view that the nuclear word is not associated with a distinct phrasing domain
  • De Boer, B., Thompson, B., Ravignani, A., & Boeckx, C. (2020). Analysis of mutation and fixation for language. In A. Ravignani, C. Barbieri, M. Flaherty, Y. Jadoul, E. Lattenkamp, H. Little, M. Martins, K. Mudd, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 13th International Conference (Evolang13) (pp. 56-58). Nijmegen: The Evolution of Language Conferences.
  • Bohnemeyer, J. (1998). Temporale Relatoren im Hispano-Yukatekischen Sprachkontakt. In A. Koechert, & T. Stolz (Eds.), Convergencia e Individualidad - Las lenguas Mayas entre hispanización e indigenismo (pp. 195-241). Hannover, Germany: Verlag für Ethnologie.
  • Bohnemeyer, J. (1998). Sententiale Topics im Yukatekischen. In Z. Dietmar (Ed.), Deskriptive Grammatik und allgemeiner Sprachvergleich (pp. 55-85). Tübingen, Germany: Max-Niemeyer-Verlag.
  • Bosker, H. R., & Kösem, A. (2017). An entrained rhythm's frequency, not phase, influences temporal sampling of speech. In Proceedings of Interspeech 2017 (pp. 2416-2420). doi:10.21437/Interspeech.2017-73.

    Abstract

    Brain oscillations have been shown to track the slow amplitude fluctuations in speech during comprehension. Moreover, there is evidence that these stimulus-induced cortical rhythms may persist even after the driving stimulus has ceased. However, how exactly this neural entrainment shapes speech perception remains debated. This behavioral study investigated whether and how the frequency and phase of an entrained rhythm would influence the temporal sampling of subsequent speech. In two behavioral experiments, participants were presented with slow and fast isochronous tone sequences, followed by Dutch target words ambiguous between as /ɑs/ “ash” (with a short vowel) and aas /a:s/ “bait” (with a long vowel). Target words were presented at various phases of the entrained rhythm. Both experiments revealed effects of the frequency of the tone sequence on target word perception: fast sequences biased listeners to more long /a:s/ responses. However, no evidence for phase effects could be discerned. These findings show that an entrained rhythm’s frequency, but not phase, influences the temporal sampling of subsequent speech. These outcomes are compatible with theories suggesting that sensory timing is evaluated relative to entrained frequency. Furthermore, they suggest that phase tracking of (syllabic) rhythms by theta oscillations plays a limited role in speech parsing.
  • Bosker, H. R. (2021). The contribution of amplitude modulations in speech to perceived charisma. In B. Weiss, J. Trouvain, M. Barkat-Defradas, & J. J. Ohala (Eds.), Voice attractiveness: Prosody, phonology and phonetics (pp. 165-181). Singapore: Springer. doi:10.1007/978-981-15-6627-1_10.

    Abstract

    Speech contains pronounced amplitude modulations in the 1–9 Hz range, correlating with the syllabic rate of speech. Recent models of speech perception propose that this rhythmic nature of speech is central to speech recognition and has beneficial effects on language processing. Here, we investigated the contribution of amplitude modulations to the subjective impression listeners have of public speakers. The speech from US presidential candidates Hillary Clinton and Donald Trump in the three TV debates of 2016 was acoustically analyzed by means of modulation spectra. These indicated that Clinton’s speech had more pronounced amplitude modulations than Trump’s speech, particularly in the 1–9 Hz range. A subsequent perception experiment, with listeners rating the perceived charisma of (low-pass filtered versions of) Clinton’s and Trump’s speech, showed that more pronounced amplitude modulations (i.e., more ‘rhythmic’ speech) increased perceived charisma ratings. These outcomes highlight the important contribution of speech rhythm to charisma perception.
  • Bosker, H. R. (2017). The role of temporal amplitude modulations in the political arena: Hillary Clinton vs. Donald Trump. In Proceedings of Interspeech 2017 (pp. 2228-2232). doi:10.21437/Interspeech.2017-142.

    Abstract

    Speech is an acoustic signal with inherent amplitude modulations in the 1-9 Hz range. Recent models of speech perception propose that this rhythmic nature of speech is central to speech recognition. Moreover, rhythmic amplitude modulations have been shown to have beneficial effects on language processing and the subjective impression listeners have of the speaker. This study investigated the role of amplitude modulations in the political arena by comparing the speech produced by Hillary Clinton and Donald Trump in the three presidential debates of 2016. Inspection of the modulation spectra, revealing the spectral content of the two speakers’ amplitude envelopes after matching for overall intensity, showed considerably greater power in Clinton’s modulation spectra (compared to Trump’s) across the three debates, particularly in the 1-9 Hz range. The findings suggest that Clinton’s speech had a more pronounced temporal envelope with rhythmic amplitude modulations below 9 Hz, with a preference for modulations around 3 Hz. This may be taken as evidence for a more structured temporal organization of syllables in Clinton’s speech, potentially due to more frequent use of preplanned utterances. Outcomes are interpreted in light of the potential beneficial effects of a rhythmic temporal envelope on intelligibility and speaker perception.
  • Botelho da Silva, T., & Cutler, A. (1993). Ill-formedness and transformability in Portuguese idioms. In C. Cacciari, & P. Tabossi (Eds.), Idioms: Processing, structure and interpretation (pp. 129-143). Hillsdale, NJ: Erlbaum.
  • Bowerman, M. (1982). Reorganizational processes in lexical and syntactic development. In E. Wanner, & L. Gleitman (Eds.), Language acquisition: The state of the art (pp. 319-346). New York: Academic Press.
  • Bowerman, M. (1982). Starting to talk worse: Clues to language acquisition from children's late speech errors. In S. Strauss (Ed.), U shaped behavioral growth (pp. 101-145). New York: Academic Press.
  • Bowerman, M. (1993). Typological perspectives on language acquisition: Do crosslinguistic patterns predict development? In E. V. Clark (Ed.), Proceedings of the Twenty-fifth Annual Child Language Research Forum (pp. 7-15). Stanford, CA: Center for the Study of Language and Information.
  • Brehm, L., Jackson, C. N., & Miller, K. L. (2019). Incremental interpretation in the first and second language. In M. Brown, & B. Dailey (Eds.), BUCLD 43: Proceedings of the 43rd annual Boston University Conference on Language Development (pp. 109-122). Sommerville, MA: Cascadilla Press.
  • Broeder, D., Offenga, F., Wittenburg, P., Van de Kamp, P., Nathan, D., & Strömqvist, S. (2006). Technologies for a federation of language resource archive. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 2291-2294).
  • Broeder, D., Van Veenendaal, R., Nathan, D., & Strömqvist, S. (2006). A grid of language resource repositories. In Proceedings of the 2nd IEEE International Conference on e-Science and Grid Computing.
  • Broeder, D., Claus, A., Offenga, F., Skiba, R., Trilsbeek, P., & Wittenburg, P. (2006). LAMUS: The Language Archive Management and Upload System. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 2291-2294).
  • Broersma, M. (2006). Nonnative listeners rely less on phonetic information for phonetic categorization than native listeners. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 109-110).
  • Broersma, M. (2006). Accident - execute: Increased activation in nonnative listening. In Proceedings of Interspeech 2006 (pp. 1519-1522).

    Abstract

    Dutch and English listeners’ perception of English words with partially overlapping onsets (e.g., accident- execute) was investigated. Partially overlapping words remained active longer for nonnative listeners, causing an increase of lexical competition in nonnative compared with native listening.
  • Brown, P. (1998). Early Tzeltal verbs: Argument structure and argument representation. In E. Clark (Ed.), Proceedings of the 29th Annual Stanford Child Language Research Forum (pp. 129-140). Stanford: CSLI Publications.

    Abstract

    The surge of research activity focussing on children's acquisition of verbs (e.g., Tomasello and Merriman 1996) addresses some fundamental questions: Just how variable across languages, and across individual children, is the process of verb learning? How specific are arguments to particular verbs in early child language? How does the grammatical category 'Verb' develop? The position of Universal Grammar, that a verb category is early, contrasts with that of Tomasello (1992), Pine and Lieven and their colleagues (1996, in press), and many others, that children develop a verb category slowly, gradually building up subcategorizations of verbs around pragmatic, syntactic, and semantic properties of the language they are exposed to. On this latter view, one would expect the language which the child is learning, the cultural milieu and the nature of the interactions in which the child is engaged, to influence the process of acquiring verb argument structures. This paper explores these issues by examining the development of argument representation in the Mayan language Tzeltal, in both its lexical and verbal cross-referencing forms, and analyzing the semantic and pragmatic factors influencing the form argument representation takes. Certain facts about Tzeltal (the ergative/ absolutive marking, the semantic specificity of transitive and positional verbs) are proposed to affect the representation of arguments. The first 500 multimorpheme combinations of 3 children (aged between 1;8 and 2;4) are examined. It is argued that there is no evidence of semantically light 'pathbreaking' verbs (Ninio 1996) leading the way into word combinations. There is early productivity of cross-referencing affixes marking A, S, and O arguments (although there are systematic omissions). The paper assesses the respective contributions of three kinds of factors to these results - structural (regular morphology), semantic (verb specificity) and pragmatic (the nature of Tzeltal conversational interaction).
  • Brown, P. (2006). Cognitive anthropology. In C. Jourdan, & K. Tuite (Eds.), Language, culture and society: Key topics in linguistic anthropology (pp. 96-114). Cambridge University Press.

    Abstract

    This is an appropriate moment to review the state of the art in cognitive anthropology, construed broadly as the comparative study of human cognition in its linguistic and cultural context. In reaction to the dominance of universalism in the 1970s and '80s, there have recently been a number of reappraisals of the relation between language and cognition, and the field of cognitive anthropology is flourishing in several new directions in both America and Europe. This is partly due to a renewal and re-evaluation of approaches to the question of linguistic relativity associated with Whorf, and partly to the inspiration of modern developments in cognitive science. This review briefly sketches the history of cognitive anthropology and surveys current research on both sides of the Atlantic. The focus is on assessing current directions, considering in particular, by way of illustration, recent work in cultural models and on spatial language and cognition. The review concludes with an assessment of how cognitive anthropology could contribute directly both to the broader project of cognitive science and to the anthropological study of how cultural ideas and practices relate to structures and processes of human cognition.
  • Brown, P. (2006). A sketch of the grammar of space in Tzeltal. In S. C. Levinson, & D. P. Wilkins (Eds.), Grammars of space: Explorations in cognitive diversity (pp. 230-272). Cambridge: Cambridge University Press.

    Abstract

    This paper surveys the lexical and grammatical resources for talking about spatial relations in the Mayan language Tzeltal - for describing where things are located, where they are moving, and how they are distributed in space. Six basic sets of spatial vocabulary are presented: i. existential locative expressions with ay ‘exist’, ii. deictics (demonstratives, adverbs, presentationals), iii. dispositional adjectives, often in combination with (iv) and (v), iv. body part relational noun locatives, v. absolute (‘cardinal’) directions, and vi. motion verbs, directionals and auxiliaries. The first two are used in minimal locative descriptions, while the others constitute the core resources for specifying in detail the location, disposition, orientation, or motion of a Figure in relation to a Ground. We find that Tzeltal displays a relative de-emphasis on deixis and left/right asymmetry, and a detailed attention to the spatial properties of objects.
  • Brown, P. (1993). Gender, politeness and confrontation in Tenejapa [reprint]. In D. Tannen (Ed.), Gender and conversational interaction (pp. 144-164). New York: Oxford University Press.

    Abstract

    This is a reprint of Brown 1990.
  • Brown, P. (1998). How and why are women more polite: Some evidence from a Mayan community. In J. Coates (Ed.), Language and gender (pp. 81-99). Oxford: Blackwell.
  • Brown, P. (2017). Politeness and impoliteness. In Y. Huang (Ed.), Oxford handbook of pragmatics (pp. 383-399). Oxford: Oxford University Press. doi:10.1093/oxfordhb/9780199697960.013.16.

    Abstract

    This article selectively reviews the literature on politeness across different disciplines—linguistics, anthropology, communications, conversation analysis, social psychology, and sociology—and critically assesses how both theoretical approaches to politeness and research on linguistic politeness phenomena have evolved over the past forty years. Major new developments include a shift from predominantly linguistic approaches to those examining politeness and impoliteness as processes that are embedded and negotiated in interactional and cultural contexts, as well as a greater focus on how both politeness and interactional confrontation and conflict fit into our developing understanding of human cooperation and universal aspects of human social interaction.

    Files private

    Request files
  • Brown, P., & Levinson, S. C. (1998). Politeness, introduction to the reissue: A review of recent work. In A. Kasher (Ed.), Pragmatics: Vol. 6 Grammar, psychology and sociology (pp. 488-554). London: Routledge.

    Abstract

    This article is a reprint of chapter 1, the introduction to Brown and Levinson, 1987, Politeness: Some universals in language usage (Cambridge University Press).
  • Brown, P. (1993). The role of shape in the acquisition of Tzeltal (Mayan) locatives. In E. V. Clark (Ed.), Proceedings of the 25th Annual Child Language Research Forum (pp. 211-220). Stanford, CA: CSLI/University of Chicago Press.

    Abstract

    In a critique of the current state of theories of language acquisition, Bowerman (1985) has argued forcibly for the need to take crosslinguistic variation in semantic structure seriously, in order to understand children's acquisition of semantic categories in the process of learning their language. The semantics of locative expressions in the Mayan language Tzeltal exemplifies this point, for no existing theory of spatial expressions provides an adequate basis for capturing the semantic structure of spatial description in this Mayan language. In this paper I describe some of the characteristics of Tzeltal locative descriptions, as a contribution to the growing body of data on crosslinguistic variation in this domain and as a prod to ideas about acquisition processes, confining myself to the topological notions of 'on' and 'in', and asking whether, and how, these notions are involved in the semantic distinctions underlying Tzeltal locatives.
  • Bruggeman, L., & Cutler, A. (2019). The dynamics of lexical activation and competition in bilinguals’ first versus second language. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1342-1346). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Speech input causes listeners to activate multiple
    candidate words which then compete with one
    another. These include onset competitors, that share a
    beginning (bumper, butter), but also, counterintuitively,
    rhyme competitors, sharing an ending
    (bumper, jumper). In L1, competition is typically
    stronger for onset than for rhyme. In L2, onset
    competition has been attested but rhyme competition
    has heretofore remained largely unexamined. We
    assessed L1 (Dutch) and L2 (English) word
    recognition by the same late-bilingual individuals. In
    each language, eye gaze was recorded as listeners
    heard sentences and viewed sets of drawings: three
    unrelated, one depicting an onset or rhyme competitor
    of a word in the input. Activation patterns revealed
    substantial onset competition but no significant
    rhyme competition in either L1 or L2. Rhyme
    competition may thus be a “luxury” feature of
    maximally efficient listening, to be abandoned when
    resources are scarcer, as in listening by late
    bilinguals, in either language.
  • Brugman, H., Malaisé, V., & Gazendam, L. (2006). A web based general thesaurus browser to support indexing of television and radio programs. In Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC 2006) (pp. 1488-1491).
  • Budwig, N., Narasimhan, B., & Srivastava, S. (2006). Interim solutions: The acquisition of early constructions in Hindi. In E. Clark, & B. Kelly (Eds.), Constructions in acquisition (pp. 163-185). Stanford: CSLI Publications.
  • Wu, D. H., & Bulut, T. (2020). The contribution of statistical learning to language and literacy acquisition. In K. D. Federmeier, & H. W. Huang (Eds.), Psychology of Learning and Motivation (pp. 283-318). doi:10.1016/bs.plm.2020.02.001.

    Abstract

    Acquisition and processing of written and spoken language is an impressive cognitive accomplishment considering the complexity of the tasks. While only humans seem to have evolved to the fullest extent the capacity that underpins these remarkable feats of development and civilization, the exact nature of such capacity has been subject to ongoing research. In this chapter, we focus on language competence and what makes it unique among the communication systems of different species. We then elaborate on the classical debate between nativist and environmentalist accounts of language acquisition, with reference to evidence for and against the critical period hypothesis. After introducing the regularity embedded in different languages and particularly in drastically different orthographies, we present behavioral and neurophysiological evidence for the sensitivity to systematic mapping between orthography and phonology. Because learning to read is to master such mapping, we assume that the ability to use statistical learning to appreciate the dependency among items would contribute to literacy acquisition. Empirical results from behavioral and neuroimaging experiments conducted in our and other laboratories provide support for the close link between statistical learning and literacy acquisition in native and foreign language. Such findings highlight the significance of domain-general statistical learning to domain-specific language acquisition, and point to an important direction for theories and practices of language education.

    Files private

    Request files
  • Burchfield, L. A., Luk, S.-.-H.-K., Antoniou, M., & Cutler, A. (2017). Lexically guided perceptual learning in Mandarin Chinese. In Proceedings of Interspeech 2017 (pp. 576-580). doi:10.21437/Interspeech.2017-618.

    Abstract

    Lexically guided perceptual learni ng refers to the use of lexical knowledge to retune sp eech categories and thereby adapt to a novel talker’s pronunciation. This adaptation has been extensively documented, but primarily for segmental-based learning in English and Dutch. In languages with lexical tone, such as Mandarin Chinese, tonal categories can also be retuned in this way, but segmental category retuning had not been studied. We report two experiment s in which Mandarin Chinese listeners were exposed to an ambiguous mixture of [f] and [s] in lexical contexts favoring an interpretation as either [f] or [s]. Listeners were subsequently more likely to identify sounds along a continuum between [f] and [s], and to interpret minimal word pairs, in a manner consistent with this exposure. Thus lexically guided perceptual learning of segmental categories had indeed taken place, consistent with suggestions that such learning may be a universally available adaptation process
  • Burenhult, N. (2020). Foraging and the history of languages in the Malay Peninsula. In T. Güldemann, P. McConvell, & R. Rhodes (Eds.), The language of Hunter-Gatherers (pp. 164-197). Cambridge: Cambridge University Press.
  • Burenkova, O. V., & Fisher, S. E. (2019). Genetic insights into the neurobiology of speech and language. In E. Grigorenko, Y. Shtyrov, & P. McCardle (Eds.), All About Language: Science, Theory, and Practice. Baltimore, MD: Paul Brookes Publishing, Inc.
  • Cabrelli, J., Chaouch-Orozco, A., González Alonso, J., Pereira Soares, S. M., Puig-Mayenco, E., & Rothman, J. (2023). Introduction - Multilingualism: Language, brain, and cognition. In J. Cabrelli, A. Chaouch-Orozco, J. González Alonso, S. M. Pereira Soares, E. Puig-Mayenco, & J. Rothman (Eds.), The Cambridge handbook of third language acquisition (pp. 1-20). Cambridge: Cambridge University Press. doi:10.1017/9781108957823.001.

    Abstract

    This chapter provides an introduction to the handbook. It succintly overviews the key questions in the field of L3/Ln acquisition and summarizes the scope of all the chapters included. The chapter ends by raising some outstanding questions that the field needs to address.
  • Caplan, S., Peng, M. Z., Zhang, Y., & Yu, C. (2023). Using an Egocentric Human Simulation Paradigm to quantify referential and semantic ambiguity in early word learning. In M. Goldwater, F. K. Anggoro, B. K. Hayes, & D. C. Ong (Eds.), Proceedings of the 45th Annual Meeting of the Cognitive Science Society (CogSci 2023) (pp. 1043-1049).

    Abstract

    In order to understand early word learning we need to better understand and quantify properties of the input that young children receive. We extended the human simulation paradigm (HSP) using egocentric videos taken from infant head-mounted cameras. The videos were further annotated with gaze information indicating in-the-moment visual attention from the infant. Our new HSP prompted participants for two types of responses, thus differentiating referential from semantic ambiguity in the learning input. Consistent with findings on visual attention in word learning, we find a strongly bimodal distribution over HSP accuracy. Even in this open-ended task, most videos only lead to a small handful of common responses. What's more, referential ambiguity was the key bottleneck to performance: participants can nearly always recover the exact word that was said if they identify the correct referent. Finally, analysis shows that adult learners relied on particular, multimodal behavioral cues to infer those target referents.
  • Casillas, M., Bergelson, E., Warlaumont, A. S., Cristia, A., Soderstrom, M., VanDam, M., & Sloetjes, H. (2017). A New Workflow for Semi-automatized Annotations: Tests with Long-Form Naturalistic Recordings of Childrens Language Environments. In Proceedings of Interspeech 2017 (pp. 2098-2102). doi:10.21437/Interspeech.2017-1418.

    Abstract

    Interoperable annotation formats are fundamental to the utility, expansion, and sustainability of collective data repositories.In language development research, shared annotation schemes have been critical to facilitating the transition from raw acoustic data to searchable, structured corpora. Current schemes typically require comprehensive and manual annotation of utterance boundaries and orthographic speech content, with an additional, optional range of tags of interest. These schemes have been enormously successful for datasets on the scale of dozens of recording hours but are untenable for long-format recording corpora, which routinely contain hundreds to thousands of audio hours. Long-format corpora would benefit greatly from (semi-)automated analyses, both on the earliest steps of annotation—voice activity detection, utterance segmentation, and speaker diarization—as well as later steps—e.g., classification-based codes such as child-vs-adult-directed speech, and speech recognition to produce phonetic/orthographic representations. We present an annotation workflow specifically designed for long-format corpora which can be tailored by individual researchers and which interfaces with the current dominant scheme for short-format recordings. The workflow allows semi-automated annotation and analyses at higher linguistic levels. We give one example of how the workflow has been successfully implemented in a large cross-database project.
  • Casillas, M., & Hilbrink, E. (2020). Communicative act development. In K. P. Schneider, & E. Ifantidou (Eds.), Developmental and Clinical Pragmatics (pp. 61-88). Berlin: De Gruyter Mouton.

    Abstract

    How do children learn to map linguistic forms onto their intended meanings? This chapter begins with an introduction to some theoretical and analytical tools used to study communicative acts. It then turns to communicative act development in spoken and signed language acquisition, including both the early scaffolding and production of communicative acts (both non-verbal and verbal) as well as their later links to linguistic development and Theory of Mind. The chapter wraps up by linking research on communicative act development to the acquisition of conversational skills, cross-linguistic and individual differences in communicative experience during development, and human evolution. Along the way, it also poses a few open questions for future research in this domain.
  • Casillas, M., Amatuni, A., Seidl, A., Soderstrom, M., Warlaumont, A., & Bergelson, E. (2017). What do Babies hear? Analyses of Child- and Adult-Directed Speech. In Proceedings of Interspeech 2017 (pp. 2093-2097). doi:10.21437/Interspeech.2017-1409.

    Abstract

    Child-directed speech is argued to facilitate language development, and is found cross-linguistically and cross-culturally to varying degrees. However, previous research has generally focused on short samples of child-caregiver interaction, often in the lab or with experimenters present. We test the generalizability of this phenomenon with an initial descriptive analysis of the speech heard by young children in a large, unique collection of naturalistic, daylong home recordings. Trained annotators coded automatically-detected adult speech 'utterances' from 61 homes across 4 North American cities, gathered from children (age 2-24 months) wearing audio recorders during a typical day. Coders marked the speaker gender (male/female) and intended addressee (child/adult), yielding 10,886 addressee and gender tags from 2,523 minutes of audio (cf. HB-CHAAC Interspeech ComParE challenge; Schuller et al., in press). Automated speaker-diarization (LENA) incorrectly gender-tagged 30% of male adult utterances, compared to manually-coded consensus. Furthermore, we find effects of SES and gender on child-directed and overall speech, increasing child-directed speech with child age, and interactions of speaker gender, child gender, and child age: female caretakers increased their child-directed speech more with age than male caretakers did, but only for male infants. Implications for language acquisition and existing classification algorithms are discussed.
  • Chen, J. (2006). The acquisition of verb compounding in Mandarin. In E. V. Clark, & B. F. Kelly (Eds.), Constructions in acquisition (pp. 111-136). Stanford: CSLI Publications.
  • Chen, Y., & Braun, B. (2006). Prosodic realization in information structure categories in standard Chinese. In R. Hoffmann, & H. Mixdorff (Eds.), Speech Prosody 2006. Dresden: TUD Press.

    Abstract

    This paper investigates the prosodic realization of information
    structure categories in Standard Chinese. A number of proper
    names with different tonal combinations were elicited as a
    grammatical subject in five pragmatic contexts. Results show
    that both duration and F0 range of the tonal realizations were
    adjusted to signal the information structure categories (i.e.
    theme vs. rheme and background vs. focus). Rhemes
    consistently induced a longer duration and a more expanded F0
    range than themes. Focus, compared to background, generally
    induced lengthening and F0 range expansion (the presence and
    magnitude of which, however, are dependent on the tonal
    structure of the proper names). Within the rheme focus
    condition, corrective rheme focus induced more expanded F0
    range than normal rheme focus.
  • Chen, A. (2006). Variations in the marking of focus in child language. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 113-114).
  • Chen, A. (2006). Interface between information structure and intonation in Dutch wh-questions. In R. Hoffmann, & H. Mixdorff (Eds.), Speech Prosody 2006. Dresden: TUD Press.

    Abstract

    This study set out to investigate how accent placement is pragmatically governed in WH-questions. Central to this issue are questions such as whether the intonation of the WH-word depends on the information structure of the non-WH word part, whether topical constituents can be accented, and whether constituents in the non-WH word part can be non-topical and accented. Previous approaches, based either on carefully composed examples or on read speech, differ in their treatments of these questions and consequently make opposing claims on the intonation of WH-questions. We addressed these questions by examining a corpus of 90 naturally occurring WH-questions, selected from the Spoken Dutch Corpus. Results show that the intonation of the WH-word is related to the information structure of the non-WH word part. Further, topical constituents can get accented and the accents are not necessarily phonetically reduced. Additionally, certain adverbs, which have no topical relation to the presupposition of the WH-questions, also get accented. They appear to function as a device for enhancing speaker engagement.
  • Chevrefils, L., Morgenstern, A., Beaupoil-Hourdel, P., Bedoin, D., Caët, S., Danet, C., Danino, C., De Pontonx, S., & Parisse, C. (2023). Coordinating eating and languaging: The choreography of speech, sign, gesture and action in family dinners. In W. Pouw, J. Trujillo, H. R. Bosker, L. Drijvers, M. Hoetjes, J. Holler, S. Kadava, L. Van Maastricht, E. Mamus, & A. Ozyurek (Eds.), Gesture and Speech in Interaction (GeSpIn) Conference. doi:10.17617/2.3527183.

    Abstract

    In this study, we analyze one French signing and one French speaking family’s interaction during dinner. The families composed of two parents and two children aged 3 to 11 were filmed with three cameras to capture all family members’ behaviors. The three videos per dinner were synchronized and coded on ELAN. We annotated all participants’ acting, and languaging.
    Our quantitative analyses show how family members collaboratively manage multiple streams of activity through the embodied performances of dining and interacting. We uncover different profiles according to participants’ modality of expression and status (focusing on the mother and the younger child). The hearing participants’ co-activity management illustrates their monitoring of dining and conversing and how they progressively master the affordances of the visual and vocal channels to maintain the simultaneity of the two activities. The deaf mother skillfully manages to alternate smoothly between dining and interacting. The deaf younger child manifests how she is in the process of developing her skills to manage multi-activity. Our qualitative analyses focus on the ecology of visual-gestural and audio-vocal languaging in the context of co-activity according to language and participant. We open new perspectives on the management of gaze and body parts in multimodal languaging.
  • Collins, J. (2017). Real and spurious correlations involving tonal languages. In N. J. Enfield (Ed.), Dependencies in language: On the causal ontology of linguistics systems (pp. 129-139). Berlin: Language Science Press.
  • Coopmans, C. W., De Hoop, H., Kaushik, K., Hagoort, P., & Martin, A. E. (2021). Structure-(in)dependent interpretation of phrases in humans and LSTMs. In Proceedings of the Society for Computation in Linguistics (SCiL 2021) (pp. 459-463).

    Abstract

    In this study, we compared the performance of a long short-term memory (LSTM) neural network to the behavior of human participants on a language task that requires hierarchically structured knowledge. We show that humans interpret ambiguous noun phrases, such as second blue ball, in line with their hierarchical constituent structure. LSTMs, instead, only do
    so after unambiguous training, and they do not systematically generalize to novel items. Overall, the results of our simulations indicate that a model can behave hierarchically without relying on hierarchical constituent structure.
  • Corps, R. E. (2023). What do we know about the mechanisms of response planning in dialog? In Psychology of Learning and Motivation (pp. 41-81). doi:10.1016/bs.plm.2023.02.002.

    Abstract

    During dialog, interlocutors take turns at speaking with little gap or overlap between their contributions. But language production in monolog is comparatively slow. Theories of dialog tend to agree that interlocutors manage these timing demands by planning a response early, before the current speaker reaches the end of their turn. In the first half of this chapter, I review experimental research supporting these theories. But this research also suggests that planning a response early, while simultaneously comprehending, is difficult. Does response planning need to be this difficult during dialog? In other words, is early-planning always necessary? In the second half of this chapter, I discuss research that suggests the answer to this question is no. In particular, corpora of natural conversation demonstrate that speakers do not directly respond to the immediately preceding utterance of their partner—instead, they continue an utterance they produced earlier. This parallel talk likely occurs because speakers are highly incremental and plan only part of their utterance before speaking, leading to pauses, hesitations, and disfluencies. As a result, speakers do not need to engage in extensive advance planning. Thus, laboratory studies do not provide a full picture of language production in dialog, and further research using naturalistic tasks is needed.
  • Crago, M. B., & Allen, S. E. M. (1998). Acquiring Inuktitut. In O. L. Taylor, & L. Leonard (Eds.), Language Acquisition Across North America: Cross-Cultural And Cross-Linguistic Perspectives (pp. 245-279). San Diego, CA, USA: Singular Publishing Group, Inc.
  • Crago, M. B., Allen, S. E. M., & Pesco, D. (1998). Issues of Complexity in Inuktitut and English Child Directed Speech. In Proceedings of the twenty-ninth Annual Stanford Child Language Research Forum (pp. 37-46).
  • Crasborn, O., Sloetjes, H., Auer, E., & Wittenburg, P. (2006). Combining video and numeric data in the analysis of sign languages with the ELAN annotation software. In C. Vetoori (Ed.), Proceedings of the 2nd Workshop on the Representation and Processing of Sign languages: Lexicographic matters and didactic scenarios (pp. 82-87). Paris: ELRA.

    Abstract

    This paper describes hardware and software that can be used for the phonetic study of sign languages. The field of sign language phonetics is characterised, and the hardware that is currently in use is described. The paper focuses on the software that was developed to enable the recording of finger and hand movement data, and the additions to the ELAN annotation software that facilitate the further visualisation and analysis of the data.
  • Creemers, A. (2023). Morphological processing in spoken-word recognition. In D. Crepaldi (Ed.), Linguistic morphology in the mind and brain (pp. 50-64). New York: Routledge.

    Abstract

    Most psycholinguistic studies on morphological processing have examined the role of morphological structure in the visual modality. This chapter discusses morphological processing in the auditory modality, which is an area of research that has only recently received more attention. It first discusses why results in the visual modality cannot straightforwardly be applied to the processing of spoken words, stressing the importance of acknowledging potential modality effects. It then gives a brief overview of the existing research on the role of morphology in the auditory modality, for which an increasing number of studies report that listeners show sensitivity to morphological structure. Finally, the chapter highlights insights gained by looking at morphological processing not only in reading, but also in listening, and it discusses directions for future research
  • Cutler, A. (2006). Rudolf Meringer. In K. Brown (Ed.), Encyclopedia of Language and Linguistics (vol. 8) (pp. 12-13). Amsterdam: Elsevier.

    Abstract

    Rudolf Meringer (1859–1931), Indo-European philologist, published two collections of slips of the tongue, annotated and interpreted. From 1909, he was the founding editor of the cultural morphology movement's journal Wörter und Sachen. Meringer was the first to note the linguistic significance of speech errors, and his interpretations have stood the test of time. This work, rather than his mainstream philological research, has proven his most lasting linguistic contribution
  • Cutler, A., Kim, J., & Otake, T. (2006). On the limits of L1 influence on non-L1 listening: Evidence from Japanese perception of Korean. In P. Warren, & C. I. Watson (Eds.), Proceedings of the 11th Australian International Conference on Speech Science & Technology (pp. 106-111).

    Abstract

    Language-specific procedures which are efficient for listening to the L1 may be applied to non-native spoken input, often to the detriment of successful listening. However, such misapplications of L1-based listening do not always happen. We propose, based on the results from two experiments in which Japanese listeners detected target sequences in spoken Korean, that an L1 procedure is only triggered if requisite L1 features are present in the input.
  • Cutler, A. (2006). Van spraak naar woorden in een tweede taal. In J. Morais, & G. d'Ydewalle (Eds.), Bilingualism and Second Language Acquisition (pp. 39-54). Brussels: Koninklijke Vlaamse Academie van België voor Wetenschappen en Kunsten.
  • Cutler, A., Burchfield, A., & Antoniou, M. (2019). A criterial interlocutor tally for successful talker adaptation? In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (Eds.), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1485-1489). Canberra, Australia: Australasian Speech Science and Technology Association Inc.

    Abstract

    Part of the remarkable efficiency of listening is
    accommodation to unfamiliar talkers’ specific
    pronunciations by retuning of phonemic intercategory
    boundaries. Such retuning occurs in second
    (L2) as well as first language (L1); however, recent
    research with emigrés revealed successful adaptation
    in the environmental L2 but, unprecedentedly, not in
    L1 despite continuing L1 use. A possible explanation
    involving relative exposure to novel talkers is here
    tested in heritage language users with Mandarin as
    family L1 and English as environmental language. In
    English, exposure to an ambiguous sound in
    disambiguating word contexts prompted the expected
    adjustment of phonemic boundaries in subsequent
    categorisation. However, no adjustment occurred in
    Mandarin, again despite regular use. Participants
    reported highly asymmetric interlocutor counts in the
    two languages. We conclude that successful retuning
    ability requires regular exposure to novel talkers in
    the language in question, a criterion not met for the
    emigrés’ or for these heritage users’ L1.
  • Cutler, A., & Jesse, A. (2021). Word stress in speech perception. In J. S. Pardo, L. C. Nygaard, & D. B. Pisoni (Eds.), The handbook of speech perception (2nd ed., pp. 239-265). Chichester: Wiley.
  • Cutler, A., & Otake, T. (1998). Assimilation of place in Japanese and Dutch. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: vol. 5 (pp. 1751-1754). Sydney: ICLSP.

    Abstract

    Assimilation of place of articulation across a nasal and a following stop consonant is obligatory in Japanese, but not in Dutch. In four experiments the processing of assimilated forms by speakers of Japanese and Dutch was compared, using a task in which listeners blended pseudo-word pairs such as ranga-serupa. An assimilated blend of this pair would be rampa, an unassimilated blend rangpa. Japanese listeners produced significantly more assimilated than unassimilated forms, both with pseudo-Japanese and pseudo-Dutch materials, while Dutch listeners produced significantly more unassimilated than assimilated forms in each materials set. This suggests that Japanese listeners, whose native-language phonology involves obligatory assimilation constraints, represent the assimilated nasals in nasal-stop sequences as unmarked for place of articulation, while Dutch listeners, who are accustomed to hearing unassimilated forms, represent the same nasal segments as marked for place of articulation.
  • Cutler, A. (2017). Converging evidence for abstract phonological knowledge in speech processing. In G. Gunzelmann, A. Howes, T. Tenbrink, & E. Davelaar (Eds.), Proceedings of the 39th Annual Conference of the Cognitive Science Society (CogSci 2017) (pp. 1447-1448). Austin, TX: Cognitive Science Society.

    Abstract

    The perceptual processing of speech is a constant interplay of multiple competing albeit convergent processes: acoustic input vs. higher-level representations, universal mechanisms vs. language-specific, veridical traces of speech experience vs. construction and activation of abstract representations. The present summary concerns the third of these issues. The ability to generalise across experience and to deal with resulting abstractions is the hallmark of human cognition, visible even in early infancy. In speech processing, abstract representations play a necessary role in both production and perception. New sorts of evidence are now informing our understanding of the breadth of this role.
  • Cutler, A., & Pasveer, D. (2006). Explaining cross-linguistic differences in effects of lexical stress on spoken-word recognition. In R. Hoffmann, & H. Mixdorff (Eds.), Speech Prosody 2006. Dresden: TUD press.

    Abstract

    Experiments have revealed differences across languages in listeners’ use of stress information in recognising spoken words. Previous comparisons of the vocabulary of Spanish and English had suggested that the explanation of this asymmetry might lie in the extent to which considering stress in spokenword recognition allows rejection of unwanted competition from words embedded in other words. This hypothesis was tested on the vocabularies of Dutch and German, for which word recognition results resemble those from Spanish more than those from English. The vocabulary statistics likewise revealed that in each language, the reduction of embeddings resulting from taking stress into account is more similar to the reduction achieved in Spanish than in English.
  • Cutler, A., Eisner, F., McQueen, J. M., & Norris, D. (2006). Coping with speaker-related variation via abstract phonemic categories. In Variation, detail and representation: 10th Conference on Laboratory Phonology (pp. 31-32).
  • Ip, M. H. K., & Cutler, A. (2017). Intonation facilitates prediction of focus even in the presence of lexical tones. In Proceedings of Interspeech 2017 (pp. 1218-1222). doi:10.21437/Interspeech.2017-264.

    Abstract

    In English and Dutch, listeners entrain to prosodic contours to predict where focus will fall in an utterance. However, is this strategy universally available, even in languages with different phonological systems? In a phoneme detection experiment, we examined whether prosodic entrainment is also found in Mandarin Chinese, a tone language, where in principle the use of pitch for lexical identity may take precedence over the use of pitch cues to salience. Consistent with the results from Germanic languages, response times were facilitated when preceding intonation predicted accent on the target-bearing word. Acoustic analyses revealed greater F0 range in the preceding intonation of the predicted-accent sentences. These findings have implications for how universal and language-specific mechanisms interact in the processing of salience.
  • Cutler, A. (1998). How listeners find the right words. In Proceedings of the Sixteenth International Congress on Acoustics: Vol. 2 (pp. 1377-1380). Melville, NY: Acoustical Society of America.

    Abstract

    Languages contain tens of thousands of words, but these are constructed from a tiny handful of phonetic elements. Consequently, words resemble one another, or can be embedded within one another, a coup stick snot with standing. me process of spoken-word recognition by human listeners involves activation of multiple word candidates consistent with the input, and direct competition between activated candidate words. Further, human listeners are sensitive, at an early, prelexical, stage of speeeh processing, to constraints on what could potentially be a word of the language.
  • Cutler, A. (1993). Language-specific processing: Does the evidence converge? In G. T. Altmann, & R. C. Shillcock (Eds.), Cognitive models of speech processing: The Sperlonga Meeting II (pp. 115-123). Hillsdale, NJ: Erlbaum.
  • Cutler, A., Treiman, R., & Van Ooijen, B. (1998). Orthografik inkoncistensy ephekts in foneme detektion? In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2783-2786). Sydney: ICSLP.

    Abstract

    The phoneme detection task is widely used in spoken word recognition research. Alphabetically literate participants, however, are more used to explicit representations of letters than of phonemes. The present study explored whether phoneme detection is sensitive to how target phonemes are, or may be, orthographically realised. Listeners detected the target sounds [b,m,t,f,s,k] in word-initial position in sequences of isolated English words. Response times were faster to the targets [b,m,t], which have consistent word-initial spelling, than to the targets [f,s,k], which are inconsistently spelled, but only when listeners’ attention was drawn to spelling by the presence in the experiment of many irregularly spelled fillers. Within the inconsistent targets [f,s,k], there was no significant difference between responses to targets in words with majority and minority spellings. We conclude that performance in the phoneme detection task is not necessarily sensitive to orthographic effects, but that salient orthographic manipulation can induce such sensitivity.
  • Cutler, A. (1998). Prosodic structure and word recognition. In A. D. Friederici (Ed.), Language comprehension: A biological perspective (pp. 41-70). Heidelberg: Springer.
  • Cutler, A. (1982). Prosody and sentence perception in English. In J. Mehler, E. C. Walker, & M. Garrett (Eds.), Perspectives on mental representation: Experimental and theoretical studies of cognitive processes and capacities (pp. 201-216). Hillsdale, N.J: Erlbaum.
  • Cutler, A. (1998). The recognition of spoken words with variable representations. In D. Duez (Ed.), Proceedings of the ESCA Workshop on Sound Patterns of Spontaneous Speech (pp. 83-92). Aix-en-Provence: Université de Aix-en-Provence.
  • Danziger, E., & Gaskins, S. (1993). Exploring the Intrinsic Frame of Reference. In S. C. Levinson (Ed.), Cognition and space kit 1.0 (pp. 53-64). Nijmegen: Max Planck Institute for Psycholinguistics. doi:10.17617/2.3513136.

    Abstract

    We can describe the position of one item with respect to another using a number of different ‘frames of reference’. For example, I can use a ‘deictic’ frame that involves the speaker’s viewpoint (The chair is on the far side of the room), or an ‘intrinsic’ frame that involves a feature of one of the items (The chair is at the back of the room). Where more than one frame of reference is available in a language, what motivates the speaker’s choice? This elicitation task is designed to explore when and why people select intrinsic frames of reference, and how these choices interact with non-linguistic problem-solving strategies.
  • Dediu, D. (2017). From biology to language change and diversity. In N. J. Enfield (Ed.), Dependencies in language: On the causal ontology of linguistics systems (pp. 39-52). Berlin: Language Science Press.
  • Dediu, D. (2006). Mostly out of Africa, but what did the others have to say? In A. Cangelosi, A. D. Smith, & K. Smith (Eds.), The evolution of language: proceedings of the 6th International Conference (EVOLANG6) (pp. 59-66). World Scientific.

    Abstract

    The Recent Out-of-Africa human evolutionary model seems to be generally accepted. This impression is very prevalent outside palaeoanthropological circles (including studies of language evolution), but proves to be unwarranted. This paper offers a short review of the main challenges facing ROA and concludes that alternative models based on the concept of metapopulation must be also considered. The implications of such a model for language evolution and diversity are briefly reviewed.

Share this page