Marisa Casillas

Publications

Displaying 1 - 31 of 31
  • Brown, P., & Casillas, M. (in press). Childrearing through social interaction on Rossel Island, PNG. In A. J. Fentiman, & M. Goody (Eds.), Esther Goody revisited: Exploring the legacy of an original inter-disciplinarian. New York, NY: Berghahn.

    Abstract

    This paper describes childrearing practices, beliefs, and attitudes in a Papua New Guinea society - that of the Rossel Islanders - and shows, through analysis of interactions with infants and small children, how these are instantiated in everyday life. Drawing on data collected during research on Rossel Island spanning 14 years, including parental interviews, videotaped naturally-occurring interactions with babies and children, structured elicitations, and time sampling of activities involving children, we investigate the daily lives of Rossel children and consider how these influence their development of prosociality and their socialization into culturally shaped roles and characters. We relate the findings to other work on child socialization in small-scale societies, with special attention to the Tzeltal Maya of southern Mexico, and argue that detailed attention to the local socio-cultural contexts of childrearing is an important antidote to the tendency to emphasize universals of child development.
  • Casillas, M., & Hilbrink, E. (in press). Communicative act development. In K. P. Schneider, & E. Ifantidou (Eds.), Developmental and Clinical Pragmatics. DE GRUYTER MOUTON.

    Abstract

    How do children learn to map linguistic forms onto their intended meanings? This chapter begins with an introduction to some theoretical and analytical tools used to study communicative acts. It then turns to communicative act development in spoken and signed language acquisition, including both the early scaffolding and production of communicative acts (both non-verbal and verbal) as well as their later links to linguistic development and Theory of Mind. The chapter wraps up by linking research on communicative act development to the acquisition of conversational skills, cross-linguistic and individual differences in communicative experience during development, and human evolution. Along the way, it also poses a few open questions for future research in this domain.
  • Casillas, M., Brown, P., & Levinson, S. C. (in press). Early language experience in a Papuan community. Journal of Child Language.

    Abstract

    The rate at which young children are directly spoken to varies due to many factors, including (a) caregiver ideas about children as conversational partners and (b) the organization of everyday life. Prior work suggests cross-cultural variation in rates of child-directed speech is due to the former factor, but has been fraught with confounds in comparing postindustrial and subsistence farming communities. We investigate the daylong language environments of children (0;0–3;0) on Rossel Island, Papua New Guinea, a small-scale traditional community where prior ethnographic study demonstrated contingency-seeking child interaction styles. In fact, children were infrequently directly addressed and linguistic input rate was primarily affected by situational factors, though children’s vocalization maturity showed no developmental delay. We compare the input characteristics between this community and a Tseltal Mayan one in which near-parallel methods produced comparable results, then briefly discuss the models and mechanisms for learning best supported by our findings.
  • Casillas, M., Brown, P., & Levinson, S. C. (2020). Early language experience in a Tseltal Mayan village. Child Development, 91(5), 1819-1835. doi:10.1111/cdev.13349.

    Abstract

    Daylong at-home audio recordings from 10 Tseltal Mayan children (0;2–3;0; Southern Mexico) were analyzed for how often children engaged in verbal interaction with others and whether their speech environment changed with age, time of day, household size, and number of speakers present. Children were infrequently directly spoken to, with most directed speech coming from adults, and no increase with age. Most directed speech came in the mornings, and interactional peaks contained nearly four times the baseline rate of directed speech. Coarse indicators of children's language development (babbling, first words, first word combinations) suggest that Tseltal children manage to extract the linguistic information they need despite minimal directed speech. Multiple proposals for how they might do so are discussed.

    Supplementary material

    Tseltal-CLE-SuppMat.pdf
  • Cychosz, M., Romeo, R., Soderstrom, M., Scaff, C., Ganek, H., Cristia, A., Casillas, M., De Barbaro, K., Bang, J. Y., & Weisleder, A. (2020). Longform recordings of everyday life: Ethics for best practices. Behavior Research Methods, 52, 1951-1969. doi:10.3758/s13428-020-01365-9.

    Abstract

    Recent advances in large-scale data storage and processing offer unprecedented opportunities for behavioral scientists to collect and analyze naturalistic data, including from under-represented groups. Audio data, particularly real-world audio recordings, are of particular interest to behavioral scientists because they provide high-fidelity access to subtle aspects of daily life and social interactions. However, these methodological advances pose novel risks to research participants and communities. In this article, we outline the benefits and challenges associated with collecting, analyzing, and sharing multi-hour audio recording data. Guided by the principles of autonomy, privacy, beneficence, and justice, we propose a set of ethical guidelines for the use of longform audio recordings in behavioral research. This article is also accompanied by an Open Science Framework Ethics Repository that includes informed consent resources such as frequent participant concerns and sample consent forms.
  • MacDonald, K., Räsänen, O., Casillas, M., & Warlaumont, A. S. (2020). Measuring prosodic predictability in children’s home language environments. In S. Denison, M. Mack, Y. Xu, & B. C. Armstrong (Eds.), Proceedings of the 42nd Annual Virtual Meeting of the Cognitive Science Society (CogSci 2020) (pp. 695-701). Montreal, QB: Cognitive Science Society.

    Abstract

    Children learn language from the speech in their home environment. Recent work shows that more infant-directed speech (IDS) leads to stronger lexical development. But what makes IDS a particularly useful learning signal? Here, we expand on an attention-based account first proposed by Räsänen et al. (2018): that prosodic modifications make IDS less predictable, and thus more interesting. First, we reproduce the critical finding from Räsänen et al.: that lab-recorded IDS pitch is less predictable compared to adult-directed speech (ADS). Next, we show that this result generalizes to the home language environment, finding that IDS in daylong recordings is also less predictable than ADS but that this pattern is much less robust than for IDS recorded in the lab. These results link experimental work on attention and prosodic modifications of IDS to real-world language-learning environments, highlighting some challenges of scaling up analyses of IDS to larger datasets that better capture children’s actual input.
  • Räsänen, O., Seshadri, S., Lavechin, M., Cristia, A., & Casillas, M. (2020). ALICE: An open-source tool for automatic measurement of phoneme, syllable, and word counts from child-centered daylong recordings. Behavior Research Methods. Advance online publication. doi:10.3758/s13428-020-01460-x.

    Abstract

    Recordings captured by wearable microphones are a standard method for investigating young children’s language environments. A key measure to quantify from such data is the amount of speech present in children’s home environments. To this end, the LENA recorder and software—a popular system for measuring linguistic input—estimates the number of adult words that children may hear over the course of a recording. However, word count estimation is challenging to do in a language-independent manner; the relationship between observable acoustic patterns and language-specific lexical entities is far from uniform across human languages. In this paper, we ask whether some alternative linguistic units, namely phone(me)s or syllables, could be measured instead of, or in parallel with, words in order to achieve improved cross-linguistic applicability and comparability of an automated system for measuring child language input. We discuss the advantages and disadvantages of measuring different units from theoretical and technical points of view. We also investigate the practical applicability of measuring such units using a novel system called Automatic LInguistic unit Count Estimator (ALICE) together with audio from seven child-centered daylong audio corpora from diverse cultural and linguistic environments. We show that language-independent measurement of phoneme counts is somewhat more accurate than syllables or words, but all three are highly correlated with human annotations on the same data. We share an open-source implementation of ALICE for use by the language research community, allowing automatic phoneme, syllable, and word count estimation from child-centered audio recordings.
  • Bergelson*, E., Casillas*, M., Soderstrom, M., Seidl, A., Warlaumont, A. S., & Amatuni, A. (2019). What Do North American Babies Hear? A large-scale cross-corpus analysis. Developmental Science, 22(1): e12724. doi:10.1111/desc.12724.

    Abstract

    - * indicates joint first authorship - Abstract: A range of demographic variables influence how much speech young children hear. However, because studies have used vastly different sampling methods, quantitative comparison of interlocking demographic effects has been nearly impossible, across or within studies. We harnessed a unique collection of existing naturalistic, day-long recordings from 61 homes across four North American cities to examine language input as a function of age, gender, and maternal education. We analyzed adult speech heard by 3- to 20-month-olds who wore audio recorders for an entire day. We annotated speaker gender and speech register (child-directed or adult-directed) for 10,861 utterances from female and male adults in these recordings. Examining age, gender, and maternal education collectively in this ecologically-valid dataset, we find several key results. First, the speaker gender imbalance in the input is striking: children heard 2--3x more speech from females than males. Second, children in higher-maternal-education homes heard more child-directed speech than those in lower-maternal education homes. Finally, our analyses revealed a previously unreported effect: the proportion of child-directed speech in the input increases with age, due to a decrease in adult-directed speech with age. This large-scale analysis is an important step forward in collectively examining demographic variables that influence early development, made possible by pooled, comparable, day-long recordings of children's language environments. The audio recordings, annotations, and annotation software are readily available for re-use and re-analysis by other researchers.

    Supplementary material

    desc12724-sup-0001-supinfo.pdf
  • Casillas, M., & Cristia, A. (2019). A step-by-step guide to collecting and analyzing long-format speech environment (LFSE) recordings. Collabra, 5(1): 24. doi:10.1525/collabra.209.

    Abstract

    Recent years have seen rapid technological development of devices that can record communicative behavior as participants go about daily life. This paper is intended as an end-to-end methodological guidebook for potential users of these technologies, including researchers who want to study children’s or adults’ communicative behavior in everyday contexts. We explain how long-format speech environment (LFSE) recordings provide a unique view on language use and how they can be used to complement other measures at the individual and group level. We aim to help potential users of these technologies make informed decisions regarding research design, hardware, software, and archiving. We also provide information regarding ethics and implementation, issues that are difficult to navigate for those new to this technology, and on which little or no resources are available. This guidebook offers a concise summary of information for new users and points to sources of more detailed information for more advanced users. Links to discussion groups and community-augmented databases are also provided to help readers stay up-to-date on the latest developments.
  • Casillas, M., Rafiee, A., & Majid, A. (2019). Iranian herbalists, but not cooks, are better at naming odors than laypeople. Cognitive Science, 43(6): e12763. doi:10.1111/cogs.12763.

    Abstract

    Odor naming is enhanced in communities where communication about odors is a central part of daily life (e.g., wine experts, flavorists, and some hunter‐gatherer groups). In this study, we investigated how expert knowledge and daily experience affect the ability to name odors in a group of experts that has not previously been investigated in this context—Iranian herbalists; also called attars—as well as cooks and laypeople. We assessed naming accuracy and consistency for 16 herb and spice odors, collected judgments of odor perception, and evaluated participants' odor meta‐awareness. Participants' responses were overall more consistent and accurate for more frequent and familiar odors. Moreover, attars were more accurate than both cooks and laypeople at naming odors, although cooks did not perform significantly better than laypeople. Attars' perceptual ratings of odors and their overall odor meta‐awareness suggest they are also more attuned to odors than the other two groups. To conclude, Iranian attars—but not cooks—are better odor namers than laypeople. They also have greater meta‐awareness and differential perceptual responses to odors. These findings further highlight the critical role that expertise and type of experience have on olfactory functions.

    Supplementary material

    Supplementary Materials
  • Räsänen, O., Seshadri, S., Karadayi, J., Riebling, E., Bunce, J., Cristia, A., Metze, F., Casillas, M., Rosemberg, C., Bergelson, E., & Soderstrom, M. (2019). Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech. Speech Communication, 113, 63-80. doi:10.1016/j.specom.2019.08.005.

    Abstract

    Automatic word count estimation (WCE) from audio recordings can be used to quantify the amount of verbal communication in a recording environment. One key application of WCE is to measure language input heard by infants and toddlers in their natural environments, as captured by daylong recordings from microphones worn by the infants. Although WCE is nearly trivial for high-quality signals in high-resource languages, daylong recordings are substantially more challenging due to the unconstrained acoustic environments and the presence of near- and far-field speech. Moreover, many use cases of interest involve languages for which reliable ASR systems or even well-defined lexicons are not available. A good WCE system should also perform similarly for low- and high-resource languages in order to enable unbiased comparisons across different cultures and environments. Unfortunately, the current state-of-the-art solution, the LENA system, is based on proprietary software and has only been optimized for American English, limiting its applicability. In this paper, we build on existing work on WCE and present the steps we have taken towards a freely available system for WCE that can be adapted to different languages or dialects with a limited amount of orthographically transcribed speech data. Our system is based on language-independent syllabification of speech, followed by a language-dependent mapping from syllable counts (and a number of other acoustic features) to the corresponding word count estimates. We evaluate our system on samples from daylong infant recordings from six different corpora consisting of several languages and socioeconomic environments, all manually annotated with the same protocol to allow direct comparison. We compare a number of alternative techniques for the two key components in our system: speech activity detection and automatic syllabification of speech. As a result, we show that our system can reach relatively consistent WCE accuracy across multiple corpora and languages (with some limitations). In addition, the system outperforms LENA on three of the four corpora consisting of different varieties of English. We also demonstrate how an automatic neural network-based syllabifier, when trained on multiple languages, generalizes well to novel languages beyond the training data, outperforming two previously proposed unsupervised syllabifiers as a feature extractor for WCE.
  • Bögels, S., Casillas, M., & Levinson, S. C. (2018). Planning versus comprehension in turn-taking: Fast responders show reduced anticipatory processing of the question. Neuropsychologia, 109, 295-310. doi:10.1016/j.neuropsychologia.2017.12.028.

    Abstract

    Rapid response latencies in conversation suggest that responders start planning before the ongoing turn is finished. Indeed, an earlier EEG study suggests that listeners start planning their responses to questions as soon as they can (Bögels, S., Magyari, L., & Levinson, S. C. (2015). Neural signatures of response planning occur midway through an incoming question in conversation. Scientific Reports, 5, 12881). The present study aimed to (1) replicate this early planning effect and (2) investigate whether such early response planning incurs a cost on participants’ concurrent comprehension of the ongoing turn. During the experiment participants answered questions from a confederate partner. To address aim (1), the questions were designed such that response planning could start either early or late in the turn. Our results largely replicate Bögels et al. (2015) showing a large positive ERP effect and an oscillatory alpha/beta reduction right after participants could have first started planning their verbal response, again suggesting an early start of response planning. To address aim (2), the confederate's questions also contained either an expected word or an unexpected one to elicit a differential N400 effect, either before or after the start of response planning. We hypothesized an attenuated N400 effect after response planning had started. In contrast, the N400 effects before and after planning did not differ. There was, however, a positive correlation between participants' response time and their N400 effect size after planning had started; quick responders showed a smaller N400 effect, suggesting reduced attention to comprehension and possibly reduced anticipatory processing. We conclude that early response planning can indeed impact comprehension processing.

    Supplementary material

    mmc1.pdf
  • Cristia, A., Ganesh, S., Casillas, M., & Ganapathy, S. (2018). Talker diarization in the wild: The case of child-centered daylong audio-recordings. In Proceedings of Interspeech 2018 (pp. 2583-2587). doi:10.21437/Interspeech.2018-2078.

    Abstract

    Speaker diarization (answering 'who spoke when') is a widely researched subject within speech technology. Numerous experiments have been run on datasets built from broadcast news, meeting data, and call centers—the task sometimes appears close to being solved. Much less work has begun to tackle the hardest diarization task of all: spontaneous conversations in real-world settings. Such diarization would be particularly useful for studies of language acquisition, where researchers investigate the speech children produce and hear in their daily lives. In this paper, we study audio gathered with a recorder worn by small children as they went about their normal days. As a result, each child was exposed to different acoustic environments with a multitude of background noises and a varying number of adults and peers. The inconsistency of speech and noise within and across samples poses a challenging task for speaker diarization systems, which we tackled via retraining and data augmentation techniques. We further studied sources of structured variation across raw audio files, including the impact of speaker type distribution, proportion of speech from children, and child age on diarization performance. We discuss the extent to which these findings might generalize to other samples of speech in the wild.
  • Räsänen, O., Seshadri, S., & Casillas, M. (2018). Comparison of syllabification algorithms and training strategies for robust word count estimation across different languages and recording conditions. In Proceedings of Interspeech 2018 (pp. 1200-1204). doi:10.21437/Interspeech.2018-1047.

    Abstract

    Word count estimation (WCE) from audio recordings has a number of applications, including quantifying the amount of speech that language-learning infants hear in their natural environments, as captured by daylong recordings made with devices worn by infants. To be applicable in a wide range of scenarios and also low-resource domains, WCE tools should be extremely robust against varying signal conditions and require minimal access to labeled training data in the target domain. For this purpose, earlier work has used automatic syllabification of speech, followed by a least-squares-mapping of syllables to word counts. This paper compares a number of previously proposed syllabifiers in the WCE task, including a supervised bi-directional long short-term memory (BLSTM) network that is trained on a language for which high quality syllable annotations are available (a “high resource language”), and reports how the alternative methods compare on different languages and signal conditions. We also explore additive noise and varying-channel data augmentation strategies for BLSTM training, and show how they improve performance in both matching and mismatching languages. Intriguingly, we also find that even though the BLSTM works on languages beyond its training data, the unsupervised algorithms can still outperform it in challenging signal conditions on novel languages.
  • Casillas, M., Bergelson, E., Warlaumont, A. S., Cristia, A., Soderstrom, M., VanDam, M., & Sloetjes, H. (2017). A New Workflow for Semi-automatized Annotations: Tests with Long-Form Naturalistic Recordings of Childrens Language Environments. In Proceedings of Interspeech 2017 (pp. 2098-2102). doi:10.21437/Interspeech.2017-1418.

    Abstract

    Interoperable annotation formats are fundamental to the utility, expansion, and sustainability of collective data repositories.In language development research, shared annotation schemes have been critical to facilitating the transition from raw acoustic data to searchable, structured corpora. Current schemes typically require comprehensive and manual annotation of utterance boundaries and orthographic speech content, with an additional, optional range of tags of interest. These schemes have been enormously successful for datasets on the scale of dozens of recording hours but are untenable for long-format recording corpora, which routinely contain hundreds to thousands of audio hours. Long-format corpora would benefit greatly from (semi-)automated analyses, both on the earliest steps of annotation—voice activity detection, utterance segmentation, and speaker diarization—as well as later steps—e.g., classification-based codes such as child-vs-adult-directed speech, and speech recognition to produce phonetic/orthographic representations. We present an annotation workflow specifically designed for long-format corpora which can be tailored by individual researchers and which interfaces with the current dominant scheme for short-format recordings. The workflow allows semi-automated annotation and analyses at higher linguistic levels. We give one example of how the workflow has been successfully implemented in a large cross-database project.
  • Casillas, M., Amatuni, A., Seidl, A., Soderstrom, M., Warlaumont, A., & Bergelson, E. (2017). What do Babies hear? Analyses of Child- and Adult-Directed Speech. In Proceedings of Interspeech 2017 (pp. 2093-2097). doi:10.21437/Interspeech.2017-1409.

    Abstract

    Child-directed speech is argued to facilitate language development, and is found cross-linguistically and cross-culturally to varying degrees. However, previous research has generally focused on short samples of child-caregiver interaction, often in the lab or with experimenters present. We test the generalizability of this phenomenon with an initial descriptive analysis of the speech heard by young children in a large, unique collection of naturalistic, daylong home recordings. Trained annotators coded automatically-detected adult speech 'utterances' from 61 homes across 4 North American cities, gathered from children (age 2-24 months) wearing audio recorders during a typical day. Coders marked the speaker gender (male/female) and intended addressee (child/adult), yielding 10,886 addressee and gender tags from 2,523 minutes of audio (cf. HB-CHAAC Interspeech ComParE challenge; Schuller et al., in press). Automated speaker-diarization (LENA) incorrectly gender-tagged 30% of male adult utterances, compared to manually-coded consensus. Furthermore, we find effects of SES and gender on child-directed and overall speech, increasing child-directed speech with child age, and interactions of speaker gender, child gender, and child age: female caretakers increased their child-directed speech more with age than male caretakers did, but only for male infants. Implications for language acquisition and existing classification algorithms are discussed.
  • Casillas, M., & Frank, M. C. (2017). The development of children's ability to track and predict turn structure in conversation. Journal of Memory and Language, 92, 234-253. doi:10.1016/j.jml.2016.06.013.

    Abstract

    Children begin developing turn-taking skills in infancy but take several years to fluidly integrate their growing knowledge of language into their turn-taking behavior. In two eye-tracking experiments, we measured children’s anticipatory gaze to upcoming responders while controlling linguistic cues to turn structure. In Experiment 1, we showed English and non-English conversations to English-speaking adults and children. In Experiment 2, we phonetically controlled lexicosyntactic and prosodic cues in English-only speech. Children spontaneously made anticipatory gaze switches by age two and continued improving through age six. In both experiments, children and adults made more anticipatory switches after hearing questions. Consistent with prior findings on adult turn prediction, prosodic information alone did not increase children’s anticipatory gaze shifts. But, unlike prior work with adults, lexical information alone was not sucient either—children’s performance was best overall with lexicosyntax and prosody together. Our findings support an account in which turn tracking and turn prediction emerge in infancy and then gradually become integrated with children’s online linguistic processing.
  • Schuller, B., Steidl, S., Batliner, A., Bergelson, E., Krajewski, J., Janott, C., Amatuni, A., Casillas, M., Seidl, A., Soderstrom, M., Warlaumont, A. S., Hidalgo, G., Schnieder, S., Heiser, C., Hohenhorst, W., Herzog, M., Schmitt, M., Qian, K., Zhang, Y., Trigeorgis, G., Tzirakis, P., & Zafeiriou, S. (2017). The INTERSPEECH 2017 computational paralinguistics challenge: Addressee, cold & snoring. In Proceedings of Interspeech 2017 (pp. 3442-3446). doi:10.21437/Interspeech.2017-43.

    Abstract

    The INTERSPEECH 2017 Computational Paralinguistics Challenge addresses three different problems for the first time in research competition under well-defined conditions: In the Addressee sub-challenge, it has to be determined whether speech produced by an adult is directed towards another adult or towards a child; in the Cold sub-challenge, speech under cold has to be told apart from ‘healthy’ speech; and in the Snoring subchallenge, four different types of snoring have to be classified. In this paper, we describe these sub-challenges, their conditions, and the baseline feature extraction and classifiers, which include data-learnt feature representations by end-to-end learning with convolutional and recurrent neural networks, and bag-of-audiowords for the first time in the challenge series
  • Casillas, M., Bobb, S. C., & Clark, E. V. (2016). Turn taking, timing, and planning in early language acquisition. Journal of Child Language, 43, 1310-1337. doi:10.1017/S0305000915000689.

    Abstract

    Young children answer questions with longer delays than adults do, and they don't reach typical adult response times until several years later. We hypothesized that this prolonged pattern of delay in children's timing results from competing demands: to give an answer, children must understand a question while simultaneously planning and initiating their response. Even as children get older and more efficient in this process, the demands on them increase because their verbal responses become more complex. We analyzed conversational question-answer sequences between caregivers and their children from ages 1;8 to 3;5, finding that children (1) initiate simple answers more quickly than complex ones, (2) initiate simple answers quickly from an early age, and (3) initiate complex answers more quickly as they grow older. Our results suggest that children aim to respond quickly from the start, improving on earlier-acquired answer types while they begin to practice later-acquired, slower ones.

    Supplementary material

    S0305000915000689sup001.docx
  • Clark, E. V., & Casillas, M. (2016). First language acquisition. In K. Allen (Ed.), The Routledge Handbook of Linguistics (pp. 311-328). New York: Routledge.
  • Holler, J., Kendrick, K. H., Casillas, M., & Levinson, S. C. (Eds.). (2016). Turn-Taking in Human Communicative Interaction. Lausanne: Frontiers Media. doi:10.3389/978-2-88919-825-2.

    Abstract

    The core use of language is in face-to-face conversation. This is characterized by rapid turn-taking. This turn-taking poses a number central puzzles for the psychology of language. Consider, for example, that in large corpora the gap between turns is on the order of 100 to 300 ms, but the latencies involved in language production require minimally between 600ms (for a single word) or 1500 ms (for as simple sentence). This implies that participants in conversation are predicting the ends of the incoming turn and preparing in advance. But how is this done? What aspects of this prediction are done when? What happens when the prediction is wrong? What stops participants coming in too early? If the system is running on prediction, why is there consistently a mode of 100 to 300 ms in response time? The timing puzzle raises further puzzles: it seems that comprehension must run parallel with the preparation for production, but it has been presumed that there are strict cognitive limitations on more than one central process running at a time. How is this bottleneck overcome? Far from being 'easy' as some psychologists have suggested, conversation may be one of the most demanding cognitive tasks in our everyday lives. Further questions naturally arise: how do children learn to master this demanding task, and what is the developmental trajectory in this domain? Research shows that aspects of turn-taking such as its timing are remarkably stable across languages and cultures, but the word order of languages varies enormously. How then does prediction of the incoming turn work when the verb (often the informational nugget in a clause) is at the end? Conversely, how can production work fast enough in languages that have the verb at the beginning, thereby requiring early planning of the whole clause? What happens when one changes modality, as in sign languages -- with the loss of channel constraints is turn-taking much freer? And what about face-to-face communication amongst hearing individuals -- do gestures, gaze, and other body behaviors facilitate turn-taking? One can also ask the phylogenetic question: how did such a system evolve? There seem to be parallels (analogies) in duetting bird species, and in a variety of monkey species, but there is little evidence of anything like this among the great apes. All this constitutes a neglected set of problems at the heart of the psychology of language and of the language sciences. This research topic welcomes contributions from right across the board, for example from psycholinguists, developmental psychologists, students of dialogue and conversation analysis, linguists interested in the use of language, phoneticians, corpus analysts and comparative ethologists or psychologists. We welcome contributions of all sorts, for example original research papers, opinion pieces, and reviews of work in subfields that may not be fully understood in other subfields.
  • Casillas, M., De Vos, C., Crasborn, O., & Levinson, S. C. (2015). The perception of stroke-to-stroke turn boundaries in signed conversation. In D. C. Noelle, R. Dale, A. S. Warlaumont, J. Yoshimi, T. Matlock, C. D. Jennings, & P. R. Maglio (Eds.), Proceedings of the 37th Annual Meeting of the Cognitive Science Society (CogSci 2015) (pp. 315-320). Austin, TX: Cognitive Science Society.

    Abstract

    Speaker transitions in conversation are often brief, with minimal vocal overlap. Signed languages appear to defy this pattern with frequent, long spans of simultaneous signing. But recent evidence suggests that turn boundaries in signed language may only include the content-bearing parts of the turn (from the first stroke to the last), and not all turn-related movement (from first preparation to final retraction). We tested whether signers were able to anticipate “stroke-to-stroke” turn boundaries with only minimal conversational context. We found that, indeed, signers anticipated turn boundaries at the ends of turn-final strokes. Signers often responded early, especially when the turn was long or contained multiple possible end points. Early responses for long turns were especially apparent for interrogatives—long interrogative turns showed much greater anticipation compared to short ones.
  • Holler, J., Kendrick, K. H., Casillas, M., & Levinson, S. C. (2015). Editorial: Turn-taking in human communicative interaction. Frontiers in Psychology, 6: 1919. doi:10.3389/fpsyg.2015.01919.
  • Lammertink, I., Casillas, M., Benders, T., Post, B., & Fikkert, P. (2015). Dutch and English toddlers' use of linguistic cues in predicting upcoming turn transitions. Frontiers in Psychology, 6: 495. doi:10.3389/fpsyg.2015.00495.
  • Arnon, I., Casillas, M., Kurumada, C., & Estigarribia, B. (Eds.). (2014). Language in interaction: Studies in honor of Eve V. Clark. Amsterdam: Benjamins.

    Abstract

    Understanding how communicative goals impact and drive the learning process has been a long-standing issue in the field of language acquisition. Recent years have seen renewed interest in the social and pragmatic aspects of language learning: the way interaction shapes what and how children learn. In this volume, we bring together researchers working on interaction in different domains to present a cohesive overview of ongoing interactional research. The studies address the diversity of the environments children learn in; the role of para-linguistic information; the pragmatic forces driving language learning; and the way communicative pressures impact language use and change. Using observational, empirical and computational findings, this volume highlights the effect of interpersonal communication on what children hear and what they learn. This anthology is inspired by and dedicated to Prof. Eve V. Clark – a pioneer in all matters related to language acquisition – and a major force in establishing interaction and communication as crucial aspects of language learning.
  • Casillas, M. (2014). Taking the floor on time: Delay and deferral in children’s turn taking. In I. Arnon, M. Casillas, C. Kurumada, & B. Estigarribia (Eds.), Language in Interaction: Studies in honor of Eve V. Clark (pp. 101-114). Amsterdam: Benjamins.

    Abstract

    A key part of learning to speak with others is figuring out when to start talking and how to hold the floor in conversation. For young children, the challenge of planning a linguistic response can slow down their response latencies, making misunderstanding, repair, and loss of the floor more likely. Like adults, children can mitigate their delays by using fillers (e.g., uh and um) at the start of their turns. In this chapter I analyze the onset and development of fillers in five children’s spontaneous speech from ages 1;6–3;6. My findings suggest that children start using fillers by 2;0, and use them to effectively mitigate delay in making a response.
  • Casillas, M. (2014). Turn-taking. In D. Matthews (Ed.), Pragmatic development in first language acquisition (pp. 53-70). Amsterdam: Benjamins.

    Abstract

    Conversation is a structured, joint action for which children need to learn a specialized set skills and conventions. Because conversation is a primary source of linguistic input, we can better grasp how children become active agents in their own linguistic development by studying their acquisition of conversational skills. In this chapter I review research on children’s turn-taking. This fundamental skill of human interaction allows children to gain feedback, make clarifications, and test hypotheses at every stage of development. I broadly review children’s conversational experiences, the types of turn-based contingency they must acquire, how they ask and answer questions, and when they manage to make timely responses
  • Casillas, M., & Frank, M. C. (2013). The development of predictive processes in children’s discourse understanding. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society. (pp. 299-304). Austin,TX: Cognitive Society.

    Abstract

    We investigate children’s online predictive processing as it occurs naturally, in conversation. We showed 1–7 year-olds short videos of improvised conversation between puppets, controlling for available linguistic information through phonetic manipulation. Even one- and two-year-old children made accurate and spontaneous predictions about when a turn-switch would occur: they gazed at the upcoming speaker before they heard a response begin. This predictive skill relies on both lexical and prosodic information together, and is not tied to either type of information alone. We suggest that children integrate prosodic, lexical, and visual information to effectively predict upcoming linguistic material in conversation.
  • Sumner, M., Kurumada, C., Gafter, R., & Casillas, M. (2013). Phonetic variation and the recognition of words with pronunciation variants. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 3486-3492). Austin, TX: Cognitive Science Society.
  • Casillas, M., & Frank, M. C. (2012). Cues to turn boundary prediction in adults and preschoolers. In S. Brown-Schmidt, J. Ginzburg, & S. Larsson (Eds.), Proceedings of SemDial 2012 (SeineDial): The 16th Workshop on the Semantics and Pragmatics of Dialogue (pp. 61-69). Paris: Université Paris-Diderot.

    Abstract

    Conversational turns often proceed with very brief pauses between speakers. In order to maintain “no gap, no overlap” turntaking, we must be able to anticipate when an ongoing utterance will end, tracking the current speaker for upcoming points of potential floor exchange. The precise set of cues that listeners use for turn-end boundary anticipation is not yet established. We used an eyetracking paradigm to measure adults’ and children’s online turn processing as they watched videos of conversations in their native language (English) and a range of other languages they did not speak. Both adults and children anticipated speaker transitions effectively. In addition, we observed evidence of turn-boundary anticipation for questions even in languages that were unknown to participants, suggesting that listeners’ success in turn-end anticipation does not rely solely on lexical information.
  • Casillas, M., & Amaral, P. (2011). Learning cues to category membership: Patterns in children’s acquisition of hedges. In C. Cathcart, I.-H. Chen, G. Finley, S. Kang, C. S. Sandy, & E. Stickles (Eds.), Proceedings of the Berkeley Linguistics Society 37th Annual Meeting (pp. 33-45). Linguistic Society of America, eLanguage.

    Abstract

    When we think of children acquiring language, we often think of their acquisition of linguistic structure as separate from their acquisition of knowledge about the world. But it is clear that in the process of learning about language, children consult what they know about the world; and that in learning about the world, children use linguistic cues to discover how items are related to one another. This interaction between the acquisition of linguistic structure and the acquisition of category structure is especially clear in word learning.

Share this page