Publications

Displaying 1 - 100 of 219
  • Alday, P. M. (2016). Towards a rigorous motivation for Ziph's law. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/178.html.

    Abstract

    Language evolution can be viewed from two viewpoints: the development of a communicative system and the biological adaptations necessary for producing and perceiving said system. The communicative-system vantage point has enjoyed a wealth of mathematical models based on simple distributional properties of language, often formulated as empirical laws. However, be- yond vague psychological notions of “least effort”, no principled explanation has been proposed for the existence and success of such laws. Meanwhile, psychological and neurobiological mod- els have focused largely on the computational constraints presented by incremental, real-time processing. In the following, we show that information-theoretic entropy underpins successful models of both types and provides a more principled motivation for Zipf’s Law
  • Alhama, R. G., & Zuidema, W. (2016). Generalization in Artificial Language Learning: Modelling the Propensity to Generalize. In Proceedings of the 7th Workshop on Cognitive Aspects of Computational Language Learning (pp. 64-72). Association for Computational Linguistics. doi:10.18653/v1/W16-1909.

    Abstract

    Experiments in Artificial Language Learn- ing have revealed much about the cogni- tive mechanisms underlying sequence and language learning in human adults, in in- fants and in non-human animals. This pa- per focuses on their ability to generalize to novel grammatical instances (i.e., in- stances consistent with a familiarization pattern). Notably, the propensity to gen- eralize appears to be negatively correlated with the amount of exposure to the artifi- cial language, a fact that has been claimed to be contrary to the predictions of statis- tical models (Pe ̃ na et al. (2002); Endress and Bonatti (2007)). In this paper, we pro- pose to model generalization as a three- step process, and we demonstrate that the use of statistical models for the first two steps, contrary to widespread intuitions in the ALL-field, can explain the observed decrease of the propensity to generalize with exposure time.
  • Alhama, R. G., & Zuidema, W. (2016). Pre-Wiring and Pre-Training: What does a neural network need to learn truly general identity rules? In T. R. Besold, A. Bordes, & A. D'Avila Garcez (Eds.), CoCo 2016 Cognitive Computation: Proceedings of the Workshop on Cognitive Computation: Integrating neural and symbolic approaches 2016. CEUR Workshop Proceedings.

    Abstract

    In an influential paper, Marcus et al. [1999] claimed that connectionist models cannot account for human success at learning tasks that involved generalization of abstract knowledge such as grammatical rules. This claim triggered a heated debate, centered mostly around variants of the Simple Recurrent Network model [Elman, 1990]. In our work, we revisit this unresolved debate and analyze the underlying issues from a different perspective. We argue that, in order to simulate human-like learning of grammatical rules, a neural network model should not be used as a tabula rasa , but rather, the initial wiring of the neural connections and the experience acquired prior to the actual task should be incorporated into the model. We present two methods that aim to provide such initial state: a manipu- lation of the initial connections of the network in a cognitively plausible manner (concretely, by implementing a “delay-line” memory), and a pre-training algorithm that incrementally challenges the network with novel stimuli. We implement such techniques in an Echo State Network [Jaeger, 2001], and we show that only when combining both techniques the ESN is able to learn truly general identity rules.
  • Allen, S. E. M. (1998). A discourse-pragmatic explanation for the subject-object asymmetry in early null arguments. In A. Sorace, C. Heycock, & R. Shillcock (Eds.), Proceedings of the GALA '97 Conference on Language Acquisition (pp. 10-15). Edinburgh, UK: Edinburgh University Press.

    Abstract

    The present paper assesses discourse-pragmatic factors as a potential explanation for the subject-object assymetry in early child language. It identifies a set of factors which characterize typical situations of informativeness (Greenfield & Smith, 1976), and uses these factors to identify informative arguments in data from four children aged 2;0 through 3;6 learning Inuktitut as a first language. In addition, it assesses the extent of the links between features of informativeness on one hand and lexical vs. null and subject vs. object arguments on the other. Results suggest that a pragmatics account of the subject-object asymmetry can be upheld to a greater extent than previous research indicates, and that several of the factors characterizing informativeness are good indicators of those arguments which tend to be omitted in early child language.
  • Ameka, F. K. (2013). Possessive constructions in Likpe (Sɛkpɛlé). In A. Aikhenvald, & R. Dixon (Eds.), Possession and ownership: A crosslinguistic typology (pp. 224-242). Oxford: Oxford University Press.
  • Azar, Z., Backus, A., & Ozyurek, A. (2016). Pragmatic relativity: Gender and context affect the use of personal pronouns in discourse differentially across languages. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 1295-1300). Austin, TX: Cognitive Science Society.

    Abstract

    Speakers use differential referring expressions in pragmatically appropriate ways to produce coherent narratives. Languages, however, differ in a) whether REs as arguments can be dropped and b) whether personal pronouns encode gender. We examine two languages that differ from each other in these two aspects and ask whether the co-reference context and the gender encoding options affect the use of REs differentially. We elicited narratives from Dutch and Turkish speakers about two types of three-person events, one including people of the same and the other of mixed-gender. Speakers re-introduced referents into the discourse with fuller forms (NPs) and maintained them with reduced forms (overt or null pronoun). Turkish speakers used pronouns mainly to mark emphasis and only Dutch speakers used pronouns differentially across the two types of videos. We argue that linguistic possibilities available in languages tune speakers into taking different principles into account to produce pragmatically coherent narratives
  • Bauer, B. L. M. (2013). Impersonal verbs. In G. K. Giannakis (Ed.), Encyclopedia of Ancient Greek Language and Linguistics Online (pp. 197-198). Leiden: Brill. doi:10.1163/2214-448X_eagll_SIM_00000481.

    Abstract

    Impersonal verbs in Greek ‒ as in the other Indo-European languages ‒ exclusively feature 3rd person singular finite forms and convey one of three types of meaning: (a) meteorological conditions; (b) emotional and physical state/experience; (c) modality. In Greek, impersonal verbs predominantly convey meteorological conditions and modality. Impersonal verbs in Greek, as in the other Indo-European languages, exclusively feature 3rd person singular finite forms and convey one of three types of me…

    Files private

    Request files
  • Bauer, B. L. M. (2016). The development of the comparative in Latin texts. In J. N. Adams, & N. Vincent (Eds.), Early and late Latin. Continuity or change? (pp. 313-339). Cambridge: Cambridge University Press.

    Files private

    Request files
  • Bergmann, C., Cristia, A., & Dupoux, E. (2016). Discriminability of sound contrasts in the face of speaker variation quantified. In Proceedings of the 38th Annual Conference of the Cognitive Science Society. (pp. 1331-1336). Austin, TX: Cognitive Science Society.

    Abstract

    How does a naive language learner deal with speaker variation irrelevant to distinguishing word meanings? Experimental data is contradictory, and incompatible models have been proposed. Here, we examine basic assumptions regarding the acoustic signal the learner deals with: Is speaker variability a hurdle in discriminating sounds or can it easily be ignored? To this end, we summarize existing infant data. We then present machine-based discriminability scores of sound pairs obtained without any language knowledge. Our results show that speaker variability decreases sound contrast discriminability, and that some contrasts are affected more than others. However, chance performance is rare; most contrasts remain discriminable in the face of speaker variation. We take our results to mean that speaker variation is not a uniform hurdle to discriminating sound contrasts, and careful examination is necessary when planning and interpreting studies testing whether and to what extent infants (and adults) are sensitive to speaker differences.

    Additional information

    Scripts and data
  • Bögels, S., Barr, D., Garrod, S., & Kessler, K. (2013). "Are we still talking about the same thing?" MEG reveals perspective-taking in response to pragmatic violations, but not in anticipation. In M. Knauff, N. Pauen, I. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 215-220). Austin, TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0066/index.html.

    Abstract

    The current study investigates whether mentalizing, or taking the perspective of your interlocutor, plays an essential role throughout a conversation or whether it is mostly used in reaction to misunderstandings. This study is the first to use a brain-imaging method, MEG, to answer this question. In a first phase of the experiment, MEG participants interacted "live" with a confederate who set naming precedents for certain pictures. In a later phase, these precedents were sometimes broken by a speaker who named the same picture in a different way. This could be done by the same speaker, who set the precedent, or by a different speaker. Source analysis of MEG data showed that in the 800 ms before the naming, when the picture was already on the screen, episodic memory and language areas were activated, but no mentalizing areas, suggesting that the speaker's naming intentions were not anticipated by the listener on the basis of shared experiences. Mentalizing areas only became activated after the same speaker had broken a precedent, which we interpret as a reaction to the violation of conversational pragmatics.
  • Bohnemeyer, J. (1998). Sententiale Topics im Yukatekischen. In Z. Dietmar (Ed.), Deskriptive Grammatik und allgemeiner Sprachvergleich (pp. 55-85). Tübingen, Germany: Max-Niemeyer-Verlag.

    Files private

    Request files
  • Bohnemeyer, J. (1998). Temporale Relatoren im Hispano-Yukatekischen Sprachkontakt. In A. Koechert, & T. Stolz (Eds.), Convergencia e Individualidad - Las lenguas Mayas entre hispanización e indigenismo (pp. 195-241). Hannover, Germany: Verlag für Ethnologie.
  • Bone, D., Ramanarayanan, V., Narayanan, S., Hoedemaker, R. S., & Gordon, P. C. (2013). Analyzing eye-voice coordination in rapid automatized naming. In F. Bimbot, C. Cerisara, G. Fougeron, L. Gravier, L. Lamel, F. Pelligrino, & P. Perrier (Eds.), INTERSPEECH-2013: 14thAnnual Conference of the International Speech Communication Association (pp. 2425-2429). ISCA Archive. Retrieved from http://www.isca-speech.org/archive/interspeech_2013/i13_2425.html.

    Abstract

    Rapid Automatized Naming (RAN) is a powerful tool for pre- dicting future reading skill. A person’s ability to quickly name symbols as they scan a table is related to higher-level reading proficiency in adults and is predictive of future literacy gains in children. However, noticeable differences are present in the strategies or patterns within groups having similar task comple- tion times. Thus, a further stratification of RAN dynamics may lead to better characterization and later intervention to support reading skill acquisition. In this work, we analyze the dynamics of the eyes, voice, and the coordination between the two during performance. It is shown that fast performers are more similar to each other than to slow performers in their patterns, but not vice versa. Further insights are provided about the patterns of more proficient subjects. For instance, fast performers tended to exhibit smoother behavior contours, suggesting a more sta- ble perception-production process.
  • Bosker, H. R. (2013). Juncture (prosodic). In G. Khan (Ed.), Encyclopedia of Hebrew Language and Linguistics (pp. 432-434). Leiden: Brill.

    Abstract

    Prosodic juncture concerns the compartmentalization and partitioning of syntactic entities in spoken discourse by means of prosody. It has been argued that the Intonation Unit, defined by internal criteria and prosodic boundary phenomena (e.g., final lengthening, pitch reset, pauses), encapsulates the basic structural unit of spoken Modern Hebrew.
  • Bosker, H. R., Reinisch, E., & Sjerps, M. J. (2016). Listening under cognitive load makes speech sound fast. In H. van den Heuvel, B. Cranen, & S. Mattys (Eds.), Proceedings of the Speech Processing in Realistic Environments [SPIRE] Workshop (pp. 23-24). Groningen.
  • Bosker, H. R. (2016). Our own speech rate influences speech perception. In J. Barnes, A. Brugos, S. Stattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 227-231).

    Abstract

    During conversation, spoken utterances occur in rich acoustic contexts, including speech produced by our interlocutor(s) and speech we produced ourselves. Prosodic characteristics of the acoustic context have been known to influence speech perception in a contrastive fashion: for instance, a vowel presented in a fast context is perceived to have a longer duration than the same vowel in a slow context. Given the ubiquity of the sound of our own voice, it may be that our own speech rate - a common source of acoustic context - also influences our perception of the speech of others. Two experiments were designed to test this hypothesis. Experiment 1 replicated earlier contextual rate effects by showing that hearing pre-recorded fast or slow context sentences alters the perception of ambiguous Dutch target words. Experiment 2 then extended this finding by showing that talking at a fast or slow rate prior to the presentation of the target words also altered the perception of those words. These results suggest that between-talker variation in speech rate production may induce between-talker variation in speech perception, thus potentially explaining why interlocutors tend to converge on speech rate in dialogue settings.

    Additional information

    pdf via conference website227
  • Bosker, H. R. (2013). Sibilant consonants. In G. Khan (Ed.), Encyclopedia of Hebrew Language and Linguistics (pp. 557-561). Leiden: Brill.

    Abstract

    Fricative consonants in Hebrew can be divided into bgdkpt and sibilants (ז, ס, צ, שׁ, שׂ). Hebrew sibilants have been argued to stem from Proto-Semitic affricates, laterals, interdentals and /s/. In standard Israeli Hebrew the sibilants are pronounced as [s] (ס and שׂ), [ʃ] (שׁ), [z] (ז), [ʦ] (צ).
  • Brown, P. (1998). Early Tzeltal verbs: Argument structure and argument representation. In E. Clark (Ed.), Proceedings of the 29th Annual Stanford Child Language Research Forum (pp. 129-140). Stanford: CSLI Publications.

    Abstract

    The surge of research activity focussing on children's acquisition of verbs (e.g., Tomasello and Merriman 1996) addresses some fundamental questions: Just how variable across languages, and across individual children, is the process of verb learning? How specific are arguments to particular verbs in early child language? How does the grammatical category 'Verb' develop? The position of Universal Grammar, that a verb category is early, contrasts with that of Tomasello (1992), Pine and Lieven and their colleagues (1996, in press), and many others, that children develop a verb category slowly, gradually building up subcategorizations of verbs around pragmatic, syntactic, and semantic properties of the language they are exposed to. On this latter view, one would expect the language which the child is learning, the cultural milieu and the nature of the interactions in which the child is engaged, to influence the process of acquiring verb argument structures. This paper explores these issues by examining the development of argument representation in the Mayan language Tzeltal, in both its lexical and verbal cross-referencing forms, and analyzing the semantic and pragmatic factors influencing the form argument representation takes. Certain facts about Tzeltal (the ergative/ absolutive marking, the semantic specificity of transitive and positional verbs) are proposed to affect the representation of arguments. The first 500 multimorpheme combinations of 3 children (aged between 1;8 and 2;4) are examined. It is argued that there is no evidence of semantically light 'pathbreaking' verbs (Ninio 1996) leading the way into word combinations. There is early productivity of cross-referencing affixes marking A, S, and O arguments (although there are systematic omissions). The paper assesses the respective contributions of three kinds of factors to these results - structural (regular morphology), semantic (verb specificity) and pragmatic (the nature of Tzeltal conversational interaction).
  • Brown, P. (1998). How and why are women more polite: Some evidence from a Mayan community. In J. Coates (Ed.), Language and gender (pp. 81-99). Oxford: Blackwell.
  • Brown, P., & Levinson, S. C. (1998). Politeness, introduction to the reissue: A review of recent work. In A. Kasher (Ed.), Pragmatics: Vol. 6 Grammar, psychology and sociology (pp. 488-554). London: Routledge.

    Abstract

    This article is a reprint of chapter 1, the introduction to Brown and Levinson, 1987, Politeness: Some universals in language usage (Cambridge University Press).
  • Brown, P. (2013). La estructura conversacional y la adquisición del lenguaje: El papel de la repetición en el habla de los adultos y niños tzeltales. In L. de León Pasquel (Ed.), Nuevos senderos en el studio de la adquisición de lenguas mesoamericanas: Estructura, narrativa y socialización (pp. 35-82). Mexico: CIESAS-UNAM.

    Abstract

    This is a translation of the Brown 1998 article in Journal of Linguistic Anthropology, 'Conversational structure and language acquisition: The role of repetition in Tzeltal adult and child speech'.

    Files private

    Request files
  • Brown, P., Pfeiler, B., de León, L., & Pye, C. (2013). The acquisition of agreement in four Mayan languages. In E. Bavin, & S. Stoll (Eds.), The acquisition of ergativity (pp. 271-306). Amsterdam: Benjamins.

    Abstract

    This paper presents results of a comparative project documenting the development of verbal agreement inflections in children learning four different Mayan languages: K’iche’, Tzeltal, Tzotzil, and Yukatek. These languages have similar inflectional paradigms: they have a generally agglutinative morphology, with transitive verbs obligatorily marked with separate cross-referencing inflections for the two core arguments (‘ergative’ and ‘absolutive’). Verbs are also inflected for aspect and mood, and they carry a ‘status suffix’ which generally marks verb transitivity and mood. At a more detailed level, the four languages differ strikingly in the realization of cross-reference marking. For each language, we examined longitudinal language production data from two children at around 2;0, 2;6, 3;0, and 3;6 years of age. We relate differences in the acquisition patterns of verbal morphology in the languages to 1) the placement of affixes, 2) phonological and prosodic prominence, 3) language-specific constraints on the various forms of the affixes, and 4) consistent vs. split ergativity, and conclude that prosodic salience accounts provide th ebest explanation for the acquisition patterns in these four languages.

    Files private

    Request files
  • Bruggeman, L., & Cutler, A. (2016). Lexical manipulation as a discovery tool for psycholinguistic research. In C. Carignan, & M. D. Tyler (Eds.), Proceedings of the 16th Australasian International Conference on Speech Science and Technology (SST2016) (pp. 313-316).
  • Burenhult, N., & Kruspe, N. (2016). The language of eating and drinking: A window on Orang Asli meaning-making. In K. Endicott (Ed.), Malaysia’s original people: Past, present and future of the Orang Asli (pp. 175-199). Singapore: National University of Singapore Press.
  • Casillas, M., & Frank, M. C. (2013). The development of predictive processes in children’s discourse understanding. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society. (pp. 299-304). Austin,TX: Cognitive Society.

    Abstract

    We investigate children’s online predictive processing as it occurs naturally, in conversation. We showed 1–7 year-olds short videos of improvised conversation between puppets, controlling for available linguistic information through phonetic manipulation. Even one- and two-year-old children made accurate and spontaneous predictions about when a turn-switch would occur: they gazed at the upcoming speaker before they heard a response begin. This predictive skill relies on both lexical and prosodic information together, and is not tied to either type of information alone. We suggest that children integrate prosodic, lexical, and visual information to effectively predict upcoming linguistic material in conversation.
  • Clark, E. V., & Casillas, M. (2016). First language acquisition. In K. Allen (Ed.), The Routledge Handbook of Linguistics (pp. 311-328). New York: Routledge.
  • Clifton, C. J., Meyer, A. S., Wurm, L. H., & Treiman, R. (2013). Language comprehension and production. In A. F. Healy, & R. W. Proctor (Eds.), Handbook of Psychology, Volume 4, Experimental Psychology. 2nd Edition (pp. 523-547). Hoboken, NJ: Wiley.

    Abstract

    In this chapter, we survey the processes of recognizing and producing words and of understanding and creating sentences. Theory and research on these topics have been shaped by debates about how various sources of information are integrated in these processes, and about the role of language structure, as analyzed in the discipline of linguistics. In this chapter, we describe current views of fluent language users' comprehension of spoken and written language and their production of spoken language. We review what we consider to be the most important findings and theories in psycholinguistics, returning again and again to the questions of modularity and the importance of linguistic knowledge. Although we acknowledge the importance of social factors in language use, our focus is on core processes such as parsing and word retrieval that are not necessarily affected by such factors. We do not have space to say much about the important fields of developmental psycholinguistics, which deals with the acquisition of language by children, or applied psycholinguistics, which encompasses such topics as language disorders and language teaching. Although we recognize that there is burgeoning interest in the measurement of brain activity during language processing and how language is represented in the brain, space permits only occasional pointers to work in neuropsychology and the cognitive neuroscience of language. For treatment of these topics, and others, the interested reader could begin with two recent handbooks of psycholinguistics (Gaskell, 2007; Traxler & Gemsbacher, 2006) and a handbook of cognitive neuroscience (Gazzaniga, 2004).
  • Crago, M. B., & Allen, S. E. M. (1998). Acquiring Inuktitut. In O. L. Taylor, & L. Leonard (Eds.), Language Acquisition Across North America: Cross-Cultural And Cross-Linguistic Perspectives (pp. 245-279). San Diego, CA, USA: Singular Publishing Group, Inc.
  • Crago, M. B., Allen, S. E. M., & Pesco, D. (1998). Issues of Complexity in Inuktitut and English Child Directed Speech. In Proceedings of the twenty-ninth Annual Stanford Child Language Research Forum (pp. 37-46).
  • Croijmans, I., & Majid, A. (2016). Language does not explain the wine-specific memory advantage of wine experts. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 141-146). Austin, TX: Cognitive Science Society.

    Abstract

    Although people are poor at naming odors, naming a smell helps to remember that odor. Previous studies show wine experts have better memory for smells, and they also name smells differently than novices. Is wine experts’ odor memory is verbally mediated? And is the odor memory advantage that experts have over novices restricted to odors in their domain of expertise, or does it generalize? Twenty-four wine experts and 24 novices smelled wines, wine-related odors and common odors, and remembered these. Half the participants also named the smells. Wine experts had better memory for wines, but not for the other odors, indicating their memory advantage is restricted to wine. Wine experts named odors better than novices, but there was no relationship between experts’ ability to name odors and their memory for odors. This suggests experts’ odor memory advantage is not linguistically mediated, but may be the result of differential perceptual learning
  • Cutler, A., & Otake, T. (1998). Assimilation of place in Japanese and Dutch. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: vol. 5 (pp. 1751-1754). Sydney: ICLSP.

    Abstract

    Assimilation of place of articulation across a nasal and a following stop consonant is obligatory in Japanese, but not in Dutch. In four experiments the processing of assimilated forms by speakers of Japanese and Dutch was compared, using a task in which listeners blended pseudo-word pairs such as ranga-serupa. An assimilated blend of this pair would be rampa, an unassimilated blend rangpa. Japanese listeners produced significantly more assimilated than unassimilated forms, both with pseudo-Japanese and pseudo-Dutch materials, while Dutch listeners produced significantly more unassimilated than assimilated forms in each materials set. This suggests that Japanese listeners, whose native-language phonology involves obligatory assimilation constraints, represent the assimilated nasals in nasal-stop sequences as unmarked for place of articulation, while Dutch listeners, who are accustomed to hearing unassimilated forms, represent the same nasal segments as marked for place of articulation.
  • Ip, M., & Cutler, A. (2016). Cross-language data on five types of prosodic focus. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 330-334).

    Abstract

    To examine the relative roles of language-specific and language-universal mechanisms in the production of prosodic focus, we compared production of five different types of focus by native speakers of English and Mandarin. Two comparable dialogues were constructed for each language, with the same words appearing in focused and unfocused position; 24 speakers recorded each dialogue in each language. Duration, F0 (mean, maximum, range), and rms-intensity (mean, maximum) of all critical word tokens were measured. Across the different types of focus, cross-language differences were observed in the degree to which English versus Mandarin speakers use the different prosodic parameters to mark focus, suggesting that while prosody may be universally available for expressing focus, the means of its employment may be considerably language-specific
  • Cutler, A. (1998). How listeners find the right words. In Proceedings of the Sixteenth International Congress on Acoustics: Vol. 2 (pp. 1377-1380). Melville, NY: Acoustical Society of America.

    Abstract

    Languages contain tens of thousands of words, but these are constructed from a tiny handful of phonetic elements. Consequently, words resemble one another, or can be embedded within one another, a coup stick snot with standing. me process of spoken-word recognition by human listeners involves activation of multiple word candidates consistent with the input, and direct competition between activated candidate words. Further, human listeners are sensitive, at an early, prelexical, stage of speeeh processing, to constraints on what could potentially be a word of the language.
  • Cutler, A., Treiman, R., & Van Ooijen, B. (1998). Orthografik inkoncistensy ephekts in foneme detektion? In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2783-2786). Sydney: ICSLP.

    Abstract

    The phoneme detection task is widely used in spoken word recognition research. Alphabetically literate participants, however, are more used to explicit representations of letters than of phonemes. The present study explored whether phoneme detection is sensitive to how target phonemes are, or may be, orthographically realised. Listeners detected the target sounds [b,m,t,f,s,k] in word-initial position in sequences of isolated English words. Response times were faster to the targets [b,m,t], which have consistent word-initial spelling, than to the targets [f,s,k], which are inconsistently spelled, but only when listeners’ attention was drawn to spelling by the presence in the experiment of many irregularly spelled fillers. Within the inconsistent targets [f,s,k], there was no significant difference between responses to targets in words with majority and minority spellings. We conclude that performance in the phoneme detection task is not necessarily sensitive to orthographic effects, but that salient orthographic manipulation can induce such sensitivity.
  • Cutler, A. (1998). Prosodic structure and word recognition. In A. D. Friederici (Ed.), Language comprehension: A biological perspective (pp. 41-70). Heidelberg: Springer.
  • Cutler, A. (1998). The recognition of spoken words with variable representations. In D. Duez (Ed.), Proceedings of the ESCA Workshop on Sound Patterns of Spontaneous Speech (pp. 83-92). Aix-en-Provence: Université de Aix-en-Provence.
  • Cutler, A., & Bruggeman, L. (2013). Vocabulary structure and spoken-word recognition: Evidence from French reveals the source of embedding asymmetry. In Proceedings of INTERSPEECH: 14th Annual Conference of the International Speech Communication Association (pp. 2812-2816).

    Abstract

    Vocabularies contain hundreds of thousands of words built from only a handful of phonemes, so that inevitably longer words tend to contain shorter ones. In many languages (but not all) such embedded words occur more often word-initially than word-finally, and this asymmetry, if present, has farreaching consequences for spoken-word recognition. Prior research had ascribed the asymmetry to suffixing or to effects of stress (in particular, final syllables containing the vowel schwa). Analyses of the standard French vocabulary here reveal an effect of suffixing, as predicted by this account, and further analyses of an artificial variety of French reveal that extensive final schwa has an independent and additive effect in promoting the embedding asymmetry.
  • Dediu, D., & Moisik, S. R. (2016). Anatomical biasing of click learning and production: An MRI and 3d palate imaging study. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/57.html.

    Abstract

    The current paper presents results for data on click learning obtained from a larger imaging study (using MRI and 3D intraoral scanning) designed to quantify and characterize intra- and inter-population variation of vocal tract structures and the relation of this to speech production. The aim of the click study was to ascertain whether and to what extent vocal tract morphology influences (1) the ability to learn to produce clicks and (2) the productions of those that successfully learn to produce these sounds. The results indicate that the presence of an alveolar ridge certainly does not prevent an individual from learning to produce click sounds (1). However, the subtle details of how clicks are produced may indeed be driven by palate shape (2).
  • Dediu, D., Cysouw, M., Levinson, S. C., Baronchelli, A., Christiansen, M. H., Croft, W., Evans, N., Garrod, S., Gray, R., Kandler, A., & Lieven, E. (2013). Cultural evolution of language. In P. J. Richerson, & M. H. Christiansen (Eds.), Cultural evolution: Society, technology, language, and religion. Strüngmann Forum Reports, vol. 12 (pp. 303-332). Cambridge, Mass: MIT Press.

    Abstract

    This chapter argues that an evolutionary cultural approach to language not only has already proven fruitful, but it probably holds the key to understand many puzzling aspects of language, its change and origins. The chapter begins by highlighting several still common misconceptions about language that might seem to call into question a cultural evolutionary approach. It explores the antiquity of language and sketches a general evolutionary approach discussing the aspects of function, fi tness, replication, and selection, as well the relevant units of linguistic evolution. In this context, the chapter looks at some fundamental aspects of linguistic diversity such as the nature of the design space, the mechanisms generating it, and the shape and fabric of language. Given that biology is another evolutionary system, its complex coevolution with language needs to be understood in order to have a proper theory of language. Throughout the chapter, various challenges are identifi ed and discussed, sketching promising directions for future research. The chapter ends by listing the necessary data, methods, and theoretical developments required for a grounded evolutionary approach to language.
  • Dediu, D., & Moisik, S. (2016). Defining and counting phonological classes in cross-linguistic segment databases. In N. Calzolari, K. Choukri, T. Declerck, S. Goggi, M. Grobelnik, B. Maegaard, J. Mariani, H. Mazo, A. Moreno, J. Odijk, & S. Piperidis (Eds.), Proceedings of LREC 2016: 10th International Conference on Language Resources and Evaluation (pp. 1955-1962). Paris: European Language Resources Association (ELRA).

    Abstract

    Recently, there has been an explosion in the availability of large, good-quality cross-linguistic databases such as WALS (Dryer & Haspelmath, 2013), Glottolog (Hammarstrom et al., 2015) and Phoible (Moran & McCloy, 2014). Databases such as Phoible contain the actual segments used by various languages as they are given in the primary language descriptions. However, this segment-level representation cannot be used directly for analyses that require generalizations over classes of segments that share theoretically interesting features. Here we present a method and the associated R (R Core Team, 2014) code that allows the exible denition of such meaningful classes and that can identify the sets of segments falling into such a class for any language inventory. The method and its results are important for those interested in exploring cross-linguistic patterns of phonetic and phonological diversity and their relationship to extra-linguistic factors and processes such as climate, economics, history or human genetics.
  • Dediu, D. (2013). Genes: Interactions with language on three levels — Inter-individual variation, historical correlations and genetic biasing. In P.-M. Binder, & K. Smith (Eds.), The language phenomenon: Human communication from milliseconds to millennia (pp. 139-161). Berlin: Springer. doi:10.1007/978-3-642-36086-2_7.

    Abstract

    The complex inter-relationships between genetics and linguistics encompass all four scales highlighted by the contributions to this book and, together with cultural transmission, the genetics of language holds the promise to offer a unitary understanding of this fascinating phenomenon. There are inter-individual differences in genetic makeup which contribute to the obvious fact that we are not identical in the way we understand and use language and, by studying them, we will be able to both better treat and enhance ourselves. There are correlations between the genetic configuration of human groups and their languages, reflecting the historical processes shaping them, and there also seem to exist genes which can influence some characteristics of language, biasing it towards or against certain states by altering the way language is transmitted across generations. Besides the joys of pure knowledge, the understanding of these three aspects of genetics relevant to language will potentially trigger advances in medicine, linguistics, psychology or the understanding of our own past and, last but not least, a profound change in the way we regard one of the emblems of being human: our capacity for language.
  • Dingemanse, M. (2013). Wie wir mit Sprache malen - How to paint with language. Forschungsbericht 2013 - Max-Planck-Institut für Psycholinguistik. In Max-Planck-Gesellschaft Jahrbuch 2013. München: Max Planck Society for the Advancement of Science. Retrieved from http://www.mpg.de/6683977/Psycholinguistik_JB_2013.

    Abstract

    Words evolve not as blobs of ink on paper but in face to face interaction. The nature of language as fundamentally interactive and multimodal is shown by the study of ideophones, vivid sensory words that thrive in conversations around the world. The ways in which these Lautbilder enable precise communication about sensory knowledge has for the first time been studied in detail. It turns out that we can paint with language, and that the onomatopoeia we sometimes classify as childish might be a subset of a much richer toolkit for depiction in speech, available to us all.
  • Dolscheid, S., Graver, C., & Casasanto, D. (2013). Spatial congruity effects reveal metaphors, not markedness. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 2213-2218). Austin,TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0405/index.html.

    Abstract

    Spatial congruity effects have often been interpreted as evidence for metaphorical thinking, but an alternative markedness-based account challenges this view. In two experiments, we directly compared metaphor and markedness explanations for spatial congruity effects, using musical pitch as a testbed. English speakers who talk about pitch in terms of spatial height were tested in speeded space-pitch compatibility tasks. To determine whether space-pitch congruency effects could be elicited by any marked spatial continuum, participants were asked to classify high- and low-frequency pitches as 'high' and 'low' or as 'front' and 'back' (both pairs of terms constitute cases of marked continuums). We found congruency effects in high/low conditions but not in front/back conditions, indicating that markedness is not sufficient to account for congruity effects (Experiment 1). A second experiment showed that congruency effects were specific to spatial words that cued a vertical schema (tall/short), and that congruity effects were not an artifact of polysemy (e.g., 'high' referring both to space and pitch). Together, these results suggest that congruency effects reveal metaphorical uses of spatial schemas, not markedness effects.
  • Doumas, L. A., & Martin, A. E. (2016). Abstraction in time: Finding hierarchical linguistic structure in a model of relational processing. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 2279-2284). Austin, TX: Cognitive Science Society.

    Abstract

    Abstract mental representation is fundamental for human cognition. Forming such representations in time, especially from dynamic and noisy perceptual input, is a challenge for any processing modality, but perhaps none so acutely as for language processing. We show that LISA (Hummel & Holyaok, 1997) and DORA (Doumas, Hummel, & Sandhofer, 2008), models built to process and to learn structured (i.e., symbolic) rep resentations of conceptual properties and relations from unstructured inputs, show oscillatory activation during processing that is highly similar to the cortical activity elicited by the linguistic stimuli from Ding et al.(2016). We argue, as Ding et al.(2016), that this activation reflects formation of hierarchical linguistic representation, and furthermore, that the kind of computational mechanisms in LISA/DORA (e.g., temporal binding by systematic asynchrony of firing) may underlie formation of abstract linguistic representations in the human brain. It may be this repurposing that allowed for the generation or mergence of hierarchical linguistic structure, and therefore, human language, from extant cognitive and neural systems. We conclude that models of thinking and reasoning and models of language processing must be integrated —not only for increased plausiblity, but in order to advance both fields towards a larger integrative model of human cognition
  • Drozd, K. F. (1998). No as a determiner in child English: A summary of categorical evidence. In A. Sorace, C. Heycock, & R. Shillcock (Eds.), Proceedings of the Gala '97 Conference on Language Acquisition (pp. 34-39). Edinburgh, UK: Edinburgh University Press,.

    Abstract

    This paper summarizes the results of a descriptive syntactic category analysis of child English no which reveals that young children use and represent no as a determiner and negatives like no pen as NPs, contra standard analyses.
  • Drozdova, P., Van Hout, R., & Scharenborg, O. (2016). Processing and adaptation to ambiguous sounds during the course of perceptual learning. In Proceedings of Interspeech 2016: The 17th Annual Conference of the International Speech Communication Association (pp. 2811-2815). doi:10.21437/Interspeech.2016-814.

    Abstract

    Listeners use their lexical knowledge to interpret ambiguous sounds, and retune their phonetic categories to include this ambiguous sound. Although there is ample evidence for lexically-guided retuning, the adaptation process is not fully understood. Using a lexical decision task with an embedded auditory semantic priming task, the present study investigates whether words containing an ambiguous sound are processed in the same way as “natural” words and whether adaptation to the ambiguous sound tends to equalize the processing of “ambiguous” and natural words. Analyses of the yes/no responses and reaction times to natural and “ambiguous” words showed that words containing an ambiguous sound were accepted as words less often and were processed slower than the same words without ambiguity. The difference in acceptance disappeared after exposure to approximately 15 ambiguous items. Interestingly, lower acceptance rates and slower processing did not have an effect on the processing of semantic information of the following word. However, lower acceptance rates of ambiguous primes predict slower reaction times of these primes, suggesting an important role of stimulus-specific characteristics in triggering lexically-guided perceptual learning.
  • Durco, M., & Windhouwer, M. (2013). Semantic Mapping in CLARIN Component Metadata. In Proceedings of MTSR 2013, the 7th Metadata and Semantics Research Conference (pp. 163-168). New York: Springer.

    Abstract

    In recent years, large scale initiatives like CLARIN set out to overcome the notorious heterogeneity of metadata formats in the domain of language resource. The CLARIN Component Metadata Infrastructure established means for flexible resouce descriptions for the domain of language resources. The Data Category Registry ISOcat and the accompanying Relation Registry foster semantic interoperability within the growing heterogeneous collection of metadata records. This paper describes the CMD Infrastructure focusing on the facilities for semantic mapping, and gives also an overview of the current status in the joint component metadata domain.
  • Eibl-Eibesfeldt, I., Senft, B., & Senft, G. (1998). Trobriander (Ost-Neuguinea, Trobriand Inseln, Kaile'una) Fadenspiele 'ninikula'. In Ethnologie - Humanethologische Begleitpublikationen von I. Eibl-Eibesfeldt und Mitarbeitern. Sammelband I, 1985-1987. Göttingen: Institut für den Wissenschaftlichen Film.
  • Enfield, N. J. (2013). A ‘Composite Utterances’ approach to meaning. In C. Müller, E. Fricke, S. Ladewig, A. Cienki, D. McNeill, & S. Teßendorf (Eds.), Handbook Body – Language – Communication. Volume 1 (pp. 689-706). Berlin: Mouton de Gruyter.
  • Enfield, N. J. (2013). Doing fieldwork on the body, language, and communication. In C. Müller, E. Fricke, S. Ladewig, A. Cienki, D. McNeill, & S. Teßendorf (Eds.), Handbook Body – Language – Communication. Volume 1 (pp. 974-981). Berlin: Mouton de Gruyter.
  • Enfield, N. J. (2013). Hippie, interrupted. In J. Barker, & J. Lindquist (Eds.), Figures of Southeast Asian modernity (pp. 101-103). Honolulu: University of Hawaii Press.
  • Enfield, N. J., Dingemanse, M., Baranova, J., Blythe, J., Brown, P., Dirksmeyer, T., Drew, P., Floyd, S., Gipper, S., Gisladottir, R. S., Hoymann, G., Kendrick, K. H., Levinson, S. C., Magyari, L., Manrique, E., Rossi, G., San Roque, L., & Torreira, F. (2013). Huh? What? – A first survey in 21 languages. In M. Hayashi, G. Raymond, & J. Sidnell (Eds.), Conversational repair and human understanding (pp. 343-380). New York: Cambridge University Press.

    Abstract

    Introduction A comparison of conversation in twenty-one languages from around the world reveals commonalities and differences in the way that people do open-class other-initiation of repair (Schegloff, Jefferson, and Sacks, 1977; Drew, 1997). We find that speakers of all of the spoken languages in the sample make use of a primary interjection strategy (in English it is Huh?), where the phonetic form of the interjection is strikingly similar across the languages: a monosyllable featuring an open non-back vowel [a, æ, ə, ʌ], often nasalized, usually with rising intonation and sometimes an [h-] onset. We also find that most of the languages have another strategy for open-class other-initiation of repair, namely the use of a question word (usually “what”). Here we find significantly more variation across the languages. The phonetic form of the question word involved is completely different from language to language: e.g., English [wɑt] versus Cha'palaa [ti] versus Duna [aki]. Furthermore, the grammatical structure in which the repair-initiating question word can or must be expressed varies within and across languages. In this chapter we present data on these two strategies – primary interjections like Huh? and question words like What? – with discussion of possible reasons for the similarities and differences across the languages. We explore some implications for the notion of repair as a system, in the context of research on the typology of language use. The general outline of this chapter is as follows. We first discuss repair as a system across languages and then introduce the focus of the chapter: open-class other-initiation of repair. A discussion of the main findings follows, where we identify two alternative strategies in the data: an interjection strategy (Huh?) and a question word strategy (What?). Formal features and possible motivations are discussed for the interjection strategy and the question word strategy in order. A final section discusses bodily behavior including posture, eyebrow movements and eye gaze, both in spoken languages and in a sign language.
  • Enfield, N. J. (2013). Reference in conversation. In J. Sidnell, & T. Stivers (Eds.), The handbook of conversation analysis (pp. 433-454). Malden, MA: Wiley-Blackwell. doi:10.1002/9781118325001.ch21.

    Abstract

    This chapter contains sections titled: Introduction Lexical Selection in Reference: Introductory Examples of Reference to Times Multiple “Preferences” Future Directions Conclusion
  • Ernestus, M. (2016). L'utilisation des corpus oraux pour la recherche en (psycho)linguistique. In M. Kilani-Schoch, C. Surcouf, & A. Xanthos (Eds.), Nouvelles technologies et standards méthodologiques en linguistique (pp. 65-93). Lausanne: Université de Lausanne.
  • Eryilmaz, K., Little, H., & De Boer, B. (2016). Using HMMs To Attribute Structure To Artificial Languages. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/125.html.

    Abstract

    We investigated the use of Hidden Markov Models (HMMs) as a way of representing repertoires of continuous signals in order to infer their building blocks. We tested the idea on a dataset from an artificial language experiment. The study demonstrates using HMMs for this purpose is viable, but also that there is a lot of room for refinement such as explicit duration modeling, incorporation of autoregressive elements and relaxing the Markovian assumption, in order to accommodate specific details.
  • Filippi, P., Congdon, J. V., Hoang, J., Bowling, D. L., Reber, S., Pašukonis, A., Hoeschele, M., Ocklenburg, S., de Boer, B., Sturdy, C. B., Newen, A., & Güntürkün, O. (2016). Humans Recognize Vocal Expressions Of Emotional States Universally Across Species. In The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/91.html.

    Abstract

    The perception of danger in the environment can induce physiological responses (such as a heightened state of arousal) in animals, which may cause measurable changes in the prosodic modulation of the voice (Briefer, 2012). The ability to interpret the prosodic features of animal calls as an indicator of emotional arousal may have provided the first hominins with an adaptive advantage, enabling, for instance, the recognition of a threat in the surroundings. This ability might have paved the ability to process meaningful prosodic modulations in the emerging linguistic utterances.
  • Filippi, P., Ocklenburg, S., Bowling, D. L., Heege, L., Newen, A., Güntürkün, O., & de Boer, B. (2016). Multimodal Processing Of Emotional Meanings: A Hypothesis On The Adaptive Value Of Prosody. In The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/90.html.

    Abstract

    Humans combine multiple sources of information to comprehend meanings. These sources can be characterized as linguistic (i.e., lexical units and/or sentences) or paralinguistic (e.g. body posture, facial expression, voice intonation, pragmatic context). Emotion communication is a special case in which linguistic and paralinguistic dimensions can simultaneously denote the same, or multiple incongruous referential meanings. Think, for instance, about when someone says “I’m sad!”, but does so with happy intonation and a happy facial expression. Here, the communicative channels express very specific (although conflicting) emotional states as denotations. In such cases of intermodal incongruence, are we involuntarily biased to respond to information in one channel over the other? We hypothesize that humans are involuntary biased to respond to prosody over verbal content and facial expression, since the ability to communicate socially relevant information such as basic emotional states through prosodic modulation of the voice might have provided early hominins with an adaptive advantage that preceded the emergence of segmental speech (Darwin 1871; Mithen, 2005). To address this hypothesis, we examined the interaction between multiple communicative channels in recruiting attentional resources, within a Stroop interference task (i.e. a task in which different channels give conflicting information; Stroop, 1935). In experiment 1, we used synonyms of “happy” and “sad” spoken with happy and sad prosody. Participants were asked to identify the emotion expressed by the verbal content while ignoring prosody (Word task) or vice versa (Prosody task). Participants responded faster and more accurately in the Prosody task. Within the Word task, incongruent stimuli were responded to more slowly and less accurately than congruent stimuli. In experiment 2, we adopted synonyms of “happy” and “sad” spoken in happy and sad prosody, while a happy or sad face was displayed. Participants were asked to identify the emotion expressed by the verbal content while ignoring prosody and face (Word task), to identify the emotion expressed by prosody while ignoring verbal content and face (Prosody task), or to identify the emotion expressed by the face while ignoring prosody and verbal content (Face task). Participants responded faster in the Face task and less accurately when the two non-focused channels were expressing an emotion that was incongruent with the focused one, as compared with the condition where all the channels were congruent. In addition, in the Word task, accuracy was lower when prosody was incongruent to verbal content and face, as compared with the condition where all the channels were congruent. Our data suggest that prosody interferes with emotion word processing, eliciting automatic responses even when conflicting with both verbal content and facial expressions at the same time. In contrast, although processed significantly faster than prosody and verbal content, faces alone are not sufficient to interfere in emotion processing within a three-dimensional Stroop task. Our findings align with the hypothesis that the ability to communicate emotions through prosodic modulation of the voice – which seems to be dominant over verbal content - is evolutionary older than the emergence of segmental articulation (Mithen, 2005; Fitch, 2010). This hypothesis fits with quantitative data suggesting that prosody has a vital role in the perception of well-formed words (Johnson & Jusczyk, 2001), in the ability to map sounds to referential meanings (Filippi et al., 2014), and in syntactic disambiguation (Soderstrom et al., 2003). This research could complement studies on iconic communication within visual and auditory domains, providing new insights for models of language evolution. Further work aimed at how emotional cues from different modalities are simultaneously integrated will improve our understanding of how humans interpret multimodal emotional meanings in real life interactions.
  • Fisher, S. E. (2016). A molecular genetic perspective on speech and language. In G. Hickok, & S. Small (Eds.), Neurobiology of Language (pp. 13-24). Amsterdam: Elsevier. doi:10.1016/B978-0-12-407794-2.00002-X.

    Abstract

    The rise of genomic technologies has yielded exciting new routes for studying the biological foundations of language. Researchers have begun to identify genes implicated in neurodevelopmental disorders that disrupt speech and language skills. This chapter illustrates how such work can provide powerful entry points into the critical neural pathways using FOXP2 as an example. Rare mutations of this gene cause problems with learning to sequence mouth movements during speech, accompanied by wide-ranging impairments in language production and comprehension. FOXP2 encodes a regulatory protein, a hub in a network of other genes, several of which have also been associated with language-related impairments. Versions of FOXP2 are found in similar form in many vertebrate species; indeed, studies of animals and birds suggest conserved roles in the development and plasticity of certain sets of neural circuits. Thus, the contributions of this gene to human speech and language involve modifications of evolutionarily ancient functions.
  • Fisher, S. E. (2013). Building bridges between genes, brains and language. In J. J. Bolhuis, & M. Everaert (Eds.), Birdsong, speech and language: Exploring the evolution of mind and brain (pp. 425-454). Cambridge, Mass: MIT Press.
  • Flecken, M., & Gerwien, J. (2013). Grammatical aspect modulates event duration estimations: findings from Dutch. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th annual meeting of the Cognitive Science Society (CogSci 2013) (pp. 2309-2314). Austin,TX: Cognitive Science Society.
  • Floyd, S. (2016). Insubordination in Interaction: The Cha’palaa counter-assertive. In N. Evans, & H. Wananabe (Eds.), Dynamics of Insubordination (pp. 341-366). Amsterdam: John Benjamins.

    Abstract

    In the Cha’palaa language of Ecuador the main-clause use of the otherwise non-finite morpheme -ba can be accounted for by a specific interactive practice: the ‘counter-assertion’ of statement or implicature of a previous conversational turn. Attention to the ways in which different constructions are deployed in such recurrent conversational contexts reveals a plausible account for how this type of dependent clause has come to be one of the options for finite clauses. After giving some background on Cha’palaa and placing ba clauses within a larger ecology of insubordination constructions in the language, this chapter uses examples from a video corpus of informal conversation to illustrate how interactive data provides answers that may otherwise be elusive for understanding how the different grammatical options for Cha’palaa finite verb constructions have been structured by insubordination
  • Floyd, S. (2013). Semantic transparency and cultural calquing in the Northwest Amazon. In P. Epps, & K. Stenzel (Eds.), Upper Rio Negro: Cultural and linguistic interaction in northwestern Amazonia (pp. 271-308). Rio de Janiero: Museu do Indio. Retrieved from http://www.museunacional.ufrj.br/ppgas/livros_ele.html.

    Abstract

    The ethnographic literature has sometimes described parts of the northwest Amazon as areas of shared culture across linguistic groups. This paper illustrates how a principle of semantic transparency across languages is a key means of establishing elements of a common regional culture through practices like the calquing of ethnonyms and toponyms so that they are semantically, but not phonologically, equivalent across languages. It places the upper Rio Negro area of the northwest Amazon in a general discussion of cross-linguistic naming practices in South America and considers the extent to which a preference for semantic transparency can be linked to cases of widespread cultural ‘calquing’, in which culturally-important meanings are kept similar across different linguistic systems. It also addresses the principle of semantic transparency beyond specific referential phrases and into larger discourse structures. It concludes that an attention to semiotic practices in multilingual settings can provide new and more complex ways of thinking about the idea of shared culture.
  • Floyd, S., & Norcliffe, E. (2016). Switch reference systems in the Barbacoan languages and their neighbors. In R. Van Gijn, & J. Hammond (Eds.), Switch Reference 2.0 (pp. 207-230). Amsterdam: Benjamins.

    Abstract

    This chapter surveys the available data on Barbacoan languages and their neighbors to explore a case study of switch reference within a single language family and in a situation of areal contact. To the extent possible given the available data, we weigh accounts appealing to common inheritance and areal convergence to ask what combination of factors led to the current state of these languages. We discuss the areal distribution of switch reference systems in the northwest Andean region, the different types of systems and degrees of complexity observed, and scenarios of contact and convergence, particularly in the case of Barbacoan and Ecuadorian Quechua. We then covers each of the Barbacoan languages’ systems (with the exception of Totoró, represented by its close relative Guambiano), identifying limited formal cognates, primarily between closely-related Tsafiki and Cha’palaa, as well as broader functional similarities, particularly in terms of interactions with topic/focus markers. n accounts for the current state of affairs with a complex scenario of areal prevalence of switch reference combined with deep structural family inheritance and formal re-structuring of the systems over time
  • Frost, R. L. A., Monaghan, P., & Christiansen, M. H. (2016). Using Statistics to Learn Words and Grammatical Categories: How High Frequency Words Assist Language Acquisition. In A. Papafragou, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 81-86). Austin, Tx: Cognitive Science Society. Retrieved from https://mindmodeling.org/cogsci2016/papers/0027/index.html.

    Abstract

    Recent studies suggest that high-frequency words may benefit speech segmentation (Bortfeld, Morgan, Golinkoff, & Rathbun, 2005) and grammatical categorisation (Monaghan, Christiansen, & Chater, 2007). To date, these tasks have been examined separately, but not together. We familiarised adults with continuous speech comprising repetitions of target words, and compared learning to a language in which targets appeared alongside high-frequency marker words. Marker words reliably preceded targets, and distinguished them into two otherwise unidentifiable categories. Participants completed a 2AFC segmentation test, and a similarity judgement categorisation test. We tested transfer to a word-picture mapping task, where words from each category were used either consistently or inconsistently to label actions/objects. Participants segmented the speech successfully, but only demonstrated effective categorisation when speech contained high-frequency marker words. The advantage of marker words extended to the early stages of the transfer task. Findings indicate the same high-frequency words may assist speech segmentation and grammatical categorisation.
  • Gannon, E., He, J., Gao, X., & Chaparro, B. (2016). RSVP Reading on a Smart Watch. In Proceedings of the Human Factors and Ergonomics Society 2016 Annual Meeting (pp. 1130-1134).

    Abstract

    Reading with Rapid Serial Visual Presentation (RSVP) has shown promise for optimizing screen space and increasing reading speed without compromising comprehension. Given the wide use of small-screen devices, the present study compared RSVP and traditional reading on three types of reading comprehension, reading speed, and subjective measures on a smart watch. Results confirm previous studies that show faster reading speed with RSVP without detracting from comprehension. Subjective data indicate that Traditional is strongly preferred to RSVP as a primary reading method. Given the optimal use of screen space, increased speed and comparable comprehension, future studies should focus on making RSVP a more comfortable format.
  • Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). Automatic sign language identification. In Proceeding of the 20th IEEE International Conference on Image Processing (ICIP) (pp. 2626-2630).

    Abstract

    We propose a Random-Forest based sign language identification system. The system uses low-level visual features and is based on the hypothesis that sign languages have varying distributions of phonemes (hand-shapes, locations and movements). We evaluated the system on two sign languages -- British SL and Greek SL, both taken from a publicly available corpus, called Dicta Sign Corpus. Achieved average F1 scores are about 95% - indicating that sign languages can be identified with high accuracy using only low-level visual features.
  • Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). Automatic signer diarization - the mover is the signer approach. In Computer Vision and Pattern Recognition Workshops (CVPRW), 2013 IEEE Conference on (pp. 283-287). doi:10.1109/CVPRW.2013.49.

    Abstract

    We present a vision-based method for signer diarization -- the task of automatically determining "who signed when?" in a video. This task has similar motivations and applications as speaker diarization but has received little attention in the literature. In this paper, we motivate the problem and propose a method for solving it. The method is based on the hypothesis that signers make more movements than their interlocutors. Experiments on four videos (a total of 1.4 hours and each consisting of two signers) show the applicability of the method. The best diarization error rate (DER) obtained is 0.16.
  • Gebre, B. G., Zampieri, M., Wittenburg, P., & Heskes, T. (2013). Improving Native Language Identification with TF-IDF weighting. In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications (pp. 216-223).

    Abstract

    This paper presents a Native Language Identification (NLI) system based on TF-IDF weighting schemes and using linear classifiers - support vector machines, logistic regressions and perceptrons. The system was one of the participants of the 2013 NLI Shared Task in the closed-training track, achieving 0.814 overall accuracy for a set of 11 native languages. This accuracy was only 2.2 percentage points lower than the winner's performance. Furthermore, with subsequent evaluations using 10-fold cross-validation (as given by the organizers) on the combined training and development data, the best average accuracy obtained is 0.8455 and the features that contributed to this accuracy are the TF-IDF of the combined unigrams and bigrams of words.
  • Gebre, B. G., Wittenburg, P., & Heskes, T. (2013). The gesturer is the speaker. In Proceedings of the 38th International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013) (pp. 3751-3755).

    Abstract

    We present and solve the speaker diarization problem in a novel way. We hypothesize that the gesturer is the speaker and that identifying the gesturer can be taken as identifying the active speaker. We provide evidence in support of the hypothesis from gesture literature and audio-visual synchrony studies. We also present a vision-only diarization algorithm that relies on gestures (i.e. upper body movements). Experiments carried out on 8.9 hours of a publicly available dataset (the AMI meeting data) show that diarization error rates as low as 15% can be achieved.
  • Gerwien, J., & Flecken, M. (2016). First things first? Top-down influences on event apprehension. In A. Papafragou, D. Grodner, D. Mirman, & J. Trueswell (Eds.), Proceedings of the 38th Annual Meeting of the Cognitive Science Society (CogSci 2016) (pp. 2633-2638). Austin, TX: Cognitive Science Society.

    Abstract

    Not much is known about event apprehension, the earliest stage of information processing in elicited language production studies, using pictorial stimuli. A reason for our lack of knowledge on this process is that apprehension happens very rapidly (<350 ms after stimulus onset, Griffin & Bock 2000), making it difficult to measure the process directly. To broaden our understanding of apprehension, we analyzed landing positions and onset latencies of first fixations on visual stimuli (pictures of real-world events) given short stimulus presentation times, presupposing that the first fixation directly results from information processing during apprehension
  • Gijssels, T., Bottini, R., Rueschemeyer, S.-A., & Casasanto, D. (2013). Space and time in the parietal cortex: fMRI Evidence for a meural asymmetry. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 495-500). Austin,TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0113/index.html.

    Abstract

    How are space and time related in the brain? This study contrasts two proposals that make different predictions about the interaction between spatial and temporal magnitudes. Whereas ATOM implies that space and time are symmetrically related, Metaphor Theory claims they are asymmetrically related. Here we investigated whether space and time activate the same neural structures in the inferior parietal cortex (IPC) and whether the activation is symmetric or asymmetric across domains. We measured participants’ neural activity while they made temporal and spatial judgments on the same visual stimuli. The behavioral results replicated earlier observations of a space-time asymmetry: Temporal judgments were more strongly influenced by irrelevant spatial information than vice versa. The BOLD fMRI data indicated that space and time activated overlapping clusters in the IPC and that, consistent with Metaphor Theory, this activation was asymmetric: The shared region of IPC was activated more strongly during temporal judgments than during spatial judgments. We consider three possible interpretations of this neural asymmetry, based on 3 possible functions of IPC.
  • Gordon, P. C., Lowder, M. W., & Hoedemaker, R. S. (2016). Reading in normally aging adults. In H. Wright (Ed.), Cognitive-Linguistic Processes and Aging (pp. 165-192). Amsterdam: Benjamins. doi:10.1075/z.200.07gor.

    Abstract

    The activity of reading raises fundamental theoretical and practical questions about healthy cognitive aging. Reading relies greatly on knowledge of patterns of language and of meaning at the level of words and topics of text. Further, this knowledge must be rapidly accessed so that it can be coordinated with processes of perception, attention, memory and motor control that sustain skilled reading at rates of four-to-five words a second. As such, reading depends both on crystallized semantic intelligence which grows or is maintained through healthy aging, and on components of fluid intelligence which decline with age. Reading is important to older adults because it facilitates completion of everyday tasks that are essential to independent living. In addition, it entails the kind of active mental engagement that can preserve and deepen the cognitive reserve that may mitigate the negative consequences of age-related changes in the brain. This chapter reviews research on the front end of reading (word recognition) and on the back end of reading (text memory) because both of these abilities are surprisingly robust to declines associated with cognitive aging. For word recognition, that robustness is surprising because rapid processing of the sort found in reading is usually impaired by aging; for text memory, it is surprising because other types of episodic memory performance (e.g., paired associates) substantially decline in aging. These two otherwise quite different levels of reading comprehension remain robust because they draw on the knowledge of language that older adults gain through a life-time of experience with language.
  • Gussenhoven, C., & Zhou, W. (2013). Revisiting pitch slope and height effects on perceived duration. In Proceedings of INTERSPEECH 2013: 14th Annual Conference of the International Speech Communication Association (pp. 1365-1369).

    Abstract

    The shape of pitch contours has been shown to have an effect on the perceived duration of vowels. For instance, vowels with high level pitch and vowels with falling contours sound longer than vowels with low level pitch. Depending on whether the comparison is between level pitches or between level and dynamic contours, these findings have been interpreted in two ways. For inter-level comparisons, where the duration results are the reverse of production results, a hypercorrection strategy in production has been proposed [1]. By contrast, for comparisons between level pitches and dynamic contours, the longer production data for dynamic contours have been held responsible. We report an experiment with Dutch and Chinese listeners which aimed to show that production data and perception data are each other’s opposites for high, low, falling and rising contours. We explain the results, which are consistent with earlier findings, in terms of the compensatory listening strategy of [2], arguing that the perception effects are due to a perceptual compensation of articulatory strategies and constraints, rather than that differences in production compensate for psycho-acoustic perception effects.
  • Hagoort, P. (2016). MUC (Memory, Unification, Control): A Model on the Neurobiology of Language Beyond Single Word Processing. In G. Hickok, & S. Small (Eds.), Neurobiology of language (pp. 339-347). Amsterdam: Elsever. doi:10.1016/B978-0-12-407794-2.00028-6.

    Abstract

    A neurobiological model of language is discussed that overcomes the shortcomings of the classical Wernicke-Lichtheim-Geschwind model. It is based on a subdivision of language processing into three components: Memory, Unification, and Control. The functional components as well as the neurobiological underpinnings of the model are discussed. In addition, the need for extension beyond the classical core regions for language is shown. Attentional networks as well as networks for inferential processing are crucial to realize language comprehension beyond single word processing and beyond decoding propositional content.
  • Hagoort, P., & Poeppel, D. (2013). The infrastructure of the language-ready brain. In M. A. Arbib (Ed.), Language, music, and the brain: A mysterious relationship (pp. 233-255). Cambridge, MA: MIT Press.

    Abstract

    This chapter sketches in very general terms the cognitive architecture of both language comprehension and production, as well as the neurobiological infrastructure that makes the human brain ready for language. Focus is on spoken language, since that compares most directly to processing music. It is worth bearing in mind that humans can also interface with language as a cognitive system using sign and text (visual) as well as Braille (tactile); that is to say, the system can connect with input/output processes in any sensory modality. Language processing consists of a complex and nested set of subroutines to get from sound to meaning (in comprehension) or meaning to sound (in production), with remarkable speed and accuracy. The fi rst section outlines a selection of the major constituent operations, from fractionating the input into manageable units to combining and unifying information in the construction of meaning. The next section addresses the neurobiological infrastructure hypothesized to form the basis for language processing. Principal insights are summarized by building on the notion of “brain networks” for speech–sound processing, syntactic processing, and the construction of meaning, bearing in mind that such a neat three-way subdivision overlooks important overlap and shared mechanisms in the neural architecture subserving language processing. Finally, in keeping with the spirit of the volume, some possible relations are highlighted between language and music that arise from the infrastructure developed here. Our characterization of language and its neurobiological foundations is necessarily selective and brief. Our aim is to identify for the reader critical questions that require an answer to have a plausible cognitive neuroscience of language processing.
  • Hagoort, P. (1998). The shadows of lexical meaning in patients with semantic impairments. In B. Stemmer, & H. Whitaker (Eds.), Handbook of neurolinguistics (pp. 235-248). New York: Academic Press.
  • Hagoort, P. (2016). Zij zijn ons brein. In J. Brockman (Ed.), Machines die denken: Invloedrijke denkers over de komst van kunstmatige intelligentie (pp. 184-186). Amsterdam: Maven Publishing.
  • Hammarström, H., & O'Connor, L. (2013). Dependency sensitive typological distance. In L. Borin, & A. Saxena (Eds.), Approaches to measuring linguistic differences (pp. 337-360). Berlin: Mouton de Gruyter.
  • Hammarström, H. (2013). Noun class parallels in Kordofanian and Niger-Congo: Evidence of genealogical inheritance? In T. C. Schadeberg, & R. M. Blench (Eds.), Nuba Mountain Language Studies (pp. 549-570). Köln: Köppe.
  • Haun, D. B. M., & Over, H. (2013). Like me: A homophily-based account of human culture. In P. J. Richerson, & M. H. Christiansen (Eds.), Cultural Evolution: Society, technology, language, and religion (pp. 75-85). Cambridge, MA: MIT Press.
  • Hayano, K. (2013). Question design in conversation. In J. Sidnell, & T. Stivers (Eds.), The handbook of conversation analysis (pp. 395-414). Malden, MA: Wiley-Blackwell. doi:10.1002/9781118325001.ch19.

    Abstract

    This chapter contains sections titled: Introduction Questions Questioning and the Epistemic Gradient Presuppositions, Agenda Setting and Preferences Social Actions Implemented by Questions Questions as Building Blocks of Institutional Activities Future Directions
  • Hendricks, I., Lefever, E., Croijmans, I., Majid, A., & Van den Bosch, A. (2016). Very quaffable and great fun: Applying NLP to wine reviews. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics: Vol 2 (pp. 306-312). Stroudsburg, PA: Association for Computational Linguistics.

    Abstract

    We automatically predict properties of wines on the basis of smell and flavor de- scriptions from experts’ wine reviews. We show wine experts are capable of describ- ing their smell and flavor experiences in wine reviews in a sufficiently consistent manner, such that we can use their descrip- tions to predict properties of a wine based solely on language. The experimental re- sults show promising F-scores when using lexical and semantic information to predict the color, grape variety, country of origin, and price of a wine. This demonstrates, contrary to popular opinion, that wine ex- perts’ reviews really are informative.
  • Hintz, F., & Scharenborg, O. (2016). Neighbourhood density influences word recognition in native and non-native speech recognition in noise. In H. Van den Heuvel, B. Cranen, & S. Mattys (Eds.), Proceedings of the Speech Processing in Realistic Environments (SPIRE) workshop (pp. 46-47). Groningen.
  • Hintz, F., & Scharenborg, O. (2016). The effect of background noise on the activation of phonological and semantic information during spoken-word recognition. In Proceedings of Interspeech 2016: The 17th Annual Conference of the International Speech Communication Association (pp. 2816-2820).

    Abstract

    During spoken-word recognition, listeners experience phonological competition between multiple word candidates, which increases, relative to optimal listening conditions, when speech is masked by noise. Moreover, listeners activate semantic word knowledge during the word’s unfolding. Here, we replicated the effect of background noise on phonological competition and investigated to which extent noise affects the activation of semantic information in phonological competitors. Participants’ eye movements were recorded when they listened to sentences containing a target word and looked at three types of displays. The displays either contained a picture of the target word, or a picture of a phonological onset competitor, or a picture of a word semantically related to the onset competitor, each along with three unrelated distractors. The analyses revealed that, in noise, fixations to the target and to the phonological onset competitor were delayed and smaller in magnitude compared to the clean listening condition, most likely reflecting enhanced phonological competition. No evidence for the activation of semantic information in the phonological competitors was observed in noise and, surprisingly, also not in the clear. We discuss the implications of the lack of an effect and differences between the present and earlier studies.
  • Hofmeister, P., & Norcliffe, E. (2013). Does resumption facilitate sentence comprehension? In P. Hofmeister, & E. Norcliffe (Eds.), The core and the periphery: Data-driven perspectives on syntax inspired by Ivan A. Sag (pp. 225-246). Stanford, CA: CSLI Publications.
  • Holler, J., Schubotz, L., Kelly, S., Schuetze, M., Hagoort, P., & Ozyurek, A. (2013). Here's not looking at you, kid! Unaddressed recipients benefit from co-speech gestures when speech processing suffers. In M. Knauff, M. Pauen, I. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 2560-2565). Austin, TX: Cognitive Science Society. Retrieved from http://mindmodeling.org/cogsci2013/papers/0463/index.html.

    Abstract

    In human face-to-face communication, language comprehension is a multi-modal, situated activity. However, little is known about how we combine information from these different modalities, and how perceived communicative intentions, often signaled through visual signals, such as eye gaze, may influence this processing. We address this question by simulating a triadic communication context in which a speaker alternated her gaze between two different recipients. Participants thus viewed speech-only or speech+gesture object-related utterances when being addressed (direct gaze) or unaddressed (averted gaze). Two object images followed each message and participants’ task was to choose the object that matched the message. Unaddressed recipients responded significantly slower than addressees for speech-only utterances. However, perceiving the same speech accompanied by gestures sped them up to a level identical to that of addressees. That is, when speech processing suffers due to not being addressed, gesture processing remains intact and enhances the comprehension of a speaker’s message
  • Huettig, F. (2013). Young children’s use of color information during language-vision mapping. In B. R. Kar (Ed.), Cognition and brain development: Converging evidence from various methodologies (pp. 368-391). Washington, DC: American Psychological Association Press.
  • Irivine, E., & Roberts, S. G. (2016). Deictic tools can limit the emergence of referential symbol systems. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/99.html.

    Abstract

    Previous experiments and models show that the pressure to communicate can lead to the emergence of symbols in specific tasks. The experiment presented here suggests that the ability to use deictic gestures can reduce the pressure for symbols to emerge in co-operative tasks. In the 'gesture-only' condition, pairs built a structure together in 'Minecraft', and could only communicate using a small range of gestures. In the 'gesture-plus' condition, pairs could also use sound to develop a symbol system if they wished. All pairs were taught a pointing convention. None of the pairs we tested developed a symbol system, and performance was no different across the two conditions. We therefore suggest that deictic gestures, and non-referential means of organising activity sequences, are often sufficient for communication. This suggests that the emergence of linguistic symbols in early hominids may have been late and patchy with symbols only emerging in contexts where they could significantly improve task success or efficiency. Given the communicative power of pointing however, these contexts may be fewer than usually supposed. An approach for identifying these situations is outlined.
  • Irvine, L., Roberts, S. G., & Kirby, S. (2013). A robustness approach to theory building: A case study of language evolution. In M. Knauff, M. Pauen, N. Sebanz, & I. Wachsmuth (Eds.), Proceedings of the 35th Annual Meeting of the Cognitive Science Society (CogSci 2013) (pp. 2614-2619). Retrieved from http://mindmodeling.org/cogsci2013/papers/0472/index.html.

    Abstract

    Models of cognitive processes often include simplifications, idealisations, and fictionalisations, so how should we learn about cognitive processes from such models? Particularly in cognitive science, when many features of the target system are unknown, it is not always clear which simplifications, idealisations, and so on, are appropriate for a research question, and which are highly misleading. Here we use a case-study from studies of language evolution, and ideas from philosophy of science, to illustrate a robustness approach to learning from models. Robust properties are those that arise across a range of models, simulations and experiments, and can be used to identify key causal structures in the models, and the phenomenon, under investigation. For example, in studies of language evolution, the emergence of compositional structure is a robust property across models, simulations and experiments of cultural transmission, but only under pressures for learnability and expressivity. This arguably illustrates the principles underlying real cases of language evolution. We provide an outline of the robustness approach, including its limitations, and suggest that this methodology can be productively used throughout cognitive science. Perhaps of most importance, it suggests that different modelling frameworks should be used as tools to identify the abstract properties of a system, rather than being definitive expressions of theories.
  • Janssen, R., Winter, B., Dediu, D., Moisik, S. R., & Roberts, S. G. (2016). Nonlinear biases in articulation constrain the design space of language. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/86.html.

    Abstract

    In Iterated Learning (IL) experiments, a participant’s learned output serves as the next participant’s learning input (Kirby et al., 2014). IL can be used to model cultural transmission and has indicated that weak biases can be amplified through repeated cultural transmission (Kirby et al., 2007). So, for example, structural language properties can emerge over time because languages come to reflect the cognitive constraints in the individuals that learn and produce the language. Similarly, we propose that languages may also reflect certain anatomical biases. Do sound systems adapt to the affordances of the articulation space induced by the vocal tract? The human vocal tract has inherent nonlinearities which might derive from acoustics and aerodynamics (cf. quantal theory, see Stevens, 1989) or biomechanics (cf. Gick & Moisik, 2015). For instance, moving the tongue anteriorly along the hard palate to produce a fricative does not result in large changes in acoustics in most cases, but for a small range there is an abrupt change from a perceived palato-alveolar [ʃ] to alveolar [s] sound (Perkell, 2012). Nonlinearities such as these might bias all human speakers to converge on a very limited set of phonetic categories, and might even be a basis for combinatoriality or phonemic ‘universals’. While IL typically uses discrete symbols, Verhoef et al. (2014) have used slide whistles to produce a continuous signal. We conducted an IL experiment with human subjects who communicated using a digital slide whistle for which the degree of nonlinearity is controlled. A single parameter (α) changes the mapping from slide whistle position (the ‘articulator’) to the acoustics. With α=0, the position of the slide whistle maps Bark-linearly to the acoustics. As α approaches 1, the mapping gets more double-sigmoidal, creating three plateaus where large ranges of positions map to similar frequencies. In more abstract terms, α represents the strength of a nonlinear (anatomical) bias in the vocal tract. Six chains (138 participants) of dyads were tested, each chain with a different, fixed α. Participants had to communicate four meanings by producing a continuous signal using the slide-whistle in a ‘director-matcher’ game, alternating roles (cf. Garrod et al., 2007). Results show that for high αs, subjects quickly converged on the plateaus. This quick convergence is indicative of a strong bias, repelling subjects away from unstable regions already within-subject. Furthermore, high αs lead to the emergence of signals that oscillate between two (out of three) plateaus. Because the sigmoidal spaces are spatially constrained, participants increasingly used the sequential/temporal dimension. As a result of this, the average duration of signals with high α was ~100ms longer than with low α. These oscillations could be an expression of a basis for phonemic combinatoriality. We have shown that it is possible to manipulate the magnitude of an articulator-induced non-linear bias in a slide whistle IL framework. The results suggest that anatomical biases might indeed constrain the design space of language. In particular, the signaling systems in our study quickly converged (within-subject) on the use of stable regions. While these conclusions were drawn from experiments using slide whistles with a relatively strong bias, weaker biases could possibly be amplified over time by repeated cultural transmission, and likely lead to similar outcomes.
  • Janssen, R., Dediu, D., & Moisik, S. R. (2016). Simple agents are able to replicate speech sounds using 3d vocal tract model. In S. G. Roberts, C. Cuskley, L. McCrohon, L. Barceló-Coblijn, O. Feher, & T. Verhoef (Eds.), The Evolution of Language: Proceedings of the 11th International Conference (EVOLANG11). Retrieved from http://evolang.org/neworleans/papers/97.html.

    Abstract

    Many factors have been proposed to explain why groups of people use different speech sounds in their language. These range from cultural, cognitive, environmental (e.g., Everett, et al., 2015) to anatomical (e.g., vocal tract (VT) morphology). How could such anatomical properties have led to the similarities and differences in speech sound distributions between human languages? It is known that hard palate profile variation can induce different articulatory strategies in speakers (e.g., Brunner et al., 2009). That is, different hard palate profiles might induce a kind of bias on speech sound production, easing some types of sounds while impeding others. With a population of speakers (with a proportion of individuals) that share certain anatomical properties, even subtle VT biases might become expressed at a population-level (through e.g., bias amplification, Kirby et al., 2007). However, before we look into population-level effects, we should first look at within-individual anatomical factors. For that, we have developed a computer-simulated analogue for a human speaker: an agent. Our agent is designed to replicate speech sounds using a production and cognition module in a computationally tractable manner. Previous agent models have often used more abstract (e.g., symbolic) signals. (e.g., Kirby et al., 2007). We have equipped our agent with a three-dimensional model of the VT (the production module, based on Birkholz, 2005) to which we made numerous adjustments. Specifically, we used a 4th-order Bezier curve that is able to capture hard palate variation on the mid-sagittal plane (XXX, 2015). Using an evolutionary algorithm, we were able to fit the model to human hard palate MRI tracings, yielding high accuracy fits and using as little as two parameters. Finally, we show that the samples map well-dispersed to the parameter-space, demonstrating that the model cannot generate unrealistic profiles. We can thus use this procedure to import palate measurements into our agent’s production module to investigate the effects on acoustics. We can also exaggerate/introduce novel biases. Our agent is able to control the VT model using the cognition module. Previous research has focused on detailed neurocomputation (e.g., Kröger et al., 2014) that highlights e.g., neurobiological principles or speech recognition performance. However, the brain is not the focus of our current study. Furthermore, present-day computing throughput likely does not allow for large-scale deployment of these architectures, as required by the population model we are developing. Thus, the question whether a very simple cognition module is able to replicate sounds in a computationally tractable manner, and even generalize over novel stimuli, is one worthy of attention in its own right. Our agent’s cognition module is based on running an evolutionary algorithm on a large population of feed-forward neural networks (NNs). As such, (anatomical) bias strength can be thought of as an attractor basin area within the parameter-space the agent has to explore. The NN we used consists of a triple-layered (fully-connected), directed graph. The input layer (three neurons) receives the formants frequencies of a target-sound. The output layer (12 neurons) projects to the articulators in the production module. A hidden layer (seven neurons) enables the network to deal with nonlinear dependencies. The Euclidean distance (first three formants) between target and replication is used as fitness measure. Results show that sound replication is indeed possible, with Euclidean distance quickly approaching a close-to-zero asymptote. Statistical analysis should reveal if the agent can also: a) Generalize: Can it replicate sounds not exposed to during learning? b) Replicate consistently: Do different, isolated agents always converge on the same sounds? c) Deal with consolidation: Can it still learn new sounds after an extended learning phase (‘infancy’) has been terminated? Finally, a comparison with more complex models will be used to demonstrate robustness.
  • Jeske, J., Kember, H., & Cutler, A. (2016). Native and non-native English speakers' use of prosody to predict sentence endings. In Proceedings of the 16th Australasian International Conference on Speech Science and Technology (SST2016).
  • De Jong, N. H., & Bosker, H. R. (2013). Choosing a threshold for silent pauses to measure second language fluency. In R. Eklund (Ed.), Proceedings of the 6th Workshop on Disfluency in Spontaneous Speech (DiSS) (pp. 17-20).

    Abstract

    Second language (L2) research often involves analyses of acoustic measures of fluency. The studies investigating fluency, however, have been difficult to compare because the measures of fluency that were used differed widely. One of the differences between studies concerns the lower cut-off point for silent pauses, which has been set anywhere between 100 ms and 1000 ms. The goal of this paper is to find an optimal cut-off point. We calculate acoustic measures of fluency using different pause thresholds and then relate these measures to a measure of L2 proficiency and to ratings on fluency.
  • Jordan, F. (2013). Comparative phylogenetic methods and the study of pattern and process in kinship. In P. McConvell, I. Keen, & R. Hendery (Eds.), Kinship systems: Change and reconstruction (pp. 43-58). Salt Lake City, UT: University of Utah Press.

    Abstract

    Anthropology began by comparing aspects of kinship across cultures, while linguists interested in semantic domains such as kinship necessarily compare across languages. In this chapter I show how phylogenetic comparative methods from evolutionary biology can be used to study evolutionary processes relating to kinship and kinship terminologies across language and culture.
  • Jordan, F. M., van Schaik, C. P., Francois, P., Gintis, H., Haun, D. B. M., Hruschka, D. H., Janssen, M. A., Kitts, J. A., Lehmann, L., Mathew, S., Richerson, P. J., Turchin, P., & Wiessner, P. (2013). Cultural evolution of the structure of human groups. In P. J. Richerson, & M. H. Christiansen (Eds.), Cultural Evolution: Society, technology, language, and religion (pp. 87-116). Cambridge, MA: MIT Press.
  • Jordens, P. (1998). Defaultformen des Präteritums. Zum Erwerb der Vergangenheitsmorphologie im Niederlänidischen. In H. Wegener (Ed.), Eine zweite Sprache lernen (pp. 61-88). Tübingen, Germany: Verlag Gunter Narr.
  • Jordens, P. (2013). Dummies and auxiliaries in the acquisition of L1 and L2 Dutch. In E. Blom, I. Van de Craats, & J. Verhagen (Eds.), Dummy Auxiliaries in First and Second Language Acquisition (pp. 341-368). Berlin: Mouton de Gruyter.
  • Kallmeyer, L., Osswald, R., & Van Valin Jr., R. D. (2013). Tree wrapping for Role and Reference Grammar. In G. Morrill, & M.-J. Nederhof (Eds.), Formal grammar: 17th and 18th International Conferences, FG 2012/2013, Opole, Poland, August 2012: revised Selected Papers, Düsseldorf, Germany, August 2013: proceedings (pp. 175-190). Heidelberg: Springer.
  • Kember, H., Choi, J., & Cutler, A. (2016). Processing advantages for focused words in Korean. In J. Barnes, A. Brugos, S. Shattuck-Hufnagel, & N. Veilleux (Eds.), Proceedings of Speech Prosody 2016 (pp. 702-705).

    Abstract

    In Korean, focus is expressed in accentual phrasing. To ascertain whether words focused in this manner enjoy a processing advantage analogous to that conferred by focus as expressed in, e.g, English and Dutch, we devised sentences with target words in one of four conditions: prosodic focus, syntactic focus, prosodic + syntactic focus, and no focus as a control. 32 native speakers of Korean listened to blocks of 10 sentences, then were presented visually with words and asked whether or not they had heard them. Overall, words with focus were recognised significantly faster and more accurately than unfocused words. In addition, words with syntactic focus or syntactic + prosodic focus were recognised faster than words with prosodic focus alone. As for other languages, Korean focus confers processing advantage on the words carrying it. While prosodic focus does provide an advantage, however, syntactic focus appears to provide the greater beneficial effect for recognition memory
  • Kempen, G., & Harbusch, K. (1998). A 'tree adjoining' grammar without adjoining: The case of scrambling in German. In Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4).

Share this page