Publications

Displaying 101 - 200 of 251
  • Hanulikova, A., & Davidson, D. (2009). Inflectional entropy in Slovak. In J. Levicka, & R. Garabik (Eds.), Slovko 2009, NLP, Corpus Linguistics, Corpus Based Grammar Research (pp. 145-151). Bratislava, Slovakia: Slovak Academy of Sciences.
  • Hanulikova, A., & Weber, A. (2009). Experience with foreign accent influences non-native (L2) word recognition: The case of th-substitutions [Abstract]. Journal of the Acoustical Society of America, 125(4), 2762-2762.
  • Harbusch, K., & Kempen, G. (2009). Clausal coordinate ellipsis and its varieties in spoken German: A study with the TüBa-D/S Treebank of the VERBMOBIL corpus. In M. Passarotti, A. Przepiórkowski, S. Raynaud, & F. Van Eynde (Eds.), Proceedings of the The Eighth International Workshop on Treebanks and Linguistic Theories (pp. 83-94). Milano: EDUCatt.
  • Harbusch, K., Kempen, G., & Vosse, T. (2008). A natural-language paraphrase generator for on-line monitoring and commenting incremental sentence construction by L2 learners of German. In Proceedings of WorldCALL 2008.

    Abstract

    Certain categories of language learners need feedback on the grammatical structure of sentences they wish to produce. In contrast with the usual NLP approach to this problem—parsing student-generated texts—we propose a generation-based approach aiming at preventing errors (“scaffolding”). In our ICALL system, students construct sentences by composing syntactic trees out of lexically anchored “treelets” via a graphical drag&drop user interface. A natural-language generator computes all possible grammatically well-formed sentences entailed by the student-composed tree, and intervenes immediately when the latter tree does not belong to the set of well-formed alternatives. Feedback is based on comparisons between the student-composed tree and the well-formed set. Frequently occurring errors are handled in terms of “malrules.” The system (implemented in JAVA and C++) currently focuses constituent order in German as L2.
  • Harbusch, K., & Kempen, G. (2009). Generating clausal coordinate ellipsis multilingually: A uniform approach based on postediting. In 12th European Workshop on Natural Language Generation: Proceedings of the Workshop (pp. 138-145). The Association for Computational Linguistics.

    Abstract

    Present-day sentence generators are often in-capable of producing a wide variety of well-formed elliptical versions of coordinated clauses, in particular, of combined elliptical phenomena (Gapping, Forward and Back-ward Conjunction Reduction, etc.). The ap-plicability of the various types of clausal co-ordinate ellipsis (CCE) presupposes detailed comparisons of the syntactic properties of the coordinated clauses. These nonlocal comparisons argue against approaches based on local rules that treat CCE structures as special cases of clausal coordination. We advocate an alternative approach where CCE rules take the form of postediting rules ap-plicable to nonelliptical structures. The ad-vantage is not only a higher level of modu-larity but also applicability to languages be-longing to different language families. We describe a language-neutral module (called Elleipo; implemented in JAVA) that gener-ates as output all major CCE versions of co-ordinated clauses. Elleipo takes as input linearly ordered nonelliptical coordinated clauses annotated with lexical identity and coreferentiality relationships between words and word groups in the conjuncts. We dem-onstrate the feasibility of a single set of postediting rules that attains multilingual coverage.
  • Harmon, Z., & Kapatsinski, V. (2015). Studying the dynamics of lexical access using disfluencies. In R. Lickley, & R. Eklund (Eds.), Proceedings of the 7th International Workshop on Disfluency in Spontaneous Speech (DiSS 2015) (pp. 41-44).

    Abstract

    Faced with planning problems related to lexical access, speakers take advantage of a major function of disfluencies: buying time. It is reasonable, then, to expect that the structure of disfluencies sheds light on the mechanisms underlying lexical access. Using data from the Switchboard Corpus, we investigated the effect of semantic competition during lexical access on repetition disfluencies. We hypothesized that the more time the speaker needs to access the following unit, the longer the repetition. We examined the repetitions preceding verbs and nouns and tested predictors influencing the accessibility of these items. Results suggest that speed of lexical access negatively correlates with the length of repetition and that the main determinants of lexical access speed differ for verbs and nouns. Longer disfluencies before verbs appear to be due to significant paradigmatic competition from semantically similar verbs. For nouns, they occur when the noun is relatively unpredictable given the preceding context.
  • Indefrey, P., & Gullberg, M. (Eds.). (2008). Time to speak: Cognitive and neural prerequisites for time in language [Special Issue]. Language Learning, 58(suppl. 1).

    Abstract

    Time is a fundamental aspect of human cognition and action. All languages have developed rich means to express various facets of time, such as bare time spans, their position on the time line, or their duration. The articles in this volume give an overview of what we know about the neural and cognitive representations of time that speakers can draw on in language. Starting with an overview of the main devices used to encode time in natural language, such as lexical elements, tense and aspect, the research presented in this volume addresses the relationship between temporal language, culture, and thought, the relationship between verb aspect and mental simulations of events, the development of temporal concepts, time perception, the storage and retrieval of temporal information in autobiographical memory, and neural correlates of tense processing and sequence planning. The psychological and neurobiological findings presented here will provide important insights to inform and extend current studies of time in language and in language acquisition.
  • Isaac, A., Matthezing, H., Van der Meij, L., Schlobach, S., Wang, S., & Zinn, C. (2008). Putting ontology alignment in context: Usage, scenarios, deployment and evaluation in a library case. In S. Bechhofer, M. Hauswirth, J. Hoffmann, & M. Koubarakis (Eds.), The semantic web: Research and applications (pp. 402-417). Berlin: Springer.

    Abstract

    Thesaurus alignment plays an important role in realising efficient access to heterogeneous Cultural Heritage data. Current ontology alignment techniques, however, provide only limited value for such access as they consider little if any requirements from realistic use cases or application scenarios. In this paper, we focus on two real-world scenarios in a library context: thesaurus merging and book re-indexing. We identify their particular requirements and describe our approach of deploying and evaluating thesaurus alignment techniques in this context. We have applied our approach for the Ontology Alignment Evaluation Initiative, and report on the performance evaluation of participants’ tools wrt. the application scenario at hand. It shows that evaluations of tools requires significant effort, but when done carefully, brings many benefits.
  • Janse, E. (2009). Hearing and cognitive measures predict elderly listeners' difficulty ignoring competing speech. In M. Boone (Ed.), Proceedings of the International Conference on Acoustics (pp. 1532-1535).
  • Janssen, R., Moisik, S. R., & Dediu, D. (2015). Bézier modelling and high accuracy curve fitting to capture hard palate variation. In Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015). Glasgow, UK: University of Glasgow.

    Abstract

    The human hard palate shows between-subject variation
    that is known to influence articulatory strategies.
    In order to link such variation to human speech, we
    are conducting a cross-sectional MRI study on multiple
    populations. A model based on Bezier curves
    using only three parameters was fitted to hard palate
    MRI tracings using evolutionary computation. The
    fits produced consistently yield high accuracies. For
    future research, this new method may be used to classify
    our MRI data on ethnic origins using e.g., cluster
    analyses. Furthermore, we may integrate our model
    into three-dimensional representations of the vocal
    tract in order to investigate its effect on acoustics and
    cultural transmission.
  • Janzen, G., & Weststeijn, C. (2004). Neural representation of object location and route direction: An fMRI study. NeuroImage, 22(Supplement 1), e634-e635.
  • Janzen, G., & Van Turennout, M. (2004). Neuronale Markierung navigationsrelevanter Objekte im räumlichen Gedächtnis: Ein fMRT Experiment. In D. Kerzel (Ed.), Beiträge zur 46. Tagung experimentell arbeitender Psychologen (pp. 125-125). Lengerich: Pabst Science Publishers.
  • Jesse, A., & Johnson, E. K. (2008). Audiovisual alignment in child-directed speech facilitates word learning. In Proceedings of the International Conference on Auditory-Visual Speech Processing (pp. 101-106). Adelaide, Aust: Causal Productions.

    Abstract

    Adult-to-child interactions are often characterized by prosodically-exaggerated speech accompanied by visually captivating co-speech gestures. In a series of adult studies, we have shown that these gestures are linked in a sophisticated manner to the prosodic structure of adults' utterances. In the current study, we use the Preferential Looking Paradigm to demonstrate that two-year-olds can use the alignment of these gestures to speech to deduce the meaning of words.
  • Jesse, A., & Janse, E. (2009). Visual speech information aids elderly adults in stream segregation. In B.-J. Theobald, & R. Harvey (Eds.), Proceedings of the International Conference on Auditory-Visual Speech Processing 2009 (pp. 22-27). Norwich, UK: School of Computing Sciences, University of East Anglia.

    Abstract

    Listening to a speaker while hearing another speaker talks is a challenging task for elderly listeners. We show that elderly listeners over the age of 65 with various degrees of age-related hearing loss benefit in this situation from also seeing the speaker they intend to listen to. In a phoneme monitoring task, listeners monitored the speech of a target speaker for either the phoneme /p/ or /k/ while simultaneously hearing a competing speaker. Critically, on some trials, the target speaker was also visible. Elderly listeners benefited in their response times and accuracy levels from seeing the target speaker when monitoring for the less visible /k/, but more so when monitoring for the highly visible /p/. Visual speech therefore aids elderly listeners not only by providing segmental information about the target phoneme, but also by providing more global information that allows for better performance in this adverse listening situation.
  • Johns, T. G., Perera, R. M., Vitali, A. A., Vernes, S. C., & Scott, A. (2004). Phosphorylation of a glioma-specific mutation of the EGFR [Abstract]. Neuro-Oncology, 6, 317.

    Abstract

    Mutations of the epidermal growth factor receptor (EGFR) gene are found at a relatively high frequency in glioma, with the most common being the de2-7 EGFR (or EGFRvIII). This mutation arises from an in-frame deletion of exons 2-7, which removes 267 amino acids from the extracellular domain of the receptor. Despite being unable to bind ligand, the de2-7 EGFR is constitutively active at a low level. Transfection of human glioma cells with the de2-7 EGFR has little effect in vitro, but when grown as tumor xenografts this mutated receptor imparts a dramatic growth advantage. We mapped the phosphorylation pattern of de2-7 EGFR, both in vivo and in vitro, using a panel of antibodies specific for different phosphorylated tyrosine residues. Phosphorylation of de2-7 EGFR was detected constitutively at all tyrosine sites surveyed in vitro and in vivo, including tyrosine 845, a known target in the wild-type EGFR for src kinase. There was a substantial upregulation of phosphorylation at every yrosine residue of the de2-7 EGFR when cells were grown in vivo compared to the receptor isolated from cells cultured in vitro. Upregulation of phosphorylation at tyrosine 845 could be stimulated in vitro by the addition of specific components of the ECM via an integrindependent mechanism. These observations may partially explain why the growth enhancement mediated by de2-7 EGFR is largely restricted to the in vivo environment
  • Kempen, G., & Harbusch, K. (1998). A 'tree adjoining' grammar without adjoining: The case of scrambling in German. In Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4).
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses? A corpus study revealing unexpected rigidity. In S. Kepser, & M. Reis (Eds.), Pre-Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: Niemeyer.
  • Kempen, G. (2004). Interactive visualization of syntactic structure assembly for grammar-intensive first- and second-language instruction. In R. Delmonte, P. Delcloque, & S. Tonelli (Eds.), Proceedings of InSTIL/ICALL2004 Symposium on NLP and speech technologies in advanced language learning systems (pp. 183-186). Venice: University of Venice.
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses?: A corpus study revealing unexpected rigidity. In Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: University of Tübingen.
  • Kempen, G. (2004). Human grammatical coding: Shared structure formation resources for grammatical encoding and decoding. In Cuny 2004 - The 17th Annual CUNY Conference on Human Sentence Processing. March 25-27, 2004. University of Maryland (pp. 66).
  • Kemps-Snijders, M., Klassmann, A., Zinn, C., Berck, P., Russel, A., & Wittenburg, P. (2008). Exploring and enriching a language resource archive via the web. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    The ”download first, then process paradigm” is still the predominant working method amongst the research community. The web-based paradigm, however, offers many advantages from a tool development and data management perspective as they allow a quick adaptation to changing research environments. Moreover, new ways of combining tools and data are increasingly becoming available and will eventually enable a true web-based workflow approach, thus challenging the ”download first, then process” paradigm. The necessary infrastructure for managing, exploring and enriching language resources via the Web will need to be delivered by projects like CLARIN and DARIAH
  • Kemps-Snijders, M., Zinn, C., Ringersma, J., & Windhouwer, M. (2008). Ensuring semantic interoperability on lexical resources. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    In this paper, we describe a unifying approach to tackle data heterogeneity issues for lexica and related resources. We present LEXUS, our software that implements the Lexical Markup Framework (LMF) to uniformly describe and manage lexica of different structures. LEXUS also makes use of a central Data Category Registry (DCR) to address terminological issues with regard to linguistic concepts as well as the handling of working and object languages. Finally, we report on ViCoS, a LEXUS extension, providing support for the definition of arbitrary semantic relations between lexical entries or parts thereof.
  • Kemps-Snijders, M., Windhouwer, M., Wittenburg, P., & Wright, S. E. (2008). ISOcat: Corralling data categories in the wild. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    To achieve true interoperability for valuable linguistic resources different levels of variation need to be addressed. ISO Technical Committee 37, Terminology and other language and content resources, is developing a Data Category Registry. This registry will provide a reusable set of data categories. A new implementation, dubbed ISOcat, of the registry is currently under construction. This paper shortly describes the new data model for data categories that will be introduced in this implementation. It goes on with a sketch of the standardization process. Completed data categories can be reused by the community. This is done by either making a selection of data categories using the ISOcat web interface, or by other tools which interact with the ISOcat system using one of its various Application Programming Interfaces. Linguistic resources that use data categories from the registry should include persistent references, e.g. in the metadata or schemata of the resource, which point back to their origin. These data category references can then be used to determine if two or more resources share common semantics, thus providing a level of interoperability close to the source data and a promising layer for semantic alignment on higher levels
  • Khetarpal, N., Majid, A., & Regier, T. (2009). Spatial terms reflect near-optimal spatial categories. In N. Taatgen, & H. Van Rijn (Eds.), Proceedings of the Thirty-First Annual Conference of the Cognitive Science Society (pp. 2396-2401). Austin, TX: Cognitive Science Society.

    Abstract

    Spatial terms in the world’s languages appear to reflect both universal conceptual tendencies and linguistic convention. A similarly mixed picture in the case of color naming has been accounted for in terms of near-optimal partitions of color space. Here, we demonstrate that this account generalizes to spatial terms. We show that the spatial terms of 9 diverse languages near-optimally partition a similarity space of spatial meanings, just as color terms near-optimally partition color space. This account accommodates both universal tendencies and cross-language differences in spatial category extension, and identifies general structuring principles that appear to operate across different semantic domains.
  • Kita, S., van Gijn, I., & van der Hulst, H. (1998). Movement phases in signs and co-speech gestures, and their transcription by human coders. In Gesture and Sign-Language in Human-Computer Interaction (Lecture Notes in Artificial Intelligence - LNCS Subseries, Vol. 1371) (pp. 23-35). Berlin, Germany: Springer-Verlag.

    Abstract

    The previous literature has suggested that the hand movement in co-speech gestures and signs consists of a series of phases with qualitatively different dynamic characteristics. In this paper, we propose a syntagmatic rule system for movement phases that applies to both co-speech gestures and signs. Descriptive criteria for the rule system were developed for the analysis video-recorded continuous production of signs and gesture. It involves segmenting a stream of body movement into phases and identifying different phase types. Two human coders used the criteria to analyze signs and cospeech gestures that are produced in natural discourse. It was found that the criteria yielded good inter-coder reliability. These criteria can be used for the technology of automatic recognition of signs and co-speech gestures in order to segment continuous production and identify the potentially meaningbearing phase.
  • Klein, W. (Ed.). (2004). Philologie auf neuen Wegen [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 136.
  • Klein, W. (Ed.). (2004). Universitas [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik (LiLi), 134.
  • Klein, W., & Schnell, R. (Eds.). (2008). Literaturwissenschaft und Linguistik [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (150).
  • Klein, W. (Ed.). (1983). Intonation [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (49).
  • Klein, W. (Ed.). (2008). Ist Schönheit messbar? [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 152.
  • Klein, W. (Ed.). (1998). Kaleidoskop [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (112).
  • Klein, W. (Ed.). (1984). Textverständlichkeit - Textverstehen [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (55).
  • Klein, W. (Ed.). (1985). Schriftlichkeit [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (59).
  • Klein, W., & Dimroth, C. (Eds.). (2009). Worauf kann sich der Sprachunterricht stützen? [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, 153.
  • Koch, X., & Janse, E. (2015). Effects of age and hearing loss on articulatory precision for sibilants. In M. Wolters, J. Livingstone, B. Beattie, R. Smith, M. MacMahon, J. Stuart-Smith, & J. Scobbie (Eds.), Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015). London: International Phonetic Association.

    Abstract

    This study investigates the effects of adult age and speaker abilities on articulatory precision for sibilant productions. Normal-hearing young adults with
    better sibilant discrimination have been shown to produce greater spectral sibilant contrasts. As reduced auditory feedback may gradually impact on feedforward
    commands, we investigate whether articulatory precision as indexed by spectral mean for [s] and [S] decreases with age, and more particularly with agerelated
    hearing loss. Younger, middle-aged and older adults read aloud words starting with the sibilants [s] or [S]. Possible effects of cognitive, perceptual, linguistic and sociolinguistic background variables
    on the sibilants’ acoustics were also investigated. Sibilant contrasts were less pronounced for male than female speakers. Most importantly, for the fricative
    [s], the spectral mean was modulated by individual high-frequency hearing loss, but not age. These results underscore that even mild hearing loss already affects articulatory precision.
  • Koenig, A., Ringersma, J., & Trilsbeek, P. (2009). The Language Archiving Technology domain. In Z. Vetulani (Ed.), Human Language Technologies as a Challenge for Computer Science and Linguistics (pp. 295-299).

    Abstract

    The Max Planck Institute for Psycholinguistics (MPI) manages an archive of linguistic research data with a current size of almost 20 Terabytes. Apart from in-house researchers other projects also store their data in the archive, most notably the Documentation of Endangered Languages (DoBeS) projects. The archive is available online and can be accessed by anybody with Internet access. To be able to manage this large amount of data the MPI's technical group has developed a software suite called Language Archiving Technology (LAT) that on the one hand helps researchers and archive managers to manage the data and on the other hand helps users in enriching their primary data with additional layers. All the MPI software is Java-based and developed according to open source principles (GNU, 2007). All three major operating systems (Windows, Linux, MacOS) are supported and the software works similarly on all of them. As the archive is online, many of the tools, especially the ones for accessing the data, are browser based. Some of these browser-based tools make use of Adobe Flex to create nice-looking GUIs. The LAT suite is a complete set of management and enrichment tools, and given the interaction between the tools the result is a complete LAT software domain. Over the last 10 years, this domain has proven its functionality and use, and is being deployed to servers in other institutions. This deployment is an important step in getting the archived resources back to the members of the speech communities whose languages are documented. In the paper we give an overview of the tools of the LAT suite and we describe their functionality and role in the integrated process of archiving, management and enrichment of linguistic data.
  • Kreuzer, H. (Ed.). (1971). Methodische Perspektiven [Special Issue]. Zeitschrift für Literaturwissenschaft und Linguistik, (1/2).
  • Lausberg, H., & Sloetjes, H. (2009). NGCS/ELAN - Coding movement behaviour in psychotherapy [Meeting abstract]. PPmP - Psychotherapie · Psychosomatik · Medizinische Psychologie, 59: A113, 103.

    Abstract

    Individual and interactive movement behaviour (non-verbal behaviour / communication) specifically reflects implicit processes in psychotherapy [1,4,11]. However, thus far, the registration of movement behaviour has been a methodological challenge. We will present a coding system combined with an annotation tool for the analysis of movement behaviour during psychotherapy interviews [9]. The NGCS coding system enables to classify body movements based on their kinetic features alone [5,7]. The theoretical assumption behind the NGCS is that its main kinetic and functional movement categories are differentially associated with specific psychological functions and thus, have different neurobiological correlates [5-8]. ELAN is a multimodal annotation tool for digital video media [2,3,12]. The NGCS / ELAN template enables to link any movie to the same coding system and to have different raters independently work on the same file. The potential of movement behaviour analysis as an objective tool for psychotherapy research and for supervision in the psychosomatic practice is discussed by giving examples of the NGCS/ELAN analyses of psychotherapy sessions. While the quality of kinetic turn-taking and the therapistrsquor;s (implicit) adoption of the patientrsquor;s movements may predict therapy outcome, changes in the patientrsquor;s movement behaviour pattern may indicate changes in cognitive concepts and emotional states and thus, may help to identify therapeutically relevant processes [10].
  • Lenkiewicz, P., Pereira, M., Freire, M. M., & Fernandes, J. (2009). A new 3D image segmentation method for parallel architectures. In Proceedings of the 2009 IEEE International Conference on Multimedia and Expo [ICME 2009] June 28 – July 3, 2009, New York (pp. 1813-1816).

    Abstract

    This paper presents a novel model for 3D image segmentation and reconstruction. It has been designed with the aim to be implemented over a computer cluster or a multi-core platform. The required features include a nearly absolute independence between the processes participating in the segmentation task and providing amount of work as equal as possible for all the participants. As a result, it is avoid many drawbacks often encountered when performing a parallelization of an algorithm that was constructed to operate in a sequential manner. Furthermore, the proposed algorithm based on the new segmentation model is efficient and shows a very good, nearly linear performance growth along with the growing number of processing units.
  • Lenkiewicz, P., Pereira, M., Freire, M., & Fernandes, J. (2008). Accelerating 3D medical image segmentation with high performance computing. In Proceedings of the IEEE International Workshops on Image Processing Theory, Tools and Applications - IPT (pp. 1-8).

    Abstract

    Digital processing of medical images has helped physicians and patients during past years by allowing examination and diagnosis on a very precise level. Nowadays possibly the biggest deal of support it can offer for modern healthcare is the use of high performance computing architectures to treat the huge amounts of data that can be collected by modern acquisition devices. This paper presents a parallel processing implementation of an image segmentation algorithm that operates on a computer cluster equipped with 10 processing units. Thanks to well-organized distribution of the workload we manage to significantly shorten the execution time of the developed algorithm and reach a performance gain very close to linear.
  • Lenkiewicz, P., Pereira, M., Freire, M., & Fernandes, J. (2009). The dynamic topology changes model for unsupervised image segmentation. In Proceedings of the 11th IEEE International Workshop on Multimedia Signal Processing (MMSP'09) (pp. 1-5).

    Abstract

    Deformable models are a popular family of image segmentation techniques, which has been gaining significant focus in the last two decades, serving both for real-world applications as well as the base for research work. One of the features that the deformable models offer and that is considered a much desired one, is the ability to change their topology during the segmentation process. Using this characteristic it is possible to perform segmentation of objects with discontinuities in their bodies or to detect an undefined number of objects in the scene. In this paper we present our model for handling the topology changes in image segmentation methods based on the Active Volumes solution. The said model is capable of performing the changes in the structure of objects while the segmentation progresses, what makes it efficient and suitable for implementations over powerful execution environment, like multi-core architectures or computer clusters.
  • Lenkiewicz, P., Pereira, M., Freire, M., & Fernandes, J. (2009). The whole mesh Deformation Model for 2D and 3D image segmentation. In Proceedings of the 2009 IEEE International Conference on Image Processing (ICIP 2009) (pp. 4045-4048).

    Abstract

    In this paper we present a novel approach for image segmentation using Active Nets and Active Volumes. Those solutions are based on the Deformable Models, with slight difference in the method for describing the shapes of interests - instead of using a contour or a surface they represented the segmented objects with a mesh structure, which allows to describe not only the surface of the objects but also to model their interiors. This is obtained by dividing the nodes of the mesh in two categories, namely internal and external ones, which will be responsible for two different tasks. In our new approach we propose to negate this separation and use only one type of nodes. Using that assumption we manage to significantly shorten the time of segmentation while maintaining its quality.
  • Levelt, W. J. M. (1991). Lexical access in speech production: Stages versus cascading. In H. Peters, W. Hulstijn, & C. Starkweather (Eds.), Speech motor control and stuttering (pp. 3-10). Amsterdam: Excerpta Medica.
  • Levelt, W. J. M. (1984). Spontaneous self-repairs in speech: Processes and representations. In M. P. R. Van den Broecke, & A. Cohen (Eds.), Proceedings of the 10th International Congress of Phonetic Sciences (pp. 105-117). Dordrecht: Foris.
  • Levelt, W. J. M. (1983). The speaker's organization of discourse. In Proceedings of the XIIIth International Congress of Linguists (pp. 278-290).
  • Little, H., Eryılmaz, K., & de Boer, B. (2015). A new artificial sign-space proxy for investigating the emergence of structure and categories in speech. In The Scottish Consortium for ICPhS 2015 (Ed.), The proceedings of the 18th International Congress of Phonetic Sciences. (ICPhS 2015).
  • Little, H., Eryılmaz, K., & de Boer, B. (2015). Linguistic modality affects the creation of structure and iconicity in signals. In D. C. Noelle, R. Dale, A. S. Warlaumont, J. Yoshimi, T. Matlock, C. Jennings, & P. Maglio (Eds.), The 37th annual meeting of the Cognitive Science Society (CogSci 2015) (pp. 1392-1398). Austin, TX: Cognitive Science Society.

    Abstract

    Different linguistic modalities (speech or sign) offer different levels at which signals can iconically represent the world. One hypothesis argues that this iconicity has an effect on how linguistic structure emerges. However, exactly how and why these effects might come about is in need of empirical investigation. In this contribution, we present a signal creation experiment in which both the signalling space and the meaning space are manipulated so that different levels and types of iconicity are available between the signals and meanings. Signals are produced using an infrared sensor that detects the hand position of participants to generate auditory feedback. We find evidence that iconicity may be maladaptive for the discrimination of created signals. Further, we implemented Hidden Markov Models to characterise the structure within signals, which was also used to inform a metric for iconicity.
  • Lucas, C., Griffiths, T., Xu, F., & Fawcett, C. (2008). A rational model of preference learning and choice prediction by children. In D. Koller, Y. Bengio, D. Schuurmans, L. Bottou, & A. Culotta (Eds.), Advances in Neural Information Processing Systems.

    Abstract

    Young children demonstrate the ability to make inferences about the preferences of other agents based on their choices. However, there exists no overarching account of what children are doing when they learn about preferences or how they use that knowledge. We use a rational model of preference learning, drawing on ideas from economics and computer science, to explain the behavior of children in several recent experiments. Specifically, we show how a simple econometric model can be extended to capture two- to four-year-olds’ use of statistical information in inferring preferences, and their generalization of these preferences.
  • Magyari, L., & De Ruiter, J. P. (2008). Timing in conversation: The anticipation of turn endings. In J. Ginzburg, P. Healey, & Y. Sato (Eds.), Proceedings of the 12th Workshop on the Semantics and Pragmatics Dialogue (pp. 139-146). London: King's college.

    Abstract

    We examined how communicators can switch between speaker and listener role with such accurate timing. During conversations, the majority of role transitions happens with a gap or overlap of only a few hundred milliseconds. This suggests that listeners can predict when the turn of the current speaker is going to end. Our hypothesis is that listeners know when a turn ends because they know how it ends. Anticipating the last words of a turn can help the next speaker in predicting when the turn will end, and also in anticipating the content of the turn, so that an appropriate response can be prepared in advance. We used the stimuli material of an earlier experiment (De Ruiter, Mitterer & Enfield, 2006), in which subjects were listening to turns from natural conversations and had to press a button exactly when the turn they were listening to ended. In the present experiment, we investigated if the subjects can complete those turns when only an initial fragment of the turn is presented to them. We found that the subjects made better predictions about the last words of those turns that had more accurate responses in the earlier button press experiment.
  • Majid, A., Van Staden, M., & Enfield, N. J. (2004). The human body in cognition, brain, and typology. In K. Hovie (Ed.), Forum Handbook, 4th International Forum on Language, Brain, and Cognition - Cognition, Brain, and Typology: Toward a Synthesis (pp. 31-35). Sendai: Tohoku University.

    Abstract

    The human body is unique: it is both an object of perception and the source of human experience. Its universality makes it a perfect resource for asking questions about how cognition, brain and typology relate to one another. For example, we can ask how speakers of different languages segment and categorize the human body. A dominant view is that body parts are “given” by visual perceptual discontinuities, and that words are merely labels for these visually determined parts (e.g., Andersen, 1978; Brown, 1976; Lakoff, 1987). However, there are problems with this view. First it ignores other perceptual information, such as somatosensory and motoric representations. By looking at the neural representations of sesnsory representations, we can test how much of the categorization of the human body can be done through perception alone. Second, we can look at language typology to see how much universality and variation there is in body-part categories. A comparison of a range of typologically, genetically and areally diverse languages shows that the perceptual view has only limited applicability (Majid, Enfield & van Staden, in press). For example, using a “coloring-in” task, where speakers of seven different languages were given a line drawing of a human body and asked to color in various body parts, Majid & van Staden (in prep) show that languages vary substantially in body part segmentation. For example, Jahai (Mon-Khmer) makes a lexical distinction between upper arm, lower arm, and hand, but Lavukaleve (Papuan Isolate) has just one word to refer to arm, hand, and leg. This shows that body part categorization is not a straightforward mapping of words to visually determined perceptual parts.
  • Majid, A., Van Staden, M., Boster, J. S., & Bowerman, M. (2004). Event categorization: A cross-linguistic perspective. In K. Forbus, D. Gentner, & T. Tegier (Eds.), Proceedings of the 26th Annual Meeting of the Cognitive Science Society (pp. 885-890). Mahwah, NJ: Erlbaum.

    Abstract

    Many studies in cognitive science address how people categorize objects, but there has been comparatively little research on event categorization. This study investigated the categorization of events involving material destruction, such as “cutting” and “breaking”. Speakers of 28 typologically, genetically, and areally diverse languages described events shown in a set of video-clips. There was considerable cross-linguistic agreement in the dimensions along which the events were distinguished, but there was variation in the number of categories and the placement of their boundaries.
  • Majid, A., Jordan, F., & Dunn, M. (Eds.). (2015). Semantic systems in closely related languages [Special Issue]. Language Sciences, 49.
  • Matsuo, A. (2004). Young children's understanding of ongoing vs. completion in present and perfective participles. In J. v. Kampen, & S. Baauw (Eds.), Proceedings of GALA 2003 (pp. 305-316). Utrecht: Netherlands Graduate School of Linguistics (LOT).
  • McCafferty, S. G., & Gullberg, M. (Eds.). (2008). Gesture and SLA: Toward an integrated approach [Special Issue]. Studies in Second Language Acquisition, 30(2).
  • McDonough, J., Lehnert-LeHouillier, H., & Bardhan, N. P. (2009). The perception of nasalized vowels in American English: An investigation of on-line use of vowel nasalization in lexical access. In Nasal 2009.

    Abstract

    The goal of the presented study was to investigate the use of coarticulatory vowel nasalization in lexical access by native speakers of American English. In particular, we compare the use of coart culatory place of articulation cues to that of coarticulatory vowel nasalization. Previous research on lexical access has shown that listeners use cues to the place of articulation of a postvocalic stop in the preceding vowel. However, vowel nasalization as cue to an upcoming nasal consonant has been argued to be a more complex phenomenon. In order to establish whether coarticulatory vowel nasalization aides in the process of lexical access in the same way as place of articulation cues do, we conducted two perception experiments: an off-line 2AFC discrimination task and an on-line eyetracking study using the visual world paradigm. The results of our study suggest that listeners are indeed able to use vowel nasalization in similar ways to place of articulation information, and that both types of cues aide in lexical access.
  • McQueen, J. M., & Cutler, A. (1998). Spotting (different kinds of) words in (different kinds of) context. In R. Mannell, & J. Robert-Ribes (Eds.), Proceedings of the Fifth International Conference on Spoken Language Processing: Vol. 6 (pp. 2791-2794). Sydney: ICSLP.

    Abstract

    The results of a word-spotting experiment are presented in which Dutch listeners tried to spot different types of bisyllabic Dutch words embedded in different types of nonsense contexts. Embedded verbs were not reliably harder to spot than embedded nouns; this suggests that nouns and verbs are recognised via the same basic processes. Iambic words were no harder to spot than trochaic words, suggesting that trochaic words are not in principle easier to recognise than iambic words. Words were harder to spot in consonantal contexts (i.e., contexts which themselves could not be words) than in longer contexts which contained at least one vowel (i.e., contexts which, though not words, were possible words of Dutch). A control experiment showed that this difference was not due to acoustic differences between the words in each context. The results support the claim that spoken-word recognition is sensitive to the viability of sound sequences as possible words.
  • Mitterer, H. (2008). How are words reduced in spontaneous speech? In A. Botonis (Ed.), Proceedings of ISCA Tutorial and Research Workshop On Experimental Linguistics (pp. 165-168). Athens: University of Athens.

    Abstract

    Words are reduced in spontaneous speech. If reductions are constrained by functional (i.e., perception and production) constraints, they should not be arbitrary. This hypothesis was tested by examing the pronunciations of high- to mid-frequency words in a Dutch and a German spontaneous speech corpus. In logistic-regression models the "reduction likelihood" of a phoneme was predicted by fixed-effect predictors such as position within the word, word length, word frequency, and stress, as well as random effects such as phoneme identity and word. The models for Dutch and German show many communalities. This is in line with the assumption that similar functional constraints influence reductions in both languages.
  • Moers, C., Janse, E., & Meyer, A. S. (2015). Probabilistic reduction in reading aloud: A comparison of younger and older adults. In M. Wolters, J. Livingstone, B. Beattie, R. Smith, M. MacMahon, J. Stuart-Smith, & J. Scobbie (Eds.), Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015). London: International Phonetics Association.

    Abstract

    Frequent and predictable words are generally pronounced with less effort and are therefore acoustically more reduced than less frequent or unpredictable words. Local predictability can be operationalised by Transitional Probability (TP), which indicates how likely a word is to occur given its immediate context. We investigated whether and how probabilistic reduction effects on word durations change with adult age when reading aloud content words embedded in sentences. The results showed equally large frequency effects on verb and noun durations for both younger (Mage = 20 years) and older (Mage = 68 years) adults. Backward TP also affected word duration for younger and older adults alike. ForwardTP, however, had no significant effect on word duration in either age group. Our results resemble earlier findings of more robust BackwardTP effects compared to ForwardTP effects. Furthermore, unlike often reported decline in predictive processing with aging, probabilistic reduction effects remain stable across adulthood.
  • Moisik, S. R., & Dediu, D. (2015). Anatomical biasing and clicks: Preliminary biomechanical modelling. In H. Little (Ed.), Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015) Satellite Event: The Evolution of Phonetic Capabilities: Causes constraints, consequences (pp. 8-13). Glasgow: ICPhS.

    Abstract

    It has been observed by several researchers that the Khoisan palate tends to lack a prominent alveolar ridge. A preliminary biomechanical model of click production was created to examine if these sounds might be subject to an anatomical bias associated with alveolar ridge size. Results suggest the bias is plausible, taking the form of decreased articulatory effort and improved volume change characteristics, however, further modelling and experimental research is required to solidify the claim.
  • Morano, L., Ernestus, M., & Ten Bosch, L. (2015). Schwa reduction in low-proficiency L2 speakers: Learning and generalization. In Scottish consortium for ICPhS, M. Wolters, J. Livingstone, B. Beattie, R. Smith, M. MacMahon, J. Stuart-Smith, & J. Scobbie (Eds.), Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015). Glasgow: University of Glasgow.

    Abstract

    This paper investigated the learnability and generalizability of French schwa alternation by Dutch low-proficiency second language learners. We trained 40 participants on 24 new schwa words by exposing them equally often to the reduced and full forms of these words. We then assessed participants' accuracy and reaction times to these newly learnt words as well as 24 previously encountered schwa words with an auditory lexical decision task. Our results show learning of the new words in both forms. This suggests that lack of exposure is probably the main cause of learners' difficulties with reduced forms. Nevertheless, the full forms were slightly better recognized than the reduced ones, possibly due to phonetic and phonological properties of the reduced forms. We also observed no generalization to previously encountered words, suggesting that our participants stored both of the learnt word forms and did not create a rule that applies to all schwa words.
  • Mulder, K., Brekelmans, G., & Ernestus, M. (2015). The processing of schwa reduced cognates and noncognates in non-native listeners of English. In Scottish consortium for ICPhS, M. Wolters, J. Livingstone, B. Beattie, R. Smith, M. MacMahon, J. Stuart-Smith, & J. Scobbie (Eds.), Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015). Glasgow: University of Glasgow.

    Abstract

    In speech, words are often reduced rather than fully pronounced (e.g., (/ˈsʌmri/ for /ˈsʌməri/, summary). Non-native listeners may have problems in processing these reduced forms, because they have encountered them less often. This paper addresses the question whether this also holds for highly proficient non-natives and for words with similar forms and meanings in the non-natives' mother tongue (i.e., cognates). In an English auditory lexical decision task, natives and highly proficient Dutch non-natives of English listened to cognates and non-cognates that were presented in full or without their post-stress schwa. The data show that highly proficient learners are affected by reduction as much as native speakers. Nevertheless, the two listener groups appear to process reduced forms differently, because non-natives produce more errors on reduced cognates than on non-cognates. While listening to reduced forms, non-natives appear to be hindered by the co-activated lexical representations of cognate forms in their native language.
  • Musgrave, S., & Cutfield, S. (2009). Language documentation and an Australian National Corpus. In M. Haugh, K. Burridge, J. Mulder, & P. Peters (Eds.), Selected proceedings of the 2008 HCSNet Workshop on Designing the Australian National Corpus: Mustering Languages (pp. 10-18). Somerville: Cascadilla Proceedings Project.

    Abstract

    Corpus linguistics and language documentation are usually considered separate subdisciplines within linguistics, having developed from different traditions and often operating on different scales, but the authors will suggest that there are commonalities to the two: both aim to represent language use in a community, and both are concerned with managing digital data. The authors propose that the development of the Australian National Corpus (AusNC) be guided by the experience of language documentation in the management of multimodal digital data and its annotation, and in ethical issues pertaining to making the data accessible. This would allow an AusNC that is distributed, multimodal, and multilingual, with holdings of text, audio, and video data distributed across multiple institutions; and including Indigenous, sign, and migrant community languages. An audit of language material held by Australian institutions and individuals is necessary to gauge the diversity and volume of possible content, and to inform common technical standards.
  • Neger, T. M., Rietveld, T., & Janse, E. (2015). Adult age effects in auditory statistical learning. In M. Wolters, J. Livingstone, B. Beattie, R. Smith, M. MacMahon, J. Stuart-Smith, & J. Scobbie (Eds.), Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015). London: International Phonetic Association.

    Abstract

    Statistical learning plays a key role in language processing, e.g., for speech segmentation. Older adults have been reported to show less statistical learning on the basis of visual input than younger adults. Given age-related changes in perception and cognition, we investigated whether statistical learning is also impaired in the auditory modality in older compared to younger adults and whether individual learning ability is associated with measures of perceptual (i.e., hearing sensitivity) and cognitive functioning in both age groups. Thirty younger and thirty older adults performed an auditory artificial-grammar-learning task to assess their statistical learning ability. In younger adults, perceptual effort came at the cost of processing resources required for learning. Inhibitory control (as indexed by Stroop colornaming performance) did not predict auditory learning. Overall, younger and older adults showed the same amount of auditory learning, indicating that statistical learning ability is preserved over the adult life span.
  • Nijland, L., & Janse, E. (Eds.). (2009). Auditory processing in speakers with acquired or developmental language disorders [Special Issue]. Clinical Linguistics and Phonetics, 23(3).
  • Nijveld, A., Ten Bosch, L., & Ernestus, M. (2015). Exemplar effects arise in a lexical decision task, but only under adverse listening conditions. In Scottish consortium for ICPhS, M. Wolters, J. Livingstone, B. Beattie, R. Smith, M. MacMahon, J. Stuart-Smith, & J. Scobbie (Eds.), Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015). Glasgow: University of Glasgow.

    Abstract

    This paper studies the influence of adverse listening conditions on exemplar effects in priming experiments that do not instruct participants to use their episodic memories. We conducted two lexical decision experiments, in which a prime and a target represented the same word type and could be spoken by the same or a different speaker. In Experiment 1, participants listened to clear speech, and showed no exemplar effects: they recognised repetitions by the same speaker as quickly as different speaker repetitions. In Experiment 2, the stimuli contained noise, and exemplar effects did arise. Importantly, Experiment 1 elicited longer average RTs than Experiment 2, a result that contradicts the time-course hypothesis, according to which exemplars only play a role when processing is slow. Instead, our findings support the hypothesis that exemplar effects arise under adverse listening conditions, when participants are stimulated to use their episodic memories in addition to their mental lexicons.
  • Ozturk, O., & Papafragou, A. (2008). Acquisition of evidentiality and source monitoring. In H. Chan, H. Jacob, & E. Kapia (Eds.), Proceedings from the 32nd Annual Boston University Conference on Language Development [BUCLD 32] (pp. 368-377). Somerville, Mass.: Cascadilla Press.
  • Ozyurek, A. (1998). An analysis of the basic meaning of Turkish demonstratives in face-to-face conversational interaction. In S. Santi, I. Guaitella, C. Cave, & G. Konopczynski (Eds.), Oralite et gestualite: Communication multimodale, interaction: actes du colloque ORAGE 98 (pp. 609-614). Paris: L'Harmattan.
  • Pacheco, A., Araújo, S., Faísca, L., Petersson, K. M., & Reis, A. (2009). Profiling dislexic children: Phonology and visual naming skills. In Abstracts presented at the International Neuropsychological Society, Finnish Neuropsychological Society, Joint Mid-Year Meeting July 29-August 1, 2009. Helsinki, Finland & Tallinn, Estonia (pp. 40). Retrieved from http://www.neuropsykologia.fi/ins2009/INS_MY09_Abstract.pdf.
  • Peeters, D., Snijders, T. M., Hagoort, P., & Ozyurek, A. (2015). The role of left inferior frontal Gyrus in the integration of point- ing gestures and speech. In G. Ferré, & M. Tutton (Eds.), Proceedings of the4th GESPIN - Gesture & Speech in Interaction Conference. Nantes: Université de Nantes.

    Abstract

    Comprehension of pointing gestures is fundamental to human communication. However, the neural mechanisms
    that subserve the integration of pointing gestures and speech in visual contexts in comprehension
    are unclear. Here we present the results of an fMRI study in which participants watched images of an
    actor pointing at an object while they listened to her referential speech. The use of a mismatch paradigm
    revealed that the semantic unication of pointing gesture and speech in a triadic context recruits left
    inferior frontal gyrus. Complementing previous ndings, this suggests that left inferior frontal gyrus
    semantically integrates information across modalities and semiotic domains.
  • Perlman, M., Paul, J., & Lupyan, G. (2015). Congenitally deaf children generate iconic vocalizations to communicate magnitude. In D. C. Noelle, R. Dale, A. S. Warlaumont, J. Yoshimi, T. Matlock, C. D. Jennings, & P. R. Maglio (Eds.), Proceedings of the 37th Annual Cognitive Science Society Meeting (CogSci 2015) (pp. 315-320). Austin, TX: Cognitive Science Society.

    Abstract

    From an early age, people exhibit strong links between certain visual (e.g. size) and acoustic (e.g. duration) dimensions. Do people instinctively extend these crossmodal correspondences to vocalization? We examine the ability of congenitally deaf Chinese children and young adults (age M = 12.4 years, SD = 3.7 years) to generate iconic vocalizations to distinguish items with contrasting magnitude (e.g., big vs. small ball). Both deaf and hearing (M = 10.1 years, SD = 0.83 years) participants produced longer, louder vocalizations for greater magnitude items. However, only hearing participants used pitch—higher pitch for greater magnitude – which counters the hypothesized, innate size “frequency code”, but fits with Mandarin language and culture. Thus our results show that the translation of visible magnitude into the duration and intensity of vocalization transcends auditory experience, whereas the use of pitch appears more malleable to linguistic and cultural influence.
  • Perniss, P. M., Ozyurek, A., & Morgan, G. (Eds.). (2015). The influence of the visual modality on language structure and conventionalization: Insights from sign language and gesture [Special Issue]. Topics in Cognitive Science, 7(1). doi:10.1111/tops.12113.
  • Perry, L., Perlman, M., & Lupyan, G. (2015). Iconicity in English vocabulary and its relation to toddlers’ word learning. In D. C. Noelle, R. Dale, A. S. Warlaumont, J. Yoshimi, T. Matlock, C. D. Jennings, & P. R. Maglio (Eds.), Proceedings of the 37th Annual Cognitive Science Society Meeting (CogSci 2015) (pp. 315-320). Austin, TX: Cognitive Science Society.

    Abstract

    Scholars have documented substantial classes of iconic vocabulary in many non-Indo-European languages. In comparison, Indo-European languages like English are assumed to be arbitrary outside of a small number of onomatopoeic words. In three experiments, we asked English speakers to rate the iconicity of words from the MacArthur-Bates Communicative Developmental Inventory. We found English—contrary to common belief—exhibits iconicity that correlates with age of acquisition and differs across lexical classes. Words judged as most iconic are learned earlier, in accord with findings that iconic words are easier to learn. We also find that adjectives and verbs are more iconic than nouns, supporting the idea that iconicity provides an extra cue in learning more difficult abstract meanings. Our results provide new evidence for a relationship between iconicity and word learning and suggest iconicity may be a more pervasive property of spoken languages than previously thought.
  • Petersson, K. M. (2008). On cognition, structured sequence processing, and adaptive dynamical systems. American Institute of Physics Conference Proceedings, 1060(1), 195-200.

    Abstract

    Cognitive neuroscience approaches the brain as a cognitive system: a system that functionally is conceptualized in terms of information processing. We outline some aspects of this concept and consider a physical system to be an information processing device when a subclass of its physical states can be viewed as representational/cognitive and transitions between these can be conceptualized as a process operating on these states by implementing operations on the corresponding representational structures. We identify a generic and fundamental problem in cognition: sequentially organized structured processing. Structured sequence processing provides the brain, in an essential sense, with its processing logic. In an approach addressing this problem, we illustrate how to integrate levels of analysis within a framework of adaptive dynamical systems. We note that the dynamical system framework lends itself to a description of asynchronous event-driven devices, which is likely to be important in cognition because the brain appears to be an asynchronous processing system. We use the human language faculty and natural language processing as a concrete example through out.
  • Reinisch, E., Jesse, A., & McQueen, J. M. (2008). The strength of stress-related lexical competition depends on the presence of first-syllable stress. In Proceedings of Interspeech 2008 (pp. 1954-1954).

    Abstract

    Dutch listeners' looks to printed words were tracked while they listened to instructions to click with their mouse on one of them. When presented with targets from word pairs where the first two syllables were segmentally identical but differed in stress location, listeners used stress information to recognize the target before segmental information disambiguated the words. Furthermore, the amount of lexical competition was influenced by the presence or absence of word-initial stress.
  • Reinisch, E., Jesse, A., & McQueen, J. M. (2008). Lexical stress information modulates the time-course of spoken-word recognition. In Proceedings of Acoustics' 08 (pp. 3183-3188).

    Abstract

    Segmental as well as suprasegmental information is used by Dutch listeners to recognize words. The time-course of the effect of suprasegmental stress information on spoken-word recognition was investigated in a previous study, in which we tracked Dutch listeners' looks to arrays of four printed words as they listened to spoken sentences. Each target was displayed along with a competitor that did not differ segmentally in its first two syllables but differed in stress placement (e.g., 'CENtimeter' and 'sentiMENT'). The listeners' eye-movements showed that stress information is used to recognize the target before distinct segmental information is available. Here, we examine the role of durational information in this effect. Two experiments showed that initial-syllable duration, as a cue to lexical stress, is not interpreted dependent on the speaking rate of the preceding carrier sentence. This still held when other stress cues like pitch and amplitude were removed. Rather, the speaking rate of the preceding carrier affected the speed of word recognition globally, even though the rate of the target itself was not altered. Stress information modulated lexical competition, but did so independently of the rate of the preceding carrier, even if duration was the only stress cue present.
  • Ringersma, J., Zinn, C., & Kemps-Snijders, M. (2009). LEXUS & ViCoS From lexical to conceptual spaces. In 1st International Conference on Language Documentation and Conservation (ICLDC).

    Abstract

    LEXUS and ViCoS: from lexicon to conceptual spaces LEXUS is a web-based lexicon tool and the knowledge space software ViCoS is an extension of LEXUS, allowing users to create relations between objects in and across lexica. LEXUS and ViCoS are part of the Language Archiving Technology software, developed at the MPI for Psycholinguistics to archive and enrich linguistic resources collected in the framework of language documentation projects. LEXUS is of primary interest for language documentation, offering the possibility to not just create a digital dictionary, but additionally it allows the creation of multi-media encyclopedic lexica. ViCoS provides an interface between the lexical space and the ontological space. Its approach permits users to model a world of concepts and their interrelations based on categorization patterns made by the speech community. We describe the LEXUS and ViCoS functionalities using three cases from DoBeS language documentation projects: (1) Marquesan The Marquesan lexicon was initially created in Toolbox and imported into LEXUS using the Toolbox import functionality. The lexicon is enriched with multi-media to illustrate the meaning of the words in its cultural environment. Members of the speech community consider words as keys to access and describe relevant parts of their life and traditions. Their understanding of words is best described by the various associations they evoke rather than in terms of any formal theory of meaning. Using ViCoS a knowledge space of related concepts is being created. (2) Kola-Sámi Two lexica are being created in LEXUS: RuSaDic lexicon is a Russian-Kildin wordlist in which the entries are of relative limited structure and content. SaRuDiC is a more complex structured lexicon with much richer content, including multi-media fragments and derivations. Using ViCoS we have created a connection between the two lexica, so that speakers who are familiair with Russian and wish to revitalize their Kildin can enter the lexicon through the RuSaDic and from there approach the informative SaRuDic. Similary we will create relations from the two lexica to external open databases, like e.g. Álgu. (3) Beaver A speaker database including kinship relations has been created and the database has been imported into LEXUS. In the LEXUS views the relations for individual speakers are being displayed. Using ViCoS the relational information from the database will be extracted to form a kisnhip relation space with specific relation types, like e.g 'mother-of'. The whole set of relations from the database can be displayed in one ViCoS relation window, and zoom functionality is available.
  • Roberts, S. G., Everett, C., & Blasi, D. (2015). Exploring potential climate effects on the evolution of human sound systems. In H. Little (Ed.), Proceedings of the 18th International Congress of Phonetic Sciences [ICPhS 2015] Satellite Event: The Evolution of Phonetic Capabilities: Causes constraints, consequences (pp. 14-19). Glasgow: ICPHS.

    Abstract

    We suggest that it is now possible to conduct research on a topic which might be called evolutionary geophonetics. The main question is how the climate influences the evolution of language. This involves biological adaptations to the climate that may affect biases in production and perception; cultural evolutionary adaptations of the sounds of a language to climatic conditions; and influences of the climate on language diversity and contact. We discuss these ideas with special reference to a recent hypothesis that lexical tone is not adaptive in dry climates (Everett, Blasi & Roberts, 2015).
  • Robotham, L., Trinkler, I., & Sauter, D. (2008). The power of positives: Evidence for an overall emotional recognition deficit in Huntington's disease [Abstract]. Journal of Neurology, Neurosurgery & Psychiatry, 79, A12.

    Abstract

    The recognition of emotions of disgust, anger and fear have been shown to be significantly impaired in Huntington’s disease (eg,Sprengelmeyer et al, 1997, 2006; Gray et al, 1997; Milders et al, 2003,Montagne et al, 2006; Johnson et al, 2007; De Gelder et al, 2008). The relative impairment of these emotions might have implied a recognition impairment specific to negative emotions. Could the asymmetric recognition deficits be due not to the complexity of the emotion but rather reflect the complexity of the task? In the current study, 15 Huntington’s patients and 16 control subjects were presented with negative and positive non-speech emotional vocalisations that were to be identified as anger, fear, sadness, disgust, achievement, pleasure and amusement in a forced-choice paradigm. This experiment more accurately matched the negative emotions with positive emotions in a homogeneous modality. The resulting dually impaired ability of Huntington’s patients to identify negative and positive non-speech emotional vocalisations correctly provides evidence for an overall emotional recognition deficit in the disease. These results indicate that previous findings of a specificity in emotional recognition deficits might instead be due to the limitations of the visual modality. Previous experiments may have found an effect of emotional specificy due to the presence of a single positive emotion, happiness, in the midst of multiple negative emotions. In contrast with the previous literature, the study presented here points to a global deficit in the recognition of emotional sounds.
  • De Ruiter, J. P. (2004). On the primacy of language in multimodal communication. In Workshop Proceedings on Multimodal Corpora: Models of Human Behaviour for the Specification and Evaluation of Multimodal Input and Output Interfaces.(LREC2004) (pp. 38-41). Paris: ELRA - European Language Resources Association (CD-ROM).

    Abstract

    In this paper, I will argue that although the study of multimodal interaction offers exciting new prospects for Human Computer Interaction and human-human communication research, language is the primary form of communication, even in multimodal systems. I will support this claim with theoretical and empirical arguments, mainly drawn from human-human communication research, and will discuss the implications for multimodal communication research and Human-Computer Interaction.
  • De Ruiter, L. E. (2008). How useful are polynomials for analyzing intonation? In Proceedings of Interspeech 2008 (pp. 785-789).

    Abstract

    This paper presents the first application of polynomial modeling as a means for validating phonological pitch accent labels to German data. It is compared to traditional phonetic analysis (measuring minima, maxima, alignment). The traditional method fares better in classification, but results are comparable in statistical accent pair testing. Robustness tests show that pitch correction is necessary in both cases. The approaches are discussed in terms of their practicability, applicability to other domains of research and interpretability of their results.
  • San Roque, L., & Bergvist, H. (Eds.). (2015). Epistemic marking in typological perspective [Special Issue]. STUF -Language typology and universals, 68(2).
  • Sauter, D., Scott, S., & Calder, A. (2004). Categorisation of vocally expressed positive emotion: A first step towards basic positive emotions? [Abstract]. Proceedings of the British Psychological Society, 12, 111.

    Abstract

    Most of the study of basic emotion expressions has focused on facial expressions and little work has been done to specifically investigate happiness, the only positive of the basic emotions (Ekman & Friesen, 1971). However, a theoretical suggestion has been made that happiness could be broken down into discrete positive emotions, which each fulfil the criteria of basic emotions, and that these would be expressed vocally (Ekman, 1992). To empirically test this hypothesis, 20 participants categorised 80 paralinguistic sounds using the labels achievement, amusement, contentment, pleasure and relief. The results suggest that achievement, amusement and relief are perceived as distinct categories, which subjects accurately identify. In contrast, the categories of contentment and pleasure were systematically confused with other responses, although performance was still well above chance levels. These findings are initial evidence that the positive emotions engage distinct vocal expressions and may be considered to be distinct emotion categories.
  • Sauter, D., Eisner, F., Rosen, S., & Scott, S. K. (2008). The role of source and filter cues in emotion recognition in speech [Abstract]. Journal of the Acoustical Society of America, 123, 3739-3740.

    Abstract

    In the context of the source-filter theory of speech, it is well established that intelligibility is heavily reliant on information carried by the filter, that is, spectral cues (e.g., Faulkner et al., 2001; Shannon et al., 1995). However, the extraction of other types of information in the speech signal, such as emotion and identity, is less well understood. In this study we investigated the extent to which emotion recognition in speech depends on filterdependent cues, using a forced-choice emotion identification task at ten levels of noise-vocoding ranging between one and 32 channels. In addition, participants performed a speech intelligibility task with the same stimuli. Our results indicate that compared to speech intelligibility, emotion recognition relies less on spectral information and more on cues typically signaled by source variations, such as voice pitch, voice quality, and intensity. We suggest that, while the reliance on spectral dynamics is likely a unique aspect of human speech, greater phylogenetic continuity across species may be found in the communication of affect in vocalizations.
  • Sauter, D. (2008). The time-course of emotional voice processing [Abstract]. Neurocase, 14, 455-455.

    Abstract

    Research using event-related brain potentials (ERPs) has demonstrated an early differential effect in fronto-central regions when processing emotional, as compared to affectively neutral facial stimuli (e.g., Eimer & Holmes, 2002). In this talk, data demonstrating a similar effect in the auditory domain will be presented. ERPs were recorded in a one-back task where participants had to identify immediate repetitions of emotion category, such as a fearful sound followed by another fearful sound. The stimulus set consisted of non-verbal emotional vocalisations communicating positive and negative sounds, as well as neutral baseline conditions. Similarly to the facial domain, fear sounds as compared to acoustically controlled neutral sounds, elicited a frontally distributed positivity with an onset latency of about 150 ms after stimulus onset. These data suggest the existence of a rapid multi-modal frontocentral mechanism discriminating emotional from non-emotional human signals.
  • Sauter, D., Eisner, F., Ekman, P., & Scott, S. K. (2009). Universal vocal signals of emotion. In N. Taatgen, & H. Van Rijn (Eds.), Proceedings of the 31st Annual Meeting of the Cognitive Science Society (CogSci 2009) (pp. 2251-2255). Cognitive Science Society.

    Abstract

    Emotional signals allow for the sharing of important information with conspecifics, for example to warn them of danger. Humans use a range of different cues to communicate to others how they feel, including facial, vocal, and gestural signals. Although much is known about facial expressions of emotion, less research has focused on affect in the voice. We compare British listeners to individuals from remote Namibian villages who have had no exposure to Western culture, and examine recognition of non-verbal emotional vocalizations, such as screams and laughs. We show that a number of emotions can be universally recognized from non-verbal vocal signals. In addition we demonstrate the specificity of this pattern, with a set of additional emotions only recognized within, but not across these cultural groups. Our findings indicate that a small set of primarily negative emotions have evolved signals across several modalities, while most positive emotions are communicated with culture-specific signals.
  • Scharenborg, O., & Cooke, M. P. (2008). Comparing human and machine recognition performance on a VCV corpus. In ISCA Tutorial and Research Workshop (ITRW) on "Speech Analysis and Processing for Knowledge Discovery".

    Abstract

    Listeners outperform ASR systems in every speech recognition task. However, what is not clear is where this human advantage originates. This paper investigates the role of acoustic feature representations. We test four (MFCCs, PLPs, Mel Filterbanks, Rate Maps) acoustic representations, with and without ‘pitch’ information, using the same backend. The results are compared with listener results at the level of articulatory feature classification. While no acoustic feature representation reached the levels of human performance, both MFCCs and Rate maps achieved good scores, with Rate maps nearing human performance on the classification of voicing. Comparing the results on the most difficult articulatory features to classify showed similarities between the humans and the SVMs: e.g., ‘dental’ was by far the least well identified by both groups. Overall, adding pitch information seemed to hamper classification performance.
  • Scharenborg, O., Boves, L., & Ten Bosch, L. (2004). ‘On-line early recognition’ of polysyllabic words in continuous speech. In S. Cassidy, F. Cox, R. Mannell, & P. Sallyanne (Eds.), Proceedings of the Tenth Australian International Conference on Speech Science & Technology (pp. 387-392). Canberra: Australian Speech Science and Technology Association Inc.

    Abstract

    In this paper, we investigate the ability of SpeM, our recognition system based on the combination of an automatic phone recogniser and a wordsearch module, to determine as early as possible during the word recognition process whether a word is likely to be recognised correctly (this we refer to as ‘on-line’ early word recognition). We present two measures that can be used to predict whether a word is correctly recognised: the Bayesian word activation and the amount of available (acoustic) information for a word. SpeM was tested on 1,463 polysyllabic words in 885 continuous speech utterances. The investigated predictors indicated that a word activation that is 1) high (but not too high) and 2) based on more phones is more reliable to predict the correctness of a word than a similarly high value based on a small number of phones or a lower value of the word activation.
  • Scharenborg, O. (2008). Modelling fine-phonetic detail in a computational model of word recognition. In INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association (pp. 1473-1476). ISCA Archive.

    Abstract

    There is now considerable evidence that fine-grained acoustic-phonetic detail in the speech signal helps listeners to segment a speech signal into syllables and words. In this paper, we compare two computational models of word recognition on their ability to capture and use this finephonetic detail during speech recognition. One model, SpeM, is phoneme-based, whereas the other, newly developed Fine- Tracker, is based on articulatory features. Simulations dealt with modelling the ability of listeners to distinguish short words (e.g., ‘ham’) from the longer words in which they are embedded (e.g., ‘hamster’). The simulations with Fine- Tracker showed that it was, like human listeners, able to distinguish between short words from the longer words in which they are embedded. This suggests that it is possible to extract this fine-phonetic detail from the speech signal and use it during word recognition.
  • Scharenborg, O., & Okolowski, S. (2009). Lexical embedding in spoken Dutch. In INTERSPEECH 2009 - 10th Annual Conference of the International Speech Communication Association (pp. 1879-1882). ISCA Archive.

    Abstract

    A stretch of speech is often consistent with multiple words, e.g., the sequence /hæm/ is consistent with ‘ham’ but also with the first syllable of ‘hamster’, resulting in temporary ambiguity. However, to what degree does this lexical embedding occur? Analyses on two corpora of spoken Dutch showed that 11.9%-19.5% of polysyllabic word tokens have word-initial embedding, while 4.1%-7.5% of monosyllabic word tokens can appear word-initially embedded. This is much lower than suggested by an analysis of a large dictionary of Dutch. Speech processing thus appears to be simpler than one might expect on the basis of statistics on a dictionary.
  • Scharenborg, O. (2009). Using durational cues in a computational model of spoken-word recognition. In INTERSPEECH 2009 - 10th Annual Conference of the International Speech Communication Association (pp. 1675-1678). ISCA Archive.

    Abstract

    Evidence that listeners use durational cues to help resolve temporarily ambiguous speech input has accumulated over the past few years. In this paper, we investigate whether durational cues are also beneficial for word recognition in a computational model of spoken-word recognition. Two sets of simulations were carried out using the acoustic signal as input. The simulations showed that the computational model, like humans, takes benefit from durational cues during word recognition, and uses these to disambiguate the speech signal. These results thus provide support for the theory that durational cues play a role in spoken-word recognition.
  • Schmidt, T., Duncan, S., Ehmer, O., Hoyt, J., Kipp, M., Loehr, D., Magnusson, M., Rose, T., & Sloetjes, H. (2008). An exchange format for multimodal annotations. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008).

    Abstract

    This paper presents the results of a joint effort of a group of multimodality researchers and tool developers to improve the interoperability between several tools used for the annotation of multimodality. We propose a multimodal annotation exchange format, based on the annotation graph formalism, which is supported by import and export routines in the respective tools
  • Schmidt, J., Scharenborg, O., & Janse, E. (2015). Semantic processing of spoken words under cognitive load in older listeners. In M. Wolters, J. Livingstone, B. Beattie, R. Smith, M. MacMahon, J. Stuart-Smith, & J. Scobbie (Eds.), Proceedings of the 18th International Congress of Phonetic Sciences (ICPhS 2015). London: International Phonetic Association.

    Abstract

    Processing of semantic information in language comprehension has been suggested to be modulated by attentional resources. Consequently, cognitive load would be expected to reduce semantic priming, but studies have yielded inconsistent results. This study investigated whether cognitive load affects semantic activation in speech processing in older adults, and whether this is modulated by individual differences in cognitive and hearing abilities. Older adults participated in an auditory continuous lexical decision task in a low-load and high-load condition. The group analysis showed only a marginally significant reduction of semantic priming in the high-load condition compared to the low-load condition. The individual differences analysis showed that semantic priming was significantly reduced under increased load in participants with poorer attention-switching control. Hence, a resource-demanding secondary task may affect the integration of spoken words into a coherent semantic representation for listeners with poorer attentional skills.
  • Schubotz, L., Holler, J., & Ozyurek, A. (2015). Age-related differences in multi-modal audience design: Young, but not old speakers, adapt speech and gestures to their addressee's knowledge. In G. Ferré, & M. Tutton (Eds.), Proceedings of the 4th GESPIN - Gesture & Speech in Interaction Conference (pp. 211-216). Nantes: Université of Nantes.

    Abstract

    Speakers can adapt their speech and co-speech gestures for
    addressees. Here, we investigate whether this ability is
    modulated by age. Younger and older adults participated in a
    comic narration task in which one participant (the speaker)
    narrated six short comic stories to another participant (the
    addressee). One half of each story was known to both participants, the other half only to the speaker. Younger but
    not older speakers used more words and gestures when narrating novel story content as opposed to known content.
    We discuss cognitive and pragmatic explanations of these findings and relate them to theories of gesture production.
  • Schuerman, W. L., Nagarajan, S., & Houde, J. (2015). Changes in consonant perception driven by adaptation of vowel production to altered auditory feedback. In M. Wolters, J. Livingstone, B. Beattie, R. Smith, M. MacMahon, J. Stuart-Smith, & J. Scobbie (Eds.), Proceedings of the 18th International Congresses of Phonetic Sciences (ICPhS 2015). London: International Phonetic Association.

    Abstract

    Adaptation to altered auditory feedback has been shown to induce subsequent shifts in perception. However, it is uncertain whether these perceptual changes may generalize to other speech sounds. In this experiment, we tested whether exposing the production of a vowel to altered auditory feedback affects perceptual categorization of a consonant distinction. In two sessions, participants produced CVC words containing the vowel /i/, while intermittently categorizing stimuli drawn from a continuum between "see" and "she." In the first session feedback was unaltered, while in the second session the formants of the vowel were shifted 20% towards /u/. Adaptation to the altered vowel was found to reduce the proportion of perceived /S/ stimuli. We suggest that this reflects an alteration to the sensorimotor mapping that is shared between vowels and consonants.
  • Schuppler, B., Ernestus, M., Scharenborg, O., & Boves, L. (2008). Preparing a corpus of Dutch spontaneous dialogues for automatic phonetic analysis. In INTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association (pp. 1638-1641). ISCA Archive.

    Abstract

    This paper presents the steps needed to make a corpus of Dutch spontaneous dialogues accessible for automatic phonetic research aimed at increasing our understanding of reduction phenomena and the role of fine phonetic detail. Since the corpus was not created with automatic processing in mind, it needed to be reshaped. The first part of this paper describes the actions needed for this reshaping in some detail. The second part reports the results of a preliminary analysis of the reduction phenomena in the corpus. For this purpose a phonemic transcription of the corpus was created by means of a forced alignment, first with a lexicon of canonical pronunciations and then with multiple pronunciation variants per word. In this study pronunciation variants were generated by applying a large set of phonetic processes that have been implicated in reduction to the canonical pronunciations of the words. This relatively straightforward procedure allows us to produce plausible pronunciation variants and to verify and extend the results of previous reduction studies reported in the literature.
  • Schuppler, B., Van Dommelen, W., Koreman, J., & Ernestus, M. (2009). Word-final [t]-deletion: An analysis on the segmental and sub-segmental level. In Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) (pp. 2275-2278). Causal Productions Pty Ltd.

    Abstract

    This paper presents a study on the reduction of word-final [t]s in conversational standard Dutch. Based on a large amount of tokens annotated on the segmental level, we show that the bigram frequency and the segmental context are the main predictors for the absence of [t]s. In a second study, we present an analysis of the detailed acoustic properties of word-final [t]s and we show that bigram frequency and context also play a role on the subsegmental level. This paper extends research on the realization of /t/ in spontaneous speech and shows the importance of incorporating sub-segmental properties in models of speech.
  • Scott, S., & Sauter, D. (2004). Vocal expressions of emotion and positive and negative basic emotions [Abstract]. Proceedings of the British Psychological Society, 12, 156.

    Abstract

    Previous studies have indicated that vocal and facial expressions of the ‘basic’ emotions share aspects of processing. Thus amygdala damage compromises the perception of fear and anger from the face and from the voice. In the current study we tested the hypothesis that there exist positive basic emotions, expressed mainly in the voice (Ekman, 1992). Vocal stimuli were produced to express the specific positive emotions of amusement, achievement, pleasure, contentment and relief.
  • Senft, G. (1991). Bakavilisi Biga - we can 'turn' the language - or: What happens to English words in Kilivila language? In W. Bahner, J. Schildt, & D. Viehwegger (Eds.), Proceedings of the XIVth International Congress of Linguists (pp. 1743-1746). Berlin: Akademie Verlag.
  • Seuren, P. A. M. (1966). Het probleem van de woorddefinitie. In Handelingen van het 29ste Nederlands Filologencongres (pp. 103-108).
  • Seuren, P. A. M. (1984). Logic and truth-values in language. In F. Landman, & F. Veltman (Eds.), Varieties of formal semantics: Proceedings of the fourth Amsterdam colloquium (pp. 343-364). Dordrecht: Foris.

Share this page