Gerard Kempen

Publications

Displaying 1 - 15 of 15
  • Kempen, G., & Harbusch, K. (2016). Verb-second word order after German weil ‘because’: psycholinguistic theory from corpus-linguistic data. Glossa: a journal of general linguistics, 1(1): 3. doi:10.5334/gjgl.46.

    Abstract

    In present-day spoken German, subordinate clauses introduced by the connector weil ‘because’ occur with two orders of subject, finite verb, and object(s). In addition to weil clauses with verb-final word order (“VF”; standard in subordinate clauses) one often hears weil clauses with SVO, the standard order of main clauses (“verb-second”, V2). The “weil-V2” phenomenon is restricted to sentences where the weil clause follows the main clause, and is virtually absent from formal (written, edited) German, occurring only in extemporaneous speech. Extant accounts of weil-V2 focus on the interpretation of weil-V2 clauses by the hearer, in particular on the type of discourse relation licensed by weil-V2 vs. weil-VF: causal/propositional or inferential/epistemic. Focusing instead on the production of weil clauses by the speaker, we examine a collection of about 1,000 sentences featuring a causal connector (weil, da or denn) after the main clause, all extracted from a corpus of spoken German dialogues and annotated with tags denoting major prosodic and syntactic boundaries, and various types of disfluencies (pauses, hesitations). Based on the observed frequency patterns and on known linguistic properties of the connectors, we propose that weil-V2 is caused by miscoordination between the mechanisms for lexical retrieval and grammatical encoding: Due to its high frequency, the lexical item weil is often selected prematurely, while the grammatical encoder is still working on the syntactic shape of the weil clause. Weil-V2 arises when pragmatic and processing factors drive the encoder to discontinue the current sentence, and to plan the clause following weil in the form of the main clause of an independent, new sentence. Thus, the speaker continues with a V2 clause, seemingly in violation of the VF constraint imposed by the preceding weil. We also explore implications of the model regarding the interpretation of sentences containing causal connectors.
  • Harbusch, K., & Kempen, G. (2009). Clausal coordinate ellipsis and its varieties in spoken German: A study with the TüBa-D/S Treebank of the VERBMOBIL corpus. In M. Passarotti, A. Przepiórkowski, S. Raynaud, & F. Van Eynde (Eds.), Proceedings of the The Eighth International Workshop on Treebanks and Linguistic Theories (pp. 83-94). Milano: EDUCatt.
  • Harbusch, K., & Kempen, G. (2009). Generating clausal coordinate ellipsis multilingually: A uniform approach based on postediting. In 12th European Workshop on Natural Language Generation: Proceedings of the Workshop (pp. 138-145). The Association for Computational Linguistics.

    Abstract

    Present-day sentence generators are often in-capable of producing a wide variety of well-formed elliptical versions of coordinated clauses, in particular, of combined elliptical phenomena (Gapping, Forward and Back-ward Conjunction Reduction, etc.). The ap-plicability of the various types of clausal co-ordinate ellipsis (CCE) presupposes detailed comparisons of the syntactic properties of the coordinated clauses. These nonlocal comparisons argue against approaches based on local rules that treat CCE structures as special cases of clausal coordination. We advocate an alternative approach where CCE rules take the form of postediting rules ap-plicable to nonelliptical structures. The ad-vantage is not only a higher level of modu-larity but also applicability to languages be-longing to different language families. We describe a language-neutral module (called Elleipo; implemented in JAVA) that gener-ates as output all major CCE versions of co-ordinated clauses. Elleipo takes as input linearly ordered nonelliptical coordinated clauses annotated with lexical identity and coreferentiality relationships between words and word groups in the conjuncts. We dem-onstrate the feasibility of a single set of postediting rules that attains multilingual coverage.
  • Kempen, G. (2009). Clausal coordination and coordinative ellipsis in a model of the speaker. Linguistics, 47(3), 653-696. doi:10.1515/LING.2009.022.

    Abstract

    This article presents a psycholinguistically inspired approach to the syntax of clause-level coordination and coordinate ellipsis. It departs from the assumption that coordinations are structurally similar to so-called appropriateness repairs — an important type of self-repairs in spontaneous speech. Coordinate structures and appropriateness repairs can both be viewed as “update” constructions. Updating is defined as a special sentence production mode that efficiently revises or augments existing sentential structure in response to modifications in the speaker's communicative intention. This perspective is shown to offer an empirically satisfactory and theoretically parsimonious account of two prominent types of coordinate ellipsis, in particular “forward conjunction reduction” (FCR) and “gapping” (including “long-distance gapping” and “subgapping”). They are analyzed as different manifestations of “incremental updating” — efficient updating of only part of the existing sentential structure. Based on empirical data from Dutch and German, novel treatments are proposed for both types of clausal coordinate ellipsis. The coordination-as-updating perspective appears to explain some general properties of coordinate structure: the existence of the well-known “coordinate structure constraint”, and the attractiveness of three-dimensional representations of coordination. Moreover, two other forms of coordinate ellipsis — SGF (“subject gap in finite clauses with fronted verb”), and “backward conjunction reduction” (BCR) (also known as “right node raising” or RNR) — are shown to be incompatible with the notion of incremental updating. Alternative theoretical interpretations of these phenomena are proposed. The four types of clausal coordinate ellipsis — SGF, gapping, FCR and BCR — are argued to originate in four different stages of sentence production: Intending (i.e., preparing the communicative intention), conceptualization, grammatical encoding, and phonological encoding, respectively.
  • Snijders, T. M., Vosse, T., Kempen, G., Van Berkum, J. J. A., Petersson, K. M., & Hagoort, P. (2009). Retrieval and unification of syntactic structure in sentence comprehension: An fMRI study using word-category ambiguity. Cerebral Cortex, 19, 1493-1503. doi:10.1093/cercor/bhn187.

    Abstract

    Sentence comprehension requires the retrieval of single word information from long-term memory, and the integration of this information into multiword representations. The current functional magnetic resonance imaging study explored the hypothesis that the left posterior temporal gyrus supports the retrieval of lexical-syntactic information, whereas left inferior frontal gyrus (LIFG) contributes to syntactic unification. Twenty-eight subjects read sentences and word sequences containing word-category (noun–verb) ambiguous words at critical positions. Regions contributing to the syntactic unification process should show enhanced activation for sentences compared to words, and only within sentences display a larger signal for ambiguous than unambiguous conditions. The posterior LIFG showed exactly this predicted pattern, confirming our hypothesis that LIFG contributes to syntactic unification. The left posterior middle temporal gyrus was activated more for ambiguous than unambiguous conditions (main effect over both sentences and word sequences), as predicted for regions subserving the retrieval of lexical-syntactic information from memory. We conclude that understanding language involves the dynamic interplay between left inferior frontal and left posterior temporal regions.

    Additional information

    suppl1.pdf suppl2_dutch_stimulus.pdf
  • Vosse, T., & Kempen, G. (2009). In defense of competition during syntactic ambiguity resolution. Journal of Psycholinguistic Research, 38(1), 1-9. doi:10.1007/s10936-008-9075-1.

    Abstract

    In a recent series of publications (Traxler et al. J Mem Lang 39:558–592, 1998; Van Gompel et al. J Mem Lang 52:284–307, 2005; see also Van Gompel et al. (In: Kennedy, et al.(eds) Reading as a perceptual process, Oxford, Elsevier pp 621–648, 2000); Van Gompel et al. J Mem Lang 45:225–258, 2001) eye tracking data are reported showing that globally ambiguous (GA) sentences are read faster than locally ambiguous (LA) counterparts. They argue that these data rule out “constraint-based” models where syntactic and conceptual processors operate concurrently and syntactic ambiguity resolution is accomplished by competition. Such models predict the opposite pattern of reading times. However, this argument against competition is valid only in conjunction with two standard assumptions in current constraint-based models of sentence comprehension: (1) that syntactic competitions (e.g., Which is the best attachment site of the incoming constituent?) are pooled together with conceptual competitions (e.g., Which attachment site entails the most plausible meaning?), and (2) that the duration of a competition is a function of the overall (pooled) quality score obtained by each competitor. We argue that it is not necessary to abandon competition as a successful basis for explaining parsing phenomena and that the above-mentioned reading time data can be accounted for by a parallel-interactive model with conceptual and syntactic processors that do not pool their quality scores together. Within the individual linguistic modules, decision-making can very well be competition-based.
  • Vosse, T., & Kempen, G. (2009). The Unification Space implemented as a localist neural net: Predictions and error-tolerance in a constraint-based parser. Cognitive Neurodynamics, 3, 331-346. doi:10.1007/s11571-009-9094-0.

    Abstract

    We introduce a novel computer implementation of the Unification-Space parser (Vosse & Kempen 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen & Harbusch 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least in a qualitative and rudimentary sense, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described.
  • Kempen, G., & Vosse, T. (1992). A language-sensitive text editor for Dutch. In P. O’Brian Holt, & N. Williams (Eds.), Computers and writing: State of the art (pp. 68-77). Dordrecht: Kluwer Academic Publishers.

    Abstract

    Modern word processors begin to offer a range of facilities for spelling, grammar and style checking in English. For the Dutch language hardly anything is available as yet. Many commercial word processing packages do include a hyphenation routine and a lexicon-based spelling checker but the practical usefulness of these tools is limited due to certain properties of Dutch orthography, as we will explain below. In this chapter we describe a text editor which incorporates a great deal of lexical, morphological and syntactic knowledge of Dutch and monitors the orthographical quality of Dutch texts. Section 1 deals with those aspects of Dutch orthography which pose problems to human authors as well as to computational language sensitive text editing tools. In section 2 we describe the design and the implementation of the text editor we have built. Section 3 is mainly devoted to a provisional evaluation of the system.
  • Kempen, G. (1992). Generation. In W. Bright (Ed.), International encyclopedia of linguistics (pp. 59-61). New York: Oxford University Press.
  • Kempen, G. (1992). Language technology and language instruction: Computational diagnosis of word level errors. In M. Swartz, & M. Yazdani (Eds.), Intelligent tutoring systems for foreign language learning: The bridge to international communication (pp. 191-198). Berlin: Springer.
  • Kempen, G. (1992). Grammar based text processing. Document Management: Nieuwsbrief voor Documentaire Informatiekunde, 1(2), 8-10.
  • Kempen, G. (1992). Second language acquisition as a hybrid learning process. In F. Engel, D. Bouwhuis, T. Bösser, & G. d'Ydewalle (Eds.), Cognitive modelling and interactive environments in language learning (pp. 139-144). Berlin: Springer.
  • Kempen, G. (1975). De taalgebruiker in de mens: Schets van zijn bouw en funktie, toepassingen op moedertaal en vreemde taal verwerving. Forum der Letteren, 16, 132-158.
  • Kempen, G. (1975). Theoretiseren en experimenteren in de cognitieve psychologie. Gedrag: Tijdschrift voor Psychologie, 6, 341-347.
  • Levelt, W. J. M., & Kempen, G. (1975). Semantic and syntactic aspects of remembering sentences: A review of some recent continental research. In A. Kennedy, & W. Wilkes (Eds.), Studies in long term memory (pp. 201-216). New York: Wiley.

Share this page