Gerard Kempen

Publications

Displaying 1 - 100 of 192
  • Kempen, G., & Harbusch, K. (2019). Mutual attraction between high-frequency verbs and clause types with finite verbs in early positions: Corpus evidence from spoken English, Dutch, and German. Language, Cognition and Neuroscience, 34(9), 1140-1151. doi:10.1080/23273798.2019.1642498.

    Abstract

    We report a hitherto unknown statistical relationship between the corpus frequency of finite verbs and their fixed linear positions (early vs. late) in finite clauses of English, Dutch, and German. Compared to the overall frequency distribution of verb lemmas in the corpora, high-frequency finite verbs are overused in main clauses, at the expense of nonfinite verbs. This finite versus nonfinite split of high-frequency verbs is basically absent from subordinate clauses. Furthermore, this “main-clause bias” (MCB) of high-frequency verbs is more prominent in German and Dutch (SOV languages) than in English (an SVO language). We attribute the MCB and its varying effect sizes to faster accessibility of high-frequency finite verbs, which (1) increases the probability for these verbs to land in clauses mandating early verb placement, and (2) boosts the activation of clause plans that assign verbs to early linear positions (in casu: clauses with SVO as opposed to SOV order).

    Supplementary material

    plcp_a_1642498_sm1530.pdf
  • Kempen, G., & Harbusch, K. (2018). A competitive mechanism selecting verb-second versus verb-final word order in causative and argumentative clauses of spoken Dutch: A corpus-linguistic study. Language Sciences, 69, 30-42. doi:10.1016/j.langsci.2018.05.005.

    Abstract

    In Dutch and German, the canonical order of subject, object(s) and finite verb is ‘verb-second’ (V2) in main but ‘verb-final’ (VF) in subordinate clauses. This occasionally leads to the production of noncanonical word orders. Familiar examples are causative and argumentative clauses introduced by a subordinating conjunction (Du. omdat, Ger. weil ‘because’): the omdat/weil-V2 phenomenon. Such clauses may also be introduced by coordinating conjunctions (Du. want, Ger. denn), which license V2 exclusively. However, want/denn-VF structures are unknown. We present the results of a corpus study on the incidence of omdat-V2 in spoken Dutch, and compare them to published data on weil-V2 in spoken German. Basic findings: omdat-V2 is much less frequent than weil-V2 (ratio almost 1:8); and the frequency relations between coordinating and subordinating conjunctions are opposite (want >> omdat; denn << weil). We propose that conjunction selection and V2/VF selection proceed partly independently, and sometimes miscommunicate—e.g. yielding omdat/weil paired with V2. Want/denn-VF pairs do not occur because want/denn clauses are planned as autonomous sentences, which take V2 by default. We sketch a simple feedforward neural network with two layers of nodes (representing conjunctions and word orders, respectively) that can simulate the observed data pattern through inhibition-based competition of the alternative choices within the node layers.
  • Kempen, G., & Harbusch, K. (2017). Frequential test of (S)OV as unmarked word order in Dutch and German clauses: A serendipitous corpus-linguistic experiment. In H. Reckman, L. L. S. Cheng, M. Hijzelendoorn, & R. Sybesma (Eds.), Crossroads semantics: Computation, experiment and grammar (pp. 107-123). Amsterdam: Benjamins.

    Abstract

    In a paper entitled “Against markedness (and what to replace it with)”, Haspelmath argues “that the term ‘markedness’ is superfluous”, and that frequency asymmetries often explain structural (un)markedness asymmetries (Haspelmath 2006). We investigate whether this argument applies to Object and Verb orders in main (VO, marked) and subordinate (OV, unmarked) clauses of spoken and written German and Dutch, using English (without VO/OV alternation) as control. Frequency counts from six treebanks (three languages, two output modalities) do not support Haspelmath’s proposal. However, they reveal an unexpected phenomenon, most prominently in spoken Dutch and German: a small set of extremely high-frequent finite verbs with unspecific meanings populates main clauses much more densely than subordinate clauses. We suggest these verbs accelerate the start-up of grammatical encoding, thus facilitating sentence-initial output fluency
  • Kuiper, K., Bimesl, N., Kempen, G., & Ogino, M. (2017). Initial vs. non-initial placement of agent constructions in spoken clauses: A corpus-based study of language production under time pressure. Language Sciences, 64, 16-33. doi:10.1016/j.langsci.2017.06.001.

    Abstract

    In this exploratory study we test the hypothesis that the retrieval from memory of proper noun Agents (PNAs) under processing pressure causes a greater proportion of such semantic arguments to be placed to the right of the initial position in a clause than would be the case if such retrieval from memory were not necessary. This effect is manifest in sports commentary. Processing pressure on sports commentators is modulated by the speed at which the sport is played and reported. Non-initial placement is also facilitated by formulae which have slots in non-initial position. It follows that the non-initial placement of PNAs is not always semantically or pragmatically motivated. This finding therefore runs counter to a strong form of the functionalist hypothesis that syntactic choices available in the systemic structure of the syntax of a language offer solely semantic or pragmatic choices. It is an open question in a weak functionalist account of language and language use how processing and communicative functions interact in general.
  • Kempen, G., & Harbusch, K. (2016). Verb-second word order after German weil ‘because’: psycholinguistic theory from corpus-linguistic data. Glossa: a journal of general linguistics, 1(1): 3. doi:10.5334/gjgl.46.

    Abstract

    In present-day spoken German, subordinate clauses introduced by the connector weil ‘because’ occur with two orders of subject, finite verb, and object(s). In addition to weil clauses with verb-final word order (“VF”; standard in subordinate clauses) one often hears weil clauses with SVO, the standard order of main clauses (“verb-second”, V2). The “weil-V2” phenomenon is restricted to sentences where the weil clause follows the main clause, and is virtually absent from formal (written, edited) German, occurring only in extemporaneous speech. Extant accounts of weil-V2 focus on the interpretation of weil-V2 clauses by the hearer, in particular on the type of discourse relation licensed by weil-V2 vs. weil-VF: causal/propositional or inferential/epistemic. Focusing instead on the production of weil clauses by the speaker, we examine a collection of about 1,000 sentences featuring a causal connector (weil, da or denn) after the main clause, all extracted from a corpus of spoken German dialogues and annotated with tags denoting major prosodic and syntactic boundaries, and various types of disfluencies (pauses, hesitations). Based on the observed frequency patterns and on known linguistic properties of the connectors, we propose that weil-V2 is caused by miscoordination between the mechanisms for lexical retrieval and grammatical encoding: Due to its high frequency, the lexical item weil is often selected prematurely, while the grammatical encoder is still working on the syntactic shape of the weil clause. Weil-V2 arises when pragmatic and processing factors drive the encoder to discontinue the current sentence, and to plan the clause following weil in the form of the main clause of an independent, new sentence. Thus, the speaker continues with a V2 clause, seemingly in violation of the VF constraint imposed by the preceding weil. We also explore implications of the model regarding the interpretation of sentences containing causal connectors.
  • Van de Velde, M., Kempen, G., & Harbusch, K. (2015). Dative alternation and planning scope in spoken language: A corpus study on effects of verb bias in VO and OV clauses of Dutch. Lingua, 165, 92-108. doi:10.1016/j.lingua.2015.07.006.

    Abstract

    The syntactic structure of main and subordinate clauses is determined to a considerable extent by verb biases. For example, some English and Dutch ditransitive verbs have a preference for the prepositional object dative, whereas others are typically used with the double object dative. In this study, we compare the effect of these biases on structure selection in (S)VO and (S)OV dative clauses in the Corpus of Spoken Dutch (CGN). This comparison allowed us to make inferences about the size of the advance planning scope during spontaneous speaking: If the verb is an obligatory component of clause-level advance planning scope, as is claimed by the hypothesis of hierarchical incrementality, then biases should exert their influence on structure choices, regardless of early (VO) or late (OV) position of the verb in the clause. Conversely, if planning proceeds in a piecemeal fashion, strictly guided by lexical availability, as claimed by linear incrementality, then the verb and its associated biases can only influence structure choices in VO sentences. We tested these predictions by analyzing structure choices in the CGN, using mixed logit models. Our results support a combination of linear and hierarchical incrementality, showing a significant influence of verb bias on structure choices in VO, and a weaker (but still significant) effect in OV clauses
  • Kempen, G. (2014). Prolegomena to a neurocomputational architecture for human grammatical encoding and decoding. Neuroinformatics, 12, 111-142. doi:10.1007/s12021-013-9191-4.

    Abstract

    The study develops a neurocomputational architecture for grammatical processing in language production and language comprehension (grammatical encoding and decoding, respectively). It seeks to answer two questions. First, how is online syntactic structure formation of the complexity required by natural-language grammars possible in a fixed, preexisting neural network without the need for online creation of new connections or associations? Second, is it realistic to assume that the seemingly disparate instantiations of syntactic structure formation in grammatical encoding and grammatical decoding can run on the same neural infrastructure? This issue is prompted by accumulating experimental evidence for the hypothesis that the mechanisms for grammatical decoding overlap with those for grammatical encoding to a considerable extent, thus inviting the hypothesis of a single “grammatical coder.” The paper answers both questions by providing the blueprint for a syntactic structure formation mechanism that is entirely based on prewired circuitry (except for referential processing, which relies on the rapid learning capacity of the hippocampal complex), and can subserve decoding as well as encoding tasks. The model builds on the “Unification Space” model of syntactic parsing developed by Vosse & Kempen (2000, 2008, 2009). The design includes a neurocomputational mechanism for the treatment of an important class of grammatical movement phenomena.
  • Segaert, K., Kempen, G., Petersson, K. M., & Hagoort, P. (2013). Syntactic priming and the lexical boost effect during sentence production and sentence comprehension: An fMRI study. Brain and Language, 124, 174-183. doi:10.1016/j.bandl.2012.12.003.

    Abstract

    Behavioral syntactic priming effects during sentence comprehension are typically observed only if both the syntactic structure and lexical head are repeated. In contrast, during production syntactic priming occurs with structure repetition alone, but the effect is boosted by repetition of the lexical head. We used fMRI to investigate the neuronal correlates of syntactic priming and lexical boost effects during sentence production and comprehension. The critical measure was the magnitude of fMRI adaptation to repetition of sentences in active or passive voice, with or without verb repetition. In conditions with repeated verbs, we observed adaptation to structure repetition in the left IFG and MTG, for active and passive voice. However, in the absence of repeated verbs, adaptation occurred only for passive sentences. None of the fMRI adaptation effects yielded differential effects for production versus comprehension, suggesting that sentence comprehension and production are subserved by the same neuronal infrastructure for syntactic processing.

    Supplementary material

    Segaert_Supplementary_data_2013.docx
  • Kempen, G., Olsthoorn, N., & Sprenger, S. (2012). Grammatical workspace sharing during language production and language comprehension: Evidence from grammatical multitasking. Language and Cognitive Processes, 27, 345-380. doi:10.1080/01690965.2010.544583.

    Abstract

    Grammatical encoding and grammatical decoding (in sentence production and comprehension, respectively) are often portrayed as independent modalities of grammatical performance that only share declarative resources: lexicon and grammar. The processing resources subserving these modalities are supposed to be distinct. In particular, one assumes the existence of two workspaces where grammatical structures are assembled and temporarily maintained—one for each modality. An alternative theory holds that the two modalities share many of their processing resources and postulates a single mechanism for the online assemblage and short-term storage of grammatical structures: a shared workspace. We report two experiments with a novel “grammatical multitasking” paradigm: the participants had to read (i.e., decode) and to paraphrase (encode) sentences presented in fragments, responding to each input fragment as fast as possible with a fragment of the paraphrase. The main finding was that grammatical constraints with respect to upcoming input that emanate from decoded sentence fragments are immediately replaced by grammatical expectations emanating from the structure of the corresponding paraphrase fragments. This evidences that the two modalities have direct access to, and operate upon, the same (i.e., token-identical) grammatical structures. This is possible only if the grammatical encoding and decoding processes command the same, shared grammatical workspace. Theoretical implications for important forms of grammatical multitasking—self-monitoring, turn-taking in dialogue, speech shadowing, and simultaneous translation—are explored.
  • Harbusch, K., & Kempen, G. (2011). Automatic online writing support for L2 learners of German through output monitoring by a natural-language paraphrase generator. In M. Levy, F. Blin, C. Bradin Siskin, & O. Takeuchi (Eds.), WorldCALL: International perspectives on computer-assisted language learning (pp. 128-143). New York: Routledge.

    Abstract

    Students who are learning to write in a foreign language, often want feedback on the grammatical quality of the sentences they produce. The usual NLP approach to this problem is based on parsing student-generated text. Here, we propose a generation-based ap- proach aiming at preventing errors ("scaffolding"). In our ICALL system, the student constructs sentences by composing syntactic trees out of lexically anchored "treelets" via a graphical drag & drop user interface. A natural-language generator computes all possible grammatically well-formed sentences entailed by the student-composed tree. It provides positive feedback if the student-composed tree belongs to the well-formed set, and negative feedback otherwise. If so requested by the student, it can substantiate the positive or negative feedback based on a comparison between the student-composed tree and its own trees (informative feedback on demand). In case of negative feedback, the system refuses to build the structure attempted by the student. Frequently occurring errors are handled in terms of "malrules." The system we describe is a prototype (implemented in JAVA and C++) which can be parameterized with respect to L1 and L2, the size of the lexicon, and the level of detail of the visually presented grammatical structures.
  • Harbusch, K., & Kempen, G. (2009). Clausal coordinate ellipsis and its varieties in spoken German: A study with the TüBa-D/S Treebank of the VERBMOBIL corpus. In M. Passarotti, A. Przepiórkowski, S. Raynaud, & F. Van Eynde (Eds.), Proceedings of the The Eighth International Workshop on Treebanks and Linguistic Theories (pp. 83-94). Milano: EDUCatt.
  • Harbusch, K., & Kempen, G. (2009). Generating clausal coordinate ellipsis multilingually: A uniform approach based on postediting. In 12th European Workshop on Natural Language Generation: Proceedings of the Workshop (pp. 138-145). The Association for Computational Linguistics.

    Abstract

    Present-day sentence generators are often in-capable of producing a wide variety of well-formed elliptical versions of coordinated clauses, in particular, of combined elliptical phenomena (Gapping, Forward and Back-ward Conjunction Reduction, etc.). The ap-plicability of the various types of clausal co-ordinate ellipsis (CCE) presupposes detailed comparisons of the syntactic properties of the coordinated clauses. These nonlocal comparisons argue against approaches based on local rules that treat CCE structures as special cases of clausal coordination. We advocate an alternative approach where CCE rules take the form of postediting rules ap-plicable to nonelliptical structures. The ad-vantage is not only a higher level of modu-larity but also applicability to languages be-longing to different language families. We describe a language-neutral module (called Elleipo; implemented in JAVA) that gener-ates as output all major CCE versions of co-ordinated clauses. Elleipo takes as input linearly ordered nonelliptical coordinated clauses annotated with lexical identity and coreferentiality relationships between words and word groups in the conjuncts. We dem-onstrate the feasibility of a single set of postediting rules that attains multilingual coverage.
  • Kempen, G. (2009). Clausal coordination and coordinative ellipsis in a model of the speaker. Linguistics, 47(3), 653-696. doi:10.1515/LING.2009.022.

    Abstract

    This article presents a psycholinguistically inspired approach to the syntax of clause-level coordination and coordinate ellipsis. It departs from the assumption that coordinations are structurally similar to so-called appropriateness repairs — an important type of self-repairs in spontaneous speech. Coordinate structures and appropriateness repairs can both be viewed as “update” constructions. Updating is defined as a special sentence production mode that efficiently revises or augments existing sentential structure in response to modifications in the speaker's communicative intention. This perspective is shown to offer an empirically satisfactory and theoretically parsimonious account of two prominent types of coordinate ellipsis, in particular “forward conjunction reduction” (FCR) and “gapping” (including “long-distance gapping” and “subgapping”). They are analyzed as different manifestations of “incremental updating” — efficient updating of only part of the existing sentential structure. Based on empirical data from Dutch and German, novel treatments are proposed for both types of clausal coordinate ellipsis. The coordination-as-updating perspective appears to explain some general properties of coordinate structure: the existence of the well-known “coordinate structure constraint”, and the attractiveness of three-dimensional representations of coordination. Moreover, two other forms of coordinate ellipsis — SGF (“subject gap in finite clauses with fronted verb”), and “backward conjunction reduction” (BCR) (also known as “right node raising” or RNR) — are shown to be incompatible with the notion of incremental updating. Alternative theoretical interpretations of these phenomena are proposed. The four types of clausal coordinate ellipsis — SGF, gapping, FCR and BCR — are argued to originate in four different stages of sentence production: Intending (i.e., preparing the communicative intention), conceptualization, grammatical encoding, and phonological encoding, respectively.
  • Snijders, T. M., Vosse, T., Kempen, G., Van Berkum, J. J. A., Petersson, K. M., & Hagoort, P. (2009). Retrieval and unification of syntactic structure in sentence comprehension: An fMRI study using word-category ambiguity. Cerebral Cortex, 19, 1493-1503. doi:10.1093/cercor/bhn187.

    Abstract

    Sentence comprehension requires the retrieval of single word information from long-term memory, and the integration of this information into multiword representations. The current functional magnetic resonance imaging study explored the hypothesis that the left posterior temporal gyrus supports the retrieval of lexical-syntactic information, whereas left inferior frontal gyrus (LIFG) contributes to syntactic unification. Twenty-eight subjects read sentences and word sequences containing word-category (noun–verb) ambiguous words at critical positions. Regions contributing to the syntactic unification process should show enhanced activation for sentences compared to words, and only within sentences display a larger signal for ambiguous than unambiguous conditions. The posterior LIFG showed exactly this predicted pattern, confirming our hypothesis that LIFG contributes to syntactic unification. The left posterior middle temporal gyrus was activated more for ambiguous than unambiguous conditions (main effect over both sentences and word sequences), as predicted for regions subserving the retrieval of lexical-syntactic information from memory. We conclude that understanding language involves the dynamic interplay between left inferior frontal and left posterior temporal regions.

    Supplementary material

    suppl1.pdf suppl2_dutch_stimulus.pdf
  • Vosse, T., & Kempen, G. (2009). In defense of competition during syntactic ambiguity resolution. Journal of Psycholinguistic Research, 38(1), 1-9. doi:10.1007/s10936-008-9075-1.

    Abstract

    In a recent series of publications (Traxler et al. J Mem Lang 39:558–592, 1998; Van Gompel et al. J Mem Lang 52:284–307, 2005; see also Van Gompel et al. (In: Kennedy, et al.(eds) Reading as a perceptual process, Oxford, Elsevier pp 621–648, 2000); Van Gompel et al. J Mem Lang 45:225–258, 2001) eye tracking data are reported showing that globally ambiguous (GA) sentences are read faster than locally ambiguous (LA) counterparts. They argue that these data rule out “constraint-based” models where syntactic and conceptual processors operate concurrently and syntactic ambiguity resolution is accomplished by competition. Such models predict the opposite pattern of reading times. However, this argument against competition is valid only in conjunction with two standard assumptions in current constraint-based models of sentence comprehension: (1) that syntactic competitions (e.g., Which is the best attachment site of the incoming constituent?) are pooled together with conceptual competitions (e.g., Which attachment site entails the most plausible meaning?), and (2) that the duration of a competition is a function of the overall (pooled) quality score obtained by each competitor. We argue that it is not necessary to abandon competition as a successful basis for explaining parsing phenomena and that the above-mentioned reading time data can be accounted for by a parallel-interactive model with conceptual and syntactic processors that do not pool their quality scores together. Within the individual linguistic modules, decision-making can very well be competition-based.
  • Vosse, T., & Kempen, G. (2009). The Unification Space implemented as a localist neural net: Predictions and error-tolerance in a constraint-based parser. Cognitive Neurodynamics, 3, 331-346. doi:10.1007/s11571-009-9094-0.

    Abstract

    We introduce a novel computer implementation of the Unification-Space parser (Vosse & Kempen 2000) in the form of a localist neural network whose dynamics is based on interactive activation and inhibition. The wiring of the network is determined by Performance Grammar (Kempen & Harbusch 2003), a lexicalist formalism with feature unification as binding operation. While the network is processing input word strings incrementally, the evolving shape of parse trees is represented in the form of changing patterns of activation in nodes that code for syntactic properties of words and phrases, and for the grammatical functions they fulfill. The system is capable, at least in a qualitative and rudimentary sense, of simulating several important dynamic aspects of human syntactic parsing, including garden-path phenomena and reanalysis, effects of complexity (various types of clause embeddings), fault-tolerance in case of unification failures and unknown words, and predictive parsing (expectation-based analysis, surprisal effects). English is the target language of the parser described.
  • Harbusch, K., Kempen, G., & Vosse, T. (2008). A natural-language paraphrase generator for on-line monitoring and commenting incremental sentence construction by L2 learners of German. In Proceedings of WorldCALL 2008.

    Abstract

    Certain categories of language learners need feedback on the grammatical structure of sentences they wish to produce. In contrast with the usual NLP approach to this problem—parsing student-generated texts—we propose a generation-based approach aiming at preventing errors (“scaffolding”). In our ICALL system, students construct sentences by composing syntactic trees out of lexically anchored “treelets” via a graphical drag&drop user interface. A natural-language generator computes all possible grammatically well-formed sentences entailed by the student-composed tree, and intervenes immediately when the latter tree does not belong to the set of well-formed alternatives. Feedback is based on comparisons between the student-composed tree and the well-formed set. Frequently occurring errors are handled in terms of “malrules.” The system (implemented in JAVA and C++) currently focuses constituent order in German as L2.
  • Kempen, G., & Harbusch, K. (2008). Comparing linguistic judgments and corpus frequencies as windows on grammatical competence: A study of argument linearization in German clauses. In A. Steube (Ed.), The discourse potential of underspecified structures (pp. 179-192). Berlin: Walter de Gruyter.

    Abstract

    We present an overview of several corpus studies we carried out into the frequencies of argument NP orderings in the midfield of subordinate and main clauses of German. Comparing the corpus frequencies with grammaticality ratings published by Keller’s (2000), we observe a “grammaticality–frequency gap”: Quite a few argument orderings with zero corpus frequency are nevertheless assigned medium–range grammaticality ratings. We propose an explanation in terms of a two-factor theory. First, we hypothesize that the grammatical induction component needs a sufficient number of exposures to a syntactic pattern to incorporate it into its repertoire of more or less stable rules of grammar. Moderately to highly frequent argument NP orderings are likely have attained this status, but not their zero-frequency counterparts. This is why the latter argument sequences cannot be produced by the grammatical encoder and are absent from the corpora. Secondly, we assumed that an extraneous (nonlinguistic) judgment process biases the ratings of moderately grammatical linear order patterns: Confronted with such structures, the informants produce their own “ideal delivery” variant of the to-be-rated target sentence and evaluate the similarity between the two versions. A high similarity score yielded by this judgment then exerts a positive bias on the grammaticality rating—a score that should not be mistaken for an authentic grammaticality rating. We conclude that, at least in the linearization domain studied here, the goal of gaining a clear view of the internal grammar of language users is best served by a combined strategy in which grammar rules are founded on structures that elicit moderate to high grammaticality ratings and attain at least moderate usage frequencies.
  • Vosse, T. G., & Kempen, G. (2008). Parsing verb-final clauses in German: Garden-path and ERP effects modeled by a parallel dynamic parser. In B. Love, K. McRae, & V. Sloutsky (Eds.), Proceedings of the 30th Annual Conference on the Cognitive Science Society (pp. 261-266). Washington: Cognitive Science Society.

    Abstract

    Experimental sentence comprehension studies have shown that superficially similar German clauses with verb-final word order elicit very different garden-path and ERP effects. We show that a computer implementation of the Unification Space parser (Vosse & Kempen, 2000) in the form of a localist-connectionist network can model the observed differences, at least qualitatively. The model embodies a parallel dynamic parser that, in contrast with existing models, does not distinguish between consecutive first-pass and reanalysis stages, and does not use semantic or thematic roles. It does use structural frequency data and animacy information.
  • Harbusch, K., & Kempen, G. (2007). Clausal coordinate ellipsis in German: The TIGER treebank as a source of evidence. In J. Nivre, H. J. Kaalep, M. Kadri, & M. Koit (Eds.), Proceedings of the 16th Nordic Conference of Computational Linguistics (NODALIDA 2007) (pp. 81-88). Tartu: University of Tartu.

    Abstract

    Syntactic parsers and generators need highquality grammars of coordination and coordinate ellipsis—structures that occur very frequently but are much less well understood theoretically than many other domains of grammar. Modern grammars of coordinate ellipsis are based nearly exclusively on linguistic judgments (intuitions). The extent to which grammar rules based on this type of empirical evidence generate all and only the structures in text corpora, is unknown. As part of a project on the development of a grammar and a generator for coordinate ellipsis in German, we undertook an extensive exploration of the TIGER treebank—a syntactically annotated corpus of about 50,000 newspaper sentences. We report (1) frequency data for the various patterns of coordinate ellipsis, and (2) several rarely (but regularly) occurring ‘fringe deviations’ from the intuition-based rules for several ellipsis types. This information can help improve parser and generator performance.
  • Harbusch, K., Breugel, C., Koch, U., & Kempen, G. (2007). Interactive sentence combining and paraphrasing in support of integrated writing and grammar instruction: A new application area for natural language sentence generators. In S. Busemann (Ed.), Proceedings of the 11th Euopean Workshop in Natural Language Generation (ENLG07) (pp. 65-68). ACL Anthology.

    Abstract

    The potential of sentence generators as engines in Intelligent Computer-Assisted Language Learning and teaching (ICALL) software has hardly been explored. We sketch the prototype of COMPASS, a system that supports integrated writing and grammar curricula for 10 to 14 year old elementary or secondary schoolers. The system enables first- or second-language teachers to design controlled writing exercises, in particular of the “sentence combining” variety. The system includes facilities for error diagnosis and on-line feedback. Syntactic structures built by students or system can be displayed as easily understood phrase-structure or dependency trees, adapted to the student’s level of grammatical knowledge. The heart of the system is a specially designed generator capable of lexically guided sentence generation, of generating syntactic paraphrases, and displaying syntactic structures visually.
  • Kempen, G. (2007). De kunst van het weglaten: Elliptische nevenschikking in een model van de spreker. In F. Moerdijk, A. van Santen, & R. Tempelaars (Eds.), Leven met woorden: Afscheidsbundel voor Piet van Sterkenburg (pp. 397-407). Leiden: Brill.

    Abstract

    This paper is an abridged version (in Dutch) of an in-press article by the same author (Kempen, G. (2008/9). Clausal coordination and coordinate ellipsis in a model of the speaker. To be published in: Linguistics). The two papers present a psycholinguistically inspired approach to the syntax of clause-level coordination and coordinate ellipsis. It departs from the assumption that coordinations are structurally similar to so-called appropriateness repairs Ñ an important type of self-repairs in spontaneous speech. Coordinate structures and appropriateness repairs can both be viewed as ÒupdateÓ con-structions. Updating is defined as a special sentence production mode that efficiently revises or augments existing sentential structure in response to modifications in the speakerÕs communicative intention. This perspective is shown to offer an empirically satisfactory and theoretically parsimonious account of two prominent types of coordinate ellipsis, in particular Forward Conjunction Reduction (FCR) and Gapping (including Long-Distance Gapping and Subgapping). They are analyzed as different manifestations of Òincremental updatingÓ Ñ efficient updating of only part of the existing sentential structure. Based on empirical data from Dutch and German, novel treatments are proposed for both types of clausal coordinate ellipsis. Two other forms of coordinate ellipsis Ñ SGF (ÒSubject Gap in Finite clauses with fronted verbÓ), and Backward Conjunction Reduction (BCR; also known as Right Node Raising or RNR) Ñ are shown to be incompatible with the notion of incremental updating. Alternative theoretical interpretations of these phenomena are proposed. The four types of clausal coordinate ellipsis Ñ SGF, Gapping, FCR and BCR Ñ are argued to originate in four different stages of sentence production: Intending (i.e. preparing the communicative intention), Conceptualization, Grammatical Encoding, and Phonological Encoding, respectively.
  • Kuiper, K., Van Egmond, M.-E., Kempen, G., & Sprenger, S. A. (2007). Slipping on superlemmas: Multiword lexical items in speech production. The Mental Lexicon, 2(3), 313-357.

    Abstract

    Only relatively recently have theories of speech production concerned themselves with the part idioms and other multi-word lexical items (MLIs) play in the processes of speech production. Two theories of speech production which attempt to account for the accessing of idioms in speech production are those of Cutting and Bock (1997) and superlemma theory (Sprenger, 2003; Sprenger, Levelt, & Kempen, 2006). Much of the data supporting theories of speech production comes either from time course experiments or from slips of the tongue (Bock & Levelt, 1994). The latter are of two kinds: experimentally induced (Baars, 1992) or naturally observed (Fromkin, 1980). Cutting and Bock use experimentally induced speech errors while Sprenger et al. use time course experiments. The missing data type that has a bearing on speech production involving MLIs is that of naturally occurring slips. In this study the impact of data taken from naturally observed slips involving English and Dutch MLIs are brought to bear on these theories. The data are taken initially from a corpus of just over 1000 naturally observed English slips involving MLIs (the Tuggy corpus). Our argument proceeds as follows. First we show that slips occur independent of whether or not there are MLIs involved. In other words, speech production proceeds in certain of its aspects as though there were no MLI present. We illustrate these slips from the Tuggy data. Second we investigate the predictions of superlemma theory. Superlemma theory (Sprenger et al., 2006) accounts for the selection of MLIs and how their properties enter processes of speech production. It predicts certain activation patterns dependent on a MLI being selected. Each such pattern might give rise to slips of the tongue. This set of predictions is tested against the Tuggy data. Each of the predicted activation patterns yields a significant number of slips. These findings are therefore compatible with a view of MLIs as single units in so far as their activation by lexical concepts goes. However, the theory also predicts that some slips are likely not to occur. We confirm that such slips are not present in the data. These findings are further corroborated by reference a second smaller dataset of slips involving Dutch MLIs (the Kempen corpus). We then use slips involving irreversible binomials to distinguish between the predictions of superlemma theory which are supported by slips involving irreversible binomials and the Cutting and Bock model's predictions for slips involving these MLIs which are not
  • Harbusch, K., Kempen, G., Van Breugel, C., & Koch, U. (2006). A generation-oriented workbench for performance grammar: Capturing linear order variability in German and Dutch. In Proceedings of the 4th International Natural Language Generation Conference (pp. 9-11).

    Abstract

    We describe a generation-oriented workbench for the Performance Grammar (PG) formalism, highlighting the treatment of certain word order and movement constraints in Dutch and German. PG enables a simple and uniform treatment of a heterogeneous collection of linear order phenomena in the domain of verb constructions (variably known as Cross-serial Dependencies, Verb Raising, Clause Union, Extraposition, Third Construction, Particle Hopping, etc.). The central data structures enabling this feature are clausal “topologies”: one-dimensional arrays associated with clauses, whose cells (“slots”) provide landing sites for the constituents of the clause. Movement operations are enabled by unification of lateral slots of topologies at adjacent levels of the clause hierarchy. The PGW generator assists the grammar developer in testing whether the implemented syntactic knowledge allows all and only the well-formed permutations of constituents.
  • Harbusch, K., & Kempen, G. (2006). ELLEIPO: A module that computes coordinative ellipsis for language generators that don't. In Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL-2006) (pp. 115-118).

    Abstract

    Many current sentence generators lack the ability to compute elliptical versions of coordinated clauses in accordance with the rules for Gapping, Forward and Backward Conjunction Reduction, and SGF (Subject Gap in clauses with Finite/ Fronted verb). We describe a module (implemented in JAVA, with German and Dutch as target languages) that takes non-elliptical coordinated clauses as input and returns all reduced versions licensed by coordinative ellipsis. It is loosely based on a new psycholinguistic theory of coordinative ellipsis proposed by Kempen. In this theory, coordinative ellipsis is not supposed to result from the application of declarative grammar rules for clause formation but from a procedural component that interacts with the sentence generator and may block the overt expression of certain constituents.
  • Sprenger, S. A., Levelt, W. J. M., & Kempen, G. (2006). Lexical access during the production of idiomatic phrases. Journal of Memory and Language, 54(2), 161-184. doi:10.1016/j.jml.2005.11.001.

    Abstract

    In three experiments we test the assumption that idioms have their own lexical entry, which is linked to its constituent lemmas (Cutting & Bock, 1997). Speakers produced idioms or literal phrases (Experiment 1), completed idioms (Experiment 2), or switched between idiom completion and naming (Experiment 3). The results of Experiment 1 show that identity priming speeds up idiom production more effectively than literal phrase production, indicating a hybrid representation of idioms. In Experiment 2, we find effects of both phonological and semantic priming. Thus, elements of an idiom can not only be primed via their wordform, but also via the conceptual level. The results of Experiment 3 show that preparing the last word of an idiom primes naming of both phonologically and semantically related targets, indicating that literal word meanings become active during idiom production. The results are discussed within the framework of the hybrid model of idiom representation.
  • Kempen, G., & Olsthoorn, N. (2005). Non-parallelism of grammatical encoding and decoding due to shared working memory [Abstract]. In AMLaP-2005 11th Annual Conference on Architectures and Mechanisms for Language Processing September 5-7, 2005 Ghent, Belgium (pp. 24).
  • Kempen, G., & Harbusch, K. (2005). The relationship between grammaticality ratings and corpus frequencies: A case study into word order variability in the midfield of German clauses. In S. Kepser, & M. Reis (Eds.), Linguistic evidence - emperical, theoretical, and computational perspectives (pp. 329-349). Berlin: Mouton de Gruyter.
  • Kempen, G., & Harbusch, K. (2004). A corpus study into word order variation in German subordinate clauses: Animacy affects linearization independently of grammatical function assignment. In T. Pechmann, & C. Habel (Eds.), Multidisciplinary approaches to language production (pp. 173-181). Berlin: Mouton de Gruyter.
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses? A corpus study revealing unexpected rigidity. In S. Kepser, & M. Reis (Eds.), Pre-Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: Niemeyer.
  • Kempen, G., & Harbusch, K. (2004). How flexible is constituent order in the midfield of German subordinate clauses?: A corpus study revealing unexpected rigidity. In Proceedings of the International Conference on Linguistic Evidence (pp. 81-85). Tübingen: University of Tübingen.
  • Kempen, G. (2004). Human grammatical coding: Shared structure formation resources for grammatical encoding and decoding. In Cuny 2004 - The 17th Annual CUNY Conference on Human Sentence Processing. March 25-27, 2004. University of Maryland (pp. 66).
  • Kempen, G. (2004). Interactive visualization of syntactic structure assembly for grammar-intensive first- and second-language instruction. In R. Delmonte, P. Delcloque, & S. Tonelli (Eds.), Proceedings of InSTIL/ICALL2004 Symposium on NLP and speech technologies in advanced language learning systems (pp. 183-186). Venice: University of Venice.
  • Kempen, G., & Harbusch, K. (2004). Generating natural word orders in a semi-free word order language: Treebank-based linearization preferences for German. In A. Gelbukh (Ed.), Computational Linguistics and Intelligent Text Processing (pp. 350-354). Berlin: Springer.

    Abstract

    We outline an algorithm capable of generating varied but natural sounding sequences of argument NPs in subordinate clauses of German, a semi-free word order language. In order to attain the right level of output flexibility, the algorithm considers (1) the relevant lexical properties of the head verb (not only transitivity type but also reflexivity, thematic relations expressed by the NPs, etc.), and (2) the animacy and definiteness values of the arguments, and their length. The relevant statistical data were extracted from the NEGRA–II treebank and from hand-coded features for animacy and definiteness. The algorithm maps the relevant properties onto “primary” versus “secondary” placement options in the generator. The algorithm is restricted in that it does not take into account linear order determinants related to the sentence’s information structure and its discourse context (e.g. contrastiveness). These factors may modulate the above preferences or license “tertiary” linear orders beyond the primary and secondary options considered here.
  • Kempen, G. (2004). Terug naar Wundt: Pleidooi voor integraal onderzoek van taal, taalkennis en taalgedrag. In Koninklijke Nederlandse Akademie van Wetenschappen (Ed.), Gij letterdames en gij letterheren': Nieuwe mogelijkheden voor taalkundig en letterkundig onderzoek in Nederland. (pp. 174-188). Amsterdam: Koninklijke Nederlandse Akademie van Wetenschappen.
  • Kempen, G., & Harbusch, K. (2003). A corpus study into word order variation in German subordinate clauses: Animacy affects linearization independently of function assignment. In Proceedings of AMLaP 2003 (pp. 153-154). Glasgow: Glasgow University.
  • Kempen, G., & Harbusch, K. (2003). An artificial opposition between grammaticality and frequency: Comment on Bornkessel, Schlesewsky & Friederici (2002). Cognition, 90(2), 205-210 [Rectification on p. 215]. doi:10.1016/S0010-0277(03)00145-8.

    Abstract

    In a recent Cognition paper (Cognition 85 (2002) B21), Bornkessel, Schlesewsky, and Friederici report ERP data that they claim “show that online processing difficulties induced by word order variations in German cannot be attributed to the relative infrequency of the constructions in question, but rather appear to reflect the application of grammatical principles during parsing” (p. B21). In this commentary we demonstrate that the posited contrast between grammatical principles and construction (in)frequency as sources of parsing problems is artificial because it is based on factually incorrect assumptions about the grammar of German and on inaccurate corpus frequency data concerning the German constructions involved.
  • Kempen, G. (2003). Language generation. In W. Frawley (Ed.), International encyclopedia of linguistics (pp. 362-364). New York: Oxford University Press.
  • Kempen, G., & Harbusch, K. (2003). Dutch and German verb clusters in performance grammar. In P. A. Seuren, & G. Kempen (Eds.), Verb constructions in German and Dutch (pp. 185-221). Amsterdam: Benjamins.
  • Kempen, G., & Harbusch, K. (2003). Word order scrambling as a consequence of incremental sentence production. In H. Härtl, & H. Tappe (Eds.), Mediating between concepts and grammar (pp. 141-164). Berlin: Mouton de Gruyter.
  • Seuren, P. A. M., & Kempen, G. (Eds.). (2003). Verb constructions in German and Dutch. Amsterdam: Benjamins.
  • Harbusch, K., & Kempen, G. (2002). A quantitative model of word order and movement in English, Dutch and German complement constructions. In Proceedings of the 19th international conference on Computational linguistics. San Francisco: Morgan Kaufmann.

    Abstract

    We present a quantitative model of word order and movement constraints that enables a simple and uniform treatment of a seemingly heterogeneous collection of linear order phenomena in English, Dutch and German complement constructions (Wh-extraction, clause union, extraposition, verb clustering, particle movement, etc.). Underlying the scheme are central assumptions of the psycholinguistically motivated Performance Grammar (PG). Here we describe this formalism in declarative terms based on typed feature unification. PG allows a homogenous treatment of both the within- and between-language variations of the ordering phenomena under discussion, which reduce to different settings of a small number of quantitative parameters.
  • Kempen, G., & Van Breugel, C. (2002). A workbench for visual-interactive grammar instruction at the secondary education level. In Proceedings of the 10th International CALL Conference (pp. 157-158). Antwerp: University of Antwerp.
  • Kempen, G., & Harbusch, K. (2002). Performance Grammar: A declarative definition. In A. Nijholt, M. Theune, & H. Hondorp (Eds.), Computational linguistics in the Netherlands 2001: Selected papers from the Twelfth CLIN Meeting (pp. 148-162). Amsterdam: Rodopi.

    Abstract

    In this paper we present a definition of Performance Grammar (PG), a psycholinguistically motivated syntax formalism, in declarative terms. PG aims not only at describing and explaining intuitive judgments and other data concerning the well–formedness of sentences of a language, but also at contributing to accounts of syntactic processing phenomena observable in language comprehension and language production. We highlight two general properties of human sentence generation, incrementality and late linearization,which make special demands on the design of grammar formalisms claiming psychological plausibility. In order to meet these demands, PG generates syntactic structures in a two-stage process. In the first and most important ‘hierarchical’ stage, unordered hierarchical structures (‘mobiles’) are assembled out of lexical building blocks. The key operation at work here is typed feature unification, which also delimits the positional options of the syntactic constituents in terms of so-called topological features. The second, much simpler stage takes care of arranging the branches of the mobile from left to right by ‘reading–out’ one positional option of every constituent. In this paper we concentrate on the structure assembly formalism in PG’s hierarchical component. We provide a declarative definition couched in an HPSG–style notation based on typed feature unification. Our emphasis throughout is on linear order constraints.
  • Kempen, G., & Harbusch, K. (2002). Rethinking the architecture of human syntactic processing: The relationship between grammatical encoding and decoding. In Proceedings of the 35th Meeting of the Societas Linguistica Europaea. University of Potsdam.
  • Harbusch, K., & Kempen, G. (2000). Complexity of linear order computation in Performance Grammar, TAG and HPSG. In Proceedings of Fifth International Workshop on Tree Adjoining Grammars and Related Formalisms (TAG+5) (pp. 101-106).

    Abstract

    This paper investigates the time and space complexity of word order computation in the psycholinguistically motivated grammar formalism of Performance Grammar (PG). In PG, the first stage of syntax assembly yields an unordered tree ('mobile') consisting of a hierarchy of lexical frames (lexically anchored elementary trees). Associated with each lexica l frame is a linearizer—a Finite-State Automaton that locally computes the left-to-right order of the branches of the frame. Linearization takes place after the promotion component may have raised certain constituents (e.g. Wh- or focused phrases) into the domain of lexical frames higher up in the syntactic mobile. We show that the worst-case time and space complexity of analyzing input strings of length n is O(n5) and O(n4), respectively. This result compares favorably with the time complexity of word-order computations in Tree Adjoining Grammar (TAG). A comparison with Head-Driven Phrase Structure Grammar (HPSG) reveals that PG yields a more declarative linearization method, provided that the FSA is rewritten as an equivalent regular expression.
  • Kempen, G. (2000). Could grammatical encoding and grammatical decoding be subserved by the same processing module? Behavioral and Brain Sciences, 23, 38-39.
  • Vosse, T., & Kempen, G. (2000). Syntactic structure assembly in human parsing: A computational model based on competitive inhibition and a lexicalist grammar. Cognition, 75, 105-143.

    Abstract

    We present the design, implementation and simulation results of a psycholinguistic model of human syntactic processing that meets major empirical criteria. The parser operates in conjunction with a lexicalist grammar and is driven by syntactic information associated with heads of phrases. The dynamics of the model are based on competition by lateral inhibition ('competitive inhibition'). Input words activate lexical frames (i.e. elementary trees anchored to input words) in the mental lexicon, and a network of candidate 'unification links' is set up between frame nodes. These links represent tentative attachments that are graded rather than all-or-none. Candidate links that, due to grammatical or 'treehood' constraints, are incompatible, compete for inclusion in the final syntactic tree by sending each other inhibitory signals that reduce the competitor's attachment strength. The outcome of these local and simultaneous competitions is controlled by dynamic parameters, in particular by the Entry Activation and the Activation Decay rate of syntactic nodes, and by the Strength and Strength Build-up rate of Unification links. In case of a successful parse, a single syntactic tree is returned that covers the whole input string and consists of lexical frames connected by winning Unification links. Simulations are reported of a significant range of psycholinguistic parsing phenomena in both normal and aphasic speakers of English: (i) various effects of linguistic complexity (single versus double, center versus right-hand self-embeddings of relative clauses; the difference between relative clauses with subject and object extraction; the contrast between a complement clause embedded within a relative clause versus a relative clause embedded within a complement clause); (ii) effects of local and global ambiguity, and of word-class and syntactic ambiguity (including recency and length effects); (iii) certain difficulty-of-reanalysis effects (contrasts between local ambiguities that are easy to resolve versus ones that lead to serious garden-path effects); (iv) effects of agrammatism on parsing performance, in particular the performance of various groups of aphasic patients on several sentence types.
  • Kempen, G. (1999). Fiets en (centri)fuge. Onze Taal, 68, 88.
  • Kempen, G. (1999). Visual Grammar: Multimedia for grammar and spelling instruction in primary education. In K. Cameron (Ed.), CALL: Media, design, and applications (pp. 223-238). Lisse: Swets & Zeitlinger.
  • Kempen, G., & Harbusch, K. (1998). A 'tree adjoining' grammar without adjoining: The case of scrambling in German. In Fourth International Workshop on Tree Adjoining Grammars and Related Frameworks (TAG+4).
  • Kempen, G. (1998). Comparing and explaining the trajectories of first and second language acquisition: In search of the right mix of psychological and linguistic factors [Commentory]. Bilingualism: Language and Cognition, 1, 29-30. doi:10.1017/S1366728998000066.

    Abstract

    When you compare the behavior of two different age groups which are trying to master the same sensori-motor or cognitive skill, you are likely to discover varying learning routes: different stages, different intervals between stages, or even different orderings of stages. Such heterogeneous learning trajectories may be caused by at least six different types of factors: (1) Initial state: the kinds and levels of skills the learners have available at the onset of the learning episode. (2) Learning mechanisms: rule-based, inductive, connectionist, parameter setting, and so on. (3) Input and feedback characteristics: learning stimuli, information about success and failure. (4) Information processing mechanisms: capacity limitations, attentional biases, response preferences. (5) Energetic variables: motivation, emotional reactions. (6) Final state: the fine-structure of kinds and levels of subskills at the end of the learning episode. This applies to language acquisition as well. First and second language learners probably differ on all six factors. Nevertheless, the debate between advocates and opponents of the Fundamental Difference Hypothesis concerning L1 and L2 acquisition have looked almost exclusively at the first two factors. Those who believe that L1 learners have access to Universal Grammar whereas L2 learners rely on language processing strategies, postulate different learning mechanisms (UG parameter setting in L1, more general inductive strategies in L2 learning). Pienemann opposes this view and, based on his Processability Theory, argues that L1 and L2 learners start out from different initial states: they come to the grammar learning task with different structural hypotheses (SOV versus SVO as basic word order of German).
  • Kempen, G. (1998). Sentence parsing. In A. D. Friederici (Ed.), Language comprehension: A biological perspective (pp. 213-228). Berlin: Springer.
  • Dijkstra, T., & Kempen, G. (1997). Het taalgebruikersmodel. In H. Hulshof, & T. Hendrix (Eds.), De taalcentrale. Amsterdam: Bulkboek.
  • Kempen, G. (1997). De ontdubbelde taalgebruiker: Maken taalproductie en taalperceptie gebruik van één en dezelfde syntactische processor? [Abstract]. In 6e Winter Congres NvP. Programma and abstracts (pp. 31-32). Nederlandse Vereniging voor Psychonomie.
  • Kempen, G., Kooij, A., & Van Leeuwen, T. (1997). Do skilled readers exploit inflectional spelling cues that do not mirror pronunciation? An eye movement study of morpho-syntactic parsing in Dutch. In Abstracts of the Orthography Workshop "What spelling changes". Nijmegen: Max Planck Institute for Psycholinguistics.
  • Kempen, G. (1997). Taalpsychologie week. In Wetenschappelijke Scheurkalender 1998. Beek: Natuur & Techniek.

    Abstract

    [Seven one-page psycholinguistic sketches]
  • Kempen, G. (1997). Van taalbarrières naar linguïstische snelwegen: Inrichting van een technische taalinfrastructuur voor het Nederlands. Grenzen aan veeltaligheid: Taalgebruik en bestuurlijke doeltreffendheid in de instellingen van de Europese Unie, 43-48.
  • Kempen, G. (1996). “De zwoele groei van de zinsbouw”: De wonderlijke levende grammatica van Jac. van Ginneken uit 'De roman van een kleuter' (1917). In A. Foolen, & J. Noordegraaf (Eds.), De taal is kennis van de ziel. Opstellen over Jac. van Ginneken (1877–1945) (pp. 173-216). Münster: Nodus.

    Files private

    Request files
  • Kempen, G. (1996). "De zwoele groei van den zinsbouw": De wonderlijke levende grammatica van Jac. van Ginneken uit De Roman van een Kleuter (1917). Bezorgd en van een nawoord voorzien door Gerard Kempen. In A. Foolen, & J. Noordegraaf (Eds.), De taal is kennis van de ziel: Opstellen over Jac. van Ginneken (1877-1945) (pp. 173-216). Münster: Nodus Publikationen.
  • Kempen, G. (1996). Human language technology can modernize writing and grammar instruction. In COLING '96 Proceedings of the 16th conference on Computational linguistics - Volume 2 (pp. 1005-1006). Stroudsburg, PA: Association for Computational Linguistics.
  • Kempen, G., & Janssen, S. (1996). Omspellen: Reuze(n)karwei of peule(n)schil? In H. Croll, & J. Creutzberg (Eds.), Proceedings of the 5e Dag van het Document (pp. 143-146). Projectbureau Croll en Creutzberg.
  • Kempen, G. (1996). Lezen, leren lezen, dyslexie: De auditieve basis van visuele woordherkenning. Nederlands Tijdschrift voor de Psychologie, 51, 91-100.
  • Kempen, G. (1996). Computational models of syntactic processing in human language comprehension. In T. Dijkstra, & K. De Smedt (Eds.), Computational psycholinguistics: Symbolic and subsymbolic models of language processing (pp. 192-220). London: Taylor & Francis.
  • Kempen, G. (1996). Wetenschap op internet: Een voorstel voor de Nederlandse Psychonomie. Nieuwsbrief Nederlandse Vereniging voor Psychonomie, 3, 5-8.
  • De Smedt, K., & Kempen, G. (1996). Discontinuous constituency in Segment Grammar. In H. C. Bunt, & A. Van Horck (Eds.), Discontinuous constituency (pp. 141-163). Berlin: Mouton de Gruyter.
  • Kempen, G. (1995). 'Hier spreekt men Nederlands'. EMNET: Nieuwsbrief Elektronische Media, 22, 1.
  • Kempen, G. (1995). IJ of Y? Onze Taal, 64(9), 205-206.
  • Kempen, G. (1995). Processing discontinuous lexical items: A reply to Frazier. Cognition, 55, 219-221. doi:10.1016/0010-0277(94)00657-7.

    Abstract

    Comments on a study by Frazier and others on Dutch-language lexical processing. Claims that the control condition in the experiment was inadequate and that an assumption made by Frazier about closed class verbal items is inaccurate, and proposes an alternative account of a subset of the data from the experiment
  • Kempen, G. (1995). Processing separable complex verbs in Dutch: Comments on Frazier, Flores d'Arcais, and Coolen (1993). Cognition, 54, 353-356. doi:10.1016/0010-0277(94)00649-6.

    Abstract

    Raises objections to L. Frazier et al's (see record 1994-32229-001) report of an experimental study intended to test Schreuder's (1990) Morphological Integration (MI) model concerning the processing of separable and inseparable verbs and shows that the logic of the experiment is flawed. The problem is rooted in the notion of a separable complex verb. The conclusion is drawn that Frazier et al's experimental data cannot be taken as evidence for the theoretical propositions they develop about the MI model.
  • Kempen, G. (1995). De mythe van het woordbeeld: Spellingherziening taalpsychologisch doorgelicht. Onze Taal, 64(11), 275-277.
  • Kempen, G. (1995). Drinken eten mij Nim. Intermediair, 31(19), 41-45.
  • Kempen, G. (1995). Van leescultuur en beeldcultuur naar internetcultuur. De Psycholoog, 30, 315-319.
  • Kempen, G. (1994). Nederlands als computertaal. EMNET: Nieuwsbrief Elektronische Media, 2, 9-12.
  • Kempen, G. (1994). Klare taal: Zicht op zinsbouw. Natuur en Techniek, 62, 380-391.
  • Kempen, G. (1994). De mythe van het woordbeeld: Spellingherziening taalpsychologisch doorgelicht. Spektator, tijdschrift voor Neerlandistiek, 23, 292-301.
  • Kempen, G. (1994). In de grammaticadiscussie is de empirie aan zet. Levende Talen, 486, 27-28.
  • Kempen, G. (1994). Innovative language checking software for Dutch. In J. Van Gent, & E. Peeters (Eds.), Proceedings of the 2e Dag van het Document (pp. 99-100). Delft: TNO Technisch Physische Dienst.
  • Kempen, G. (1994). The unification space: A hybrid model of human syntactic processing [Abstract]. In Cuny 1994 - The 7th Annual CUNY Conference on Human Sentence Processing. March 17-19, 1994. CUNY Graduate Center, New York.
  • Kempen, G., & Dijkstra, A. (1994). Toward an integrated system for grammar, writing and spelling instruction. In L. Appelo, & F. De Jong (Eds.), Computer-Assisted Language Learning: Proceedings of the Seventh Twente Workshop on Language Technology (pp. 41-46). Enschede: University of Twente.
  • Diesveld, P., & Kempen, G. (1993). Zinnen als bouwwerken: Computerprogramma's voor grammatica-oefeningen. MOER, Tijdschrift voor onderwijs in het Nederlands, 1993(4), 130-138.
  • Dijkstra, T., & Kempen, G. (Eds.). (1993). Einführung in die Psycholinguistik. München: Hans Huber.
  • Dijkstra, T. (1993). Taalpsychologie (G. Kempen, Ed.). Groningen: Wolters-Noordhoff.
  • Kempen, G. (1993). A cognitive architecture for incremental syntactic processing in sentence understanding and sentence production [Abstract]. In Abstracts of the International Conference on the Psychology of Language and Communication. Glasgow: University of Glasgow.
  • Kempen, G. (1993). Mensentaal als computertaal. Onze Taal, 62, 275-277.
  • Kempen, G. (1993). Naar geautomatiseerde Nederlandstalige informatiediensten. In N. Van Willigen (Ed.), RABIN uitGELUID: Tien persoonlijke bijdragen na zes jaar advisering over bibliotheken en informatie (pp. 42-51). Den Haag: RABIN.
  • Kempen, G. (1993). Die Architektur des Sprechens [Abstract]. In O. Herzog, T. Christaller, & D. Schütt (Eds.), Grundlagen und Anwendungen der Künstlichen Intelligenz: 17. Fachtagung für Künstliche Intelligenz, Humboldt-Universität zu Berlin, 13.-16. September 1993 (pp. 201-202). Berlin: Springer Verlag.
  • Kempen, G. (1993). Zinsontleding kan een exact vak worden. Levende Talen, 483, 459-462.
  • Kempen, G. (1993). Spraakkunst als bouwkunst [Inaugural lecture]. Leiden: University of Leiden.
  • Kempen, G., & Vosse, T. (1992). A language-sensitive text editor for Dutch. In P. O’Brian Holt, & N. Williams (Eds.), Computers and writing: State of the art (pp. 68-77). Dordrecht: Kluwer Academic Publishers.

    Abstract

    Modern word processors begin to offer a range of facilities for spelling, grammar and style checking in English. For the Dutch language hardly anything is available as yet. Many commercial word processing packages do include a hyphenation routine and a lexicon-based spelling checker but the practical usefulness of these tools is limited due to certain properties of Dutch orthography, as we will explain below. In this chapter we describe a text editor which incorporates a great deal of lexical, morphological and syntactic knowledge of Dutch and monitors the orthographical quality of Dutch texts. Section 1 deals with those aspects of Dutch orthography which pose problems to human authors as well as to computational language sensitive text editing tools. In section 2 we describe the design and the implementation of the text editor we have built. Section 3 is mainly devoted to a provisional evaluation of the system.
  • Kempen, G. (1992). Grammar based text processing. Document Management: Nieuwsbrief voor Documentaire Informatiekunde, 1(2), 8-10.
  • Kempen, G. (1992). Language technology and language instruction: Computational diagnosis of word level errors. In M. Swartz, & M. Yazdani (Eds.), Intelligent tutoring systems for foreign language learning: The bridge to international communication (pp. 191-198). Berlin: Springer.
  • Kempen, G. (1992). Generation. In W. Bright (Ed.), International encyclopedia of linguistics (pp. 59-61). New York: Oxford University Press.
  • Kempen, G. (1992). Second language acquisition as a hybrid learning process. In F. Engel, D. Bouwhuis, T. Bösser, & G. d'Ydewalle (Eds.), Cognitive modelling and interactive environments in language learning (pp. 139-144). Berlin: Springer.
  • Kempen, G., & De Vroomen, P. (Eds.). (1991). Informatiewetenschap 1991: Wetenschappelijke bijdragen aan de eerste STINFON-conferentie. Leiden: STINFON.
  • Kempen, G. (1991). Conjunction reduction and gapping in clause-level coordination: An inheritance-based approach. Computational Intelligence, 7, 357-360. doi:10.1111/j.1467-8640.1991.tb00406.x.
  • De Smedt, K., & Kempen, G. (1991). Segment Grammar: A formalism for incremental sentence generation. In C. Paris, W. Swartout, & W. Mann (Eds.), Natural language generation and computational linguistics (pp. 329-349). Dordrecht: Kluwer Academic Publishers.

    Abstract

    Incremental sentence generation imposes special constraints on the representation of the grammar and the design of the formulator (the module which is responsible for constructing the syntactic and morphological structure). In the model of natural speech production presented here, a formalism called Segment Grammar is used for the representation of linguistic knowledge. We give a definition of this formalism and present a formulator design which relies on it. Next, we present an object- oriented implementation of Segment Grammar. Finally, we compare Segment Grammar with other formalisms.
  • Van der Veer, G. C., Bagnara, S., & Kempen, G. (1991). Preface. Acta Psychologica, 78, ix. doi:10.1016/0001-6918(91)90002-H.
  • Vosse, T., & Kempen, G. (1991). A hybrid model of human sentence processing: Parsing right-branching, center-embedded and cross-serial dependencies. In M. Tomita (Ed.), Proceedings of the Second International Workshop on Parsing Technologies.
  • Jongen-Janner, E., Pijls, F., & Kempen, G. (1990). Intelligente programma's voor grammatica- en spellingonderwijs. In Q. De Kort, & G. Leerdam (Eds.), Computertoepassingen in de Neerlandistiek. Almere: Landelijke Vereniging van Neerlandici.

Share this page