Neural syntax

Fitz, H. (2009). Neural syntax. PhD Thesis, Universiteit van Amsterdam, Institute for Logic, Language, and Computation.
Children learn their mother tongue spontaneously and effortlessly through communicative interaction with their environment; they do not have to be taught explicitly or learn how to learn first. The ambient language to which children are exposed, however, is highly variable and arguably deficient with regard to the learning target. Nonetheless, most normally developing children learn their native language rapidly and with ease. To explain this accomplishment, many theories of acquisition posit innate constraints on learning, or even a biological endowment for language which is specific to language. Usage-based theories, on the other hand, place more emphasis on the role of experience and domain-general learning mechanisms than on innate language-specific knowledge. But languages are lexically open and combinatorial in structure, so no amount of experience covers their expressivity. Usage-based theories therefore have to explain how children can generalize the properties of their linguistic input to an adult-like grammar. In this thesis I provide an explicit computational mechanism with which usage-based theories of language can be tested and evaluated. The focus of my work lies on complex syntax and the human ability to form sentences which express more than one proposition by means of relativization. This `capacity for recursion' is a hallmark of an adult grammar and, as some have argued, the human language faculty itself. The manuscript is organized as follows. In the second chapter, I give an overview of results that characterize the properties of neural networks as mathematical objects and review previous attempts at modelling the acquisition of complex syntax with such networks. The chapter introduces the conceptual landscape in which the current work is located. In the third chapter, I argue that the construction and use of meaning is essential in child language acquisition and adult processing. Neural network models need to incorporate this dimension of human linguistic behavior. I introduce the Dual-path model of sentence production and syntactic development which is able to represent semantics and learns from exposure to sentences paired with their meaning (cf. Chang et al. 2006). I explain the architecture of this model, motivate critical assumptions behind its design, and discuss existing research using this model. The fourth chapter describes and compares several extensions of the basic architecture to accommodate the processing of multi-clause utterances. These extensions are evaluated against computational desiderata, such as good learning and generalization performance and the parsimony of input representations. A single-best solution for encoding the meaning of complex sentences with restrictive relative clauses is identified, which forms the basis for all subsequent simulations. Chapter five analyzes the learning dynamics in more detail. I first examine the model's behavior for different relative clause types. Syntactic alternations prove to be particularly difficult to learn because they complicate the meaning-to-form mapping the model has to acquire. In the second part, I probe the internal representations the model has developed during learning. It is argued that the model acquires the argument structure of the construction types in its input language and represents the hierarchical organization of distinct multi-clause utterances. The juice of this thesis is contained in chapters six to eight. In chapter six, I test the Dual-path model's generalization capacities in a variety of tasks. I show that its syntactic representations are sufficiently transparent to allow structural generalization to novel complex utterances. Semantic similarities between novel and familiar sentence types play a critical role in this task. The Dual-path model also has a capacity for generalizing familiar words to novel slots in novel constructions (strong semantic systematicity). Moreover, I identify learning conditions under which the model displays recursive productivity. It is argued that the model's behavior is consistent with human behavior in that production accuracy degrades with depth of embedding, and right-branching is learned faster than center-embedding recursion. In chapter seven, I address the issue of learning complex polar interrogatives in the absence of positive exemplars in the input. I show that the Dual-path model can acquire the syntax of these questions from simpler and similar structures which are warranted in a child's linguistic environment. The model's errors closely match children's errors, and it is suggested that children might not require an innate learning bias to acquire auxiliary fronting. Since the model does not implement a traditional kind of language-specific universal grammar, these results are relevant to the poverty of the stimulus debate. English relative clause constructions give rise to similar performance orderings in adult processing and child language acquisition. This pattern matches the typological universal called the noun phrase accessibility hierarchy. I propose an input-based explanation of this data in chapter eight. The Dual-path model displays this ordering in syntactic development when exposed to plausible input distributions. But it is possible to manipulate and completely remove the ordering by varying properties of the input from which the model learns. This indicates, I argue, that patterns of interference and facilitation among input structures can explain the hierarchy when all structures are simultaneously learned and represented over a single set of connection weights. Finally, I draw conclusions from this work, address some unanswered questions, and give a brief outlook on how this research might be continued.
Additional information
Publication type
Publication date

Share this page