Generating natural word orders in a semi-free word order language: Treebank-based linearization preferences for German
We outline an algorithm capable of generating varied but
natural sounding sequences of argument NPs in subordinate clauses of German, a semi-free word order language. In order to attain the right level of output flexibility, the algorithm considers (1) the relevant lexical properties of the head verb (not only transitivity type but also reflexivity, thematic relations expressed by the NPs, etc.), and (2) the animacy and definiteness values of the arguments, and their length. The relevant statistical data were extracted from the NEGRA–II treebank and from
hand-coded features for animacy and definiteness. The algorithm maps the relevant properties onto “primary” versus “secondary” placement options in the generator. The algorithm is restricted in that it does not take into account linear order determinants related to the sentence’s
information structure and its discourse context (e.g. contrastiveness). These factors may modulate the above preferences or license “tertiary” linear orders beyond the primary and secondary options considered here.
Share this page