Second example: TAGML

Next: Principles for multilevel annotated Up: A general relational model Previous: First example: Morphological lexicon

Second example: TAGML

TAGML (Tree Adjoining Grammars Markup Language) is a general norm for encodind and exchange ressources used with Lexicalised Tree Adjoining Grammars. A working group in France gathers people (mainly from TALaNa, ENST, INRIA Rocquencourt and LORIA) who work on this formalism and try to define standards for common grammars and grammar exchange, parsers, and tools developments. TAGML is an exemple of the high level of complexity of the ressources to encode. A LTAG grammar is defined by a morphological lexicon, a syntactic lexicon and a set of schemas (non lexicalized elementary tree paterns). The schema are ordered in tree families in order to capture generalities of lexicalizations given by the syntactic lexicon. Improvment of LTAG parsers and tools depends on how this huge amount of datas can be factorized in order to share computation.

The previous RROM model for morphological lexicon is extented to the other ressources needed at the syntactic level. An inflection (a lemma and a set of morphological features including verb mode for example) corresponds to a set of schemas. This lexicalization relation can include the instanciation of co-anchors (a lemma and a set of possibly underspecified morphological features) and of some additional syntactic features in the schema. Each syntactical instanciation give a complete elementary tree. If we assume that linguistic principles given in [Abeillé et al.1990] and [Candito1999] are fullfilled by the grammar, each syntactical instanciation corresponds to only one semantic instanciation (semantic consistency principle). This model allows an incremental view of the lexicon ressources.

Figure: Simplified RROM for LTAG ressources.

The figure 2 presents the corresponding RROM. To simplify, tree families and structuration of features are not included in this example.

Figure: RROM for multilevel annotated textual corpus.

Patrice Lopez
Thu Apr 13 09:23:20 MET DST 2000