All Data is Metadata: Rich Architectures for Rich Resources

Henry S. Thompson
HCRC Language Technology Group
Division of Informatics
University of Edinburgh

Data-intensive research and development in the area of human language is no more than 25 years old, but the pace of change in the underlying paradigms is so great that the field has already changed almost beyond recognition. If the changes in the first few decades have largely been in quantitative areas - the size and speed of computers themselves and the related exponential growth in the amount of material available to us - more recently qualitative change has come more to the fore, first in the emergence of consensus standards for structure and markup, then in the explosion on web-based delivery and exploitation mechanisms. The next step is already upon us: from single to multiple modalities. In this talk I will survey the annotation technologies emerging on the World Wide Web, and go on to explore how we can extend our methodologies to productively embrace multiple modality resources.