Meta-Descriptions for Language Resources
An increasing number of language resources are being created world-wide. These are treasure-troves of information for a variety of communities, including researchers and engineers working on language and human interaction. As these resources get more and more complex, it gets progressively more difficult to know what kind of information they will contain and what sort of format they will use to encode this information. People searching such resources - for example, in order to do some research or train a speech recognizer - want to be able to find those that hold relevant information quickly and easily.
Today the Internet supports networks that let us keep each other informed about available resources. But the Internet contains so much information that it is almost impossible to find a specific item of information unless one knows where it is. For this reason many communities are talking about setting up web-portals, and structuring community- or domain-specific information by using so-called meta-descriptions. A collection of meta-descriptions can form an easily browsed search space. It is proposed that the language resources community should set up web-portals to organize and present information about language resources.
The meta-descriptions would contain meta-data which would characterize individual language resources in a way which was meaningful to a member of the language community. The meta-descriptions would have to be generated in a standard format so that specialised search engines and browse tools could make use of them. It remains to be seen how much of the work being done for the meta-initiatives in other communities could be adapted to the requirements of the users of language resources, and to what extent we could re-use existing meta-element definitions, but it does seem that the ongoing work of the World Wide Web Consortium and similar groups could be exceedingly relevant.
The talk will introduce the concept of meta-descriptions and present examples to illustrate the applications for such descriptions on the Internet.