Appendix A. CMDI ecosystem

CMDI - “Component Metadata infrastructure” – is a metadata framework in which metadata blueprints can be described and reused.

Metadata is data about data, giving various kinds of information about the data contained, e.g. about protagonists, places, the date of gathering data etc. To manage this metadata in a diligent way certain ready-made options for description can be used which facilitate the production of well-ordered information.

In addition, to create metadata (components) from scratch, CMDI allows you to use existing description options –blueprints in the form of ‘profiles and components’ (see Figure A.1 below). These existing blueprints can be edited in the component registry according to individual needs, if necessary.

Furthermore, Arbil also allows editing the content of metadata descriptions.

Building profiles from components according to the projects’ individual demands with CMDI in the component browser.

Figure A.1.  Building profiles from components according to the projects’ individual demands with CMDI in the component browser.


CLARIN has established this infrastructure of ready-made building blocks (components) to overcome “dispersion of metadata in diverse formats” (see CLARIN website for more information about CMDI): through the use of standardized building blocks metadata will not be subject to arbitrary and very diverse, individually preferred ways of describing data about data (=metadata). All the necessary information can be extracted easily from the standardized metadata-descriptions without having to study the data themselves and first understanding the individual way authors of metadata have set up their descriptions. Through standardization, working with metadata is thus made easy for authors and users. All in all, the potential of sharing and storing self-designed components and profiles for reuse by others facilitates flexible, but structured metadata building and search among various language resources.

Another advantage of CMDI is the possibility of linking elements in components and profiles to the ISOcat concept registry. This registry is a central framework for gathering concepts in various domains and has special profiles for metadata related categories. By creating links to this registry users searching for certain kinds of data in ISOcat will be enabled to find all data belonging to different authors, thanks to standardized metadata descriptions and to the central registry. It is thus highly recommended to browse through ISOcat and to create links to it when working with CMDI, which can be done from inside the Component Registry.

The last significant CLARIN-related instrument to be mentioned here is the VLO (Virtual Language Observatory). This website gives an overview of metadata from a certain amount of registries covering a large number of institutions which make use of CDMI. Additionally, all metadata which has been created with the help of CMDI is available in the several records provided by the institutions. This service facilitates the search for, and the exchange of, data with the help of metadata by researchers’ communities in the field.