Appendix A. CMDI ecosystem

CMDI - “Component Metadata infrastructure” – is a metadata framework in which metadata blueprints can be described and reused. Metadata is data about data, giving various kinds of information about the data contained, e.g. about protagonists, places, the date of gathering data etc. To manage this metadata in a diligent way certain ready-made options for description can facilitate production of well-ordered information about data. In addition to creating metadata (components) always from scratch, CMDI allows to use existing description options –blueprints in the form of ‘profiles and components’ (see Figure A.1 below). These existing blueprints can be edited in the component registry according to individual needs, if necessary. Further, editing the content of metadata descriptions can be done as a next step in Arbil.

Building profiles from components according to the projects’ individual demands with CMDI in the component browser.

Figure A.1.  Building profiles from components according to the projects’ individual demands with CMDI in the component browser.


CLARIN has established this infrastructure of ready-made building blocks (components) to overcome “dispersion of metadata in diverse formats” (see CLARIN website about CMDI): Through the use of standardized building blocks metadata will not be subject to arbitrary and very diverse, individually preferred ways of describing data about data (=metadata). All necessary information can be extracted easily from the standardized metadata-descriptions without having to study the data themselves and first understanding the individual way authors of metadata have set up their descriptions. Through standardization working with metadata is thus made easy for authors and users. All in all, the potential of sharing and storing self-designed components and profiles for reuse by others facilitates flexible, but structured metadata building and search among various language resources.

Another advantage of CMDI is the possibility of linking elements in components and profiles to the ISOcat concept registry. This registry is a central framework for gathering concepts in various domains and has special profile for metadata related categories. Through creating links to this registry users searching for certain kinds of data in ISOcat will be enabled to find all data of different authors, thanks to standardized metadata descriptions and the central registry. ISOcat is thus highly recommended to be browsed through and to create links to when working with CMDI, which can be done from inside the Component Registry.

The last significant related instrument of CLARIN to be mentioned here is the VLO (Virtual Language Observatory). This website gives an overview over metadata from a number of registries covering a large number of institutions making use of CDMI. Additionally all metadata which has been created with the help of CMDI is available in the diverse records provided by the institutions. This service facilitates the search for and exchange of data with the help of metadata among researchers’ communities in the field.