In ELAN users are free to invent their own tier setup and labelling method. This flexibility is often necessary due to the nature of the data that is to be transcribed. Moreover, people that are involved in the transcription process may not be fluent in English and as a result an international (English) annotation scheme is not applicable. In those cases a controlled vocabulary (see Section 5.6) and templates (see Section 4.2.10) are convenient tools to help annotators.
The downside of all this flexibility is the amount of work involved to make language resources interoperable. When dealing with only a few resources, data can be manageable, but with an increasing number of resources a convenient way to make them interoperable becomes more important. For this purpose the ISO Data Category Registry is developed.
The Data Category Registry (or DCR) is an list of linguistic concepts covering a range of linguistic domains. The concepts in the DCR can be referenced to from all sorts of tools and resources. Therefore, the DCR acts as a intermediate between those tools and resources.
Referencing to a Data Category is implemented in ELAN as follows. Depending on the type of data you are referencing from (linguistic type (Section 5.3.6), controlled vocabulary entry (Section 5.6.3) or annotation (Section 5.9.21)), the following or a similar window is displayed.
The left panel shows the categories stored on your local system. Since there are
none in the left panel, the right panel does not display any name or description. To add
categories, click on . The following window appears:
This window displays the DCR on a remote server. It includes all profiles and the
data categories of those profiles. To select one or more data categories for local storage
first click a profile in the left panel. All data categories of the selected profile are
displayed in the middle panel ordered by alphabet, ID or Broader Concept. If you select a
data category, information of the category is displayed in the right panel. For instance,
the data category partOfSpeech has Id 1345 as can be seen below.
Holding the CTRL key while clicking multiple lines in the middle panel
enables you to select more than one data category. The same holds for using the
SHIFT key for selecting a range and using CTRL+A for
selecting all categories from the list. Click on to storing
the selected data categories on you local system.
In the same way as described above more data categories, also from other profiles, can be selected and stored on your local system. Afterwards, you can highlight a category and associate it to a CV or linguistic type by clicking Apply:
The original purpose of this system is to associate (parts of) your data to a common labelling system to improve interoperability between resources and tools. To do so, select a data category and click on
. This will associate the selected data category to an annotation, entry of a controlled vocabulary or linguistic type, depending on the point from which you entered the Local Data Category Selection.