How to select multiple files

In the dialog window below (which pops up as the first dialog for most of the multiple file operation) do one of the following:

Figure 1.142. Multiple file selection dialog

Multiple file selection dialog

Editing multiple files and analysis of multiple files

Normally ELAN allows to edit only a single file at a time. There are situations in which it is convenient to edit multiple files at once. The menu item File > Multiple File Processing gives a number of options to do just this. When selecting either of them, you are warned that you should have copies of the files you are going to work on in case you want to restore the files (there is no Undo for multiple file edits).

Create transcription files for media files

When you choose this option Create Transcription Files for Media Files..., you see the following dialog.

Figure 1.143. Create transcription files for multiple media files Dialog

Create transcription files for multiple media files Dialog

Options :

Click on Start to create the transcriptions based on the options set.

Edit Multiple Files

The option Edit Multiple Files... shows, after clicking Yes in the warning dialog mentioned above, the Multi File Editor. The first thing to do here is to load a domain by clicking Load domain. Loading a domain is the same as for the Scrub Transcriptions... option. To be able to load a domain you must of course have created one beforehand (see How to select multiple files). After loading a domain, the data is shown in the table. In this table you can edit tiers on the Tiers tab and tier types on the Tier Types tab.

To edit a name, annotator or participant of a tier, double click the corresponding table cell or select it and start typing. To change the tier type of a tier, select one from the drop down menu. You can add a tier by clicking Add tier, add a depending tier by clicking Add dependent tierand remove one by clicking Remove tier.

Note

If there are hierarchy inconsistencies (e.g. if a tier in one file does have a parent while a tier with the same name in another file does not) removing tiers is not possible. The button Remove tier is therefore greyed out.

Figure 1.144. Multiple File Editor

Multiple File Editor
On the Tier Types tab, the name of a tier type can be changed by double clicking the corresponding table cell in the Type Name column.

Changes made in the Tiers and Types tabs are applied to all the files in the domain after clicking the Save changes to domain files button.

Scrub Transcriptions

When you choose Scrub Transcriptions..., you first need to specify a new domain or select an existing domain. This option helps you to "clean" the annotation files (*.eaf) of possible tabs or white space characters which are often overlooked by the user but are still saved in the file. To select, create or delete a domain see How to select multiple files.

In the next dialog, you can specify what characters to delete, new line characters, tab characters and/or white space characters, and in what position these characters have to be. Click Start to start the scrubbing process. The progress of the scrubbing is shown in the progress bar.

Figure 1.145. Transcription Scrubber

Transcription Scrubber

Annotations from overlaps

The option Annotations From Overlaps... for multiple files is the same function as annotations from overlaps in the current open file (see Creating annotations from overlaps ), but applied to a selection of files. The first step allows to select a custom set of files in a file browser or to load a set of files that have been stored as a domain. For loading or creating new domain see How to select multiple files The list of tiers is the sum of all tier names encountered in the selected files. The options in the next steps are the same, clicking the Finish button in the last step the new tier is created and populated with annotations in all files of the domain.

Calculate inter-annotator reliability

In some projects two or more annotators annotate the recordings and the need exists to assess the level of agreement, in order to be able to improve the training of the annotators and in the end the annotation quality. Several algorithms have been implemented and are being applied in research projects but there doesn’t seem to be consensus on what the best approach is for this type of data (time aligned annotations). The option Calculate inter-annotator reliability... allows to calculate an agreement measure for annotations on multiple tiers in multiple files, providing three different methods based on existing algorithms. Even though multiple tiers can be selected, the comparison is always performed on pairs of tiers. These methods are provided “as is”. Implementation of other algorithms might be added later, if time allows. The three methods are:

Using the Calculate inter-annotator reliability... function requires a few steps to be taken. Some of the steps differ depending on the choices made; other steps are common to all methods. The steps and their availability depending on the choices made are described below.

The following sections will describe each step of the process in more detail.

Step 1. Method selection

Figure 1.146. 1. Method selection

1. Method selection

The first step in the process allows you to select a method of comparing. Depending on what method is selected, the steps after that will differ. The options to choose from are:

  • By calculating modified Cohen’s kappa:

    This option implements (part of) the Holle & Rein algorithm as described in this publication: Holle, H., & Rein, R. (2014). EasyDIAg: A tool for easy determination of interrater agreement. Behavior Research Methods, August 2014. The manual of EasyDIAg can be consulted for a detailed description and explanation of the algorithm.

  • By calculating modified Fleiss' kappa:

    This option provides a modified implementation of Fleiss' kappa. The modification concerns (as is the case for the modified Cohen's kappa implementation) the matching of annotations to determine the "subjects" or "units" and the introduction of the "Unmatched" or "Void" category for annotations/events that are not identified by all raters. If the raters only have to apply labels to pre-existing segments, the problem of matching annotations does not exist. Fleiss' kappa works for two or more raters (the other options are limited to two raters).

  • by calculating the ratio of overlap and total extent:

    This is a simplified version of the function that used to be under Tier > Compare annotators.... It calculates a raw agreement value for the segmentation, it doesn't take into account chance agreement and it doesn't compare annotation values. The current implementation only includes in the output the average agreement value for all annotation pairs of each set of tiers (whereas previously the ratio per annotation pair was listed as well).

  • by applying the Staccato algorithm:

    This will compare the annotations (the segmentations) of 2 annotators using the Staccato algorithm. See this article for more information on the Staccato algorithm: Luecking, A., Ptock, S., & Bergmann, K. (2011). Staccato: Segmentation Agreement Calculator according to Thomann. In E. Efthimiou G. & Kouroupetroglou (Eds.), Proceedings of the 9th International Gesture Workshop: Gestures in Embodied Communication and Human-Computer Interaction (pp. 50-53) .

Once you choose a method, click Next to continue. Note that when the Kappa or Staccato are chosen, the next step will be 'Customize compare method'. Otherwise the next step is 'Document & tier configuration'.

Step 2. Customize compare method

Figure 1.147. a. Customize compare method for modified Cohen's kappa

a. Customize compare method for modified Cohen's kappa

When the modified Cohen's kappa method is chosen, this step allows you to specify the minimal required percentage of overlap. This is the amount of overlap as a percentage of the duration of the longest of the two annotations. The higher the percentage, the more the annotations have to overlap to match.

You can choose to generate and export agreement values per pair of tiers, in addition to the overall values. Since this algorithm compares annotation values as well, it is best to select tiers that share the same (controlled) vocabulary. When done, click Next.

Figure 1.148. b. Customize compare method for modified Fleiss' kappa

b. Customize compare method for modified Fleiss' kappa

The options for Fleiss' kappa are slightly different; the slider here allows to specify a percentage between 1 and 100. Since there can be any number of raters, the annotation matching algorithm tries to create clusters of as many overlapping annotation as possible, taking into account the required average of the percentages of the overlap and each involved annotation's duration. The figures below try to illustrate the problem.

Figure 1.149. Six raters, two possible clusters of four and six matching annotations, the overlap in light blue

Six raters, two possible clusters of four and six matching annotations, the overlap in light blue

Figure 1.150. Six raters, four of the possible clusters of matching annotations are marked in blue and green

Six raters, four of the possible clusters of matching annotations are marked in blue and green

The algorithm gives preference to clusters with more annotations, as long as the required average percentage of overlap is met. If not, a cluster with less annotations is selected. Each annotation can only be part of one cluster.

The Also export matrices checkbox allows to not save the tables of values (see the worked example at Wikipedia), but it is recommended to accept the default.

Figure 1.151. c. Customize compare method for Staccato

c. Customize compare method for Staccato

When you've selected the Staccato algorithm as the compare method, the settings as shown above will appear. You can customize the settings for the Staccato algorithm here. This algorithm takes chance into account by comparing the segmentation with a series of randomly generated segmentations, the Monte Carlo simulation. The nomination length granularity determines how many memory slots for segments of different length will be used internally. For more in-depth information regarding these settings, please see the reference article mentioned before. When done, click Next.

Step 3. Document & tier configuration

Figure 1.152. 3. Document and tier configuration

3. Document and tier configuration

The next step is to configure where the tiers that you want to use for comparison are located and how they should be paired. In the upper part of the dialog, you select the location of the tiers:

  • In the current document (available when you have an .eaf file open, otherwise this option will be greyed out)
  • In the same file (choose a single file in the next step)
  • In different files (choose multiple files in the next step)

In the lower part of the dialog, you can select in what way the pairing of tiers to compare is done:

  • Based on manual selection (select tiers from a list in the next step)
  • Based on prefix or suffix (customize in the next step)
  • Based on same tier name (only available when the option 'in different files' is chosen)

When done, click Next to continue.

Step 4. Select files & matching

Next, you will select which files you want to use for comparison and how to match tiers.

Figure 1.153. 4. Select files & matching

4. Select files & matching

The screen above shows all possible options. The options available for this step will differ depending on the configuration you made in the previous step. It will not be available when the option 'in current document' together with 'based on manual selection' was chosen in the previous step.

  • Select files from file browser Browse for one or more files that you wish to use.
  • Select files from domain Choose or create a domain of files to use, see How to select multiple files.
  • Combine files based on different suffix/prefix When using multiple files, choose how they should be combined. E.g. if there is a certain naming convention for the files and the annotations of the first annotator are in files like "Recording4_R1.eaf" and those of the second annotator in files like "Recording4_R2.eaf" and suffix has been selected, these files will be combined automatically.
  • Combine tiers based on different suffix/prefix Similar as with files, when a certain naming convention has been used for tiers of different annotators, they can be combined on the basis of a prefix or suffix, e.g. A_LeftHand and B_LeftHand in case of prefix-based matching.
  • If Fleiss' kappa was selected and the tiers to compare are in different files, an option is available to create and store new EAF files containing the matching tiers (experimental).

Step 5. Tier selection

Figure 1.154. 5. Tier selection (current document / manual selection)

5. Tier selection (current document / manual selection)

In this last dialog, you will select the tiers used for comparison. The layout will be different, based on what you selected in previous steps. The screen above displays the dialog when you've chosen the option 'In the current document' & 'based on manual selection' in step 3. You can manually select which tiers to compare.

Figure 1.155. Tier selection (in the same file / based on suffix)

Tier selection (in the same file / based on suffix)

The dialog above will appear when 'based on pre/suffix' in step 3 is chosen. Marking a tier from an annotator will result in a highlighted corresponding tier in the lower part of the dialog. When 'based on same tier name' was chosen, you can only select tier names, corresponding tiers will not be visible in the dialog.

Finally, click Next or Finish to save the output text file to a location on your computer.

Annotations from subtraction

The option Annotations From Subtraction... for multiple files is the same function as annotations from subtraction in the current open file (see Create annotation by subtraction ), but applied to a selection of files. The first step allows to select a custom set of files in a file browser or to load a set of files that have been stored as a domain. For loading or creating new domain see How to select multiple files The list of tiers is the sum of all tier names encountered in the selected files. The options in the next steps are the same, clicking the Finish button in the last step the new tier is created and populated with annotations in all files of the domain.

Modify boundaries of all annotations

The multiple file function Modify Boundaries of All Annotations is similar to the variant for the current document (see How to modify the boundaries of all annotations of selected tiers). Except that there is no Undo and good backup copies of the files should be there. The files to process can be selected in a file browser or from a domain (see How to select multiple files). The options in the second step are the same and the Finish button starts processing of all files. At the end of the process a short report lists which tiers in which files have been altered.

Statistics for multiple files

This function Statistics for Multiple Files is similar to annotation statistics for the current file (see Annotations Statistics). The main difference after selecting the files in the domain is that it is possible to select which tiers to include in the calculations. The tables in the tabs do not have the column showing the total annotation duration as a percentage of the media duration but most do have a column for the number of files a certain value (tier or type name etc.) has been encountered in. After changes in the selection of files or in the selection of tiers the Update Statistics button needs to be clicked before the new calculations are started.

N-Gram analysis

Since ELAN 4.7, you are able to do an N-Gram analysis over multiple eaf files. This functionality has been developed by Larwan Berke, you can find an extensive PDF document about this implementation on the third-party resources page of ELAN: https://archive.mpi.nl/tla/elan/thirdparty

When you first open the N-gram analysis, a new dialog window will pop up that contains the various options for the search and the resulting table showcasing a few statistics.

Figure 1.156. N-Gram Analysis dialog

N-Gram Analysis dialog

The first step is to select the search domain, see How to select multiple files. Once that is done a list of tiers seen in the domain will be shown. A note of caution: the code assumes that all files in the domain will contain the same tiers. It then loads the first file in the domain to extract the tiers and display it in the window. Check the tiers you want to analyse.

Figure 1.157. N-Gram Analysis dialog 2

N-Gram Analysis dialog 2

Next, define the N-gram size in the text box. The software can handle any positive size greater than 1. When set, clicking the “Update Statistics” button will start the search and will calculate the statistics. The annotations are extracted from the files, N-grams created from them, and finally collated into groupings of same N-grams for statistical analysis. When done, you will see a pop-up window with a process report. If there were any errors, they will also be displayed here.

Figure 1.158. N-Gram Analysis process report

N-Gram Analysis process report

When the search is done, the result table will be displayed in the main window. Some of the columns from the data are visible here, however only a small subset is displayed simultaneously to avoid overcrowding the GUI. The visible columns are: N-gram, Occurrences, Average Duration, Minimal Duration, Maximal Duration, Average Annotation Time, and Average Interval Time.

Figure 1.159. N-Gram Analysis dialog 3

N-Gram Analysis dialog 3

The first column shows the N-gram. The vertical marker “|” separates the annotations contained in the bigram. For example, if a trigram was selected it would show something similar to “FINISH|READ|BOOK” and so on for larger N-gram sizes.

Finally, in order to see the entire data that was produced it is necessary to export the results into a text file for further processing. This is done by clicking on the “Save” button and a dialog will pop up asking the user where to save the data. It is exported in a CSV-like format (Comma-Separated Values). The CSV file uses tabs “\t” as the delimiters and newlines “\n” as the record separators to avoid ambiguity with the values. A sample row is: “HOLD|IX-1p\t7.9934\t0.348\t0.13754 ...” and contains numerous columns.

Furthermore, it is possible to export the N-grams individually in order to process it separately from ELAN. The data is exported by clicking the “Raw Data” button in the GUI. After supplying the file the data will be exported in the same CSV format as discussed above. For more in-depth information about the N-Gram analysis function and the resulting data, please consult the PDF mentioned earlier on the ELAN website, or consult it here: https://parasol.tamu.edu/dreu2013/Berke/images/DREU_Final_Report.pdf (This link may become outdated at some point).

Create multiple media clips

Apart from exporting part of a clip in a project (see Media clip using script), you can also export multiple clips from the same (or multiple) projects. The clips will be clipped based on the annotation-times on a specified tier.

As with the exporting of a part of a clip, Windows users will need to put a copy of ffmpeg.exe or ffmbc.exe in the program folder of ELAN. ( see Media clip using script) for more info.

To utilize this function, you will need to create a tab-delimited text file first. Go to File > Export Multiple files as... > Tab-delimited text.... (note, this will not work for the single file tab-delimited export, as there is no option to include the videofile-path). Choose a domain, or create a new one (if you want to create multiple clips from only one .eaf, create that .eaf as a new domain). Select the tiers you want to include in the Tab-delimited text file (each annotation on a tier will result in a clip).

Make sure you check the following options:

Under the time column and format options, you will need to check:

The other options have to remain unchecked. Next, click OK and the file will be exported to a text file.

Go to File > Multiple File Processing... > Create Multiple Media Clips.... Choose the exported tab delimited text file you just created, and specify a folder to save the clipped videos to. Click OK to start the process. When done, a process report dialog will appear with information about the clipping process.

Merge tiers

Similar to merging tiers within a single project (see Merging tiers), this function allows you to merge tiers in multiple projects. This means the merged tiers will be added to each project you select in the process.

To start, click File > Multiple File Processing... > Merge Tiers.... You will be presented with a dialog in which you either select the eaf files from the file browser, or select files from a domain. When done, select the tiers to use for the merging process.

Figure 1.160. Merge Tiers Multiple Files 1

Merge Tiers Multiple Files 1

Next, select the merge criteria, either regardless of the annotation values or according to specified constraints within the annotations of a chosen tier. When checking the option Only process overlapping annotations , ELAN only merges annotations that have the same value. In this case, the values of both annotations are not concatenated, so the created annotation contains the value only once.

Figure 1.161. Merge Tiers Multiple Files 2

Merge Tiers Multiple Files 2

In the next step, set a name for the destination tier and decide whether this tier will be a root tier or a child tier. Also select or create a tier type for the new tier.

Figure 1.162. Merge Tiers Multiple Files 3

Merge Tiers Multiple Files 3

Lastly, specify the value for the destination tier. You can set a value in a time format, which will put in the specified time values inside the annotation units on the new tier. You can also choose to set a specific value to be filled out into the annotation units. The final choice is to concatenate the values of the annotations from the tiers you have selected for merging.

Figure 1.163. Merge Tiers Multiple Files 4

Merge Tiers Multiple Files 4

After clicking Finish, the tiers will be merged and inserted into each eaf you chose at the start. A process report will show an overview of what has been done.

Update transcriptions after changes in ECV's

If one or more of the tier types of a transcription are linked to ECV's (External Controlled Vocabularies, see Using an External CV) and annotations have been created using those ECV's, it might be necessary to update existing transcriptions after changes have been made to those ECV's. This function allows to update an entire corpus or a selection of files.

If you choose the option Update Transcriptions for ECVs, you'll see the following dialog:

Figure 1.164. Update transcriptions based on ECV's

Update transcriptions based on ECV's

First a source folder containing the transcription files should be selected, specifying whether or not sub-folders should be processed too. For the destination it is possible to choose to overwrite the existing source files (this should only be done if there are recent back up copies of the files) or to select a folder where the updated transcriptions should be stored. Furthermore it is possible to specify which content language to use; this is only useful if the ECV's are multilingual.

The default behavior of this update action is to change annotation values after changes in an external controlled vocabulary, based on the reference of the annotation to (the id of) an entry in the controlled vocabulary. The Don't change the annotation value... option allows to change this default behavior; instead of updating an annotation value based on a reference to a CV entry, it checks if the annotation value is still in the controlled vocabulary and, if so, updates the reference or, otherwise, removes the reference.

Update Transcriptions with elements from a Template

When the files of a corpus are based on a template, the need can arise during the project, to update all files with new tiers, tier types and/or new controlled vocabularies. To achieve this, the template can be updated with the new elements and the changes can then be applied to the files. It is also possible to create a new template with only the new tiers and types etc. and use that one for the updating, because new elements in the template will be added but existing tiers, types and CV's etc. in the .eaf files will not be deleted if they are not (or no longer) in the template.

To start the process of updating a set of files, choose Update Transcriptions with Template... A message will be shown, warning that there is no undo for the changes that are going to be made to the files. Then this dialog will be shown:

Figure 1.165. Update transcriptions with a template

Update transcriptions with a template

The following options are available:

Apart from adding new elements to the existing files, this process also allows to update some properties of existing elements, as long as these changes can't result in data loss. E.g. the Annotator and Participant properties of a Tiercan be updated, but not the Parent Tier. The Tier Type of a Tier can only be changed to a Type with the same overall constraints (Stereotype). The Controlled Vocabulary property of a Tier Type can be changed, but not its Stereotype. Controlled Vocabularies can be converted from internal to external or the other way round. Etc.

Multiple file export options

ELAN offers the possibility to export multiple annotation files as one file. To do so click on File > Export Multiple Files As... and one of the options.

Toolbox file(UTF-8)

To export multiple files as toolbox files, click on File > Export Multiple Files As... > Toolbox file(UTF-8).... This process involves 3 steps.

  1. Step 1/3: File and Tier Selection

    Figure 1.166. Export as Toolbox file step 1

    Export as Toolbox file step 1

    1. First you have to select the files that are to be exported. You can select multiple files you can choose any one of the below options
      • Select files from file browser : This will option a multiple file selection dialog which allows you to select multiple files and you can also choose a directory to export all the files in the directory.
      • Select files from domain : How to select multiple files
    2. Next select the tiers which are to be used for the export process. Using the arrow buttons, you can sort the order of the tiers.
    3. Insert blank line after this marker (see Toolbox file(UTF-8))

    From the drop down list select the tiers to use in the overlaps computation. You can select all the tiers displayed in the list if you click on Select All, or deselect them if you click on Select None. Once you have made your choice for the tiers for which the overlaps should be found, you can select next, this will bring you to the next step.

  2. Step 2/3: Export Settings

    Figure 1.167. Export as Toolbox file step 2

    Export as Toolbox file step 2

    In this step you can define output settings and the Toolbox options. The option are more clearly defined in Toolbox file(UTF-8)

  3. Step 3/3: Save as Settings

    Figure 1.168. Save as Settings

    Save as Settings

FLEx file

To export multiple files as FLEx files, click on File > Export Multiple Files As... > FLEx File.... This process involves 4 steps.

  1. Step 1/4: File selection and element mapping

    Figure 1.169. Export as FLEx file step 1

    Export as FLEx file step 1

    1. First you have to select the files that are to be exported. You can select multiple files, choose any one of the below options
      • Select files from file browser : This will option a multiple file selection dialog which allows you to select multiple files and you can also choose a directory to export all the files in the directory.
      • Select files from domain : How to select multiple files
    2. Next select if you want to export the interlinear-text and paragraph tier. You can set the correct tier type to use as element type and paragraph type in de dialog below that.

  2. Step 2/4: Export Settings

    Figure 1.170. Export as FLEx file step 2

    Export as FLEx file step 2

    In this step you can select a tier type to use for the 'morph-type' tiers. It's also possible to uncheck this, if not needed. From the dialog, you can also map the tier types to the different items, which are listed on top.

  3. Step 3/4: Element-item 'type' and language attribute

    Figure 1.171. Element-item type and language attribute.

    Element-item type and language attribute.

    In the next dialog, you can specify the element-item tier type and set a language for it. ELAN can try to extract that from a tier name, (if the box is checked) but it is also possible to add (or remove) a value for a language or type. To do so, enter a value ('en' in this example) and click Add. Then, you can select the added value from the drop-down menu under 'language'. You need to set a type and language for every Tier Type Name in order to be able to go to the final step. For more information on the structure of FLEx, see FLEx to ELAN structure.

  4. Step 4/4: Save as settings

    Figure 1.172. Save as Settings

    Save as Settings

Praat TextGrid

To export multiple files as Praat TextGrid, click on File > Export Multiple Files As... > Praat TextGrid .... This process involves 2 steps. See Praat TextGrid file for more details.

  1. Step 1/2: File and Tier Selection (see Export as Toolbox file step 1)
  2. Step 2/2: Export Settings (see xrefSave as Settings

Tab-delimited Text

To export multiple files as Tab-delimited Text, click File > Export Multiple Files As... > Tab-delimited Text....

  1. A multiple file selection dialog appears(see How to select multiple files . Select or create a domain and click on OK to continue.
  2. In the next dialog that appears, select tiers and options as you would do when exporting a single Tab-delimited Text file (see Tab-delimited text file).
  3. You can also choose to include a column for the file name and file path to the exported text file. To do so, check or uncheck the appropriate boxes. Instead of adding a column containing the file name and/or path, you can also choose to put the name and path in a row preceding the annotations of each file. When both file name and path are unchecked, but the row option is checked, the file path will be exported in a row.

Figure 1.173. File name & path options for Multiple Tab-delimited text export.

File name & path options for Multiple Tab-delimited text export.

List of annotations

To export multiple files as List of annotations, click File > Export Multiple Files As... > List of annotations....

  1. A multiple file selection dialog appears(see How to select multiple files . Select or create a domain and click on OK to continue.
  2. Then in the next dialog that appears, select the tiers (see How to select tiers ) from which the annotations are to be exported. Note that the annotations are not separated into words. Check Count occurrences if you want the list to include the number of occurrences for each annotation.

List of Words

To export multiple files as Tab-delimited Text, click File > Export Multiple Files As... > List of Words....

  1. A multiple file selection dialog appears(see How to select multiple files . Select or create a domain and click on OK to continue.
  2. Then in the next dialog that appears, select the tiers and other options as you would do when exporting a single Tab-delimited Text file (see Alphabetical list of words).

Export tiers as EAF

There could be situations in which you want to discard or select tiers from multiple .eaf files, for instance if you want to present a third party with a limited number of tiers. To do so, select File > Export Multiple Files As...>Selected Tiers as EAF.... In the first dialog(see How to select multiple files) you can select the files from which you want to export a selection of tiers.

Once you have selected your files, Export Tiers from Multiple Files dialog appears.

Figure 1.174. Exporting by selecting tiers

Exporting by selecting tiers
To export, do the following:
  1. Select tiers (see How to select tiers) for the export.
  2. In the Output Options section you can specify...
    1. whether to Export parent tiers of the selected dependent tiers automatically or to Only export dependent tiers if their parent tiers are selected.
    2. whether to Save files with original names of to Make use of suffixes. In case of the latter, you can specify whether to save the files with their original name followed by a suffix or to save the files with a new base name and followed by a suffix number.
    3. whether the files should be saved in the original directory, in a (possibly new) directory which is local for each files, or together in the same directory.
    4. whether of not ELAN should export files that result in having no tiers.
  3. Finally, click Export to export the .eaf files containing only the selected tiers.

Annotation Overlaps Information

This function is available via menu File->Export Multiple Files As-> Annotation Overlaps Information....

This function allows the user to select one “reference” tier and multiple other tiers that will be compared (sequentially) with the reference tier. The comparison is done on the level of the annotations.  

The following information will be present in the resulting tab-delimited text file:

  1. The Header line with column names

    Column 1- 4:

    Begin time End time Duration Reference Tier Name

    These columns will contain information for all annotations of the reference tier. The annotation values are in the column with the tier name as the header, the time info in the first 3 columns.

    Next for each “comparing” tier there will be 11 columns, the header of which consists of the tier name and a suffix and the column contains the following information:

    1 Name-ov 0 or 1,  whether there is an overlapping annotation or not  (0=no, 1=yes)
    2 Name-same 0 or 1, whether the overlapping annotation has the same value. If there are more than one overlapping annotations the value is 0 (0=no, 1=yes)
    3 Name-ov-dur The duration of the overlap,  the total overlap duration in case of more than one overlapping annotation
    4 Name-no-ann The number of overlapping annotations
    5 Name-value The value of the overlapping annotation, concatenated, comma separated, in case of multiple overlaps
    6 Name-bt-to-bt-After The amount of time from the beginning of the reference annotation to the beginning of the first non
    7 Name-et-to-bt-After The amount of time from the end of the reference annotation to the beginning of  the first non overlapping annotation
    8 Name-et-to-et-After The amount of time from the end of the reference annotation to the end of  the first non overlapping annotation
    9 Name-bt-After The begin time of the annotation on the comparing tier after the reference annotation
    10 Name-et-After The end time of the annotation on the comparing tier after the reference annotation
    11 Name-value-After The value of the first annotation on the comparing tier after the reference annotation

  2. The content per file

    After the header, for each file there will be the following information/data:

    • one line containing the file name
    • a number of rows equal to the number of annotations of the reference tier
    • each cell filled with the information corresponding to the header description (above) 

    All time values are in milliseconds.

Time Co-reference Tab-delimited text

This function is available via menu File->Export Multiple Files As-> Time Co-reference Tab-delimited text....

A common requirement is to export data in a tabular format that can be opened in a spreadsheet program or used in another environment such as R or Python. ELAN has a standard export option for this task called ‘Tab-delimited Text’. However, where the focus is on capturing data which co-occurs, temporally, with annotations on a specific tier, the arrangement of the data is unsuitable.

The variant ‘Time Co-reference Tab-delimited text’ export option addresses this issue. It produces a file where each row is an instance of an annotation in a specific reference tier, and each column provides data related to, or co-occurring with, that annotation, according to the timecode.

In short, this option is intended for analysis, qualitative and quantitative, of temporal co-occurrence of phenomena with regard to a particular focus. For example, a research question might be: “What is the range of the elements of ‘Tier A’ that are associated with annotations on ‘Tier X’?”

This function allows the user to select either a single file or the collection of files within a domain. The dialog box presents the following steps:

Figure 1.175. Select files, tiers and Concatenation separator

Select files, tiers and Concatenation separator
  1. First a reference tier is chosen that reflects the focus of the data enquiry.
  2. Then, other tiers are selected according to what supplementary, co-occurring, information is required.
  3. Next a ‘Concatenation separator’ is chosen. (The default is a semi-colon, but almost anything is possible – including multi-character strings – using the ‘Custom’ option.) The purpose of this is to indicate the boundaries between individual annotations in the cases where multiple annotations on a particular tier are captured.

The output file: After these steps, the output is saved to a tab-delimited text file with a ‘.txt’ extension. (This format is commonly referred to as ‘tab-separated values’, or TSV. Often it is useful to rename the extension to ‘.tsv’ since this can help other programs to parse the format.) The output file comprises:

Theme Data Files

Theme© (www.patternvision.com) is an application for detection and analysis of hidden patterns in time (so called t-patterns) in behavioral data. It is possible to export annotations of selected tiers such that they can be imported into a Theme project. A Theme project requires at least two text files:

When exporting, ELAN creates the “vvt.vvt” file and for each transcription file it creates a raw data file, converting each annotation into two records for the raw data, one for the begin time and one for the end time of the annotation.

To start, click File->Export Multiple Files As-> Theme Data Files....

  1. A file and tier selection dialog appears. Select the appropriate files or a domain, then choose the tiers you would like to use for the export.

    Note

    If a file from the selected eaf files is currently opened, you will be presented with a warning. Make sure you have saved your current transcriptions before starting this export process as local changes will be overwritten.

  2. In the next dialog that appears, set the various export options. There are two specific configuration options for the export:
    • When tiers are connected to a Controlled Vocabulary there is an option to include the entire CV in the .vvt file, otherwise ELAN will just add all values that are actually present in the annotations.
    • For the Actor it is possible to use either the tier name or the participant label of a tier (if it is there).
  3. When done, click Finish to start the export. Afterwards, you can find the exported vtt.vtt file and the text file(s) in the directory you specified in step 2.

Annotated Part Overview (preliminary)

This export function is available via File->Export Multiple Files As-> Annotated Part Overview.... It creates a tabular text file containing information on how much of a recording has been annotated on at least one of the selected tiers. The export window allows to select a domain or a custom set of files and lists the tiers found in the files. After selection of the tiers to include in the processing, clicking the Next or Finish button starts the processing and shows a Save As dialog. The output can be opened a spreadsheet application, it contains a row for each file and the following columns (summarized):

Finally there is a row with the totals of all rows, giving an overview of all the files in the domain.

Export to a REFI project file

The REFI-QDA Standard is a file format for exchanging data between qualitative data analysis (QDA) software applications. The approach to annotation and analysis in these applications seems to be somewhat different from that in ELAN and similar tier-based annotation applications. This is reflected in the concepts registered in the REFI format; some map quite well to ELAN concepts, others don't have an obvious counterpart in the EAF format.

One of the concepts that does map quite well is that of Transcript, in ELAN usually represented by one transcription tier per speaker. Similarly the concepts of Source, User and Selection (or Segment) translate relatively easily. On the other hand, the important REFI concepts of Codebook and Code don't seem to map to ELAN elements in a straightforward way. Closest equivalents seem to be Controlled Vocabulary and Controlled Vocabulary Entry, even though the way Codes are used and applied in QDA applications seems to differ from how CV's are usually used in ELAN.

Despite these differences, there can be cases where export of an ELAN project (i.e. multiple EAF files and the linked media files) to the REFI format makes sense. Therefore an option has been added to export ELAN files to the REFI .qdpx project exchange file format. The implementation is based on best guesses of how concepts should be mapped, while providing some means to configure the output. The results have been tested in trial versions of a few of the involved QDA applications (most of them require a paid license), with varying degrees of success. This export is only of interest to uers who have access to a QDA application and are familiar with the concepts mentioned above.

A REFI .qdpx file is a zip file with a predefined structure. Central in the .qdpx file is a project.qde file, an XML file based on the REFI-QDA standard's XML schema file. Next to this file is a folder named Sources, which can contain different types of source files, such as plain text .txt transcript files and possibly audio and/or video files etc. The .qde file contains the Codebook and entries for each audio/video file in the project, each entry with links to the media file and to the produced transcript file and possibly containing Syncpoint and/or Coding elements.

To start the export, click File->Export Multiple Files As-> REFI Project File... This creates a 3-steps export window:

  1. Step 1/3: Selection of files and of Transcript and/or Coding tiers

    Figure 1.176. Export as REFI project file, step 1

    Export as REFI project file, step 1

    1. Select EAF files files from the file browser or from a domain. The files will be loaded and two tables will be filled with the tier names.
    2. The first table allows to select the tiers that should be exported as transcription text. If there are multiple speakers (in multiple tiers), these will be exported in a single Transcript text file (i.e. one Transcript text file per EAF file. More details on this in step 2.
    3. The second table allows to select tiers that should be exported as Coding tiers. If a selected tier in this table has an ancestor tier in the Transcript table, it is assumed the annotations on this tier add Codes to annotations on that tier and ultimately to that part of the exported transcription text that originates from that ancestor annotation. If a Coding tier does not have an ancestor tier in the Transcript table, it is assumed the annotations add codes directly to segments of the media file.

      In both cases the annotation values are added as Codes to the Codebook in the output. If a Coding tier is linked to a Controlled Vocabulary, all entries of that CV are added to the Codebook, regardless of whether they are used or not.

    At least one tier should be selected, in either the Transcript or the Coding tier table, in order to be able to proceed to the next step.

  2. Step 2/3: Transcript configuration and media file handling settings

    Figure 1.177. Export as REFI project file, step 2

    Export as REFI project file, step 2

    1. Settings for Transcript texts
      • Include speaker labels: if this is selected a label based on the participant attribute or the tier name will be printed in front of the text. The label has a fixed maximum length of 3 characters.
      • Repeat the label of the same speaker: sets whether or not the label should be printed if the speaker of the current utterance is the same as the one of the previous utterance.
      • Merge annotations of the same speaker into paragraph: if not selected, each annotation will be on a new line in the output. If this options is selected, the annotations of one speaker will be on the same line in the output, separated by a whitespace. Until an utterance of a different speaker appears, this will always be on a new line.

        Note

        A special case are subdivision tiers; if these are selected as transcript tiers, annotations under the same parent will always be merged in the export (separated by whitespaces), regardless of this setting. If a depending tier of a subdivision tier is selected as a coding tier, the resulting code of each depending annotation will still be linked to that part of the transcript text that corresponds to its parent annotation.

      • Include silence duration indicators: if this is selected an indicattion of the duration of the gap between utterances is printed in the output text, with a (currently hardcoded) minimum of 200 milliseconds. There won't be silence duration indicators within a paragraph.
      • Include begin (and/or end) time stamps in the text: if selected, a formatted time stamp will be printed at the beginning (or end) of a line or paragraph. Importing QDA applications may or may not show these time stamps in their text view.

        Note

        Regardless of these time stamp settings, there will be Syncpoint elements in the REFI project XML file, ensuring that the time links between segments of the text and the media file are available to and in the importing QDA application.

    2. Settings for media files
      • Copy the media files into the export file: if this is selected the linked media files are copied into the Sources folder of the .qdpx file. In the REFI project's .qde file, the media files are then referenced with internal:// URL's. The big disadvantage is that this increases the size of the .qdpx file considerably. Unfortunately, it seems that the best chance of successful import into a QDA application, including the media files and links between text and media, is with this option selected.

        Note

        This export function does not check possible maximum sizes of the files to add or of the resulting file.

      • Set a base path and use relative paths: with this option a base path for the media files should be specified. The base path should be (the path to) a folder containing the media files and/or the sub-folders containing the media files. The media files will not be copied into the .qdpx file, in the project's .qde file they will be referenced with relative:// URL's. Even though this could be sufficient for importing QDA applications to find the media files, if importing takes place on the same computer, this often doesn't work. The QDA application might prompt the user to locate or select the base path folder (this can be a different folder e.g. if importing happens on a different computer).
      • Use absolute paths of the media files: with this option the media files are not included in the .qdpx file and are referenced in the project's .qde file with absolute:// URL's, their current absolute path. When the export from ELAN and the import into a QDA application is performed on the same computer, that could theoretically give a good chance of the media files being found and imported. But the three tested applications did not support this.

      It will be a matter of just trying out which option works (best), depending on the target QDA application and possibly the operating system. More information on media file handling and on the structure of a .qde file in general, can be obtained from the www.qdasoftware.org website.

  3. Step 3/3: Saving the file and showing the export progress. A Save As dialog asks to specify a location and enter a name for the .qdpx file, after which processing of the .eaf files starts. After completion of the export process, the .qdpx file can be imported or opened in a QDA application (possibly after transfer of the file to a different computer). If the export is cancelled before the end of the process, the .qdpx file may or may not be usable.

Multiple file import options

ELAN offers the possibility to import multiple files at once and save them as *.eaf files. To do so click on File > Import Multiple Files As... and one of the options.

Toolbox file (UTF-8)

To import multiple toolbox files for conversion to *.eaf, click File > Import Multiple Files As... > Toolbox file(UTF-8).... This operation consists of 3 steps.

Figure 1.178. File selection

File selection

  1. Select the files you would like to import by clicking browse and adding the files to the list.
  2. Next you will have to select what settings to import. Either select a *.typ file or use the 'Set field markers' option. See Shoebox file for working with *.typ files and field markers.
  3. Lastly, you will have to configure the Save as settings.

Figure 1.179. Save as Settings

Save as Settings

When the operation has completed, you will be presented with a process report. The multiple *.eaf files are now ready to be used in ELAN.

Praat TextGrid

Multiple TextGrid files created in Praat can be imported and converted to *.eaf files. This process involves 3 steps.

  1. Choose the *.TextGrid files that will be imported for conversion to *.eaf. You can also set the encoding (default, UTF 8, UT 16).
  2. In the next dialog, you can define the settings to be used for importing:

    Figure 1.180. Import Settings

    Import Settings

    In this dialog, you can choose to include Praat PointTiers, if empty annotations or intervals should be skipped or not and whether or not for each converted .TextGrid file a corresponding .wav file should be added to the linked files.

  3. Lastly, you will be asked how the files should be saved and in what location.

Figure 1.181. Save as Settings

Save as Settings

When the operation has completed, you will be presented with a process report. The multiple *.eaf files are now ready to be used in ELAN.

Flex files

To import multiple FLExtext files for conversion to *.eaf, click File > Import Multiple Files As... > FLEx file.... This operation consists of 3 steps.

  1. Select the FLEx files you want to use for conversion to *.eaf. Do so by clicking the Browse... button in the dialog and choose the proper files.
  2. In the next dialog, you can define the settings to be used for importing:

    Figure 1.182. Import Settings

    Import Settings

    You can select whether to use the 'interlinear-text and 'paragraph' element in FLEx, import the participant information and what the smallest time-alignable element should be: 'phrase' or 'word'. Choose on what level you want to create tier types and set a duration per phrase element (required).

  3. Finally, you can choose how to save the files and in what directory to save them:

    Figure 1.183. Save as Settings

    Save as Settings

    Finally, configure how and where to save your files. You can choose to save with an .XML or .flextext extension, and you can skip files that would result in having no tiers.