A CSV (Comma Separated Values) or Tab-delimited Text file is a text file in which one can identify rows and columns. Rows are represented by the lines in the file and the columns are created by separating the values on each line by a specific character, like a comma or a tab. CSV or Tab-delimited Text files can be compared to spreadsheets like the ones in Microsoft Excel in that they also have rows and columns. Note that .csv files can be created by Excel.
Take a look at Figure 4.32. The first row represents the event of a person saying 'so from here'. The first value (as well as the first column of the complete file) represents the tier name, the second and third represent begin time in different formats, the fourth and fifth represent the end time, the sixth an seventh represent the duration and the last value represents the annotation.
You are able to import CSV or Tab-delimited Text files in ELAN:
. In the dialog window browse to and select a file that contains CSV or Tab-delimited data and click .The second dialog window contains two sections (see Figure 4.33). The upper section shows a sample table containing data from the selected file. Both rows and columns are numbered. The lower section enables you to specify which columns to include and what data type they represent. This means that the format of the files is flexible: it is not prescribed what data is expected nor how it is formatted. The numbers of the columns in the Import Options section correspond to the numbers of the columns in the sample table. The data types you can select are:
Annotation
Tier
Begin time
End time
Duration
Select at least one column with data type 'Annotation'. If you select a column for begin time, end time and duration, the latter will be ignored in the import process.
The option Specify first row of data
enables
you to exclude a header by excluding the first few lines. The option
Specify delimiter
lets you specify the delimiter if
Elan did not guess the correct delimiter. The delimiters supported
by Elan are comma, tab, colon and semi-colon.
If you enable the option Default annotation
duration
Elan creates all annotations from the selected
file with durations equal to the number of milliseconds specified.
This option works only if there is no time data or only the begin or
end times.
Finally click
to import the data. A new transcription document is created with the imported annotations as its contents.To demonstrate that the format of the imported file can be flexible, take a look at the following tab-delimited text:
In this example each column represents a tier with the
tier names in the first row and the annotation in the other rows.
This file can be imported by selecting the following import
options:
Note that the Specify first row of data
option is set to 2. As a consequence Elan starts importing
annotations from row 2 instead of row 1. Furthermore, Elan tries to
extract tier names from the first line of the file if the column
they part of is specified as 'annotation'. This results in this
example in two tiers: K-Spch and W-Spch.