Elan supports importing file from :
Transcriber file (the section called “Transcriber files”)
CHAT file (the section called “CHAT file”)
Shoebox file (the section called “Shoebox file”)
Toolbox file (the section called “Toolbox file”)
Fieldworks Language Explorer (FLEx) file (the section called “Fieldworks Language Explorer (FLEx) file”)
CSV / Tab-delimited Text Files (the section called “CSV / Tab-delimited Text Files”)
Praat TextGrid file (the section called “Praat TextGrid file”)
The feature to import Transcriber annotation files into ELAN works as follows:
Choose
Select the transcriber file (*.trs
) and click on
Open
If the associated sound file cannot be found, a dialog will be shown asking you to locate it. When this request is cancelled, one can choose to open the annotation file without the sound, or to stop the whole import process.
The transcriber tiers will be mapped on the ELAN equivalents:
Section becomes a independent tier and turn becomes a referring tier of section (see also the section called “Basic Information: Annotations, tiers and linguistic types”).
Events are embedded into the annotation text.
It is possible to import CHAT files (used in e.g. the Childes project) in ELAN:
Select
Select the Chat file
Click on
Some remarks about this import feature:
supported are old CHAT files and CHAT-UTF8, not XML CHAT
existing media alignment in %snd tiers is maintained in ELAN:
when no media alignment is present at all, each CHAT utterance gets a default interval of 1 second assigned
when partial media alignment is present, the time interval is equally distributed over preceding unaligned utterances
overlapping utterances of the same participant are corrected as good as possible
CHAT dependent tier names are mapped to ELAN Linguistic Types
ELAN tier names are either CHAT participant labels or CHAT tier names, followed by '@participantName'
Remaining issues:
'<' and '>' characters in CHAT cause parsing errors when the imported file is saved as EAF file
ELAN supports the import of documents from Shoebox, thereby allowing you to link transcribed and/or interlinearized documents to the time axis of media files. In order to import from Shoebox, you need at least the following two files:
the Shoebox file (*.txt, *.sht, *.tbt
);
the media file(s) (*.mpg
, *.mov
,
*.wav
etc.);
Optionally you can use the corresponding Shoebox database type file
(*.typ
). If this is not available, one has to provide a list with
field markers (= tier names).
If you do not know the Shoebox database type file, do the following:
Open the Shoebox *.txt
file in Shoebox. Make sure it is the active
window (click on it to activate it). |*.sht
|*.tbt
Click on Database menu.
Click on
dialog box appears. The name of the database type is displayed in the header, e.g.:Locate the directory of the database type file (e.g., “texts.typ” in the above illustration). It is probably located in the directory “My Shoebox Settings”.
To import a Shoebox file into ELAN, do the following:
Click on
. The dialog box appears.Specify the name and directory of the two files, e.g.:
Like *.eaf
documents, the Shoebox file and the media
file(s) do not necessarily need to have the same name, and
they do not need to be in the same directory (see the section called “Basic Information: Media Files and Annotation Files”).
If the Shoebox file contains both aligned (i.e. containing time information) and non-aligned records, the aligned ones will maintain the timing, whereas the location of the non-aligned records will be interpolated automatically.
Click
to import the file; otherwise click to exit the dialog box without importing the file.An ELAN window containing the imported Shoebox file appears.
Instead of using a Shoebox *.txt
file, there is also an option in ELAN to define the field
markers yourself when importing a Shoebox file.|*.sht |*.tbt
select the Set field markers and click on the button in the import dialog. The following window appears:
Now fill in a field marker as used in the Shoebox
*.txt
file|*.sht |*.tbt
Optionally select a parent marker (see the section called “Basic Information: Annotations, tiers and linguistic types”)
Optionally select a stereotype (symbolic subdivision or association, see the section called “Basic Information: Annotations, tiers and linguistic types”)
Choose a character set (Latin-1, SIL IPA or UTF-8) for the tier
Click on Add.
Repeat step 2-6 for all field markers.
If the selected marker designates a participant, check the
checkbox. If you don’t want the selected marker to be imported, tick .finally choose
and click on in the import Shoebox file dialogOnce you have manually created a set of field makers, you might want to reuse them later on. ELAN provides support for this:
To save a set of field markers, select the
button. This will display a save dialog. Enter a filename, and press save.The same way you can open a stored field marker set by clicking on
Once the import has succeeded, you can add a reference to a media file via the the section called “Changing the links to media files”. If the imported Shoebox file was exported from ELAN before, you won’t need to establish the link to the media file(s) again, as in that case the location information is stored in the file.
menu, as described inELAN imports Shoebox files according to the following conventions:
The Shoebox field markers are imported as ELAN tiers. The tier label is identical to that of the field marker, except for the added extension @‘Speaker-ID’.
This addition is necessary because ELAN and Shoebox differ in how they code information about multiple speakers:
In ELAN, each speaker is coded on a separate tier.
In Shoebox, all speakers are coded using the same field, and their identity is specified in a separate field.
When importing texts by multiple speakers, ELAN splits each Shoebox field into several ELAN tiers (one for each speaker) and adds the speaker-ID to the tier label.
If speaker information is not specified in the Shoebox file, the extension @unknown is added.
The following screenshot illustrates how ELAN treats texts by multiple speakers:
Note that ELAN can only read speaker information if:
A marker is defined as a Participant marker in the Set field marker dialog (see Importing Shoebox files without a TYP file above), or if:
It is coded in a Shoebox field labelled \EUDICOp or \ELANParticipant (see illustration above). If this field is not present, or if speaker information is coded in a different field, ELAN will assume that there is only one speaker. I.e., if you have multiple speakers and if you want ELAN to assign them to separate tiers, do the following:
For every Shoebox record, add the field marker \EUDICOp.
For every Shoebox record, enter the relevant speaker-ID into this field.
When the file is exported back to Shoebox (see the section called “Shoebox file”), the extension @‘Speaker-ID’ is automatically dropped from the field marker, and the Shoebox records are sorted according to their record marker (e.g., in the above illustration, “test 001” is sorted before “test 002” etc.)
Based on the information contained in the Shoebox database type file, the tiers are brought into a hierarchical relationship and are assigned to linguistic types (see the section called “Basic Information: Annotations, tiers and linguistic types” for details of tier hierarchies and linguistic types). For every tier name a corresponding linguistic type with the same name is created. All of these linguistic types are connected with a stereotype in such a way that it fits with the original Shoebox structure.
The Shoebox record marker is assigned to the stereotype None, i.e., it is an independent, time-alignable parent tier.
The transcription and parsing fields of Shoebox are assigned to the stereotype Symbolic Subdivision, i.e., they are referring tiers that can be subdivided into smaller units.
All other fields are assigned to the stereotype Symbolic Association, i.e., they are referring tiers that cannot be subdivided into smaller units.
If you define the markers yourself, then there also is the possibility to choose the Time Subdivision stereotype. For example:
All SIL IPA characters are converted into Unicode characters during import. If you export the file back into Shoebox (see the section called “Shoebox file”), the Unicode characters will be converted back into SIL IPA characters.
Initially, unless it had the time code information, the imported Shoebox file does not contain information about timing. Instead, ELAN automatically assigns each Shoebox record to a three second time interval, as in the following illustration:
The time alignment has to be done manually for each Shoebox record. Do the following:
Activate the Bulldozer mode: Click on the section called “Activating and deactivating the Bulldozer mode or Shift mode” for the three available modes).
(seeIf you do not activate the Bulldozer mode, you will inadvertently overwrite and thereby delete existing annotations. Make sure that
is enabled in the menu.Click on the first annotation on the parent tier (i.e., the first Shoebox record). It appears in a dark blue frame.
Modify the boundaries of that annotation, so that they are aligned with the correct time interval (see the section called “Changing the boundaries of an existing selection and annotation” for ways of modifying boundaries).
Press CTRL+ENTER to apply the new time interval.
The parent annotation (together with all its referring annotations) is assigned to the new time interval. All other parent annotations are moved to the right.
Repeat steps 2 to 4 for each parent annotation.
The following screenshot illustrates steps 1 to 4:
After you have done the time-alignment, you can export the file back to Shoebox – in this case, the time code information will be kept (see the section called “Shoebox file”). If you then re-import the file back into ELAN, ELAN automatically assigns the Shoebox records to their correct time intervals.
An imported Shoebox file can be saved as an ELAN file (see the section called “Re-open recently accessed files”), exported back into Shoebox (see the section called “Shoebox file”), or exported as a tab-delimited text (see the section called “Tab-delimited text file”).
Importing a document form Toolbox is very much the same as importing a document from
Shoebox (see the section called “Shoebox file”). The Toolbox
import assumes that all markers in the file are Unicode (although it still allows to
import files in which all markers are in ISO-Latin if you uncheck All markers
are Unicode). This alternative to the Shoebox import attempts to allow more
flexibility in terms of tier relations and tries to prevent that words are cut up in
case of misalignment. As with the Shoebox import, information about the tier relations
can be provided by means of a .typ
file or by using a marker
file.
When reconstructing the vertical alignment of words on interlinearized markers, the position is recalculated based on the number of bytes per character. But in some files this leads to incorrect alignment, therefore this recalculation can be turned off by unchecking Correct alignment based on the number of bytes per character. This import also tries to take non-spacing characters into account.
ELAN can import documents from the SIL Fieldworks Language Explorer (FLEx). To do so click
. Select the FLEx file and relevant media files by clicking the -buttons. Determine whether you want to import the "interlinear-text" and "paragraph" elements and what the smallest time-alignable element should be. Finally enter the duration of the whole file or the initial duration of the smallest time-alignable element you have chosen. Start the import by clicking .A CSV (Comma Separated Values) or Tab-delimited Text file is a text file in which one can identify rows and columns. Rows are represented by the lines in the file and the columns are created by separating the values on each line by a specific character, like a comma or a tab. CSV or Tab-delimited Text files can be compared to spreadsheets like the ones in Microsoft Excel in that they also have rows and columns. Note that .csv files can be created by Excel.
Take a look at Figure 4.35, “Tab-delimited Text”. The first row represents the event of a person saying 'so from here'. The first value (as well as the first column of the complete file) represents the tier name, the second and third represent begin time in different formats, the fourth and fifth represent the end time, the sixth an seventh represent the duration and the last value represents the annotation.
You are able to import CSV or Tab-delimited Text files in ELAN:
. In the dialog window browse to and select a file that contains CSV or Tab-delimited data and click .The second dialog window contains two sections (see Figure 4.36, “Import CSV / Tab-delimited Text”). The upper section shows a sample table containing data from the selected file. Both rows and columns are numbered. The lower section enables you to specify which columns to include and what data type they represent. This means that the format of the files is flexible: it is not prescribed what data is expected nor how it is formatted. The numbers of the columns in the Import Options section correspond to the numbers of the columns in the sample table. The data types you can select are:
Annotation
Tier
Begin time
End time
Duration
Select at least one column with data type 'Annotation'. If you select a column for begin time, end time and duration, the latter will be ignored in the import process.
The option Specify first row of data
enables you to exclude a
header by excluding the first few lines. The option Specify delimiter
lets you specify the delimiter if Elan did not guess the correct delimiter. The
delimiters supported by Elan are comma, tab, colon and semi-colon.
If you enable the option Default annotation duration
Elan creates
all annotations from the selected file with durations equal to the number of
milliseconds specified. This option works only if there is no time data or only the
begin or end times.
Finally click
to import the data. A new transcription document is created with the imported annotations as its contents.To demonstrate that the format of the imported file can be flexible, take a look at the following tab-delimited text:
In this example each column represents a tier with the tier names in the first
row and the annotation in the other rows. This file can be imported by selecting the
following import options:
Note that the Specify first row of data
option is set to 2. As
a consequence Elan starts importing annotations from row 2 instead of row 1.
Furthermore, Elan tries to extract tier names from the first line of the file if the
column they part of is specified as 'annotation'. This results in this example in two
tiers: K-Spch and W-Spch.
ELAN offers the possibility to import a Praat TextGrid file: click on Skip empty intervals / annotations if you want to do so.
. In the dialog window that now appears, you can browse to the file you wish to import. You are also able to include Praat PointTiers. When selecting this option, specify the default PointTiers annotation duration in milliseconds. Finally, checkIf there is already a annotation document opened in ELAN, the imported TextGrid is added to the document in one or more new tiers. If there is no annotation document opened, a new document consisting of the TextGrid data is generated.
In addition to TextGrid files in the default encoding for the operating system, ELAN supports Praat TextGrid files with UTF-8 and UTF-16 encoding.
Importing Tiers from recognizers will import the tiers in a new file if there is no file currently open in elan. But if a file is open, the tiers will be in the currently open file. To import the tiers from recognizers, go to Create tiers from segments' dialog appears. For more information about this dialog see Figure 5.15, “Audio Recognizer”.
> > . Selecting this option, first will prompt for the import file. If there is no file is open, the tiers are directly imported to the new file. But if a file is already open, then a 'ELAN offers the many export options. To export, click on
and one of the options.Shoebox file (the section called “Shoebox file”)
Toolbox file (the section called “Toolbox file(UTF-8)”)
Tab-delimited text file (the section called “Tab-delimited text file”)
Tiger XML (the section called “Tiger XML”)
CHAT file (the section called “CHAT files”)
Traditional transcript file (the section called “Traditional transcript files”)
Praat TextGrid file (the section called “Praat TextGrid file”)
Alphabetical list of words (the section called “Alphabetical list of words”)
Clip of video file (the section called “Clip of video file”)
Media clip using script (the section called “Media clip using script”)
SMIL clip (the section called “SMIL clip”)
QuickTime text (the section called “QuickTime Text”)
Subtitle text (the section called “Subtitle Text”)
ELAN's document view (the section called “ELAN’s document view”)
Interlinear text file (the section called “Interlinear text file”)
HTML file (the section called “HTML file”)
Filmstrip Image (the section called “Filmstrip Image”)
Tiers for recognizers (the section called “Tiers for recognizers”)
Different ways to select tiers :
By Tier Names
Select the tiers by checking the boxes before each tier name.
By Type
This tab shows a list of the linguistic types available in the current transcription. Select the types by checking the boxes before each type name. Selecting the types will select all the tiers of the each selected types. To modify the selected tiers switch back to By Tier Names.
By Participant
This tab has a list of all the participants in the transcription. Select the participants by checking the boxes before each type name. Selecting the participants will select all the tiers of the each selected participants. To modify the selected participant switch back to By Tier Names.
By Annotators
This tab has a list of all the annotators in the transcription. Select the participants by checking the boxes before each annotator name. Selecting the annotators will select all the tiers of the each selected annotators. To modify the selected tiers switch back to By Tier Names.
To select multiple tiers, press Shift and click on the successive tiers or click and drag the mouse along the tiers to select them
Other options :
To sort the selected order of tiers use the and
buttons to move the tiers up and down in the table.
Show only root tiers : Check this option to show only the root tiers in the transcription.
: click this button to select all the boxes in the current tab.
: click this button to de-select all the boxes in the current tab.
: click on Ok to select the tiers
: click to close the dialog or cancel the changes
All Shoebox files that were imported into ELAN (see the section called “Shoebox file”) can be exported back into Shoebox. In this case, the time code information is kept.
To export a file into Shoebox, do the following:
Click on
menu.Click on
.The
dialog box appears. Make a choice and click on to continue.By selecting
you can let ELAN wrap a whole block if one of the line in a block is longer than a specified number of character (default is 80 characters).By selecting the section called “Synchronizing video files”).
you can add to the annotation times the time offset from the master media that originated from the synchronization of media files (seeSpecify the name and directory of the exported file, e.g.:
Click
to export the file; otherwise click to exit the dialog box without exporting the file.The file is exported as a *.txt
|
*.sht
| *.tbt
file.
If there already exists a file of the same name, ELAN will ask you whether or not it should overwrite the existing file, e.g.:
Open the exported file in Shoebox.
It contains the following information:
All tiers and annotations.
Each ELAN parent annotation (including all its referring annotations) corresponds to one Shoebox record. E.g., in the illustration below, the ELAN parent annotation “Ligya-001” corresponds to the Shoebox record “Ligya-001”.
The time code information for each parent annotation.
Each ELAN parent annotation (i.e., each Shoebox record) contains the additional field markers \ELANBegin and \ELANEnd (i.e., the begin and end time of the parent annotation).
This time code information allows you to import the Shoebox file back into ELAN, without having to manually re-align the file (see the section called “Shoebox file”).
Similar to exporting a document to Shoebox (see the section called “Shoebox file”) ELAN data can be exported to a Toolbox document with an UTF-8 encoding. This export provides more options for output customization.
To export a file into Toolbox, do the following:
Click on
menu.Click on
The
dialog box appears:
Only the left part of ELAN tier names containing an @ are identified as
tier markers for Toolbox. These markers form a block in the exported file. The
right part of the ELAN tier names are identified as participant names. These are
exported with the marker ELANParticipant see the figure below:
If you use a Shoebox *.typ file to specify the Toolbox database type ELAN
extracts the database type name from the first line of the type file (e.g. the
database type name Text in \+DatabaseType Text
)
and puts is in the first line of the exported file (e.g. \_sh v3.0
400 Text
).
When there is only one root tier (tier without a parent tier) in the transcription (e.g. ref) this will be used as the record marker by default. When there are multiple root tiers "\block" will be added as record marker. In both cases it is possible to specify a custom record marker instead.
Some options not touched up in Figure 4.44, “Toolbox Export dialog window”:
By first selecting a tier(the section called “How to select tiers”) and then selecting you insert a blank line after the selected marker every time the marker is printed in the exported file. The tier name is colored blue in the dialog box.
By selecting
you can let ELAN wrap a whole block if one of the lines in a block is longer than a specified number of characters (default is 80 characters). A block in this context refers to the markers that are part of the interlinearization.When
is selected it is also possible to select . This applies to long marker lines that are not part of the interlinearization. There are 2 variants: when Wrap to next line is selected the line is split into 2 or more lines that immediately follow each other, regardless of their position in the record. When Wrap to end of block is selected everything beyond the first wrap is placed at the end of the record. Note that wrapped interlinearization blocks are grouped as much as possible.When
is selected all markers will be printed in each record, whether there is content or not. When this option is not selected a marker will not be printed in a record when it has no content.By selecting the section called “Synchronizing video files”).
you can add to the annotation times the time offset from the master media that originated from the synchronization of media files (seeMake a choice and click on
to continue.Specify the name and directory of the exported file.
Click
to export the file; otherwise click to exit the dialog box without exporting the file.The file is exported as a *.txt
|
*.sht
| *.tbt
file.
If there already exists a file of the same name, ELAN will ask you whether or not it should overwrite the existing file.
Open the exported file in Toolbox.
It contains the following information:
All tiers and annotations.
Each ELAN parent annotation (including all its referring annotations) corresponds to one Toolbox record. E.g., in the illustration below, the ELAN parent annotation “CLLDCh3R02S01.001” corresponds to the Toolbox record “CLLDCh3R02S01.001”.
The time code information for each parent annotation.
Each ELAN parent annotation (i.e., each Toolbox record) contains the additional field markers \ELANBegin and \ELANEnd (i.e., the begin and end time of the parent annotation).
This time code information allows you to import the Toolbox file back into ELAN, without having to manually re-align the file (see the section called “Shoebox file”).
All documents can be exported into a tabular format for purposes of further analysis and/or printing. This includes documents that were created by ELAN itself (see the section called “Creating a new document” and the section called “Opening an existing document”) as well as documents that were imported into ELAN from Shoebox (see the section called “Shoebox file”) Do the following:
Click on
menu.Click on
.The
dialog window is displayed, e.g.:Figure 4.46. Export as tab-delimited text dialog window
Select the tiers to be exported. ( the section called “How to select tiers”)
Select to export a selected time interval only.
Add time offset from the master media to the annotation times.
Select to exclude the tires names from the output file
Annotations sharing the same begin and end time are exported in the same row.
Select to include the description of the controlled vocabulary.
Select time information and format.
Add extra time format expressed in hours, minutes, seconds and frame.
By default, ELAN exports all annotations, but it is possible to restrict the export process to selected annotations. The following three options are available:
Export only those annotations that correspond to a selected time interval. Do the following:
In the ELAN window, select the desired time interval (see the section called “Making a selection on an independent tier”).
In the
dialog window, click in the box to the left of . A checkmark appears indicating that this option has been selected.Export only those annotations that are contained on particular tiers. Do the following:
In the
dialog window, select those tiers that you want to export. A checkmark appears next to any selected tier.Export only those annotations that (a) correspond to a particular time interval and (b) are contained on particular tiers. To do this, combine the two steps under (a) and (b) above.
By selecting the section called “Synchronizing video files”).
you can add to the annotation times the time offset from the master media that originated from the synchronization of media files (seeThe option
gives each tier its own column in the export file. Annotations that have the same begin time and the same end time are exported to the same row i.e. the same tab-delimited line.If you check Repeat values of annotations spanning other
annotations
the spanning annotation is put in each row containing
an annotation it spans. The spanning annotation is not in a row by
itself.
The option Only repeat within annotation hierarchies
limits the previous option. An annotation is only repeated if it is on one
of the ancestor tiers in the annotation hierarchy.
Select the time markers you want to export (begin time, end time and/or duration of every annotation unit).
Choose the time format (hh:mm:ss.ms, ss.msec, milliseconds and/or SMPTE time code)
If you choose the SMPTE (hh:mm:ss.ff) format, the selected video standard (PAL or NTSC) just indicates the way seconds and milliseconds are converted to frame numbers. This is independent of the actual video standard of the associated video(s).
Click
to start the export process; otherwise click to exit the dialog box without exporting the annotations.Finally you will see a save dialog window. In the Encoding drop down box a text encoding can be selected (either iso-latin, UTF-8 or UTF-16). Make an appropriate choice and click on
.Some Mac applications, like TextEdit, have difficulties to load UTF-8 encoded files. This is most noticeable for “special” characters, e.g. IPA. Using UTF-16 is recommended in that case.
A message appears to inform you that the file has been exported. The exported
file has the extension *.txt
.
The exported file contains the following information: participant, begin time of each annotation, end time, total length, content, and tier. It can be opened with any program that can handle tab-delimited texts, e.g., Microsoft Excel.
Some versions of Excel seem to have problems importing tab-separated files (white rectangles are shown instead of the column borders). As a workaround you can open the text file first in a text editor (e.g. Notepad) and copy and paste the content into Excel.
If your ELAN annotations contain syntactic elements, it is possible to export these to Synpathy[4] (see http://www.lat-mpi.eu/tools/synpathy/). This function is available via
First select out of the candidate tiers the one you want to be exported. Afterwards,
map the tiers onto the correct description ("word" or "pos"). Finally enter the name of
the file (*.tig
).
Choosing
will give you the following screen:Fill in the necessary fields.
Chat labels must be preceded by * (for root tiers) or % (for dependent tiers). While root tiers have to contain exactly 3 characters, dependent tier names can have up to 7 characters.
Click on
Fill in a chat file name and choose
In some situations a straight-forward list of the annotation units, one after another, can be handy. For that cause an export option to a “traditional transcript text” has been added to ELAN. In its simplest form it just will create a text file containing the successive annotations of several tiers, in chronological order. This feature can be found under
.As can be seen in one of the options enables you to include silences with a minimal duration. In the figure there is a silence of 0.2 seconds between 'yeah' on the tier K-Spch and 'and the you go the other ...' on the tier W-Spch. The first annotation end at 00:00:04.400 seconds and the second begin at 00:00:04.600 seconds, resulting in a silence of 0.2 seconds. If this silence was shorter than the minimal silence duration entered in the export dialog window (20 ms in the figure), the silence will not be included in the exported file.
When you wish to work with your annotations in Praat, ELAN enables you to export your annotation to a Praat TextGrid. To do this, click the section called “How to select tiers”) and specify whether you want to restrict the output to the selected interval.
. In the dialog window that appears you can select the tiers you wish to export(After clicking
, you can enter a filename and select an encoding. In addition to TextGrid files in the default encoding for the operating system, ELAN supports Praat TextGrid files with UTF-8 and UTF-16 encoding. Finally click on .Sometimes it can be very useful to have a alphabetical list of (unique) words from one or more tiers. ELAN offers a way to generate such lists. Go to the section called “How to select tiers”) from which you want to extract the words. The annotations of the selected tiers will be tokenized (split into words) using either a default set of delimiters or a user definable set. Check Count occurrences if you want the list to include the number of occurrences for each token. After selecting tiers (or better, deselecting unwanted tiers) you can click OK and choose a filename. Clicking will save the word list.
and select the tiers(seeWhen a command line tool for extracting clips from video files is installed Elan is able to use that tool. At this moment only M2-edit-cl[5] from Mediaware Solutions is supported. If the edit tool is in the user path and a selection is made, there is a menu item to export a video clip of the current selection for each linked video. In that case, follow these steps:
Select the part of the video(s) you want to export as (a) clip(s)
Choose
Enter a filename and press
Elan now supports any command line tool to extract clips from the video file. Elan
uses a script file named "clip-media.txt
" which can be found in the
Elan folder where Elan is installed. To clip a video file, first the script have to be
modified according to specifications of the command line tool used.
For example, the syntax for M2-edit-cl : M2-edit-cl/ in:$begin(fr) / out:$end(fr) $in_file $out_file
M2-edit-cl : the path of the application
in:$begin(fr) : specifies the begin time frame of the clip in frames.
out:$end(fr) : specifies the end time frame of the clip
$in_file : input file
$out_file : output file
Few examples for command line tools are
M2-edit-cl -: windows : M2-edit-cl/ in:$begin(fr) / out:$end(fr) $in_file $out_file
ffmpeX - mac n windows : /Applications/ffmpegX/ffmpeg -sameg -ss $begin(sec.ms) -t -$duration(sec.ms) -i $in_file $out_file
These syntax depends on the command line tool you are using. Look in the script file for more explanation and examples. After this to clip the video file first make a selection of the video file and select
.. Now the exported clip with be saved in the same place where the video file is present. The name of the clip exported will have the original video file name along with the start and end time of the selection.ELAN supports export to SMIL[6]-compliant clips. With a suitable player this enables you to view media files and the associated annotations as a subtitled movie.
Select the
menu. This will bring up this dialog box:Select the tiers you want to export (see the section called “How to select tiers”).
Check
if you only want to export the current selection. Otherwise the whole media file and associated annotations will be exported.Check
if you only want the current selection start time to start from zero.Check
to specify the minimal display duration of a subtitle. For instance, if a annotation is only 0.3 seconds long, but you want to display a subtitle at least 0.5 seconds, enter 500 (ms).Click on
button. This will bring up this dialog box:
Click on the respective
button and select the color from the dialog displayed to set the background color and text color of the subtitle text.To set the font of the Text, click on the respective
.
Font size and the alignment of the subtitle text can be selected from their respective list.
Click
button to set the default setting.Click on the
button to apply the new setting
Choose
to export the clip.Click on the suggested filename to change the location where the SMIL clip will be saved.
Exporting SMIL for Quick time is very much the same as exporting SMIL for real player (see the section called “Export SMIL for Real Player”). To export SMIL for Quick time, go to . This will bring up a dialog box very similar to export SMIL for Real player . The only extra option which is not available for real player is .If selected, all tiers are merged into one file and if not selected a separate text file will be generated for each tier. It is also possible to set a transparent background for the subtitles. This is done by selecting Transparent background in the dialog (see Figure 4.51, “Change subtitle text settings”) which pops up by clicking the button. Finally click on to export.
Another format you can export to from ELAN is QuickTime subtitle Text. To do this, go to the section called “How to select tiers” ) you want to be included in the subtitles. Optionally specify the following options:
. Select the tiers(see: restrict the subtitles to the current selection.
: recalculates the time of current selection to start from zero
: specify the minimal display duration of a subtitle. For instance, if a annotation is only 0.3 seconds long, but you want to display a subtitle at least 0.5 seconds, enter 500 (ms).
: If not selected a separate text file will be generated for each tier.
Finally click on
.Besides the QuickTime subtitle Text (see the section called “QuickTime Text”) there is another subtitle format ELAN
can export annotations to: SubRip with file extension .srt
. Click
on and select the
tiers(see the section called “How to select tiers” ) you want to include in the
subtitle file. Specify whether the subtitles should be restricted to annotations in
the selected time interval, whether the time of the selected interval should be
recalculated form zero and if the master media time offset should be added to the
annotations times. The third option lets you specify the minimal display duration of a
subtitle. For instance, if a annotation is only 0.3 seconds long, but you want to
display a subtitle at least 0.5 seconds, enter 500 (ms).
After you have selected tiers and specified the options, click on
. Enter a filename in the next window and click on .To export ELAN’s document view (i.e. to make a screenshot):
choose
Enter a filename and an extension (*.jpg
,
*.jpeg
, *.png
or
*.bmp
)
click on
.If you are using Windows, it sometimes happens that ELAN’s video window is black on the picture created using this function. This can be solved by temporary disabling the hardware video acceleration:
Right-click on the desktop
choose properties
select the Settings tab
Click on the advanced… button
Select the Troubleshooting tab
move the Hardware Acceleration slider tot None
Don’t forget to re-enable the hardware acceleration afterwards, because this has a strong effect on the system’s graphical performance.
This function (the section called “Previewing the printed pages”. The main difference is that the width of the exported text depends in this case on the number of characters that fits on one line.
) is very similar to ELAN’s printing system. Therefore more information can be found in
After selecting an appropriate layout click on Save as and choose a location
and file name. These files can afterwards easily be edited with any text editor
(preferably using a fixed-with font). Optionally tick the box if you prefer to have the whitespace between
annotations to be filled with tabs instead of spaces (especially useful when importing
a text file into Word). If
is selected, you could also have single tab instead of multiple whitespaces. To do
that tick box if you prefer to have
tabs instead of multiple white spaces.
Similarly to the export to interlinear text (see the section called “Interlinear text file”) you can also export annotations to a HTML file, through the menu.
The only extra option for the HTML export is
Play media : Check this option to play the media file in the exported html file.
To play the media HTML 5 is required. It is necessary to place the exported html in the same location as the media file in order to play the file from the html export.
To export a Filmstrip Image first select the time segment you want the
filmstrip of. Then click . In the dialog window (see Figure 4.56, “Exporting to a filmstrip image”) you can define the width of each
video frame, which frames to include and whether ELAN must add a time code in each
frame. Moreover, ELAN can add the waveform, with or without a ruler, and specify the
height. You can also specify whether the stereo channel should be displayed separately
or merged or blended. Click on to generate the image.
Finally select a destination folder, enter a filename and click on
.
An example or an exported filmstrip image can be seen in Figure 4.55, “An exported filmstrip image”.
Tiers for the recognizers are exported in the AVATech tier format. For more information on the AVATech tier format see http://www.mpi.nl/avatech
Select
menu. This will bring up this dialog box:Check
to show only the top level tiers.Select the tiers you want to export. Keep CTRL pressed and click to select multiple tiers, press Shift and click to select multiple successive tiers.
Check
if you want to export the current selection. Otherwise the whole media file and associated annotations will be exported.Click
to export the tiers and give a filename, where the tiers can be exported.[3] From here on, every appearance of Shoebox can also be read as Toolbox, i.e. the newer version of what was formerly known as Shoebox.
[4] Synpathy is a tool for annotating, analyzing, and graphically editing the syntactical structure of sentences (e.g. Linguistically annotated text corpora), developed at the Max Planck Institute for Psycholinguistics. The application is based on the SyntaxViewer from the TIGER search project developed by the IMS (Institute für Maschinelle Sprachverarbeitung, University of Stuttgart).
[6] For a description of this standard and players see http://www.w3.org/AudioVideo/