European Language Resources Association Catalog (ELRA)

Introduction References Corpus Structure Corpus Information
Document Information Header Information Metadata Overview

Last update: 25-Sep-2000

 

Introduction

ELRA (European Language Resources Association) is an organization to promote the creation, verification, and distribution of language resources in Europe.  The catalog includes a wide range of corpora including speech corpora, written corpora and terminology corpora. 

 

References

Information for this overview comes from the "written corpus" description form found at the European Language Resources Association Catalog

 

Catalog Structure

The catalog is an access structure on top of corpora where the metadata is about the corpora in the catalog. Corpora are divided into categories according to the type of data they contain (speech, written, terminology). Each category contains a set of corpora.

 

Meta Data Overview

Producer / Provider  
Organisation
Department
Representative
Representative Position
Contact Person
Contact Person Position
Address
Postal Code
City
Country
Telephone
Fax
  E-mail
Copyright Holder  
Organisation
Department
Representative
Representative Position
Contact Person
Contact Person Position
Address
Postal Code
City
Country
Telephone
Fax
  E-mail
General Information  
Full name of data collection
Short name of data collection
Type of resource According to the description form the following types are possible: Monolingual, Multilingual, Parallel and Other.
Source
Creation date
Update frequency
Last update
Document information (?)  
Language(s)
Details of the source
Domain(s)
Size (in words, sentences, etc.)
Time span
Description of the corpus  
ELRA Ref.
Specific Information  
Linguistic Annotation Type of annotation? (Phonemic, Othographic, Morphological, Syntactical, Semantic, Other)
Text level of annotation (tagging)
Description of the tagging system
Technical Information  
File format Text, Word for PC, Word for Mac, Other
Standard in use ISO, SGML, TEI, Other
Character set ISO 8859-1, 7-bit ASCII, 8-bit ASCII, UNICODE, Other
Distribution media CD-ROM, Floppy disk, Cartridge, Other
Additional Information  
Documentation
On-line documentation (www, ftp)
Related tools
  Application purposes
Availability  
Date of availability
Price for research use
Price for commercial use