Browsable Corpus (BC)

Introduction References Corpus Structure Corpus Information
Document Information Header Information Metadata Overview

Last update: 30-Aug-2000

 

Introduction

The Browsable Corpus concept was introduced at the Max Planck Institute for Psycholinguistics (MPI) to make resource discovery easier by defining meta-descriptions for language resources. The structure of linked meta-descriptions can be browsed and searched. 

 

References

A Browsable Corpus: accessing linguistic resources the easy way (Broeder, Brugman, Russel & Wittenburg). 

 

Corpus Structure

 

Metadata Overview

Corpus Groups corpora and/or sessions together
   Name Name of the grouping
   Level Not used anymore
   Description Description of the grouping
  0..N FDescription Provides for (legacy) HTML information files
  0..N Infofile Reference to a file/URL providing relevant information (usually legacy metadata) about the current grouping
   Project_Controller Responsible for the corpus data
     Name Name of the entity responsible for the corpus data
      Description Description of the responsible entity
      Infofile Reference to a file/URL providing information about the responsible entity
   Content Describes important aspects about the content of the corpus
     Keywords Keywords about the content
     Languages Groups the languages used in the conversation
 

1..N

Language The language used in the conversation
  Description Describes specifics about the language used in the conversation
  Infofile Reference to a file/URL providing information about the language used in the conversation
     Description Description of the content of the corpus
  

 0..N

Infofile Reference to a file/URL providing information about the contents of the corpus
   Participants Groups together information about the persons identified in the corpus
  

1..N

Person Gives information about an identified person
       Name Name of the person
       Fullname Person's full name
       Sex Person's gender
       Role Person's role in the conversation
       Code Person's unique identifier
       Age Person's age
       Born Person's date of birth
       Relation ?
     Keys Contains a set of  attribute-value pairs

0..N

Key Attribute-value pair
  Name Contains the attribute label name
  Value Contains the attribute's value
     Description Description of the participants
  

 0..N

Infofile Reference to a file/URL providing information about the participants
   Keys Contains a set of  attribute-value pairs
0..N Key Attribute-value pair
  Name Contains the attribute label name

 

Value Contains the attribute's value
Session ?
   Name **** see corpus ****
   Level **** see corpus ****
   Date **** see corpus ****
   Description **** see corpus ****

  0..N

FDescription **** see corpus ****
   Access Contains juridical access rights to the recording
     Date Contains the date of when the rights were established

  0..N

Infofile **** see corpus ****
   Project_Controller **** see corpus ****
   Content **** see corpus ****
  Participants **** see corpus ****

  0..N

Tape Contains a possible reference to the original media tape
     Description Description of the media tape
     ID Gives a unique identifier to locate the tape in the archive
     Position Indicates the position on the tape where the session resides
     Format Describes the media format on which the recording is made
   Keys **** see corpus ****
   Files Groups together all the files of the session
  

 0..N

Transcription-File Groups information about a transcription file
       Remark Description of the transcription file
       Src Points to the location of the transcription file
       Format Indicates how the file should be interpreted
  

 0..N

Label_File Gives a sequence of markers which relate to the media file fragment
       Remark Description of the label file
       Src Points to the location of the label file
       Format Indicated how the file should be interpreted
       Start Gives information about the start position of media file fragment
       Duration Gives information about the duration of the media file fragment
  

 0..N

Media-File Groups information about a media file
       Remark Description of the media file
       Src Points to the location of the media file
       Format Gives the format of the media file
      Start Indicates when the recording was started
       Duration Gives information about the duration of the recording
       Audio-Quality Gives information about the audio quality of the recording
       Video-Quality Gives information about the video quality of the recording
  

 0..N

Infofile **** see corpus ****