Codes for the Human Analysis of Transcripts (CHAT)
Introduction | References | Corpus Structure | Corpus Information |
Document Information | Header Information | Metadata Overview |
Last update: 30-Aug-2000
CHAT is the format used for the CHILDES (Child Language Data Exchange System) project.
MacWhinney, Brian. 1991. The Childes Project: Tools for Analyzing Talk
The corpus header is called the 'Documentation File' in CHAT. It is stored in a text file (00readme.doc) in the corpus directory. The documentation file contains descriptions about the corpus. Some metadata elements which are extracted from these human readable descriptions are listed under 'Corpus Header' in the metadata overview.
Each document equals one file in the corpus directory.
There are three types of document headers in CHAT:
Corpus Header | A basic set of facts about the corpus | ||
Acknowledgements | A statement that asks the user to cite some particular reference when using the corpus | ||
Reference Name | The name of the person cited | ||
Reference Year | The year of the cited reference | ||
Restrictions | A description of the restrictions on the use of the corpus data | ||
Warnings | A description of the limitations on the use of the corpus data | ||
Pseudonyms | ? | ||
History | Gives detailed information about the history of the project | ||
Funding | Description of how the funding was obtained | ||
Goals | Description of the goals of the project | ||
Data collection | Description of how the data was collected | ||
Sampling procedure | Description of the sampling procedure | ||
Transcription procedure | Description of the transcription procedure | ||
Transcription ignored | Description of what was ignored in the transcription | ||
Transcribers training | Description of the transcribers training | ||
Reliability | Description of reliability of the data | ||
Coding | Description about coding and used codes ???? | ||
Computerized | Description of how the material was computerized ???? | ||
Codes | Description of project-specific codes | ||
Biographical data | Gives biographical information about the informant | ||
Informant's Age | ? | ||
Informant's Gender | ? | ||
Informant's Siblings | ? | ||
Informant's Schooling | ? | ||
Informant's Social Class | ? | ||
Informant's Occupation | ? | ||
Informant's Previous residences | ? | ||
Informant's religion | ? | ||
Informant's interest | ? | ||
Informant's friends | ? | ||
Table of contents | Gives a brief index to the contents of the corpora ? | ||
Situational description | Gives general situational descriptions ? | ||
Obligatory header | This header must be included to for use with CLAN programs | ||
Participants | Lists all the 'actors' within the file | ||
1..N |
Speaker's ID | The participants are represented by a unique three-letter ID. Mostly the first three letters from the speaker's name are used | |
Speaker's Name | The speaker's first name | ||
Speaker's Role | The speaker's relationship to the children under study. Standard roles: Target_Child, Mother, Father, Brother, Sister, Teacher, Playmate and Investigator | ||
Constant header | Contains information that is constant throughout the file. The information is unlikely to change during the course of the recording session | ||
Age | Specifies the speaker's age in years, months and days. | ||
1..N |
Speaker's ID | The unique speaker's ID which refers to the name and role of a participant | |
Speaker's Age | Age in years, months and days | ||
Birth | Gives the date of birth of the speaker | ||
1..N |
Speaker's ID | The unique speaker's ID which refers to the name and role of a participant | |
Speaker's Date of birth | Date of birth | ||
Coding | Indicates the date of the current version of CHAT. Used for updating files and new coding conventions | ||
Coder | Identifies the people who transcribed and coded the file | ||
Education | Specifies the speaker's highest grade in school | ||
1..N |
Speaker's ID | The unique speaker's ID which refers to the name and role of a participant | |
Speaker's Education | Identifies the speaker's education or years of college | ||
Filename | Gives the name of the computer file | ||
ID | Used by the program "STATFREQ" to assign a unique code to each child | ||
1..N |
Speaker's ID | The unique speaker's ID which refers to the name and role of a participant | |
Unique code | A unique code to identify the speaker throughout a corpus | ||
SES | Describes the socioeconomic status of the child's family | ||
1..N |
Speaker's ID | The unique speaker's ID which refers to the name and role of a participant | |
Speaker's SES | The speaker's socioeconomic status. The following adjectives are recommended: welfare, lower, working, lower-middle, middle, upper-middle, upper | ||
Sex | Indicates the speaker's gender | ||
1..N |
Speaker's ID | The unique speaker's ID which refers to the name and role of a participant | |
Speaker's Sex | Gender of the speaker (male or female) | ||
Warning | Describes user warnings about certain defects or peculiarities in the collection | ||
Changeable header | Contain information that can change within the file | ||
Activities | Describes the activities involved in a situation | ||
Bgd | Describes backgrounding material (????) | ||
Comment | Used for all-purpose comments | ||
Date | Indicates the date of interaction | ||
Language | Specifies the language used for the material that follows | ||
Location | Indicates the city, state and country in which the interaction took place | ||
New Episode | Indicates the end of one episode and the beginning of another | ||
Room Layout | A description of the room and its contents | ||
Situation | Describes the general setting of the interaction | ||
Stim | Indicates a particular stimuli used in an elicited production task | ||
Tape Location | Indicates the specific tape from which the transcription was made | ||
Tape ID | Gives the tape identifier | ||
Tape Side | Gives the side of the tape (a or b) | ||
Tape footage | Gives the tape footage | ||
Time Duration | Indicates the time at which the audiotaping began and the time that passed during the course of the taping | ||
Time Start | Gives the time at which the recording began | ||
Time End | Gives the time at which the recording ended | ||
Time Start | Used to "restart" the clock |