The effect of sampling on estimates of lexical specificity and error rates
Studies based on naturalistic data are a core tool in the field of language acquisition research and have provided thorough descriptions of children's speech. However, these descriptions are inevitably confounded by differences in the relative frequency with which children use words and language structures. The purpose of the present work was to investigate the impact of sampling constraints on estimates of the productivity of children's utterances, and on the validity of error rates. Comparisons were made between five different sized samples of wh-question data produced by one child aged 2;8. First, we assessed whether sampling constraints undermined the claim (e.g. Tomasello, 2000) that the restricted nature of early child speech reflects a lack of adultlike grammatical knowledge. We demonstrated that small samples were equally likely to under- as overestimate lexical specificity in children's speech, and that the reliability of estimates varies according to sample size. We argued that reliable analyses require a comparison with a control sample, such as that from an adult speaker. Second, we investigated the validity of estimates of error rates based on small samples. The results showed that overall error rates underestimate the incidence of error in some rarely produced parts of the system and that analyses on small samples were likely to substantially over- or underestimate error rates in infrequently produced constructions. We concluded that caution must be used when basing arguments about the scope and nature of errors in children's early multi-word productions on analyses of samples of spontaneous speech.
Publication typeJournal article