A comparison of lexeme and speech syllables in Dutch
Schiller, N. O., Meyer, A. S., Baayen, R. H., & Levelt, W. J. M.
A comparison of lexeme and speech syllables in Dutch. Journal of Quantitative Linguistics, 3
The CELEX lexical database includes a list of Dutch syllables and their frequencies, based on syllabification of isolated word forms. In connected speech, however, sentence-level phonological rules can modify the syllables and their token frequencies. In order to estimate the changes syllables may undergo in connected speech, an empirical investigation was carried out. A large Dutch text corpus (TROUW) was transcribed, processed by word level rules, and syllabified. The resulting lexeme syllables were evaluated by comparing them to the CELEX lexical database for Dutch. Then additional phonological sentence-level rules were applied to the TROUW corpus, and the frequencies of the resulting connected speech syllables were compared with those of the lexeme syllables from TROUW. The overall correlation between lexeme and speech syllables was very high. However, speech syllables generally had more complex CV structures than lexeme syllables. Implications of the results for research involving syllables are discussed. With respect to the notion of a mental syllabary (a store for precompiled articulatory programs for syllables, see Levelt & Wheeldon, 1994) this study revealed an interesting statistical result. The calculation of the cumulative syllable frequencies showed that 85% of the syllable tokens in Dutch can be covered by the 500 most frequent syllable types, which makes the idea of a syllabary very attractive.