The learnability of tones from the speech signal

Author(s):: Yu, Kristine Mak
Format:: Thesis
Degree granted:: Ph.D.
Publisher:: Ann Arbor : University of California, Los Angeles, 2011.
Pages:: 245
Language:: English
Abstract:: It is an unremarkable matter of course but a remarkable miracle of human cognition that children learning tonal languages learn maps from the speech signal to the abstract phonological tone concepts of their native language, which could be any tone language of the world. As an initial step for understanding how children learn tonal maps, this thesis focuses on working toward a characterization of what it is that is being learned—the class of possible maps from the speech signal to tonal categories in natural language. By studying the structure of this class of tonal maps, we can assess the learnability of the class under a mathematically precise criterion for successful feasible learning. Characterizing the learning problem as feasibly learnable is a fruitful direction for elucidating the human learning problem. Since the structure of tonal maps is conditioned on the phonetic space in which they are defined, the focus of this thesis is determining an appropriate phonetic parameterization of the speech signal for the domain of the tonal maps and for representation of the data, to the learner. We do this by assessing the separability of tonal categories in different phonetic spaces. Studying the structure of the class of possible tonal maps necessitates studying tonal maps in a range of languages, so we study tonal maps using a sample of cross-linguistic tonal production data we collected in Bole, Beijing Mandarin, Cantonese, and White Hmong and with a series of perception experiments we performed in Cantonese. The bulk of the thesis motivates the inclusion of particular information from the speech signal, since the phonetic realization of linguistic tone is widely believed to be limited to a single dimension of fundamental frequency, the acoustic correlate of pitch. We show evidence from human perceptual experiments and computational modeling: (i) motivating a. temporal domain from the speech signal for tonal maps beyond the span of a single syllable, and (ii) demonstrating that voice source parameters beyond f0 must be included for characterizing phonetic spaces for tonal maps in a. wide range of languages. While these results indicate potential sources of complexity for tonal maps, we also show that coarse temporal resolution in sampling of the relevant parameters from the speech signal suffices for good tonal category separability, hinting at potential structure in tonal maps. Human listeners identify tones degraded to be coarsely sampled at a comparable level of accuracy to that for intact tones in Cantonese, and classification by machine with acoustic parameter spaces defined only over a few real values shows a near partition of the phonetic space in the sample of languages studied. The potential structure in tonal maps suggested by these results is consistent with feasible learnability of tonal maps.
Identifier:: HmongStudies3904