The Pitch of Speech in Two Chinese Villages

Languages such as English employ pitch to focus attention on particular words, to emphasize the grammatical structure of a phrase, and to convey emotional tone. In contrast, tone languages employ pitch (along with consonants and vowels) to convey the meaning of individual words. Just as changing the vowel of a monosyllabic word in English changes its meaning (for example, from ‘bit’ to ‘boat’) so in tone languages changing the pitch of the vowel can also change the meaning of the word. For example, the word ‘Ma’ in Mandarin means ‘mother’ when it is spoken in a high flat tone, ‘hemp’ when it is spoken in a mid-rising tone,  ‘horse’ in a low tone that descends and then ascends in pitch, and a reproach when spoken in a high, rapidly descending tone. The chart below shows four Mandarin words in each of four tones, together with their meanings. Click on the audio example below to hear the words in all four tones.

Your browser does not support the audio element.

Play word 1 Play word 3
Play word 2 Play word 6
Play word 3 Play word 7
Play word 4 Play word 8
Play word 9 Play word 13
Play word 10 Play word 14
Play word 11 Play word 15
Play word 12 Play word 16

Figure 1. Chart of four words in each of four tones

Because pitch is critically important to conveying meaning in tone languages, we would expect that, in such languages, the pitch range of a person's speaking voice would be particularly consistent across time. And indeed, as described in the entry on Absolute Pitch, we found that speakers of the tone languages Mandarin and Vietnamese were remarkably consistent in the pitches with which they pronounced the same list of words on different days. I further conjectured that, considering two linguistic communities, the overall pitch levels of speech should cluster within each community, but might differ across communities.


Figure 2. The mountainous region close to the villages where this experiment was carried out

  My colleagues and I carried out a study to test this hypothesis by comparing the overall pitch levels of female speech in two Chinese villages1.   The villages were in a mountainous region in a relatively remote area of China, on the border of Hubei and Chongqing, and the dialects spoken in these villages were in the same family as Standard Mandarin. The villages were less than 40 miles apart, though travel time between them took several hours. We gave subjects a passage of roughly 3 minutes to read out, and took pitch estimates of their speech at 5 ms intervals. Then for each subject we derived the octave band that contained the largest number of pitch samples in her speech. Figure 3 shows the upper limits of the octave bands for speech in these two villages, plotted in semitone bins. As can be seen, the pitch ranges of the subjects’ speech clustered within each village, but differed overall across villages. This provided evidence that the overall pitch range of a person’s speaking voice reflects long-term exposure to the speech of others – at least in the case of tone language speakers.

Figure 3. Pitch ranges of speech in the two villages at the border of Hubei and Chongqing. Taoyuan Village is in Hubei, and Juiying Village is in Chongqing. The graphs show the percentages of subjects for whom the upper limit of the octave band for speech fell in each semitone bin.

Here are excerpts taken from the passage as spoken by six subjects in the two villages – the first three subjects were from Taoyuan Village, the next two from Juiying Village,  and the last one one from Taoyuan Village again. As can be heard, the pitch ranges of speech in the two villages differed clearly.


A continuous passage with excerpts spoken by six subjects in the two villages. Data from Deutsch et al. (2009).

Click here  for the translation of the spoken passage.

What is the value of clustering the pitch range of speech within a linguistic community? In tone languages, an agreed-upon pitch representation could facilitate the identification of individual tones, and so the comprehension of individual words. It could also be useful for speakers of nontone languages such as English – for example in enabling listeners to identify rapidly the emotional tone of a speaker’s voice.   Yet it remains to be determined whether effects similar to these also occur in speakers of nontone languages2,3.


1. Deutsch, D., Le, J., Shen, J., and Henthorn, T. The pitch levels of female speech in two Chinese villages. Journal of the Acoustical Society of America, 2009, April, 125, EL208. [PDF Document]

2. Deutsch, D. Speaking in tones. Scientific American Mind, 2010, July/August, 36-43. [PDF Document]

3. Ingrid Wickelgren.  The music of language (Audio Slideshow) Scientific American, June 2010.

Footnote.  I am grateful to Trevor Henthorn for creating the chart illustrating the tones, and to Jing Shen for choosing and pronouncing the four words in the different tones.

  CONTACT             RELATED SITES               PHILOMEL RECORDS               UC SAN DIEGO © 2021 Diana Deutsch