Speech to Song illusion
Memory for Musical Tones
Pitch of Speech in Tone Language
Speech to Song illusion
The Speech-to-Song Illusion was discovered by Diana Deutsch in 1995, when she was fine-tuning the opening spoken commentary on her CD ‘Musical Illusions and Paradoxes’. She had the phrase ‘Sometimes behave so strangely’ on a loop, and noticed that after a number of repetitions, the phrase sounded as though sung rather than spoken. Later she included this illusion in her CD ‘Phantom words and other curiosities’, accompanied by the following commentary:
In our final demonstration, speech is made to be heard as song, and this is achieved without transforming the sounds in any way, or by adding any musical context, but simply by repeating a phrase several times over. The demonstration is based on a sentence at the beginning of the CD Musical Illusions and Paradoxes. When you listen to this sentence in the usual way, it appears to be spoken normally - as indeed it is. However, when you play the phrase that is embedded in it: 'sometimes behave so strangely' over and over again, a curious thing happens. At some point, instead of appearing to be spoken, the words appear to be sung, rather as in the figure below.
Here is the full sentence followed by the phrase played repeatedly:
And here is the phrase as it is generally heard after it has been played repeatedly:
Now here again is the exact same sentence as you just heard. You will probably find that it begins by sounding as speech, just as before. But when you come to the phrase that had been repeated, it suddenly appears to burst into song.
The effect was investigated in detail by Deutsch, Lapidis, and Henthorn (2008), and Deutsch, Henthorn and Lapidis (2011).
In our first experiment we tested three matched groups of subjects, and presented each group with a different condition. The subjects all listened to the full sentence and then to ten presentations of the phrase. During each pause between presentations they judged on a five-point scale whether they heard the phrase as exactly like speech, like speech, like either speech or song, like song, or exactly like song.
In all conditions, the first and last presentations were identical, and we examined the effects of two manipulations of the intervening presentations on the subjects' judgments. In the first condition, the intervening presentations were exactly as the original. In the second, they were transposed slightly, so that the pitches differed but the pitch relationships were preserved. In the third, the intervening presentations were not transposed, but the syllables were presented in jumbled orderings.
The above graph compares the effects of having the intervening repetitions exactly as the original, as compared with being transposed slightly. As can be seen, when the repetitions were exact, perception moved solidly from speech to song. However, when the repetitions were transposed slightly, although ratings moved slightly towards song, they remained solidly in the speech region.
The above graph shows the effect of having the intervening repetitions consist of the same syllables in jumbled orderings, again compared with having the repetitions exactly as the original. We can see that here there was no transformation from speech to song. So it seems that, in order for this transformation to occur, the phrase needs to be repeated exactly, without transposition, and without changing the ordering of the syllables.
So we can then ask: What do the subjects actually hear when they say that they are hearing song? To find out, we recruited 11 female subjects who had had experience with singing in choirs or choruses, and tested each subject in isolation from the others. We had them listen to the full sentence and then to the phrase repeated ten times, and asked them to reproduce the phrase exactly as they had heard it.
Here are the reproductions of six of the subjects played in sequence. As is evident, although the phrase was spoken, the subjects reproduced it as song.
And here are the reproductions of all 11 subjects, digitally mixed together so that they are played as a chorus. (A small amount of reverberation has been added, but otherwise the sounds are exactly as they were recorded.)
But one might then wonder whether these subjects could have heard the phrase as sung the first time they heard it. So we recruited another set of 11 subjects on the same basis, and also tested them in isolation from each other. This time we played them the full sentence followed by the phrase presented only once, and asked them to reproduce the phrase exactly as they heard it. Here are the reproductions of six of these subjects played in sequence.
And here are the reproductions of all 11 subjects, again digitally mixed together so that they are played as a chorus. This confirms our finding from the rating experiment that when the phrase is heard only once, it is perceived as speech rather than song.
To make sure that these subjects were able repeat the pitches after a single hearing, we then had them listen only once to the phrase as sung rather than spoken, and again asked them to repeat back exactly what they had heard. Here are the reproductions of the same six subjects that you just heard, and you can see that they had no problem reproducing the sung melody.
The red line in the above graph shows the average pitch of each syllable, averaged over the 11 subjects who repeated back the spoken phrase after having heard it 10 times. The blue line shows the average pitch of each syllable, averaged over the other set of 11 subjects, who repeated back the same spoken phrase after having heard it only once. As can be seen, the reproductions of the two groups were very different.
The red line in the above graph again shows the average pitch of each syllable in the spoken phrase, averaged over the 11 subjects who repeated it back after having heard it 10 times. The green line shows the average pitch of each syllable, averaged over the other set of 11 subjects, who repeated back the sung phrase when it had been presented only once. Notice that there is a remarkable correspondence between these two plots, showing that the subjects' perceptions of the sung phrase were very similar to those of the subjects who had instead heard the spoken phrase repeated 10 times, and quite different from their own perceptions of the spoken phrase when they had heard it only once.
To conclude, this illusion is in line with what philosophers and musicians have been arguing for centuries, that strong linkages must exist between speech and music. We still need to determine the neural processes that are responsible for this striking perceptual transformation. However, the present experiments show that for a phrase to be heard as spoken or as sung, it does not need to have a set of physical properties that are unique to speech, or a different set of physical properties that are unique to song. Rather, we must conclude that, assuming the neural circuitries underlying speech and song are at some point distinct and separate, they can accept the same input, but process the information in different ways so as to produce different outputs. As a further point, this illusion demonstrates a striking example of very rapid and highly specific perceptual reorganization, so showing an extreme form of short term neural plasticity in the auditory system.
This illusion has been featured in numerous radio broadcasts, notably the WNYC Radio Lab interview(NPR)with Jad Abumrad and Robert Krulwich.
It has also been featured in several videos. For example, the video below features the illusion being experienced by the fifth graders of Atwater School, Shorewood, Wisconsin. Video created by their music teacher Walt Boyer, posted with permission.
Psychology Home Page | Diana Deutsch's Psychology Web Page