A Linguistic Consequence of Music Appreciation

John A. Frantz

Music appreciation is almost impossible to explain as a product of evolution, as illustrated by this quotation from Charles Darwin in The Descent of Man: “As neither the enjoyment nor the capacity of producing musical notes are faculties of the least use to man . . . they must be ranked amongst the most mysterious with which he is endowed.” The sophisticated ability to analyze and appreciate music (even by nonmusicians) defies the direct rationalization of having survival value for its own sake. Rather, it must be piggybacking on some other biological imperative. What can this imperative be?

Recognition—Friend or Foe?

The instant recognition of faces is a hardwired ability with obvious survival value in terms of distinguishing friend from foe. This ability has been well studied by students of animal behavior. I was especially intrigued by a 2001 study of face recognition among sheep. Twenty sheep were trained to recognize pictures of sheep or human faces that were placed at junctions in a maze. Subjects received food rewards for correct choices. The sheep became very adept.

The evolutionary implication is that humans and sheep must have had a common ancestor some tens of millions of years ago already in possession of this brain circuitry. Voice recognition, our next topic, is also part of the recognition of friend or foe. Perhaps this ability also enables us to appreciate music.

In my medical-school physiology class, we did informal experiments to demonstrate our ability to localize the direction of origin of a sound. Apparently this ability is hardwired, not learned. Babies can do it very early on. It turns on detecting differential loudness in the two ears and even more on sensing the different arrival time of the sound at the two ears. Tiny differences in the sound experienced by the two ears are noted by the brain and the meaningful calculations are made automatically. However, this explanation is still a long way from identifying a biological basis for music appreciation. The following anecdote describes a possible breakthrough in thinking about this problem.

Learning the Language Doesn’t Mean Mastering the Accent

When my wife and I went through Peace Corps training in 1968, our fourteen-year-old daughter chose to take training for teaching English as a second language with the college graduate volunteers. As the Peace Corps encouraged flexibility, she was permitted to do so. Ten years later, having had no further contact with one of her fellow students, Norman, she answered the telephone in a Manhattan home where she was a houseguest. It was a wrong number, but she recognized Norman’s voice and replied to him instantly in Farsi, their mutual exotic language, and they were both astounded. I am sure that we have all had similar, if less striking, episodes of sudden recognition of voices out of context in the absence of helpful clues.

Reflecting on this event, I realized that this ability is automatic, that it is not learned, and that it has enormous survival value. It is the biological equivalent of the electronic Identification Friend or Foe (IFF) equipment in military aircraft that is programmed to recognize other aircraft from afar as friendly or otherwise. The minute details of the sound of the same words as spoken by different individuals, which we perceive so readily, could serve a similar function. Thus, this capability for voice recognition could undergo selection for survival, while music appreciation might be its accidental by-product.

The varying ways we respond to different pieces of music probably has a similar basis. Familiar music is comforting, so we feel pleased and reassured on hearing a tune that we recognize and have long enjoyed. Work songs lighten the burden of hard labor; marching music keeps the military tromping through difficult situations. A further point confirming that music appreciation might relate to our hardwired human IFF system is the commonplace fact that songs are much easier to memorize than even the most rhythmical prose. With a given musical piece, the melody is easier to remember than the words. Learning tunes can even be inadvertent, as anyone knows who has heard a melody and then can’t get it out of his or her head.

It is of interest that trained musicians are less likely to have absolute pitch than the rest of us, and virtually all primitive, illiterate people possess this trait. Bird songs, if not many others, are identifiable not only by the tune but also by the pitch, which is absolute, thus helping primitive man to interpret the sounds of the jungle more promptly. Other animals may also have absolute pitch. Modern musicians are confronted with the transposing of many pieces to make them easier to sing for a particular group. Besides, the frequency of the “standard A” has been increased by some committee a couple of times in my lifetime. Have modern musicians shot themselves in the foot, so to speak?

In summary, music appreciation is an extremely sophisticated activity that places heavy demands on already-sophisticated neural circuitry that evolved to enhance identification of sounds, especially voices. Music appreciation, which has no innate survival benefit, is clearly piggybacking on some high-priority survival trait, and friend-or-foe voice recognition is the logical suspect. Without this preexisting ability to gain so much information from sound, music appreciation would not have developed. In many ways it resembles our human capacity to appreciate literature, which seems to have arisen accidentally out of our capacities for language and literacy without requiring any new biological evolution.

Next, I offer a speculative consequence of music appreciation that might change the way we all learn languages—if we choose to utilize it.

An International Anthem, or a Perfect Accent for Students of all Languages

This could be an “ultimate project” for some skunk-works posse. (A skunk-works posse is a group, usually of engineers, whose members pursue a problem on their own time—or, at least, on time for which they do not have to account in the ordinary way—in hopes of achieving an unorthodox solution with minimal management interference.) My idea concerns how to teach all human languages, even to adults, with a near-perfect accent. Let me offer a brief account of how it occurred to me.

I tried to learn Farsi (Persian) at age forty-four. My vocabulary was good enough for ordinary conversation, but illiterate Farsi speakers couldn’t understand me (they thought I was speaking English). My eight-year-old daughter knew fewer words than I, but she had a perfect accent. The uncanny thing was that she could be my “translator” with the locals, even when she did not know the words I was using. She had grasped the “music” of Farsi—the sounds from which its words were constructed—in ways that my adult brain could not.

The reason is simple enough. A great number of phonemes—basic speech sounds, like the hard b in ball or the trill of the double-r in Spanish—are employed across the range of human languages. All tongues use only some of them; this is surely true of English. Babies come into the world capable of forming each of these phonemes—otherwise they couldn’t form part of any language. As young children acquire their first language, they gain performative mastery of its particular phonemes (and tones, in the case of tonal languages). In many cases, as they leave early childhood they lose the capacity to form or even recognize phonemes or tone patterns not employed in the first language. This greatly complicates the project of learning new languages in later life. In addition to the challenges of mastering the new tongue’s vocabulary, grammar, and syntax, performative mastery of the language’s phonemes may forever evade the adult learner, who may find it difficult or impossible to speak (and often, even to hear) phonemes or tones not acquired in early childhood.

A few examples should make this problem clear. Consider the stereotypical difficulty many speakers of Asian languages encounter in distinguishing (and also pronouncing) the distinct r and l sounds in spoken English. Imagine the difficulty of native English speakers who master Spanish but cannot roll their r’s, however hard they try—or who try to master a language that uses tonality extensively, or one that employs the glottalized clicks present in many African languages but wholly absent from English. Their lack of performative mastery keeps many from being fluent in languages they have mastered intellectually; that was surely my problem with Farsi.

Thirty years after that humbling experience, the following idea occurred to me. It is simple to express, but may be quite difficult to carry out. I will appreciate any comments readers may offer.

My idea is to develop and popularize a children’s play song that includes all the phonemes used in human language. The hope is that children who grew up singing this song will grow into adults who would not be encumbered from acquiring any other language by “performance issues” that inhibit their speaking or recognizing the new language’s phonemes. In the end, all humans might be empowered to learn each other’s languages with a perfect accent, even starting as adults, because in childhood we would all have mastered all the phonemes the human vocal apparatus is capable of forming.

Children all over the world sing play songs. From my childhood I remember jumping rope to “’Pendicitis said the doctor, ’Pendicitis said the nurse, ’Pendicitis said the lady with the alligator purse. . . .” I remember it clearly even now, though the verse is close to nonsense. Meaning can be imposed upon nonsense, as illustrated by Lewis Carroll’s Jabberwocky—or by a joke that never stops bouncing across the Internet: “Q. How do we know that Mahatma Gandhi had bad breath? A. Otherwise, how could Mary Poppins have sung, ‘super-calloused, fragile mystic hexed by halitosis?’” (Say this out loud, in cadence.)

Writing a song that is both catchy and comprehensive would be no easy task. It would fall to a skunk-works posse, a self-assembled group of experts that in this case would include much more than engineers. Our posse would need to be composed of experts from diverse fields, including a well-known and competent coordinator (such as a recently retired editor of Science), multilingual literary people (Isabelle Allende comes to mind), numerous linguists, and others to be recruited as needed, such as leaders from UNESCO and the various religions and perhaps even computer scientists. A newly composed melody compatible with all cultural traditions would be highly desirable.

A lab school could provide children’s voices for the recording. No child should attempt any phonemes that he or she could not pronounce—digital mixing would easily smooth it all out.

Though the lyrics would be surface nonsense, meaningful humanitarian, conservation-related, and even universal religious precepts could be subtly incorporated into a scaffold of languages and woven into the text. Some might approach surface comprehensibility in one or more languages. The song might be accompanied by commentaries in many languages, giving a key to the text’s meaning, as Lewis Carroll did in Alice through the Looking Glass. Still, most of the song will be near-nonsense for most who would learn to sing it.

Our posse might prefer, if all the phonemes could be sufficiently and concisely included, to use proverbs from many cultures as a nonsense equivalent. Many proverbs would be needed. Those with universal messages would be parsed into their phonemes for computer processing to find a combination concisely incorporating most of the phonemes. If there are leftover phonemes that cannot be incorporated in this fashion, they could form a chorus of total nonsense.

Of course, these details merely illustrate how our posse might organize its task—originality would be needed in every aspect of this effort. Its coordinator might deem it desirable to arrange some grants for administrative expenses and perhaps to bring key people together in person for critical discussion or to supervise recording of the soundtrack.

Once complete, the song would be played on radio and television, made available in libraries, distributed online, and—one hopes—sung by a generation of the world’s children. Ultimately, international travelers could join children in their songs at play, enhancing the unity of our species—a truly international anthem. It would be appropriate to publish this suggestion in many places in order to recruit appropriate volunteers for the posse.

My hope is that this idea will take on a life of its own, and that I will become merely one of the volunteers while a dynamic coordinator recruits coleaders from the linguistic, literary, musical, and computer science fields, to ensure that no vital phoneme is inadvertently omitted.

Further Reading

Kendrick, Keith M. et al. “Sheep Don’t Forget a Face.” Nature, November 8, 2011. 414 (6860):165–66.

Levitan, Daniel J. and Susan E. Rogers, Susan E. “Absolute Pitch: Perception, Coding, and Controversies.” Trends in Cognitive Sciences 9 (1) 29, 2005.

John A. Frantz

John A. Frantz practiced medicine from 1946–2006. He taught internal medicine as a Peace Corps volunteer from 1968 to 1970.


Music appreciation is almost impossible to explain as a product of evolution, as illustrated by this quotation from Charles Darwin in The Descent of Man: “As neither the enjoyment nor the capacity of producing musical notes are faculties of the least use to man . . . they must be ranked amongst the most mysterious …

This article is available to subscribers only.
Subscribe now or log in to read this article.