Gibbons "sing" in the trees. You can see and hear them on YouTube.
IThe most traditional approach to language evolution begins by looking at animal vocalizations and imagining some kind of extension of them led to speech. The trouble with the from arf-arf to howdy-miss hypothesis is that it leaves out the insertion of meaning, syntax, joint attention, and voluntary motor control. Yet vocalization is so common in the animal world, and can be so very elaborate among primates, that vocalization is difficult to dismiss entirely as a pathway to speech. Perhaps part of the appeal of the idea that speech began as gesture lies in the way the approach throws all previous animal vocalizations and aside and gives evolution a blank slate to work with. The January issue of Developmental Science proposed a more promising avenue around the impasse. A Japanese primatologist, Nobuo Masataka, proposes in an article titled, “Music, evolution, and language” (here) that animal vocalizations led not to speech but to song.
This blog has noted repeatedly that a strong impediment to the rise of speech is basic Darwinian selfishness. Speech requires more than a society whose members pursue identical objectives. It takes a community whose members share a common psychological ground, and who know that they share that ground. This blog has also noted that music provides a way of holding a group in a common emotion. Thus music is a powerful tool in forging and strengthening a sense of community. It is not difficult to marry these two points and suppose that music is important to creating the communal emotion that speech requires.
Masataka suggests that this musical capacity evolved first, serving as an intermediary between vocalization and speech. I am generally suspicious of talk about intermediaries because so much of it is mere pawing at the earth, substituting one hard thing for another. But this intermediary idea does not seem to be shying away from the issue. Instead, it proposes a way that an intermediary step can move from Darwinian selfishness to a speaking community. The process would be move from society to more of emotional linked group and then to a speaking community.
Masataka's paper takes readers through three steps: cooing, motherese, and duets.
Cooing. Anybody who has ever watched a mother and infant interact is likely to have noticed some singing. This vocal contact is preceded by simpler interactions. At three and four months, an infant can maintain contact with its mother through cooing. The mother responds, perhaps with a simple coo herself. The infant coos again. Mutual cooing can have the rhythms of a simple conversation, except that it is meaningless (does not draw attention to a topic). Functionally, it is a powerful method of developing emotional contact.
Masataka reports having observed similar cooing interactions between Japanese macaques, baboon-like primates sometimes known as “snow monkeys.” These are highly social primates. He says the patterns of coos
... were similar to those obtained in human mother-infant dyads; after a monkey cooed spontaneously, it remained silent for a short interval, and if no response was heard from the other monkeys, then the monkey would often coo again to address the other group members. [p. 36]
It is hard to know exactly what to make of this without a richer description of what these interactions are like. Is it like two owls hooting in the night? Also there are a couple of important differences between human and macaque cooing. Most notably, in the primates it is observed “exclusively” between adults. Suggesting that, at best, adult macaques are able to manage what comes naturally to human infants. Even so, a kind of Madonna and child level of interaction is not to be sneered at.
Motherese. Another important difference between human and macaque cooing is that only humans encourage others to join in.The peculiarly-human encouragement to participate in vocal exchanges takes on some crude, singing qualities. People often use a higher pitch and a more pronounced rhythm while speaking to an infant. This style of speech is sometimes called “motherese” and, ever since it was first noted by psycholinguists, researchers have been trying to determine why it exists. The basic hypothesis has been that it helps children learn to speak their parent’s language, but it seems irrelevant to a child’s learning words and syntax. It leaves us with something of a paradox. There is a world-wide phenomenon across cultures of adults speaking to children in high, rhythmic ways and yet this universality is irrelevant.
Masataka suggests that motherese helps with mastering the “music” of the interaction rather than the meaning. The high pitch catches a child’s attention, so they listen more closely. The rhythm encourages a stronger emotional response. The result is a child who can coordinate its actions more exactly with its mothers behavior. At this point in the research the idea of a musical function for motherese is more assertion than science, but it is a very testable hypothesis as well as a promising one. The idea has the potential of being the most important one is Masataka's paper.
Duets. Serious musical interactions are not arias but duets or more. Apart from ourselves, the best examples in the primate world that Masataka comments on come from gibbons, Asian apes. Several examples of their "duets" are available on YouTube, one of which has been embeded below. I suggest your check it out.
These sounds are literally inhuman. They contain no syllables and they are "biphasic," meaning they are made by both inhaling and exhaling. Yet something recognizable is going on. These two sad animals imprisoned in a small zoo cage are plainly engaged in some sort of vocal interaction. Careful study of many such interactions has established a stereotypical, repetitive quality that seems to rule out meaningful exchanges. So they are not discussing the weather, but they are doing something together, something that seems unrelated to food, sex, or territory. In the wild, however, their duets may be more functional. Gibbons are unusual in that they are monogamous and need to be able to forge and maintain strong pair bonds.
Masataka tries to push this presence of gibbon duets:
It is therefore possible that the loud calls of early hominids shared the above characteristics [well-coordinated duets] with apes, providing the basis from which current human language evolved. (p. 38)
That strikes me as overly optimistic, and yet well-coordinated duets, for all their ambiguity of function, do suggest a place where an emotionally connected group could begin to make their way toward talking and listening to talk about things.
Macaque cooing and gibbon singing demonstrate that primate societies can sometimes evolve to include vocal interactions and, perhaps, the emotions that come from making rhythmic, pitched sounds together. The evolutionary links to humans are not direct. The last common ancestor shared by gibbons and humans probably lived about 30 million years ago and the last common macaque-human ancestor was even further back, so we are talking about convergent evolution, not common descent. Musical-like interactions have appeared more than once in the primate line, so there is nothing particulary dumbfounding about the suggestion that it evolved among our ancestors along the human line.
The scenario would go something like this:
- cooing allows the emergence of syllables used to make emotional contact;
- babbling allows the emergence of multisyllabic sounds for even richer and more complex contact;
- signing allows the contact to spread to an entire group, permitting the sense of membership in a community greater than oneself.
However, this scenario raises an important question. If primates have evolved singing relationships several times in the past, why did they stop there and not press on? Or to reverse the question, if our ancestors evolved song-like vocalizations, why didn't they stop there? What extra factor was there that pushes the human line beyond where macaques and gibbons ever needed to go?