8. Implications for Music Cognition,
Musicology, and Computer Music

My overall thesis has been that music perception and cognition are embodied activities, depending crucially on the tangible features of our sensorimotor apparatus, and also on the sociocultural environment in which music perception, cognition, and production are situated. I have presented some specific evidence in its support, by showing how certain rhythms of African-American music may relate to such embodied processes. I have claimed that musical perception and cognition are active constructions, rather than passive experiences, of the listener. In particular, the perception of pulse and meter are not perceptual inevitabilities, but are strongly dependent on the person s culturally contingent listening strategies. I have also argued that much of what we experience listening to performed music relates to a ecological recognition of, and even an empathy for, the bodily motion of which the musical sounds are a result. These sonic traces of bodily motion can be appreciated as such, and even aesthetically privileged in certain cultures, while neglected or suppressed in others.

The music-cognition community has been somewhat slow to acknowledge fully the role of culture in shaping our ways of perceiving music. Consider the search for the universals of human music cognition. In a recent, rather controversial and quite flawed lecture on possible evolutionary explanations for the existence of music (prefaced by the disclaimer, "I don t know anything about music, but..."), evolutionary biologist Steven Pinker made one compelling claim: if we wish to study the basics of music cognition, we should appeal to the musical experiences of the masses (such as hip-hop, the Eurovision song contest, disco, etc.) rather than the art music of high culture. (Pinker 1997) I concur with this claim, but challenge him in his later assumptions that the perception and cognition of music is foremost a solitary, pensive act. His model listener was an idealized, radically de-situated one, with headphones on and eyes closed c in fact an invocation of the classic Western autonomous listener described in many European music-theoretic texts and dating back to Plato, as McClary (1991) has pointed out (see below). When Pinker was asked whether he had any ideas about how group psychology might have affected the evolution of music, he responded that if any such effects did exist, they were profoundly secondary. While it is clear that music listening involves an individual s cognitive systems, one could argue that language comprehension does as well; but nobody would claim that language -- particularly spoken language -- exists for solitary, pensive activity. Language serves as a means of interpersonal communication, not just of factual information but also of emotions, interrelationships, and imaginings. One could say the same of music. In fact, the connections between music and language are quite far-reaching, as I discuss further below.

In the same lecture, Pinker argued that music could be seen as a "pleasure technology" -- a concentrated dose of auditory patterns that happen to give pleasure for other evolutionary reasons. For example, sensory systems grant a sense of pleasure when they receive optimal input -- clear, analyzable signals -- probably because such signals were favored evolutionarily. Hence one might hypothesize that simply organized musical material would be more popular than pieces with much surface complexity. But this is not the case for many popular musics; salsa and Afro-Cuban rumba [CD-37] would be judged as quite complex by cultural outsiders. It seems as though a simplicity criterion might be one of many competing criteria in the perception of music as pleasurable or not. One can also think of other functions of music that are not clearly derived from pleasure. For example, consider a musical mnemonic such as the "Alphabet Song," which nearly all young, English-speaking children learn. It aids in the memorization of a large number of symbols by associating it with a familiar melody (that of "Twinkle, Twinkle Little Star") and parsing it into six chunks of decreasing size. Dance music provides another example; not all dance is merely pleasure-related (as in many religious, ritual, and narrative dance forms), but many social situations involving dance seems to reinforce collective unity, which itself could enhance pleasure.

Only someone who knew nothing about music could miss the ugly implications of Pinker s suggestions to study instances of "quasi-music," among which in condescending fashion he included rap music alongside train whistles and woodchopping. Blacking cautions against this "evolutionary" treatment of the development of musical styles (Blacking 1973: 55-56). There is such an absence of description or understanding of the music of so many cultures that it is impossible to judge any music to be primitive. Often a musical style appears simplistic by one culture because it is judged by false criteria. We see this in our own culture: by other musical standards, hip-hop music is musically derivative and dull, because (for example) it doesn t have enough chord changes or melodies. I need not point out that hip-hop culture has its own highly developed, distinctive and elaborate esthetics and standards, nor that it requires highly developed improvisational skills in a variety of domains: vocal, instrumental (turntables), and dance. What should be stressed is that Pinker s ethnocentric comment and his arguments in general betray his ignorance of these dimensions. This instance is emblematic of some of the larger problems in this field, namely the tendency to generalize from a poverty of data, to fetishize certain varieties of musical complexity, and to remain blind to non-European parameters of musical expression.

In any case, questions of complexity defer, in hip-hop as well as in almost all other cases of music and language, to questions of function and utility. Indeed, in Pinker s own book, The Language Instinct (1994), he points out that all human languages have pretty much the same degree of complexity; they seem to arise fully formed, regardless of a culture s technological level. Pinker roundly rejects the notorious Sapir-Whorf "relativism" hypothesis (Whorf 1956), which claimed that language and culture shape one another to the degree that certain cognitive abilities, like color classification, are enhanced or stunted by cultural and environmental factors. Instead, Pinker argues, humans are born with a baseline of hardwired cognitive capacities, among which ranks language. In this way, Pinker derives his ideas from Chomsky (see, for instance, Chomsky 1975), and in particular Chomsky s findings of universals or "super-rules" of human language. As an example, he cites data that show that children are able to create the requisite complexity of a full-fledged language, like a creole or American Sign Language, even if their parents speak a pidgin or a shabby version of ASL. Perhaps we could hypothesize that music contains a similar baseline of complexity and a similar set of super-rules, which is distributed among rhythmic, melodic, and other components. But in his talk, Pinker claimed that music showed extreme variation in complexity across cultures, with tonal music representing some sort of pinnacle (Pinker 1997). Blacking cautions against attaching too much significance to musical complexity:

The issue of musical complexity is irrelevant in any consideration of universal musical competence. First, within a single musical system greater surface complexity may be like an extension of vocabulary, which does not alter the basic principles of a grammar and is meaningless apart from them. Second, in comparing different systems we cannot assume that surface complexity is either musically or cognitively more complex. In any case, the mind of man is infinitely more complex than anything produced by particular men or cultures. (Blacking 1973: 34-35)

Furthermore, the research reviewed by Dowling (1988) shows that very young children produce spontaneous songs that incorporate but by no means mimic elements of adult productions. Just as in language development, song development seems to go through a set of ordered, rule-governed stages, somewhat independently of external input. However, as with language, the raw materials for these cognitive processes do not appear in a vacuum; the child requires some basic stimulation to exercise these capacities. It appears as though there does exist some hardwired baseline of musical understanding; however, this basic cognitive ability may atrophy if not nurtured sufficiently in early years, as is the case with language.

Would variations in surface complexity signal variations in fundamental structure? Or might there be other factors? For example, extreme surface complexity may serve as a kind of exclusivity, in the same way that extremely jargonistic language might delineate a certain small professional community. Might increased complexity amount to a kind of augmented vocabulary, rather than a complexified grammar? Perhaps a general musical grammar would include interpersonal factors as well as individual ones; most probably it would include the conceptual scaffolding for embodiment, dance, and collective rhythmic synchronization as well as rules governing melodic and rhythmic structure. If we are to follow this evidence, we may consider the possibility that universal musical competence exist, and consider how it might manifest.

Ultimately, however, we must question the utility of a concept of musical universals. While it has been documented by Brown (1991) (via thorough and painstaking analysis of as many documented ethnographies as possible) that every culture known to man has music and dance -- that they themselves are human universals -- we should be careful with the limits of the assertion that the same principles underlie all musics of the world. Whatever the role of musical universals, the particulars seem to matter just as much. For example, cross-cultural studies suggest that listeners experience great difficulty in intuiting the emotional content of unfamiliar music from another culture (Gregory & Varney 1996). Furthermore, even people of "the same" culture may fail to decode a given piece s emotional content in the same way. Often musical portrayals of exuberance and rage can have similar surface characteristics; for example, these similarities have led to much-contested interpretations of saxophonist John Coltrane s music as alternatively angry or joyous [CD-53]. Similarly, musical depictions of sexuality and of violence can be mistaken for one another. (Wessel 1998) Hence, one of music s most unarguable strengths, namely its capacity for emotional expression, appears to be the result of cultural associations rather than purely intramusical dynamics. Just as different cultures have different words for joy or sorrow, they may just as well use different sonic gestures to connote these emotions. The cultural factors that give rise to musical activity provide the richness that distinguishes one music from another, and they do so in a manner that is productive, not limiting.


I will now turn briefly to the implications of this thesis in the realm of computer music. The early days of computer music saw pieces that focused on manipulation of timbral parameters in direct reference to work in music perception and cognition. One might suppose that this work suggests perceptual issues that can be addressed through music itself. The most relevant issues along these lines relate to the body, and to its status in contemporary music.

McClary writes, "The advent of recording has been a Platonic dream come true, for with a disk one can have the pleasure of the sound without the troubling reminder of the bodies producing it. And electronic composition makes it possible to eliminate the last trace of the nonidealist element." (McClary 1991: 136) The implicit prejudices about computer music -- that it sounds inhuman, digitized, random, and so forth -- are addressed explicitly by the present work. As is the case with programs that improvise "convincing" musical output, programs that generate human-sounding rhythms focus our attention on the role of the same human performer that they might seem to replace. From the few psychological and cultural considerations discussed herein, we could construct a handful of heuristics about rhythmic expression in a groove context; these heuristics can be applied to relevant musical material in an intelligent way. To summarize what was set forth in chapter 7:

A simple understanding of such minor adjustments could help musicians working with computers to create music that is rhythmically vital and rich in texture.

Popular music of recent decades has grown quite aware of these possibilities, as its use of technology has catered to its fickle and ever-changing audience. You only need to tune in to any urban radio station to hear that rather convincing electronic tracks have replaced the drummer. Observe the trajectory from the quantized, otherworldly sounds of the Roland TR-808, an analog-synthesis drum machine popular in the early 1980s, to the plasticity of sampled recordings of real drumming manipulated by contemporary software tools. This path suggests the narrative of a popular aesthetic, born of widely available technology, whose participants attempt to make inexpensive rhythmic accompaniment that sound as funky and fresh as their human counterparts, and the counter-narrative of the role of technology in shaping those aesthetics. The uncanny, inhuman sounds of the TR-808 are now enjoying a resurgence thanks to a retro craze, and thanks to the influence of history and memory on popular taste. A contemporary song using the sound of the TR-808 [CD-54] implicitly signifies on the past. Every pop tune or "ghetto classic" exists in a universe of Signifyin(g) associations with other such songs.


Performance variation, musical expression, microtiming -- they all suggest the presence of a human body making music. Humans necessarily exhibit some deviation from rigid quantization. Hence, the absence of these deviations implies the absence of a musical body. But this absence can be as musically meaningful as its presence; the strategic use of "robotic" rhythms can suggest a disembodied, techno-fetishistic, futuristic ideal (as in contemporary electronica), or it can embody a Signifyin(g) riff on technology, history, and memory (as in contemporary hip-hop referencing the sounds of its beginnings [CD-54]).

Often, popular computer music plays in the gray area between bodily presence and electronic impossibility. Again, an example from electronica displays this playful ambiguity [CD-55]. A sampled "beat" c i.e., a brief recording of a human drummer c is sliced into small temporal units. These units are played back in rearranged orders, sped up or slowed down, multiply triggered, and otherwise manipulated electronically. Because the original sampled recording bears the microrhythmic traces of embodiment, the result sounds something like a human drummer improvising with often amusing flourishes and ample metric ambiguity. Momentarily regular, almost human-sounding pseudo-drumming devolves into inhumanly rapid sequences of rhythmic attacks, fast enough to resemble digital noise. Such electronic manipulation of familiar musical sounds serves to problematize the listener s ecologically sound image of a human drummer.

Another prime example of the play of embodiment in contemporary popular music is the hip-hop DJ, who treats the turntables as a kind of percussion meta-instrument [CD-56]. Using strategically chosen segments of a vinyl record, the DJ moves the record back and forth with one hand, while creating amplitude envelopes with a fader on a mixer in the other hand. The sound generated is of two general types: one is a percussive scratch derived from rapid motion of the record, and the other is a recognizable, meaningful fragment of recorded music or sound. The latter stroke type often hides the sophisticated, impeccably timed physical gestures involved in their creation, as these gestures are unrelated to the sonic material. The scratch sound, however, bears a direct sonic resemblance to the physical motion involved. There is an interesting continuum between these two general types. A fragment of recorded sound can be manipulated percussively in a manner that temporarily overrides its referential content, causing it to refer instead to the physical materiality of the vinyl-record medium, and more importantly to the embodiment, dexterity and skill of its manipulator.

This play with the ambiguity of embodiment also appears in some more experimental realms of computer music. Improvising kotoist Miya Masaoka [CD-57] augments the physical capacities of her wooden, stringed classical instrument with electronic sensors that drive a bank of synthesizers and samplers. The sensors track her physical gestures as well as the pitch material from each string. Her creative mapping of this data to the electronic sound sources results in a sort of electronic-acoustic hybrid instrument, which she calls the "Koto-Monster." As an improvisor, she makes use of this expanded palette with an organic seamlessness that blurs the boundary between the acoustic (physical, embodied) and the electronic (artificial, disembodied) realms. In a similar vein, Laetitia Sonami has developed a sleek lady s glove into a sensitive gestural controller that tracks dozens of dimensions of manual movement. In her performances, she transforms primal gestures of the hand into sonic elements that seem to bear the trace of these gestures. Her sound material often consists of non-melodic, sampled sounds that also reference their own physical sources (speaking voices, animal sounds, wind, etc.). The result is as fascinating to watch as to hear, as one discerns an emergent connection between the hand motions and the disembodied sound material, and one becomes aware that a versatile instrument is being played with great skill.

Also in the realm of sensitive, expressive controllers, David Wessel [CD-58] has developed a productive framework for improvising with the Buchla Thunder, a novel electronic instrument with two dimensions of continuous control (position and pressure) for each fingertip. In his setup, each pressure controller acts as a volume fader that brings up a computer-driven rhythmic process, which is always in motion. The position control of each finger is set to manipulate various rhythmic parameters of the associated process, such as density, timing, or timbre. The ten fingers can create richly variable musical gestures that act as larger constructs on top of an implied rhythmic undercurrent. The musical material thus generated has the quality of hand gestures, but the musical totality is not the simple result of these gestures. Rather, it is the novel interaction between the hand motions and the computer rhythm engine that gives rise to the hybrid musical texture.

Working in a slightly different paradigm, improvising trombonist George Lewis [CD-59] has built a computer program that improvises polyphonically, with other musicians or without them. Quite expansive in timbral scope, it gives the sense of an improvising orchestra that produces focused, flowing music. Its output relates to its human colleague s sonic input just enough to convince us that it is listening, without sounding imitative. When unstimulated by musical input, it "takes a solo," seemingly unfazed. Aesthetically it fits quite squarely into the world of improvised music of the last three or four decades. It draws its inspiration from the collective improvisations of artists like the Art Ensemble of Chicago, Anthony Braxton, and Muhal Richard Abrams, all of whom (along with Lewis himself) are associated with the Chicago-based African-American musical collective known as the Association for the Advancement of Creative Musicians. Indeed, the program makes musical choices with enough depth and wisdom that one easily forgets that it is a purely disembodied computer program. In listening to this piece of artificial intelligence, we begin to perceive what we call a sound c the sonic traces of a creative personality. Lewis s work addresses these issues of embodiment, by creating a distinct sense of embodied artistry out of a laptop computer and some synthesizer modules.

We can imagine a multitude of further possibilities of exploration of a meaningful continuum between these two poles of absence and presence. For her recent electronica album, pop diva Madonna says that she wanted to explore the possibility of giving that music s characteristically inhuman sound "a soul." (Rule 1998) Her solution was heartfelt (if exceedingly banal) lyrics delivered by an instantly recognizable, celebrity voice [CD-60]. But now that we have begun to analyze the actual sonic trace of the human body in the microrhythmic content of instrumental music, perhaps we can further problematize the longstanding cliché that electronic music fails to provide a sense of soul. For what is soul in music, if not a powerfully embodied human presence? [CD-61]

 

 Table of Contents

List of Audio Examples

 Bibliography

 Discography

 Previous Chapter

Next Chapter