2. Defining Terms

Given that this is an interdisciplinary work with a multiperspectival audience, it is necessary to assure ourselves that we are all talking about the same things. Cognitive science, itself an interdisciplinary field, is plagued by the slippage caused by multiple simultaneous meanings of commonly used terms. Philosophers and neuroscientists can barely agree on the meaning of the word "mind," the supposed primary object of study. Similarly, the commonly used terms of music theory and music psychology are clouded by imprecise connotations. Often their scientific meanings, vague as they are, are eclipsed by their colloquial and artistic usages. We must be careful and thorough when using such terms in a scientific context, especially when attempting to describe cross-cultural phenomena. Invariably, these terms have multiple dictionary meanings, multiple meanings implied by common usage, and multiple meanings agreed upon by a community of scholars in the field of music perception and cognition.

In this section I examine a number of terms that arise in the study of rhythm perception, cognition, and production. I will focus on the last body of definitions, and will attempt to remain aware of the slippage among these three kinds of meaning. The nature of an interdisciplinary work makes such semantic mismatches inevitable, as does the nature of an interdisciplinary field such as cognitive science.

Cognitive science. The field of cognitive science consists of an interdisciplinary study of the structures of the human mind. These structures include our sensory/perceptual apparatus, such as vision, audition, olfaction; internal mental processes such as language, thinking, reasoning and problem solving; motor control and the organization of skilled behavior such as speech and musical performance; memory; consciousness; attention; and many other aspects of mind. All of these subfields are clearly intertwined. Disciplines included are psychology, biology, neuroscience, philosophy, anthropology, linguistics, sociology, and computer science; more recently, the academic music world has devoted some of its resources to the study of the cognitive science of music.

Often the claim is made that cognition is information processing: "Cognitive science is the study of information processing, and insofar as a discipline studies that, then it is part of cognitive science." (Hardcastle 1996: 8) This claim is true in the most general sense, in that the mind is continually taking in information and dealing with it in some way. In the past, the further claim was made that the mind is merely an example of a formal symbol-manipulating device that executes mental programs, where any such device would do (e.g. von Neumann 1951). As such, it was assumed that one could ignore the physical and biochemical particulars of the brain. Theories of mental processes then consisted of abstract mathematical models that had the correct input-output characteristics and various functional or causal relations. This kind of theorizing suggested a fundamental dichotomy that was assumed between the body and the mind -- a problematic dichotomy that dates back to Plato.

Eventually the neurosciences began to make enough advances that it became advantageous to adopt the perspective that an organism s cognition is intimately connected with its strategy for survival within particular ecological environments. Nervous systems are not von Neumann s idealized general-purpose computers. Rather, our neural architecture evolved to fulfill a certain range of specific needs and to facilitate certain activities. For example, the general organization of the brain is best explained if we view the nervous system as a generator of motor output. The cerebellum is connected almost directly to all areas of the brain -- sensory transmissions, reticular (arousal/attention) systems, hippocampus (episodic memories), limbic system (emotions, behavior). All areas of our brain seem geared to coping with their functions as they pertain to problems of motor control. The brain is understood in enough detail that one can model them computationally at a fairly low level. (Hardcastle 1996: 6-7)

And furthermore, recent authors (e.g. Shore 1996) have viewed cognition as inextricably linked with the environment that gives rise to it. It has been suggested that "there is reason to suspect that what we call cognition is in fact a complex social phenomenon... 'Cognition' observed in everyday practice is distributed -- stretched over, not divided among -- mind, body, activity and culturally organized settings (which include other actors)... [Cognitive] 'activity' takes form differently in different situations." (Lave 1988: 1)

Overall, the field of cognitive science consists of research on parallel fronts: to name a few, we have the neuroscientific study of the structure of the brain, the psychological study of mental abstractions, and the socioanthropological study of the shaping of these mental structures by culture.

Cognition. In view of the above description, this term operates as a frame for a huge variety of activities and mental processes. It is occasionally distinguished from perception, to denote "higher"-level abstractions. The term also carries the implication that these processes can be described scientifically; i.e. the use of the term "cognition" references the discipline of cognitive science and all of its elaborations.

Perception. Perception typically refers to the activity of processing physical input (e.g. pressure waves, photons) into convenient abstractions (e.g. pitch/timbre, color). In the past, the term has been seen as a set of processes distinct from cognition, but in modern terms (in part due to the research agenda of the vast field of cognitive science) seen as subsumed under the umbrella term cognition. But this distinction is often blurred; hence we have a journal called Music Perception, and a Society for Music Perception and Cognition, both of which are concerned with the same research agenda.

Perception and cognition are often used in ways that attempt to transcend the influence of culture. However, it has been argued more recently that what is commonly called "perception" should be viewed as a practice -- an open-ended, intentional activity that is accomplished actively by the musical participants, while profoundly influenced by the perceivers social context. (Berger 1997, Bourdieu 1977) Perception should not be seen simply as a raw, sensory inevitability, like a sensation. Nonetheless, there are aspects of perception that are universal, attributable to the human sensory apparatus alone, and can be studied in this way. In the work that follows, I appeal to both sides of this incarnation of the nature/nurture debate.

Cognitive model. A cognitive model may comprise a "circle & arrow theory" of how some aspect of cognition is structured (e.g. information processing stages), or a set of equations with the proper input-output specifications and some internal structure that is believed to represent some aspect of cognition. In studying a cognitive model, one considers issues such as predictive power and model uniqueness. In other words, one examines whether the model can foresee any traits of the aspect of cognition it claims to govern, and also whether success of the model logically excludes other possible models with the proper I/O mapping.

Representation. A representation is nothing more than some way of organizing, manipulating, and storing information. Because of the overlap between cognitive science and computer science, a cognitive representation is often discussed in the same terms as a computational data structure, or a set of such structures -- usually seen as a disembodied, symbolic abstraction, possibly shared by some group of computational or mental processes. The internal organization of a representation, i.e. its data structures and attributes, might be meant to reflect a theory of mind. In such cases, to propose a certain cognitive representation, say for rhythm, means to posit an assertion about cognition; in this sense a representation can be a kind of model. The use of representations is linked traditionally to the information-processing or "cognitivist" view of the mind.

Varela et al. (1991) describe the "cognitivist" view of cognitive science as follows: Intelligent behavior presupposes the ability to represent the world internally; an agent acts by representing relevant features from its environment; the success of the agent s behavior depends on the accuracy of its representations. While this may not seem terribly controversial, the added claim of cognitivism might: namely, that these representations are physically realized in the form of a symbolic code in the brain or in a machine. (Varela et al. 1991: 40)

More recently, Clancey (1997) has broken down this distinction by exploring the difference between a mechanism and a descriptive model. He says that "the descriptive modeling literature often equates knowledge, knowledge representations, representations, mental models, knowledge base, concepts..." (Clancey 1997: 50) but claims that instead, "we have a memory for coordinated, interacting processes, not for the descriptions of them per se. These processes correspond to the activation, recategorization, and coordination of perceptual-conceptual-motor sequences and other temporal relations, including rhythm and simultaneity." (ibid.: 68-69) The distinction is made between actual information and the description of that information, akin to the distinction between a city and a map of that city. Clancey also points out that descriptions are not the only form of representation involved in cognition, and storage is the wrong metaphor for memory. (ibid.: 221) As an alternative example he invokes the self-organization with memory of neurobiological systems, where the information lies in the physical organization of interconnected neural pathways, qualitatively different from the mechanism of storage and retrieval systems. (ibid: 224) In such cases it is unclear where one would locate the role of a representation, other than as a purely descriptive picture of the system.

Music. A definition of music would seem to be necessary, but I will not attempt such a maneuver. However, it is enlightening to discuss problems one might encounter in constructing such a definition.

(1) Many disparate activities may be classified as musical. The two primal sound-generating activities of rhythmic motion of the limbs and melodic outpourings of the voice form a basis for many kinds of music. A whole continuum of instruments in the wind, string, and percussion families derive from, and even refer to, these two fundamental acts. Also, musical activities can appear non-musical in certain contexts, and vice-versa; furthermore, the designation "musical/non-musical" is highly culturally contingent. What passes for non-music (e.g., sine-tone sequences in psychoacoustic experiments) can still be perceived musically, i.e., perceptually organized according to a listener s culturally contingent music-listening strategies. It is not clear where in our perceptual/cognitive systems we would mark the cutoff between sociocultural contingency and psychological fact. Hence, some such experiments may not be as close to measuring cognitive universals of music as some experimenters might believe.

(2) Music possesses different status and roles in different cultures and subcultures. In the west we have many musics associated with various communities: concert music s high-culture spectacle, the colloquial pulsating functionality of an urban hit single, a Hollywood film score s emotional manipulations, the precise environmental design of elevator music. In many such cases it does not make sense to discuss "the musical object" divorced from its context. A full understanding of the perception and cognition of music must include these disparate functions of music. For one rarely attends to elevator music with the attention that one is obliged to give in a concert hall, nor does one often give concert music the level of physical engagement normally expected on a dance floor. It appears that these sociocultural boundaries often serve to delimit not only music s functionality but also our reception, attention, and understanding. When speaking of music cognition, we should also address musical functionality, and hence we should keep these social factors in mind.

Grouping. By grouping usually we mean the perceptual or cognitive unification of some series of contiguous events or stimuli, due to proximity and/or similarity (Lerdahl & Jackendoff 1983). It can also mean a unification of non-contiguous events or stimuli due to periodic or non-periodic repetition (Parncutt 1994). In music-perception studies, grouping usually refers to the temporal domain, though synchronous tones can be perceptually grouped also, to form a complex timbre or harmony.

Rhythm. I propose that we construe rhythm broadly, as any perceived or inferred temporal organization in a series of events. The organization itself need not be cognized thoroughly; it may merely be perceived to exist. The perception of rhythm occurs usually because of some kind of perceptual grouping of events. Rhythm need not evoke a sensation of pulse (e.g. circadian rhythms, conversational speech, most Western contemporary concert music), nor need rhythm be "intended" by the producer of these events (ocean waves, sewing machines); nor need it be constrained to any sensory modality, social context, or timescale. In particular, rhythm can be but need not be a raw sensory phenomenon. Note that this listener-centered definition allows that one person may perceive rhythm where another does not.

Pulse. Literally, pulse denotes any periodicity inherent or perceived in any rhythm or combination of rhythms. It also strongly connotes isochrony (i.e., a fixed tempo), often connotes some degree of perceptual "salience," and weakly connotes an approximate frequency range between 1.2 and 3.3 Hz (a fuzzy category known as the tactus range, also the range of the human heartbeat pulse, human locomotion, and the infant sucking reflex). However, from a scientific perspective, these connotations could be dispensed with entirely in the definition; they seem to derive from the stricter meaning via linguistic and cultural usage of the term itself. Beat is roughly equivalent to pulse, except that it is more variable; in some contexts it more strongly connotes the tactus range, whereas in others it functions at any timescale. Beats can also function as an abstract quantity, so that notes may last fractions of a beat. Some authors (Lerdahl & Jackendoff 1983) have insisted that beats are to be seen as points in time, rather than as intervals of time. However, common usage suggests other possible meanings. Hence beats can both mean the discrete time point at which the interval occurs (as in "start this note on the beat") and the continuous interval between such time points (as in "hold this note for three-and-a-half beats").

Tactus. The tactus has been long understood to mean the moderate-tempo pulse present in most rhythmic music. Typically when asked to tap a finger or foot to a piece of music, listeners choose a regular time period that is in the approximate range of 300 to 800 milliseconds, averaging a little slower than 2 beats per second (Fraisse 1982). As the music gets faster, a listener is inclined to find progressively slower pulses such that they fit within this range, and vice-versa. The tactus range is also the range of "spontaneous" tempo, that is, of the tempo produced by the typical person asked to tap a steady pulse. This range coincides with a moderate walking pace, a human hearbeat, the rate of jaw movement in chewing, and the infant sucking reflex. It is also a fairly comfortable rate at which to tap a foot or a finger, since it is neither too fast for motor control, nor too slow for accurate, regular timing. Hence the tactus seems to correspond to natural timescales involved with human motion; we might imagine a chipmunk to have a faster tactus.

Polyrhythm. This term literally means multiple rhythms appearing simultaneously; it is simply polyphony viewed in its rhythmic dimension. Polyrhythm also frequently connotes multiple cyclically recurring rhythms, but only because the term is used often in conjunction with African musics, in which cyclic rhythms are commonplace. Cyclicality itself is not inherent in polyrhythm. Contrary to how it is often discussed in the literature, the individual rhythms in musical polyrhythm are usually more complex than mere periodic pulses. (I would call the latter construct polycycle, or trivial polyrhythm, not because it is trivial to reproduce but because it is assembled from trivial individual rhythms; an example would be what we call "three-against-four.")

Meter. Most generally, meter is a periodic grouping of a musical time unit. Traditionally in European concert music, meter connotes a hierarchy of weak and strong beats. However, as I shall elaborate in chapter 5, meter can exist without such a hierarchy. Meter denotes a subharmonic (or grouping) of a pulse, and might also imply a higher harmonic (or subdivision) of the same pulse. That is, it can simultaneously group and subdivide pulses into regular units. For example, the time signature 6/8 denotes a cycle of two pulses each divided into three equal subunits. Note that meter is treated as a periodic grouping of pulses -- i.e. as a cognitive/perceptual phenomenon, not as an objective reality of the acoustic signal. However, this distinction is often elided, so we might speak of the meter of a piece of music. (See chapter 5 for more on this issue.)

Expressive timing. Some theorists have tried to divide music into its structural and expressive components. This distinction also tends to fall along the same lines as discrete versus continuous elements. Expressivity in performance is taken to mean that which deviates from regularity; one can be expressive with intonation, with dynamics, with tempo and other kinds of timing. The regularity of a group or unit can be taken to mean, according to Seashore (1938), the norm set by the unit itself; hence the understanding of expression amounts to a kind of statistics. Expressive timing has come to mean the ways in which performers deviate from strict metronomicity.

The separation of the structural and the expressive is of course a problematic distinction, for it (1) suggests that composers cannot be expressive, and (2) presupposes a distinction between the fixed and the regular, between the composition and the performance. At worst, however, this distinction amounts merely to an unfortunate choice of wording, or an impoverished definition. But just as the term "representation" has been appropriated by the cognitive-science community, so the term "expression" has been redefined provisionally by the cognitive-musicology community to refer to a subset of what is typically seen as expressive performance. While I am aware of this issue, I will use the terminology as it is used in the literature on expressive timing. Namely, the term refers to the differences, along the temporal dimension, between a "score" and a "performance," or in the case of improvised groove-based music, between regularity of the underlying meter and fluidity of the performed rhythms.

Microtiming. Microtiming, as I have been using it, refers to expressive timing at the sub-tactus level, characterized by high-frequency activity. It is complementary to tempo modulation, which has a low-frequency emphasis. It corresponds to Bilmes s (1993) concept of deviation, but microtiming is more general since it doesn t connote an ideal metric referent (as in deviation from something). Microtiming refers to the entire range of sub-tactus, non-notatable rhythmic expression, pertaining both to music and to speech, from which much musical rhythm originates.

Groove. I believe that African and African-American dance musics and their descendant genres should be treated in terms of a "groove" -- which might be described (but not defined) as an isochronous pulse that is established collectively by an interlocking composite of rhythmic entities. A groove tends to feature a high degree of regularity but also conveys some sense of animation. Groove involves an emphasis on the process of music-making, rather than on the syntax (Keil & Feld 1994). The focus is less on coherence and the notes themselves, and more on spontaneity and how those notes are played. Groove concerns the animation and decoration of time as it is shared by musicians and audience. This relates to the functional role of African and African-American musics in their communities. It is worthwhile to point out the common observation that both African and African American peoples exhibit a cultural tendency to treat music on a functional basis. That is, music is not merely treated as a work of art for art s sake, but as an activity that is integral to life, partially structuring everyday reality.

A salient feature of groove-based musics seems to be the attentiveness to an additional unifying rhythmic level below the level of the tactus. For example, if the quarter note is the tactus, one may also focus on the sixteenth note to heighten rhythmic precision. It is verified experimentally that discrimination of long temporal intervals is more variable than discrimination of short ones (Weber s law), and the sum of the variances of n subdivisions is a factor of n smaller than the variance of the total interval. Hence we actually gain accuracy in timing a moderate pulse by subdividing it. According to Fraisse (1982), music listeners typically divide rhythmic intervals into two categories, long and short. These intervals are usually in the ratio of 2:1, indicating that the smaller interval is a subdivision of the larger one; also, the long interval is usually in the tactus range, whereas the short one lies in the subtactus range. Fraisse notes that these two categories have different perceptual implications. During the long intervals, we can be aware of the passage of time, whereas we do not sense temporal extent during the short intervals. However, we can have qualitative awareness of the grouping of numbers of such brief intervals (Fraisse 1956, cited in Clarke 1999), such as their "two-ness or three-ness, [or] accentedness or unaccentedness" (Brower 1993: 25). For this and other reasons, the smallest operative musical subdivision of the tactus has been referred to as a "temporal atom," which was then abbreviated by Bilmes (1993) to "tatum" in homage to the master pianist, Art Tatum [CD-1].

Groove has no correlate in European concert music, and is therefore indescribable by models derived from it. Groove-based musics do not often feature the phrase-final lengthening, ritardandi, accelerandi, rubati, or other expressive tempo modulations of European classical music; rather, they involve miniscule, subtle microtiming deviations from rigid regularity, while maintaining overall pulse isochrony. This mode of rhythmic expression has a whole tacit grammar unto itself, with its own set of esthetics, techniques, and methods of development. To our knowledge, while much research effort has focused on the investigation of the aforementioned tempo-modulating phenomena (e.g. Longuet-Higgins 1982, Todd 1989, Repp 1990), very little attention has been devoted to expressive timing in the context of an isochronous pulse or groove. Sometimes they are described as "small accelerations and decelerations," (Magill & Pressing 1997), i.e. in terms of a larger construct called tempo, as if to imply the existence of some kind of musical time independent of the musical events that shape it.

Bilmes (1993) developed a model for groove-based expressive timing that features two simultaneous isochronous pulses, one at the foot-tapping tactus level (with a period typically between 300 and 800 ms), and another, the temporal atom or tatum, at the smallest operative subdivision of that pulse (typically 80 to 150 ms). The onset time of a note occurring on a specific tatum (i.e. a specific sixteenth, twenty-fourth, or other such note) can be transformed by a continuous deviation from perfect quantization. Hence rhythmic expression can occur at the tatum level without perturbing the overall tactus or tempo. This representation is described and expanded upon in chapter 7.

Attention. Attention can be described or defined in numerous ways. As Jones & Yee (1993: 70) put it, "Ultimately, definitions of attention become theories of attention." It has been described variously as the allocating of info-processing resources to a specific source of information, frequently to the neglect of others; the differential processing of simultaneous sources of information; or simply, the mind s ability to focus and concentrate. It is believed by many that musical meter provides us with an attentive mechanism -- a temporal template against which to process information in time, reducing demands on memory. This issue is discussed in chapter 5.

Now, armed with these terms, we will turn our attention to the cognitive science of rhythm perception and cognition.

 

 Table of Contents

List of Audio Examples

 Bibliography

 Discography

 Previous Chapter

Next Chapter