Pitch Database

Database of Challenging Musical Sounds for Evaluation and Refinement of Pitch Estimators

Adrian Freed and Tristan Jehan

Introduction

Speech researchers have made the most thorough study of the performance of pitch estimation algorithms. A key to their work is the evaluation of algorithm performance against standardized databases of speech that have been "hand" analyzed. Such a database does not exist for musical signals. As a result, pitch estimation papers in the computer music community describe algorithms evaluated using short sound examples often chosen to show new work in the best light. It is thus impossible to predict performance of published algorithms in real musical situations, and difficult for researchers to identify fruitful areas for new work. We describe a publicly available database of musical sound files intended to redress these difficulties.

Musical Sound Database

Sounds in this database can be grouped into two important and hitherto poorly represented categories:

Complete musical phrases are used to evaluate the impact of estimation errors in common and realistic musical contexts.
Challenging examples areused to identify particular points of weakness from which an algorithm may suffer. Included are sounds with: pitch synchronous and additive noise, room ambiance, cross-talk from adjacent strings, ambiguous octaves, inharmonicity, missing fundamentals, glissandi, vibrato and trills.

Access

The database will be available in early 1998 at http://www.cnmat.berkeley.edu/Research/Pitch. You will be able to submit your own files to this database by filling in a form at the site. This form represents a contract that establishes you as the owner of the rights to the submitted files and granting permission for their analysis and re-distribution.

AIFF is the chosen format for sound file samples and SDIF for analyses of these files. The SDIF pitch frame type allows for a weighted set of pitches facilitating virtual pitches for inharmonic sounds and management of multiple pitch estimates.

Database Overview

Wind

Shakuhachi
Organ Flu Pipes
Suling
Didjereedo and Stick
Clarinet
Bass Clarinet

Singing

Indian
Bel Canto
Western Popular
Tibetan

String

For these string sounds a wide range of playing techniques were used including: open strings, low and high stopped, low and high frequency vibrato, narrow and wide trill, timbre change, sol ponticello, glissandi, tremelo near and away from bridge, pizzicato, pizzicato stopped, slow bow change,harmonics, damped rmonics, hammer on and pull off's, picked, left and right hand damping, slaps, bottleneck slide and pops.

Cello
Volin
Guitar
Bass

Brass

trombone
trumpet

Percussion

tabla
bells
piano

Analysis

In parallel with the archival activity assembling this database, we are exploring automatic segmentation and parameter estimation tools to develop analyses of the sounds against which algorithms may be judged. Early results using a wavelet technique are very promising. The wavelet method identifies each pitch period and provides a "voiced/unvoiced" estimate. Combining this with energy-based techniques results in good estimations for pitched regions of a phrase. The estimator is robust with impulsive and continuous noise.

Future Work

A set of artificially synthesized test signals
Psychoacoustic experimental harness to develop perceptually solid pitch estimates
Objective measures of pitch estimate accuracy.

Acknowledgement

This work assembles materials developed over many years of work with support from:

California State Dept. of Commerce
Zeta Music Inc.
Gibson Guitar Inc.
Silicon Graphics Inc.
Apple Computer Inc.