Database of Challenging Musical Sounds for Evaluation and Refinement of Pitch Estimators

Adrian Freed and Tristan Jehan

Introduction

Speech researchers have made the most thorough study of the performance of pitch estimation algorithms. A key to their work is the evaluation of algorithm performance against standardized databases of speech that have been "hand" analyzed. Such a database does not exist for musical signals. As a result, pitch estimation papers in the computer music community describe algorithms evaluated using short sound examples often chosen to show new work in the best light. It is thus impossible to predict performance of published algorithms in real musical situations, and difficult for researchers to identify fruitful areas for new work. We describe a publicly available database of musical sound files intended to redress these difficulties.

Musical Sound Database

Sounds in this database can be grouped into two important and hitherto poorly represented categories:

  1. Complete musical phrases are used to evaluate the impact of estimation errors in common and realistic musical contexts.
  2. Challenging examples areused to identify particular points of weakness from which an algorithm may suffer. Included are sounds with: pitch synchronous and additive noise, room ambiance, cross-talk from adjacent strings, ambiguous octaves, inharmonicity, missing fundamentals, glissandi, vibrato and trills.

Access

The database will be available in early 1998 at http://www.cnmat.berkeley.edu/Research/Pitch. You will be able to submit your own files to this database by filling in a form at the site. This form represents a contract that establishes you as the owner of the rights to the submitted files and granting permission for their analysis and re-distribution.

AIFF is the chosen format for sound file samples and SDIF for analyses of these files. The SDIF pitch frame type allows for a weighted set of pitches facilitating virtual pitches for inharmonic sounds and management of multiple pitch estimates.

Database Overview

Wind
Singing
String

For these string sounds a wide range of playing techniques were used including: open strings, low and high stopped, low and high frequency vibrato, narrow and wide trill, timbre change, sol ponticello, glissandi, tremelo near and away from bridge, pizzicato, pizzicato stopped, slow bow change,harmonics, damped rmonics, hammer on and pull off's, picked, left and right hand damping, slaps, bottleneck slide and pops.

Brass
Percussion

Analysis

In parallel with the archival activity assembling this database, we are exploring automatic segmentation and parameter estimation tools to develop analyses of the sounds against which algorithms may be judged. Early results using a wavelet technique are very promising. The wavelet method identifies each pitch period and provides a "voiced/unvoiced" estimate. Combining this with energy-based techniques results in good estimations for pitched regions of a phrase. The estimator is robust with impulsive and continuous noise.

Future Work

Acknowledgement

This work assembles materials developed over many years of work with support from: