Chapter Overview
This chapter gives an overview of the essential structures of the work and how it can be used.
The largest part of the software built in this thesis is done as an extension of Open Inventor. Open Inventor is a powerfull C++ library providing classes, building blocks, and mechanisms for handling interactive 3D graphics at a high level of abstraction. For a summary of Open Inventor see [23]. TsKit is an extension of Open Inventor providing specialized classes, building blocks, and applications for visualizing sound with animated 3D graphics and synchronized audio playback.
The central data structure of Open Inventor and tsKit is the scene graph, a graph of nodes. Nodes are containers for properties and data, stored in typed fields. The scene graph is the description of a 3D scene that is rendered by Open Inventor using the low level graphics API OpenGL. TsKit comes as a set of additional nodes, all having the prefix 'ts'.
See figure 2.1 for an overview of the system architecture.
Figure 2.1: The schema shows the software architecture of tsKit.
See figure 2.2 for the structure of a simple scene graph and how it is rendered in a viewer window. The scene starts with a root node (SoSeparator) and contains standard nodes and tsKit nodes: A master clock (tsTime) and its graphical control user interface (tsTimeCtrlUI), a node containing a small two-dimensional signal (tsData2RNFloat), a standard node containing a color ( SoMaterial), and finaly a node that renders the signal to a surface ( tsSurface). Figure 2.3 in the next section shows the corresponding file in the Open Inventor file format and C++ source code creating this scene.
Figure 2.2: A schematic view of a simple Open Inventor/tsKit scene graph
(bottom) and loaded into a standard viewer with ivview helloTsKit.iv
(top).
TsKit offers the following building blocks as nodes and applications:
tsData1[R|N]Float[File]
: Store one-dimensional signal
data.
tsData2RNFloat[File]
: Store two-dimensional signal data.
The tsKit can be used in several ways:
Figure 2.3 (a) shows a simple Open Inventor ASCII file.
Because tsKit is still a prototype, creating and arranging tsKit objects in a C++ program has not yet been extensively tested.
Because tsKit is still a prototype, deriving new tsKit classes is not yet documented but quite possible and desired.
Figure 2.3 and 2.2 show a simple layout of signal data in different representations.
Figure: The scene graph from fig. 2.2
described in an ASCII Open Inventor
file (right) and as C++ source code that generates this scene graph
and renders it in a viewer (left).
This section shows a realistic application of how a sound is visualized with tsKit as an animated 3D graphics with synchronized audio playback.
One application of tsKit is to visualize sound signals. There are at least two representations of sounds that are interesting for examinations. Figure 2.4 shows a sound in both representations. First the time domain signal. This is the waveform of the airpressure over time and is widely used to store audio, e.g., on a CD. The second representation is the time-frequency domain signal. Here a time-frequency distribution is applied to the time domain signal to generate a two-dimsional signal. This signal describes the energy/amplitude of a sound in time and frequency and is also called spectrogram. To represent the frequency components and how they evolve over time is an essential instrument for understanding many signals and is not only used for sounds.
Figure 2.4: Figure showing one sound signal of 2 seconds in the time domain (top) and
the time-frequency domain (bottom).
The time-frequency domain is more interesting for studying sounds, because it gives a better understanding in how a sound sounds.
This section describes how to prepare a sound to be visualized with tsKit.
The time domain signal can be read directly into tsKit. The node tsData1RFloatFile has a field for a file name of a sound file and reads it in the first time, when the data is needed for rendering . The format of the soundfile can be AIFF (Audio Interchange File Format), AIFC, WAV, MPEG1 and some others. The frequency-domain signal must first be generated from a time-domain audio signal. In tsKit/bin/ there is a program cqt3 from D.P.W. Ellis that does a constant-Q wavelet analysis of a AIFF file. See [8] and [7] for a detailed description of this analysis program. The program reads an audio file (AIFF/AIFC) and outputs a file with the extension ' .aqt'. Here is how to analyze an example audio file :
cqt3 -o 8 -g .25 mySoundfile.aiff
These option settings are suggestions and can be changed.
Because the .aqt file is very big and the results can be accessed only with low performance, I wrote a converter tsKit/bin/aqt2gridcompl taking an .aqt file and outputing the relevant data in compact binary format with an extension '.grid'. This conversion is very slow! Here is how a .aqt file is converted into a .grid file containing <numberOfRows> spectra, that can be read into a node of class tsData2RNFloatFile:
aqt2gridcompl mySoundfile.aqt mySoundfile.grid <numberOfRows>
The last parameter is the number of rows the .grid file will contain. Here
is how to calculate this value: With a suggested sample frequency of 60
spectra per second and a sound duration of <length>
seconds it is:
<numberOfRows> = <duration> sec * 60 rows/sec..
Figure 2.5: This Firgure shows a schematic application scenario of how
tsKit is ment to be used. It shows the scene graph of file
Expl.Application.iv loaded in the tsSceneViewer (bottom,
right) and as a graph of node-icons (top). These nodes that are part
of the tsKit begin 'ts' and are marked with a cricle around their
icons. Arrows denote how nodes are visualized or edited in dialogs.
Now you have to write a ASCII text file in the Open Inventor file format to view your data in a viewer. Take an example file ( tsKit/data/ivfiles/*.iv) as a template and substitute the references to the auidio files with the your file names.
The animated 3D graphics can be rendered into a movie file in non-real-time. Here is how the example file from above ( myInventorFile.iv) is rendered in a movie called myMovie.mv of length 4 seconds, at a speed of 0.5:
tsRenderer -l 4.0 -s 0.5 -o myMovie.mv myInventorFile.iv
After you have chosen a perspective press the 'R' button to start the rendering.
The compression scheme is SGI-JPEG, the frame rate is 30 Hz and the frame size is 640x480.
See The tsKit Reference Manual for detailed information.