Chapter Background
Visualization wants to shift cognitive processing load to the perceptual system.
Visualization engages the primary human sensory apparatus, vision, and the processing power of human mind.
To shift to the visual system takes advantage of the special features of the visual system, especially of the powerful cognitive performance.
Visualization of data, information and concepts is getting more and more important. The fast expanding computer manufacturer Silicon Graphics Inc. has 'Visual Computing' as its main mission.
Arnheim says in [4], pp. 13: ''There is no basic difference in this respect between what happens when a person looks at the world directly and when he sits with his eyes closed and 'thinks'.'' When seeing is thinking, visualization does not need to be ending in an concrete image!
Benefiting from changes in one domain to another is often done and useful in general. Metaphors are an example for a shift to a domain that is often visual. Metaphors are everybody's visualizations from a problem domain into a pseudo-visual domain. Great use of pictures in language are an everyday example of using the visual system in its broadest terms: without getting as concrete as a real picture it is visual thinking. Often conferences are mapped to the visual domain, using its vocabulary of shapes, colors, motion and all visual perceived attributes.
The auditive perceptual system is included in this approach using computers for mapping data to sound. This area is called Auditory Display or Sonification. The auditive perceptual system can be used as well. Here the ability to perceive parallel structures in time is used.
Because the intra-media character of this framework, it shares at least requirements for multimedia frameworks.
In [1] p. 11/12, Ackermann gives a list of requirements and problems of a multimedia framework. I added my annotations in normal type face to show how these items relate to this project:
Software libraries that support the development of multimedia applications have to provide structures and functions that meet the following requirements:
- integration of various media
Within this work the term 'time signal' covers all time derived media types, including time derived data that is product of analysis or used for control. 3D graphics is native to Open Inventor. Audio (and video in future work) must be integrated. Time must be integrated as a dimension for rendering, like space is already in Inventor.
- user interface interactions with direct manipulation of graphical representations and high semantic feedback
All graphical representation is done in the domain of 3D graphics. Manipulations of the properties translation, scale and rotation is a build-in feature for all Open Inventor objects. Time specific representations and their manipulation should be possible. Open Inventor's concept of manipulators can be used for this.
- support for media compositions to define multimedia presentations by high level synchronization specifications
Media composition is needed only on a low level. Geometric composition is native for Open Inventor. Basic temporal composition (sequencing) should be possible with future work.
- real-time behavior
3D graphics and audio must be rendered and controlled in real-time. Media transformations can be done in off- real-time. Real-time capturing, buffering and transformation is future work
- flexibility and portability in view of different hardware configurations and platforms
It is not a requirement for this work, to be platform independent, although Open Inventor and OpenGL are ported to many platforms. The intensive usage of SGI's digital media libraries makes the software SGI specific, but hides different hardware configurations.
- A requirement not mentioned above but essential is to transform media signals. Transformations cover tasks like analysis, mixing and feature extraction of media signals. Transformations are needed as building blocks for the main usage as a tool for mapping audio to video.
Additionally, there are problems that complicate the development of multimedia software systems:
- support for the data type time is often lacking in programming languages
Basic data type SbTime is available in Open Inventor. Time must be conceptual introduced as a additional dimension for rendering and control.
- different real-time constrains at low-level device interfaces for each media type
SGI's media programming API offers a device independent interface for buffering, converting and synchrinizing audio and video frames. Real-time constrains are fulfilled by playing audio (video in future work) in separate threads. 3D graphics has no real-time constrains in term of fixed frame rate. Animation is perceived as a fluent movement starting at about 15 frames per second (15 fps).
- rapid hardware evolution
SGI's digital media programming API hides most changes due to hardware evolution. Although there were important changes in the API in the recent releases.
- strong hardware dependencies because of ADCs, DACs, DSPs and other special purpose processors, e.g., for 3D graphics accelerators or video compression (JPEG and MPEG chipsets)
SGI's digital media programming API and Open Inventor/OpenGL hides hardware dependencies. Although knowledge about special hardware constrains is needed to achieve best performance.
- the lack of established standard interfaces to multimedia libraries that hide hardware dependencies
SGI's digital media programming API is same on all SGI machines. Graphics API OpenGL is platform independent.
- the lack of established standards for the representation of multimedia data (e.g., widely used multimedia file format standards are missing)
Open Inventor's file format is widely used an many converters exist to other formats. SGI's library for audio and movie files are independent in formats and compression/decompression algorithms. Standard file formats are supported for audio (AIFF/AIFC, WAVE, MPEG1) and movies (MPEG1, Quicktime).
This section gives a short overview of related systems.
The following projects are applications or extensions of Open Inventor:
ChemKit is an extension of the standard Open Inventor library on SGI workstations. It is similar to the tsKit, but it differs in the subject of visualization. ChemKit provides specialized nodes for storing and visualizing molecular data. ChemKit is the source of many of my ideas of how to extend Open Inventor.
Figure 3.1: A screenshot of a standard Inventor viewer showing a scene
graph with ChemKit nodes for molecular data and a dialog for editing
visualization parameters.
Modeling Organic Forms Using Soft Primitives
Thesis of Scott Peterson, Cal Poly State University, San Luis Obispo, USA.
See http://macabre.lib.calpoly.edu/projects/csc/Peterson_Scott_Brandon/contents
.html
From the abstract:
'' A library of tools, dubbed Softies, has been implemented for the Open Inventor 2.1 architecture. When combined with the transformations and manipulators provided by the Open Inventor architecture, these elements become powerful organic modeling tools.''
Virtual Environment Technology Laboratory at the The University of
Houston, USA.
See http://www.vetl.uh.edu/ lincom/VrTool/vrtool.html
From the abstract:
''VrTool is the newest Open Inventor Virtual Reality toolkit to provide a rapid prototyping capability to enable VR users to quickly get their application running with the minimum amount of effort.''
In his works [1], [2] and [3] Philipp Ackermann describes a object-oriented multimedia application framework called MET++. This framework handles the synchronization, viewing and editing of time-dependent data (audio, music, 2D and 3D graphics animation, video). This system is built on top of the application framework ET [9]. Although it is platform and window system independent (Silicon Graphics, Hewlett-Packard, Sun, Linux), it is free available and offers interesting multimedia building blocks, there are two reasons against using this framework for this thesis. First of all, as being a research project, the available version was not stable and far from being efficient, in terms of performance. Second, being a application framework, rather than a visualization framework, the complete ET and MET++ application infrastructure (including a complete GUI and window system layer) has to be reused. This implication would make my work hard to reuse and integrate in other systems. Although the novel direct manipulation interaction on temporal structures in the so-called Time Composition View and Event Graph (see fig. 3.2) is an interesting concept, to be introduced in future work.
Figure 3.2: The MET++ event graph viewer.
MAX is is a graphical programming environment for event and signal processing. Building blocks of a MAX application are objects with typed input and output connectors. The object incorporates the mapping from the input to the output connections. MAX offers an editor for patching these objects and their connections while the application is running. MAX schedules the processing so that the programmer can use it like a parallel machine.
MAX is heavily used in the are of musical signal processing and is used as a system control for live performances and installations. See [16] or WEB site from the maufactuor Opcode Systems
Open Inventors concept of engines that can be created and patched on run-time is very similar to MAX, except there is no editor for patching engines (and nodes) in Open inventor.
VRML stands for Virtual Reality Meta Language and is a meta language for 3D worlds, like HTML (Hyper Text Meta Language) is a metalanguage for hypertext. VRML2.0 is the extension of VRML1.0 with an event model allowing scripting with JAVA. VRML1.0 was at least a file format for describing static and non interactive scenes in 3D. Actually it was a subset of the Open Inventor file format.
VRML2.0's architecture has great similarity with that of Open Inventor and shares the concepts of a scene graph with nodes containing the relevant data in typed members called fields that can be connected. It comes as a interpreter for VRML files optional containing JAVA scripts and is platform independent.
VRML offers extensive support for 3D graphics, a limited one for audio, video is nonsupported at all. To implement part of the VRML by extensions in native code (C/C++, called plug-ins) to circumvent limitations in performance or system integration (using other libraries) foils its advantage of being platform independent.
IRIS Explorer is an application creation system and user environment that provides visualization and analysis functionality for computational scientists, engineers, and other investigators. Internally IRIS Explorer makes extensively usage of Open Inventor.It is especially useful for those whose needs are not met by commercial software packages. Also, IRIS Explorer's Graphical User Interface (GUI) allows users to build custom applications without having to write a single line of code.
Explorer is a system for creating visualization maps, each of which comprises a series of small software tools, called modules. A map is a collection of modules that carries out a series of related operations on a dataset and produces a visual representation of the result.
See [13] and [14] or the IRIS Explorer Center for detailed information.
This work from Jøran Rudi, Norwegian network for Technology, Acoustics and Music (NoTAM), University of Oslo used IRIS Explorer for visualizing sounds in a art work.
See http://www.notam.uio.no/ joranru/wtca.html for a paper and images.
Alan Wesley Peevers, Master thesis, Department of Electrical Engineering, University of California, Berkeley.
See: http://www.CNMAT.Berkeley.EDU/ alan/MS-html/MSv2_ToC.html
From the Introduction:
This paper describes a system for audio analysis, modification, and synthesis, based on the Short Time Fourier Transform (STFT). The system is intended both as a tool for sound manipulation, and as a means to reinforce people's intuitions regarding the relationships between timbre and the harmonic structure of music and other audio signals, as conveyed via their spectrograms. This is done by creating a 3D spectrogram which shows a sound's harmonic structure in great detail as it is sampled. Similar systems in the past (for example, David Shipman's SPIRE system) have often sought to convey harmonic structure via two-dimensional spectrograms, sometimes in conjunction with wave form or other displays. By adding the third dimension (amplitude in dB mapped to surface height), it is hoped that a greater apprehension of the detailed structure can be achieved.
Figure 3.3: Image from [19], Copyright 1995, UC Regents, Universisty of California, Berkeley. All Rights Reserved.
Weseley's work is related to this thesis in using 3D graphics (OpenGL) to visualize spectral audio data. His work stresses performance and is not designed as an open framework.