Motivation of Decisions Behind the SDIF Specification

9/29/99 by Matthew Wright

Overview

This document explains some of the motivation and background behind various design decisions that went into defining SDIF.

Why Isn't SDIF IFF Compatible?

In early versions of SDIF, we claimed compatibility with the IFF standard, the parent of several important standards AIFF and RIFF. However, we have decided to drop strict IFF compatibility because of problems that arose in trying to ensure 64-bit alignment of all data types.

Here's the issue: All IFF files consist of one large chunk, which must begin with some header fields. AIFF files are a "FORM" chunk, whose header contains these 3 fields:

This adds up to 12 bytes, which makes all subsequent chunks not 16-byte (64 bit) aligned. We could have solved this problem with a mandatory chunk whose length is 4 more than a multiple of 16 bytes, but that seemed too much like a kludge.

Other IFF chunks besides FORM have similar problems; all have manadatory "headers" whose sizes are not multiples of 16 bytes.

Another problem with IFF is that a 4 byte size count is not big enough for certain extremely large sets of SDIF data.

The SDIF standard still follows the "spirit" of IFF, with a series of frames each with an identifying 4-byte frame type and size count; the only practical difference is the lack of an opening "FORM" chunk.

We don't know of any commonly available utilities or libraries for manipulating IFF files in general, so it seems like we haven't given anything up with this decision.

How To Embed SDIF in IFF Files

Here's a proposal for embedding SDIF data in IFF files such as AIFF files. Embedded SDIF data can be no longer than about 2 gigabytes, because IFF chunks contained a signed 32-bit count of the data size.

Embedded SDIF data would require a special IFF form chunk to be a "wrapper" around the entire SDIF block:

ChunkID char[4] 'FORM', as required by IFF
ChunkSize int32 The size, in bytes, of the entire embedded SDIF data, plus this chunk, not including the "FORM" ChunkID or this ChunkSize field.
FormType char[4] 'SDIF' (We need to register this form type!)
PaddingChunkID char[4] 'SPAD'
PaddingChunkSize int32 Either 0, 2, 4, or 6
PaddingChunkData char[n] 0 to 6 bytes of null characters, depending on PaddingChunkSize
SDIFChunkID char[4] 'SDIF'
SDIFChunkSize int32 The size, in bytes, of the embedded SDIF data
SDIFData data The SDIF data

This structure is a legal IFF chunk and can therefore be embedded inside other IFF chunks. It consists of an enclosing FORM chunk that includes two subchunks. The first subchunk is a padding chunk to ensure that the SDIF data will be aligned on an 8-byte boundary. The second subchunk contains the SDIF data to be embedded, wrapped in a legal IFF chunk.

The number of padding bytes depends on the size of the data that precedes this chunk in the IFF file (which must always be a multiple of 2 bytes, per the IFF standard). For example, suppose the preceeding portion of the file is 200 bytes. Those 200 bytes, plus the 28 non-padding bytes in the above wrapper, is 228, which is 4 more than a multiple of 8, so there would need to be 4 padding bytes to make the SDIF data begin on an 8-byte boundary.

Note that changing the contents of the earlier portion of an IFF file may require changing the number of padding bytes.

We welcome suggestions about better ways to do this.

Why Is the Opening Frame's Format Different From Other Frames?

The opening frame is the only frame that does not have a time tag, a stream ID, and matrices. Why this nonuniformity? The SDIFSpecVersion and SDIFStandardTypesVersion could be data elements in a matrices in a frame at time tag minus infinity.

Two reasons:

64-bit Alignment

Why Allow Optional Columns In Matrices?

There are many cases where it would make sense to add extra columns to a matrix. For example, the Lemur project does analysis for and synthesis of "bandwidth enhanced" sinusoids with the usual amplitude, frequency, and phase fields plus a "bandwidth" field indicating the spectral width or "noisiness" of the partial. Another example is a measure of the "importance" or perceptual salience of each of a set of partials.

SDIF supports this in a way that allows programs that don't understand these extra columns to read and process the information they do understand without disturbing the extra information: Extra columns must always appear after the required columns.


back to SDIF Main Page