Request for Comments

This document represents IRCAM's proposal concerning the SDIFF format. It was conceived by Xavier Rodet, Rolf Woehrmann and Dominique Virolle and written down by Rolf Woehrmann.

SDIFF as an IFF File Type

We agreed to base the SDIFF format on the IFF structure by proposing the following kinds of chunk types in the following order:
1. The IFF standard FORM chunk.
2. Zero ore more Infotable chunks.
3. An optional StreamIDInfo chunk.
4. An optional TimeTable chunk.
5. The Dataframe Chunk Type.
We allow further the embedding of AIFF/AIFC chunk types for storing soundwave information. We want to separate this from the dataframe chunks and to reuse the AIFF/AIFC chunks types. It has to be defined where to put these chunks.

Datatypes

byte
- a byte without semantics used for zero padding
char
- one-byte character for ASCII arrays or strings
string
- zero-terminated ASCII string
int32
- 32bit big-endian integer (used only for IFF chunk data size fields)
float32
- 32bit big-endian float
float64
- 64bit big-endian float

The FORM Chunk

The standard IFF form chunk.

ChunkID	char[4]	'FORM'
ChunkDataSize	int32
FormType	char[4]	'SDIF'

The Infotable Chunks

All further creation information has to be stored in name/value tables (NVtable).
Each NVtable defines a namespace for a specific application used for creating the SDIFF file. Each application is free to define his set of name/value pairs in a NVtable named by the application.
A special NVtable called 'General' is advised to store further non-application specific information. There might be a policy for common names like 'AnalyzedSoundfile', 'Author', 'LastModificationDate', 'Institution' etc.
Each NVtable is stored in his own infotable chunk. It is stored in a single zero-terminated ASCII string by alterning names with values both in double-quotes and delimited by white-space like:

"Window Size" "512"
"Window Hop" "64"
"Window Type" "Hamming"

ChunkID	char[4]	'SITC'
ChunkDataSize	int32
InfotableName	string	for ex. 'General' or 'Additive'
InfotableData	string	see above
ZeroPadding	byte[n]	n bytes for padding to a 64bit limit in respect to the beginning of the file.

The Stream ID Information Chunk

This chunk type is optional.
The stream ID information chunk provides a mean for grouping the ID streams into higher-level structures.
Our design approach tries to be open and flexible while still providing meaningful information.
The semantics of the grouping is more oriented into the direction of a possible synthesis programs than into the direction of the programs which created the SDIFF file.
Formally we are inspired by the URL concept by using the URL protocol specification as a specification for the kind of synthesis program and the URL path as a mean for multi-level structuring in the sense of trees or more general graph paths. Examples:
- ID1->csound:/instr1/oscA/inpB
- ID2->carl:/xavier/might/know/better
- ID3->synthadd:/bank1
The semantics of the path structure is left to the synthesis program and is not specified in the SDIFF formal specification. In this sense we try to present an open and flexible architechture. For ex. the path can be used to specify a certain input in a certain unit-generator in a certain instrument definition.
The specification itself is stored in a single zero-terminated ASCII string by alterning ID number and URL-like specification delimited by white-space like:
```
1 csound:/instr1/oscA/inpB
2 carl:/xavier/might/know/better
3 synthadd:/bank1
```

ChunkID	char[4]	'SSIC'
ChunkDataSize	int32
StreamIDInfo	string	see above
ZeroPadding	byte[n]	n bytes for padding to a 64bit limit in respect to the beginning of the file.

The Time Table Chunk

This chunk type is optional.
It is ment to provide a fast random access of the data in respect of time.
Since some studies have to be made for finding the optimal algorithm and representation we do not specify a certain structure for a first implementation of the SDIFF format.

The Dataframe Chunk

Each dataframe is composed by a dataframe header structure and one or more matrix structures.
The notion of compound and non-compound frametypes is replaced by the means of the MatrixCount field in the dataframe header. A non-compound frame is simply a frame with just one matrix, a compound one a frame with 2 or more matrixes.

64-bit Alignment

We want to have the option to work with 64-bit data.
In order to do that in an efficient way, we have to require that all 64-bit data items are aligned to 64-bit in respect to the beginning of the file. This is especially important for working with memory mapping of the complete file, for ex. with the 'mmap' utilities in UNIX. We made tests for SGI O2s and DEC Alphas that showed that you have to align to 64-bit in memory in order to work with 64-bit floats (For the test program mail to Rolf Woehrmann).
We want to have the time information in the dataframes allways in 64-bit floats. So the beginnings of the dataframes have to be allways aligned to 64-bits. In order to achive that we have to require possible byte padding at the end of each matrix data. This further allows to have only specific matrixes in a single dataframe in 64bit.

Restricting Datatypes of the Fields in the Dataframes

For simplicity we require all fields in the dataframes to be in 32bit or 64bit width.
Also we require for simplicity that all fields except the type specifiers to be in floating points even when the semantics are integers like for ex. MatrixCount.

Wrapping All Dataframes in One Big IFF Chunk

There is an advantage to wrap all dataframes in one big IFF chunk instead of having separate IFF chunks for each dataframe. The point is that with the second approach any program has to differentiate by the chunk ID whether this a dataframe or an information chunk like the ones already discussed above. While this is relatively simply for the moment, it becomes problematic when we invent new types of information chunks like special versions of time table chunks, new kinds of patch information chunks, embedded data chunks for AIFF/AIFC support or even embedded MIDI chunks. Every program has then to be recompiled because it has to recognize that these are not dataframe chunk IDs. In our approach the application looks simply for the IFF 'SDFC' chunk in order to find the dataframes. It can simply ignore all other additional standard or even non-standard chunk types.

Non-Defined Values

There might be in some cases the need for non-defined values for single columns in certain rows. Since we do not like the approach of interpreting otherwise valid values like MINFLOAT or MININT, we invented another handling on non-defined values. The idea is to use a special 'validity' column which represents a kind of flagmask in which each bit==1 denotes a non-defined value in the column specified by the position of that bit. The advantage is that in the normal case where there is no non-defined value this mask equals to zero. This can be easily detected.
Since non-defined datavalues are only needed for special matrix types, we require the apperance of a validity column to be 'hardcoded' in the definition of a specific matrix type. So, a specific matrix type has either *allways* a 'validity' column or *never*. To escape from this restriction we have to define two versions of a matrix type.

The Frame Structure

DataFrameChunk := ChunkHeader ( DataFrameHeader ( Matrix )+ )+

The chunk header

ChunkID	char[4]	'SDFC'
ChunkDataSize	int32

The dataframe header

FrameType	char[4]	Four characters: the first as the ASCII version number, the pending three as the ASCII frame type name. Ex. '1HMM' or '2CRL'
FrameSize	float32	The total size in bytes of the dataframe.
MatrixCount	float32	The number of matrixes.
StreamID	float32	The stream ID number.
Time	float64	The time of the data frame.

The matrix structure

MatrixType	char[4]	Four characters: the first as the ASCII version number, the rest of three as the ASCII matrix type name. Ex. '1FFT' or '2FZ0'
MatrixDataType	float32	The type code of the matrix data. Currently 1 for float32 and 2 for float64.
RowCount	float32	The number of rows in the matrix.
ColumnCount	float32	The number of columns in the matrix.
MatrixData	float32 or float64	The matrix data itself.
OptionalBytePadding	byte[4]	Optional four padding bytes in order to align the total frame size to a multiple of 64bits.