Last updated: 10 February 2010
Published in:
Digitising analogue media |
Creating new digital media |
Managing your digital resources |
Tags:
audio |
audio editing |
bit depth |
codec |
compression |
digital preservation |
digital preservation policy |
digitisation |
file formats |
metadata |
resolution |
sound recordings |
This document summarises the main features of uncompressed audio file types, including WAV, AIFF and Broadcast WAV (BWF). It gives an overview of these 'raw' formats, and a simple explanation of the common options they will offer the user when creating or working with digital audio. It also looks at lossless compression tools for subsequently reducing the size of these files which will allow reconstruction of their original audio data sonically unaltered. Several sections include links to further resources covering specific relevant topics in more detail, so this is a good place to start if you want to learn more about any of these formats.
Uncompressed audio files are the most accurate digital representation of a soundwave, but can also be the most resource-intensive method of recording and storing digital audio, both in terms of storage and management. Their accuracy makes them suitable for archiving and delivering audio at high resolution, and working with audio at a professional level, and they are the 'master' audio format of choice.
As well as being widely used master formats, uncompressed file types such as WAV and AIFF are also working formats within many audio, video and multi-media applications, and all Digital Audio Workstations (DAWs) used in professional audio production and editing. They are used extensively throughout the digital audio lifecycle, even when they are not the final target format, and familiarity with their composition and use is essential knowledge for good audio production and/or digitisation practice.
There are a few options offered by audio interfaces and software for the capture format and resolution of these files, which will affect both their sound quality and file size, and here we give a brief explanation of their effects on sound quality and some suggestions for appropriate settings for different uses.
We'll also look at what embedded metadata (if any) they offer.
Digital audio recording involves measuring the level of a sound wave at regular intervals (usually every 5-10 microseconds) and recording this level as a discrete value, using a predefined number of binary bits (ones and zeros), before moving on to the next sample. This process generates a stream of binary values (the 'bitstream') which when played back can be used to reconstruct the original waveform. This bitstream is the 'raw' audio data, expressing the sound wave in its closest digital analogue.
Technical explanations of the sampling process, Pulse-code Modulation (PCM) and other types of bitstream recording can be found in our advice documents Introduction to Digital Audio and File Formats and Compression.
All of the following uncompressed audio file types are 'wrapper' formats which use PCM audio as their raw material, and add small amounts of additional data to it to enable compatibility with particular codec(s) and operating system(s). They are sonically identical, but offer different features and are commonly found in different audio environments.
Though some of these were developed for specific platforms, all have open-source codecs available for all standard operating systems - Windows, MacOS and Linux.
By far the most widely used uncompressed format is the Microsoft Wave format, commonly known as WAV on account of its .wav file extension. WAV is the longstanding audio format of the Windows operating system, as well as being found within many other contexts, and is therefore familiar to the majority of users, and compatible with their computers and audio peripherals. All forms of WAV are PCM wrapper formats, and store their audio data in a similar basic form. The wrapper has been changed over time to offer compatibility with alternative non-PCM audio streams and/or chunks of data related to the audio.
Microsoft's native uncompressed audio format for Windows. While a flexible format, capable of storing very high quality audio, WAV has a few limitations which have become apparent over its lifetime, and extensions to the format have been developed to address them. One shortcoming identified in WAV is its inability to hold any metadata describing its audio contents (see BWF).
PCM bitstream audio is stored as a series of standard RIFF (Resource Interchange File Format) chunks
File size is also limited to 4Gb. This limit derives from the use of a 32-bit address header, which imposes a limit of 2^32, or 4,294,967,296 bytes (4 gigabytes).
Developed by the European Broadcasting Union as an improvement to the WAV format, Broadcast WAV is a form of WAV file which, though functionally identical and cross-compatible with WAV, includes an extra header file which contains additional information (metadata) about the audio and synchronisation information, usually a BEXT chunk or (more recently) iXML chunk.
BEXT is the format of the original 'chunk' of data added to WAV to allow metadata to be embedded with the audio. iXML is the XML metadata schema that supercedes BEXT.
BWF is the default audio format of some non-linear digital A/V workstations, and a recommended archive format. 4Gb size limit for the same reasons as WAV.
A relatively recent evolution of Broadcast WAV, MBWF was introduced in July 2009 by the European Broadcasting Union, and combines RF64 audio with a BEXT chunk to describe its attributes.
RF64 is an audio format contain up to 18 simultaneous streams of surround audio, a stereo 'mixdown' and other non-PCM data streams (such as MP3 or AAC). RF64 is envisioned as a long-term solution suitable for archiving uncompressed 5.1 surround sound and other X.1 multi-channel formats. The RF64 format used by MBWF uses a 64-bit address header (hence the '64' in its name), and is therefore not limited to 4Gb, having in fact a size limit of over 18 billion Gb.
To quote the EBU: "An RF64 file with a bext chunk becomes an MBWF (Multichannel BWF) file. The terms ‘RF64’ and ‘MBWF’ can then be considered synonymous." MBWF is backwardly compatible with with WAV and BWF.
Audio Interchange File Format - developed by Apple and Amiga. Works similarly to WAV but uses a different method of dividing the PCM data into manageable chunks. Widely available free codecs for all platforms. AIFF is the native format for audio on Mac OSX [Note: Mac OSX is also WAV compatible at all levels].
AIFF also has the facility to store a loop point for the audio, and also a musical note, both of which features are useful for playing back musical samples.
Technical attibutes shared by all uncompressed audio file types:
The number of binary bits (ones and zeroes) used to record the sampled level of the waveform. Thus 8-bit sampling uses an 8 digit binary number to record the level, giving 2^8, or 256 potential values.
Bit depth determines the ratio between the quietest and loudest signals the system can record (dynamic range); 16 bit has 65,536 possible values (i.e. 2^16), and hence a far higher dynamic range (96dB) than 8-bit recording. 24-bit sampling has 16,777,216 discrete levels, giving 144dB dynamic range, which exceeds the tolerances of human hearing.
The frequency with which the level is measured, measured in kiloHertz (kHz = thousands of times per second). Determines the frequency range which the system can record.
See Introduction to Digital Audio for further details on the empirical effects of sample rate and bit depth.
Note: Bitrate
Though not usually a selectable option when encoding uncompressed files, many mixed libraries of compressed and uncompressed audio list the bitrates - measured in kilo-bits per second (kbps) - of audio files as an index of an audio file's resolution.
Many audio compression methods - e.g MP3 and AAC - can reduce bitrate without reducing bit depth or sampling frequency, so this is a useful additional indicator, especially when comparing sound quality to file size for different file types.
Bitrates of uncompressed stereo PCM files, compared to a representative compressed (MP3) file:
- 16-bit 44.1kHz = 1411kbps
- 24-bit 48kHz = 2034kbps
- 24-bit 96kHz = 4068kbps
- typical MP3 bitrate = 128kbps
BWF and MBWF files can incorporate metadata with their audio content within the wrapper, to enable description of ownership, writing and production credits, publishing etc. This metadata can be stored as a BEXT chunk, or the newer and more flexible iXML chunk.
BEXT lacked an agreed structure or specification, and was used in many different and non-standard ways. iXML is designed to supercede the original BEXT chunk and offer a suitable framework for detailing multi-channel digital audio object properties comprehensively.
AIFF supports ID3 tags which, though not as fully integrated as they are with MP3, can be stored with the audio as a RIFF chunk.
Further details of metadata implementations and strategies for audio can be found in our advice document Metadata and Audio Resources, including an example of a populated iXML chunk.
As the most accurate digital representation of a sound wave, uncompressed files are the recommended format for audio archiving.
The International Association of Sound and Audiovisual Archives (IASA) recommend Broadcast WAV as a suitable archival format, for reasons of its wide compatibility and support, and its embedded metadata capability. For surround-sound or multichannel audio the MBWF format should be used.
For archive PCM audio, bit depth should be a minimum of 24-bit, and sample rate a minimum of 48kHz to comply with IASA standards.
[Note: The performance of the computer audio interface used for archival duties is also of great importance to quality of audio file capture from analogue sources. More details of recommendations for archive quality converters can be found in our advice document Choosing an Audio Interface: Project Requirements and Choosing an Audio Interface: Technical Considerations]
In the event that your data storage facility has insufficient capacity for your audio at full uncompressed resolution, you may need to consider some form of data compression. While sacrificing standards compliance, this may occasionally be necessary, and depending on the objectives and requirements of your project, there are several alternatives.
Lossless compression is the least destructive of these alternatives, as it uses mathematical algorithms to encode the data in such a way that it can be restored to its former state without any audio degradation at all (hence the name). The only disadvantage of this process is that it requires an additional encoding/decoding stage, and that suitable codec software is available to the final user.
Some open-source lossless compression codecs are available, which give best long-term outlook for compatibility and support. Of these, FLAC is the most widely used, and is available to download free of charge via Xiph.org
Our document Choosing a Digital Audio File Format gives further information on FLAC and uncompressed formats.
MBWF / RF64: An extended File Format for Audio - European Broadcasting Union - July 2009
Guidelines on the Production and Preservation of Digital Audio Objects - IASA - Second Edition, March 2009
Last updated: 10 February 2010
Published in:
Digitising analogue media |
Creating new digital media |
Managing your digital resources |
Tags:
audio |
audio editing |
bit depth |
codec |
compression |
digital preservation |
digital preservation policy |
digitisation |
file formats |
metadata |
resolution |
sound recordings |
We provide a FREE enquiry service giving advice to the UK Further and Higher Education community.
You can ask us anything, typical questions include - "What formats should I use?" "How do I...?" "What tools can achieve the result I need?" "What is new and emerging?"
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++