User Guide to MP3
A guide to the creation, manipulation and use of audio flies in the popular MP3 format, including a close look at the many audio and metadata options which MP3 offers, and its use within teaching and learning.
- Why use MP3?
- What is MP3?
- Archival suitability
- Who is MP3 for?
- How does MP3 work?
- How to make an MP3
- How to manage and play MP3s
- Delivery tools
- ID3 - Tagging and the MP3 metadata schema
- The future for MP3
Most of us have experience of MP3 audio in some form, either through the internet, a portable audio player, a podcast, or in one of its many other incarnations. MP3 is one of the most widely used and accepted media formats in the digital world, and this advice document aims to explain in more detail what it is, how it works, and the features and options which the format offers to enable you to get the most from it.
We'll also look at why MP3 is so popular, who its audience is, and its suitability to the pedagogic needs of audio in education.
MP3's key benefit to all users is its reduced file size when compared to an uncompressed audio file. This smaller size enables faster delivery via the internet, and easier sharing and portability, as well as its obviously reduced mass storage requirements. Where file transfer and internet bandwidth is an issue, MP3s offer a simple solution with few drawbacks.
Either in its own right as an audio podcast or download, or enhancing visual materials such as presentations, slideshows and video, audio improves accessibility, and can be a useful tool for engaging those less attuned to text-based learning styles.
MP3 is an excellent format for delivering audio in education, and for embedding into enhanced learning materials. Not only can audio recordings and archives be easily and flexibly compressed as MP3 and tagged for delivery by podcast or download, but MP3 audio can be embedded into Powerpoint presentations and used within production tools like Camtasia, Garageband, and most non-linear video editors.
Whether MP3 is the final target format, or a working format for audio within a larger project, it provides a flexible and easily managed feature set.
As an example of an audio object, a vinyl LP carries far more information than the audio signal alone, and its labels and sleeve can provide information on copyright, production, lyrics, personnel, as well as cover art and sometimes background information on historical or musical context. In this sense it is a rich audio document.
In the same way, an MP3 can carry a large amount of extra information to inform the listener and enhance the learning experience, though it does so in digital form. For example, a recording of a poetry seminar could be accompanied by text excerpts, images, quotations, references etc, to be viewed while listening to the audio:
Screengrab from iPhone of MP3 audio playback accompanied by embedded text
See ID3 - Tagging and the MP3 metadata schema below for more ways to package additional information with MP3 audio.
Defined by the Motion Picture Experts Group (MPEG), MP3 is an open standard - ISO11172-3 - forming the audio part of the MPEG1 standard ISO11172. Files in the MPEG-1 format (file extensions .mpg .mpeg) incorporate video encoded as MPEG-1 and/or up to 2 channels of audio, which is encoded as MP3.
MP3 is an abbreviation of 'MPEG1 Audio Layer 3' (not, as is sometimes wrongly assumed, MPEG-3), and is a method of compressing (reducing the size of) a digital audio signal. The MP3 compression algorithm reduces the complexity of the signal using a method called 'Perceptual Modelling' (explained below), simplifying the audio content and thus enabling it to be more succinctly expressed in digital form - the result: a smaller file with minimal audio degradation.
However, during the encoding process some of the audio information is irretrievably lost, hence MP3 is termed a 'lossy' compression format. Full range audio cannot be restored from an MP3. Depending on the factor by which the filesize is reduced, the degradation in audio quality from MP3 compression can be either subtle, almost to the point of transparency, or - at high compression ratios - clearly audible, where its effects can become intrusive.
MP3 now also allows extensive tagging of the audio file with details associated with its ownership, production and contents - a system which can be used to catalogue and manage collections of MP3 files in an intuitive and flexible fashion.
MP3 was developed between 1987 and 1991 by engineers at Fraunhofer Gesellschaft as an attempt to reduce digital audio file size with the minimum degradation of perceived audio quality. The format was adopted in 1992 by the Moving Picture Experts Group (MPEG) as part of its first standard for digital file compression - MPEG1 - and included as part of the 1993 ISO/IEC standard 11172 by the International Organization for Standardization (ISO). Fraunhofer were awarded a US Patent for MP3 in 1996, and following the release of the first free MP3 player - Winamp - in 1998, closely followed by the first commercial online sales of music in MP3 format in 1999, MP3 rapidly established itself as the format of choice for delivering compressed audio files via the internet.
The MP3 standard has since been expanded through its inclusion in the subsequent MPEG2 standard.
Also introduced in 1999, the portable MP3 player allowed the user for the first time to transfer digital files to a convenient portable device and interchange files between this mobile player and a larger library located on computer, without needing to use any physical media (disc, tape etc). Owners of CD collections could now encode or 'rip' them to much smaller MP3s for use on correspondingly smaller portable players, whilst retaining acceptable sound quality, or music could be bought online and downloaded already encoded to MP3. In the years since, this method of listening has become virtually ubiquitous, especially amongst younger generations, and has brought deep change to the music and broadcast industries.
As stated earlier, and explained in more detail below, MP3 is a destructive or 'lossy' form of audio compression, and as such is unsuitable for archiving sensitive audio data. That said, the content of some audio material destined for long term storage will not be significantly compromised by the MP3 compression process, and the benefits of reduced storage requirements and rich metadata may sometimes outweigh the concomitant loss of sound quality This will be at the discretion of the user. If a lossy format is deemed acceptable for your archiving requirements, you should also look at the newer AAC format, offering all the benefits of MP3 but with superior audio performance.
For standards-compliant archiving, an uncompressed Pulse Code Modulation (PCM) file format should be used, preferably in accordance with the recommendations of the International Association of Sound and Audiovisual Archives (IASA), who recommend Broadcast WAV (Waveform Audio File Format) as a suitable archive format. However, some respected bodies (eg the BBC) have elected to use MP3 and AAC (Advanced Audio Coding) for some parts of their archive. This document hopes to explain some of the reasoning behind such a decision, to enable users to make an informed choice.
As an open standard with much free encoding and decoding software in existence, there is little danger of format obsolescence in the foreseeable future. With such a large amount of material already in MP3 form, support at least for playback and conversion into other formats is anticipated to be long term.
MP3 is a mature format, and has been improved and honed to meet user needs for over ten years.There are unavoidable side effects of lossy compression, but at high bit rates the differences between MP3 and uncompressed audio are subtle and often only discernable by expert listeners making direct A/B comparisons. When compared to other audio file types MP3s offer a good balance of small file size, good quality audio, rich metadata and public acceptance.
However, when auditioned on reference monitors MP3s can exhibit a slightly 'hollow' sound, lacking solidity in the stereo image. Additionally at low bit rates temporal quantisation can create phantom 'pre-echo', caused by the processing of the audio in packages or 'blocks', and MP3 is sometimes characterised as having a sharper, more abrasive and 'metallic' sound.
The MP3 standard also does not include an analysis band for frequencies above 15.3kHz, and therefore will not reproduce them. For this reason it cannot be considered 'full range' audio, even at the highest available bitrates (the human hearing range being generally accepted to be approximately 20Hz-20kHz).
MP3 has a maximum of 2 simultaneous channels (stereo), and cannot therefore be used for surround sound or any other format requiring more than two audio channels [Note - the MPEG-2 standard does have a 6-channel implementation of MP3, but this is accessible only within the MP2 container].
While low bitrate MP3s can be noticeably compromised in audio quality, at higher rates (192kbps and over) they are largely indistinguishable from uncompressed originals, or 'transparent', to the casual listener.
MP3 Key Facts
- Open standard ISO11172-3, part of MPEG1 and MPEG2 media compression standards
- Lossy compression method, based on psychoacoustic modelling
- Sample Rates: 32kHz, 44.1kHz, 48kHz [16, 22.05 and 24kHz MPEG2 standard only]. 44.1 kHz is by far the most commonly used (same as CD).
- Data Rates: 8 to 320 kbps, giving filesize compression ratios of between 176:1 and 4:1 when compared to uncompressed CD audio (1411kbps)
- Maximum 2 channels - Monophonic, Split stereo and Joint stereo modes
- ID3 v2.x tagging system (metadata)
- Very widely used and supported
- Superceded by the AAC format
As the first commercially available compressed audio filetype, MP3's popularity and acceptance has grown exponentially alongside the growth of the internet - its natural habitat, due to its smaller storage and bandwidth requirements, easy cataloguing and management. All computers and mobile media devices (including smartphones) do or can support MP3 audio - almost without exception - and the acceptance of the format is correspondingly wide.
Most users are comfortable accessing and playing back MP3s, with more advanced users happy to encode, tag and organise their MP3 collections with the files' metadata. Expert users have access to one of the most comprehensive embedded metadata schemata of any digital audio filetype - the ID3 tagging system - and can build and use rich audio objects enhanced with images, transcriptions and extensive notes.
The reduced data rate of MP3 also makes it (in common with other lossy formats) an ideal medium for audio streaming via the internet. A live audio stream can be broken down into smaller packets, transmitted via the internet, reconstructed by a webcast client and played by the listener anywhere in the world within a matter of seconds. This process - known as 'webcasting' - differs from podcasting in that it does not require that the audio is downloaded to the listener's computer before playback, but allows live playback from source at the time of webcast.
Alternatively MP3 can be streamed from an embedded player, again allowing playback but not download, but where the audio is available 'on demand'. When the player's controls are operated the audio is streamed from the web server. There are many of these players available, mostly Flash based. Streaming audio is often also accompanied with a separate link to download the file, which is one of the most flexible combinations of online delivery methods currently available, and our embedded audio player is of this type.
Thanks in part to this versatility the public profile of MP3 is arguably higher even than those of long-established uncompressed formats such as AIFF and WAV, and indeed MP3 could well be one of the most globally recognised digital formats of any sort. It was - in the words of one of its key architects, Karlheinz Brandenburg - "the right technology available at the right time". As such, MP3 is an excellent delivery format for reaching the widest and most diverse audience.
MP3s are playable on the vast majority of current digital audio systems and - as one of the first popular digital audio standards - legacy support is also good. MP3 players of all generations are still able to play back current files (though some metadata included with newer files may not be viewable on some devices- see ID3 standard and versions) and there are no issues with cross-compatibility or transferrability.
Much MP3 compatible software is open source or free, and while comparable hardware players do cost money, most students and final users will already own an MP3 compatible device of some sort for personal use - either a portable media player, or integrated into a 'smart' phone - and MP3 distribution can broadly be seen as an enabler to sharing and using audio.
The first action of an MP3 encoder is to divide the source audio into 'frames' for individual analysis. It does this every .026 seconds - equivalent to approximately 38 frames per second.
The audio in these frames is then analysed and compressed to a target number of bits using psychoacoustic modelling (see below), followed by a further lossless data compression stage. Each frame is assigned a 32-bit header, comprised of a synchronisation reference number and various other identifiers of the frame's contents (bitrate, sample rate, joint/split stereo etc). The header is then followed by the frame's audio data. This series of frames constitutes the standard MP3 file.
MP3 frame header contents:
- Frame sync reference - 11 bits
- MPEG audio version - 2 bits
- MPEG layer - 2 bits
- Protection on/off - 1 bit
- Bitrate (from look-up table) - 4 bits
- Sampling Rate (from look-up table) - 2 bits
- Padding bit - 1 bit
- Application-specific reserved bit - 1 bit
- Channel mode (mono/dual/split stereo/joint stereo) - 2 bits
- Mode extension (for joint stereo mode) - 2 bits
- Copyright (on/off) - 1 bit
- Original (on/off) - 1 bit
- Emphasis - 2 bits
Total - 32 bits
'Psychoacoustic' is a term used to describe how we hear and perceive sounds. Audio measurement equipment, or skilled and experienced listeners may be able to discern differences between subtly altered sounds, but often the 'average' listener cannot.
For example, when we hear a loud noise (a drum, car horn etc) the brain will momentarily 'focus in' on it, drawing attention away from other background noises and effectively 'masking' them. This is an involuntary and subconscious process common to all humans. MP3 encoding takes advantage of this psychoacoustic phenomenon to make largely imperceptible changes to the audio signal, and so reduce the amount of information needed to express it in digital form, reducing its file size consummately.
If an audio signal is split into many different frequency bands, then the fidelity of the quieter bands can be reduced and their data simplified (or in some cases removed completely) when a louder sound is present in another band to mask the effect, without hugely affecting the overall subjective sound quality to listeners. The fidelity of louder frequency bands - which the brain listens to more closely - is preserved, and when all bands are recombined these will 'mask' the reduced quality quieter signals, and fool the brain into thinking that all the signal is being played at the higher quality. This 'psychoacoustic masking' is the key principle behind MP3 and several other audio compression algorithms based on psychoacoustic encoding.
MP3 encoders split the signal into 22 frequency bands and then process each band separately (though interdependently) for storage. These signals are then decoded and recombined for playback.
Most MP3 encoder software allows you to start with any type of audio file (including another MP3), specify encoding or import options, and then render the compressed MP3 file.
Once you have selected your encoding options (bitrate, stereo mode, VBR etc. - see below), simply select your original file and choose 'Convert to MP3', 'Import Selection' or the equivalent menu command within your encoding software.
All encoders will have a set of importing options for creating MP3s files which can be set to provide a file of suitable size and quality for your uses. These are usually accessed through the software's 'Preferences' or 'Options' menu, and can be set within the following limits:
'Bitrate' describes how many bits of binary data are used to store each second of audio. Simply put, the lower the bitrate, the smaller the file (all other factors being equal).
The target bitrate of an MP3 is the chief factor in determining its filesize and compression ratio. Bitrates of between 8 and 320kbps (kilo-bits per second) can be selected, with 128kbps often considered the minimum bitrate below which audio artefacts start becoming easily discernable - a good rule of thumb if you are unsure, giving just over 10:1 filesize reduction.
As a reference point, standard CD quality audio (16bit 44.1kHz) has a data throughput rate of 1411kbps, so even at its best quality (320kbps), MP3 will offer better than 4:1 compression of CD quality audio.
- <64kbps - low quality - clearly audible compression artefacts
- 128kbps - good quality - 10:1 compression ratio with little audible degradation. Default setting for many encoders
- >192kbps - high quality - barely discernable from uncompressed originals without careful critical listening
Variable BitRate (VBR) encoding offers further file reduction by reducing the bitrate during simpler or silent sections and increasing it for complex passages, with a target average bitrate. Each frame of data in an MP3 file includes a header which specifies that frame's bitrate, and while the 'master' bitrate is used as the target average setting, the bitrate of individual frames will be increased and decreased for sections containing more or less complex material, which is what VBR accomplishes. The LAME encoder is again often thought to excel at VBR encoding.
Some older players can have problems playing back VBR files, but this is rare and easily overcome by installing a newer free player.
A stereo signal can be encoded as two discrete mono signals (Split Stereo mode), using double the bitrate of a monophonic signal. Alternatively the stereo signal can be expressed as a sum & difference representation of the stereo spread of the signal (Joint Stereo mode), similarly to the Mid/Side stereo recording technique.
Also, Joint Stereo mode will sometimes (depending on your encoding software's implementation) reduce the stereo spread of bass frequencies - again to help simplify the signal with minimal audible effect (another psychoacoustic process). Location of bass signals within the stereo image can be difficult, even for expert listeners, so essentially making the bass content monophonic is a simple way of reducing file size.
Joint Stereo can reduce the file size, particularly for signals with simpler stereo images, but may have some adverse effects for sound quality; auditioning and comparison of both Joint and Split stereo encoded files is recommended where possible.
- Joint Stereo
- Split Stereo
- Dual Channel
As there are several stages to the encoding process, each of which can involve varying levels of detail in their analysis, some encoders offer the ability to balance between speed and quality of encoding. Unless dealing with a very large quantity of audio, or using a very slow computer, it is advisable to choose the 'best quality' option where possible.
Many pieces of software exist which are capable of encoding MP3 files, though this is a non-standardized process, and they will use different algorithms yielding different results. Some incorporate the LAME (LAME Ain't an Mp3 Encoder) encoding engine, which was developed an open source educational tool. LAME is a well evolved project offering high quality encoding (especially at higher and variable bitrates), as well as freely available and customisable source code. Some digital audio applications use their own proprietary encoding algorithms, licensed by the developers according to the various MP3 patents.
These encoder algorithms all use different implementations of the same psychoacoustic encoding principle described above, and produce (sometimes subtly) different results for MP3s compressed to the same bitrate. The starting point is the same and so is the target format and file size, but this goal can be reached by different routes.
Since the encoding method used does not however in any way affect future playback compatibility, the choice of encoder is entirely a personal and subjective one. It is also the subject of some debate, with most encoders having their own supporters and detractors. Critical users may want to research the various options and alternatives, or carry out comparative listening tests themselves.
As a general rule, any application incorporating the LAME encoder should give good results and (being open source) a relatively future-proof workflow:
LAME open source project - available as plug-in for Audacity, iTunes and many other host applications.
Unlike encoding, MP3 decoding (playback) is a standardised process, and part of MP3's official definition, so different players should not give significantly different results. Playback quality will still of course be affected by other elements of the computer/player's audio chain, including the Digital to Analogue (DA) converter and associated post-conversion volume controls (headphone level on a soundcard for example), but your choice of playback software should have no major repercussions for sound quality if chosen from the MPEG recommended player list.
[Note on digital audio playback from a computer: It is very important to present the DA converter with as large a range of digital values to convert as possible, thereby allowing it the highest resolution signal with which to work. Where possible, to avoid internal bit reduction all internal (digital) level controls should be at 0dB (ie no cut or boost to the signal - often the maximum setting), and the playback level through headphones or speakers altered only after conversion has taken place, and preferably as the last stage in reproduction. This maxim is true of all audio playback.
Think of it like taking a digital photograph of an optician's sight chart from across a large room; if the chart is large (analogous to full volume) and fills the whole frame, then the photo will not need enlargement for the letters on the chart to be seen clearly. If, however, the chart is small (low volume), then the digital photo will need to be zoomed or enlarged, and in the process pixels and other artefacts will begin to show, making the letters more indistinct and details less clear. The blank wall photographed around the chart is the equivalent of wasted digital 'headroom' (space available for recording but not used), and is similarly discarded.
In short, keep the volume control of your MP3 playback software at full, and if you want to turn down the volume then use the controls on your amplifier or headphone output (if available) to do so.]
Portable MP3 players
Portable MP3 players are available in a large variety of sizes and capacities. They will usually have access to limited metadata - for example browsing files by artist or genre - and some will display cover art on small screens.
They obviously decode the MP3 data (by the standardized method mentioned earlier), but also convert the digital bitstream into analogue sound and output to headphones or an external amplifier, and as such will have an effect on playback quality.
Some players are designed to work best with specific pieces of software, so check compatibility and functionality if you have a strong preference for a particular feature or file management system.
- Web Services
The reduced size of MP3 audio files makes them the ideal medium for delivering audio via the internet, and the web is - as noted earlier - MP3's natural environment. MP3 audio can be uploaded to VLE, various online file sharing services or other 'Cloud' repositories for access and download by users. Many sites have embedded players which allow the user to listen to the audio online, without downloading it.
Similarly MP3 is ideal for audio podcasting, with a stereo 128kbps MP3 requiring less than 1Mb download per minute of audio. Furthermore the ID3 tagging system offers many useful options for organising cataloguing and managing podcast episodes and series (though see 'Device compatibility' below for caveats).
Device compatibility (and the accompanying metadata which can be viewed thereon)
All modern operating systems and portable media players are MP3 compatible, as are practically all mobile phones. Thanks to standardized decoding, playback quality will depend largely on the compression settings chosen at the time of encoding, and the quality of the Digital to Analogue (DA) converters of the playback device or its audio interface (soundcard).
While playback of the audio file is of course a core feature of all these devices, some will offer better features for interrogating the file's tags and metadata than others. For example the iPhone, while a capable portable MP3 player, allowing the user to search by artist, genre, title etc, and even view lyrics and images, still provides no facility for viewing the complete ID3 contents of a file.
Constructing a thorough metadata schema for your files is obviously an advantage. However, the range of devices on which the material may be played back should be considered when planning what information to enter into which fields - for example, most portable players have no way of searching or viewing MP3s by Composer, Date, Record Label etc, but only core values such as Artist, Album and Genre. These core fields should therefore contain all vital file information, even if duplicated elsewhere.
Reduced file size is of course a key benefit of MP3 use, but even setting aside the convenience of smaller files MP3s offer the advantage of a larger, more accessible and well supported metadata structure than almost any other audio filetype. This is especially valuable in education for placing the audio in context (artistic, historical or otherwise) and enabling correct use and attribution of its content. Additionally, MP3s' ability to package substantial amounts of simple text and images with the audio enables access for all users by providing scope to include notes and text transcripts, as well as accompanying images.
MP3 uses the ID3 tagging system, which allows a standardized metadata package or 'container' to be attached to the MP3 audio data to allow it to be decribed in detail, and text and image relating to the audio to be stored with it.
ID3 was designed with MP3 tagging specifically in mind, so integration is seamless and transparent. However, ID3 containers are quite separate from the MP3 audio data, and can be used by several other audio file types - including AIFF, AAC and MP4, though with varying degrees of integration. ID3 is a de facto standard for embedded audio metadata, though newer container formats (eg MPEG-4) offer flexible and scalable alternatives.
The first ID3 tags were a small 128 byte header for storing descriptive track information, and were used from 1996 to 1998. These version 1 tags offered four 30-character text fields and a few other numerical fields. Information could be added describing:
- Genre (from a list of 80 predetermined genres)
- Track number (v1.1 only)
Users quickly outgrew this capacity, and version 1 was replaced by the more flexible ID3 v2 system in 1998. ID3 v1 tags are no longer in general use, though most players will read them.
Over its first couple of years ID3 v2 was further revised and expanded, and ID3 v2.4 (the current standard) was introduced in May 2000. Most MP3 players are ID3 v2.4 compatible, with the notable and inexplicable exception of Windows Media Player, which can mishandle and even overwrite or damage v2.4 tags. If full cross-compatibility with WMP users is desired, it is advisable to use v2.3 tags.
The v2.4 ID3 metadata container is divided into 'frames', each of which consists of an identifier of the frame's contents (eg 'title', 'artist', 'year', 'image' etc from the list of declared frame values) and its size, followed by the frame data itself - text or a jpeg or png image, audio settings (such as equalisation), web links etc. A null byte then indicates the end of the frame, and the identifier for the next frame will follow.
ID3 v2.4 structure allows the creation and attachment of multiple standard (declared) and custom metadata frame types, each of which may be up to 16Mb in size, with a maximum total container size of 256Mb. Frame length is variable, and will expand to fit up to 65,536 Unicode UTF-8 or UTF-16 characters or a 16Mb image per frame. Earlier versions of ID3v2 use the UCS-2 character set.
There are 83 declared frame types described within the ID3 v2.4 standard. Each of them is assigned a four character identifier - for example, COMM for comments, WPUB for a link to the publisher's official web page and APIC for an attached picture. Some of these frames can occur more than once (APIC for example) and others - which describe unique attributes - only once. These frame types cover all the common MP3 attributes such as title, artist, cover art, composer, lyrics, genre etc as well as some more esoteric and less well known values. (See 'declared frame values' link above for a full listing)
Being quite music orientated, there are some Dublin Core elements which are absent or are not easily represented within the ID3 schema, although conversely all ID3 values can be mapped to the more open and flexible Dublin Core. The Dublin Core is an international standard ISO15839:2009 for metadata architecture, used to describe digital items of all types. To ensure maximum interoperability of your resources you should be aware of their compliance (or not) with it.
If desired fields are not available within ID3 then custom frame types can be constructed within a suitable editor, but this will inevitably sacrifice an element of standards compliance and associated interoperability with other users. Alternatively a separate metadata structure can be used; refer to our advice document Metadata and Audio Resources for guidance.
Editing ID3 tags
ID3 tags can be accessed and edited in several ways. Depending on the amount of files' metadata you need to edit, and the depth of view required, these are some possible methods:
- Windows XP allows viewing and editing of a basic set of ID3 tags. Simply select the file(s) whose data you wish to edit and right click>Properties>Summary>Advanced. This will display a list of core tags (Title, Artist etc), some of which are editable.
- Most software media players can organise their library by Artist, Genre, Album etc, using the ID3 tags, and often offer more viewable tag values than Windows alone. They often also allow editing of single or multiple files' metadata.
- For in-depth editing of ID3 tags there are many dedicated tag editor applications. ID3.org provide a list of recommended ID3 tag editors and implementations.
When managing a collection of MP3s it is often necessary to assign the same attribute to multiple files, or successive values of a parameter to a series of files. This is known as 'batch editing', and is easily accomplished in several ways, via the methods just described.
Batch editing requires the selection of multiple files (usually by holding down the Ctrl or Shift key while selecting); shared fields can be viewed and edited via the 'Get Info' or 'Properties' command (or equivalent). Most ID3 editors feature batch editing tools.
Some players create custom metadata which they access to describe files in their library, but which is not written to the MP3 files themselves. For example, Apple iTunes allows a user rating of 1 to 5 stars, and stores this rating information in .itl and .xml files in its library folder; thus, if the MP3s are transferred to a different player, the ratings will not go with them.
This is also incidentally how iTunes allows extensive tagging of files which do not themselves have metadata containers (WAV for example).
Care should be taken when editing tags that one is editing the 'raw' metadata - rather than updating the player's library alone - if it is intended that these tags travel with the audio.
MP3 does not provide any way to apply serial a copy protection system or DRM (Digital Rights Management). This omission may account in some part for the eagerness with which AAC, which does allow DRM, has been embraced by the music industry (see The Future for MP3 below).
2004 saw an abortive attempt to introduce a DRM protected version of MP3, but compatibility issues brought this initiative to a swift end.
MP3 has evolved significantly to meet the global needs of the internet and users of digital audio. It has in turn had a profound influence on these groups and on the commercial, educational and creative audio industries, and continues to dominate the market for delivering and sharing digital audio, with which it is inextricably linked. However, though it has proved flexible and eminently fit for purpose for ten years, MP3 is not perfect.
MP3 compression has well documented - if sometimes marginal - detrimental effects on audio quality. The lessons of MP3, combined with the intervening years of research, have revealed better methods of psychoacoustic encoding which cannot be implemented within the MP3 standard.
The new AAC (Advanced Audio Codec) format, which is based on the same psychoacoustic approach as MP3 but with improved encoding algorithms, gives improved sound quality for the same filesize reduction and is positioned as the natural evolution and successor to MP3. Many major suppliers of MP3 (most notably the iTunes Store) switched some time ago to AAC as their chosen format, so it would seem the writing is on the wall. Both objectively and subjectively AAC is superior to MP3, and AAC forms part of the MPEG-2 and MPEG-4 standards.
Like many popular legacy audio formats, MP3 will no doubt hold on long after it has been superceded by superior formats; however, unlike the old physical format wars there is no requirement for new hardware to play back the new formats, and suitable players can be downloaded freely and immediately. If the audience are informed, and care about improved sound quality or smaller files, these new formats will logically replace MP3. Interestingly however, according to a study by Stanford Professor of Music, Jonathan Berger, some (especially younger) users may be becoming accustomed to the effects of MP3 compression to the point of actually in some cases preferring it to the sound of uncompressed audio.
Despite its inevitable obsolescence, accessing MP3 data is not likely to present problems for future users, as there are several complete open source solutions for MP3 encoding, management and playback, which provide a framework for long term support.
- MP3 and AAC Explained - Karlheinz Brandenburg, Fraunhofer Institute for Integrated Circuits
- Study by Jonathan Berger (cited by Murad Ahmed and Kaya Burgess) 2009
- The MP3 open standard and the music industry's response to Internet piracy - Robert F. Easley, John G. Michel, Sarv Devaraj 2003
- Sustainability of Digital Formats, Planning for Library of Congress Collections - Library of Congress
- ISO/IEC 11172-3 - International Organization for Standardization
- MP3: The Definitive Guide - Scott Hacker 2000
- Guidelines on the Production and Preservation of Digital Audio Objects - IASA March 2009