Posted by Joel Eaton on Friday 29 November 2013
This article discusses some of the various options available, dispels some of the myths about audio transcription and provides some tips to help make audio transcription an easier and ultimately quicker task. If you have any experiences you would like to share or questions about audio transcription then feel free to use the comments section below this article for discussion.
Image from Flickr by upchuck_norris used with a Creative Commons Attribution-ShareAlike 3.0 Unported License.
What can transcription tools do for you?
The first thing to address (as it's often the first question people have about transcription) is that automatic transcription tools are sadly not as advanced as we’d like them to be. In an ideal world there’d be a tool that with one click would automatically convert your audio to word perfect formatted text. But for some people putting the translation of their audio into the hands of a machine is a daunting prospect. See the section below that discusses what automatic audio transcription can do if you use it wisely.
Ultimately the point of using or learning any software for audio transcription is to make a laborious task more bearable. Choosing the right one can help you transcribe faster and with greater ease. The trick is to ascertain what aspects of the process you need assistance with and then base any decisions regarding software or hardware on this. Do you feel you waste time switching between applications? Then maybe an all-in-one software solution will suit you. Do you get frustrated with forever having to rewind the audio to decipher the speech and allow you to catch up? Then maybe a USB foot pedal will give you better control of this. Or is it difficult to keep track of which documents go with which audio files? Then perhaps an application that does this for you is worth looking into.
Audio transcription is easy to think of as quite a boring task. One of the benefits of typing up your own recordings yourself is that it familiarises yourself with the information again, embedding the information and allowing for a fresh perspective after the recording has taken place. This can be really useful for recapping what you have captured in order to formulise ideas and interpretations of the data.
What do you need from your recordings?
There are many methodologies available for conducting audio recordings for research, and as a result approaches to transcription should reflect the project objectives. Asking yourself what information it is that you require from your recordings can help streamline your transcription workflow. The first question to be asked is whether or not you actually need all of the text? If the answer is no, then outline what it is you do need. Often it may be wise to transcribe every word so that evaluations can be made at a later date. The benefit of this is that future research can access the data for different evaluations or to strengthen existing ones.
Examples of information:
- Bite sized quotes relevant to project aims
In some projects the majority of speech recordings are eventually reduced to a series of shortened passaged that provide direct answers to the research questions, or capture the essence of a larger recording. Instead of filtering this information post-transcribing can this be done beforehand? If notes were taken during recording can these be used to help here?
- Capturing tone and interactions
In many cases, especially interviews or oral history recordings, there is far more content beyond the text that can be just as important to capture. How do you intend to capture more human features like indicative pauses, sarcasm, emotional expressions or other intangible elements that exist between the lines? This could be problematic to have somebody else to undertake, especially if they are unfamiliar with your research area as issues of perception arise. Furthermore, this information could be embedded within the transcription or stored separately.
- Collecting thematic information
Your research data may be linked to specific themes and it may be useful to extract the specific content and then categorise this. Are there key words or phrases that can help you link these together? Can your recordings be approximately time separated into these themes due to the way the recordings have been structured?
How much time does it take to transcribe?
Attempting to calculate the time needed for transcription can be tricky. It will mostly depend on how quickly people are talking in your recordings and how fast you are at typing. Using some of the tips and tools below will help bring the time down but for a benchmark allow approximately 6 hours for every hour of conversational speech. This will include listening and typing, error checking, repeatedly navigating between software applications and having natural breaks from the work. This time can be cut by up to a half if you successfully use the practice of re-speaking for automated speech to text transcription (see the section below on this).
Useful tips for transcribing
- When typing copy popular words or phrases to the clipboard. For example, if your passage is about genetic algorithms it is likely that the phrase genetic algorithms will be used a lot. Being able to paste this instead of typing it out each time can save you time.
- Slowing down the audio during playback can stop you from stopping and starting the audio as much as you would otherwise need to. This is especially useful for fast spoken speech. It can even be possible to slow the sound down to a speed that matches your typing speed eliminating the need to regularly top and start the clips of audio.
- Integrate computer tasks into one or two software dedicated applications. Playing back audio files in a standard audio player (such as Windows Media Player or iTunes) and manually transcribing in a word processor (MS Word being the obvious choice) is sufficient for many as it ‘does the job’. But there are some free and paid for tools available that can make the job easier and quicker that not only save you constantly jumping between applications, but are designed specifically for transcription – something that neither Word or iTunes is explicitly designed for. See below for more details of transcription software.
Automatic speech to text transcription
This issue deserves its own section as it is such a commonly misperceived concept; is there software available that can automatically transcribe speech recordings? And if so how good is it? The simple answer is yes, if your recordings are of one person (preferably yourself) and are of excellent quality then tools do exist that can perform automatic transcription, with some extra effort from you. If your recordings are of multiple people and/or are not of excellent quality then sadly the short answer is no. The reason is because for automatic transcription to work effectively, the software needs to be trained to understand a particular voice. If there is more than one voice for it to be trained to comprehend then things become very complicated and frankly it's best avoided. However, there is actually a pretty easy way of using automatic transcription when you have poor quality recordings and multiple speakers. Having trained the software to work with your own voice by ‘re-speaking’ all of your recordings the software will be able to work at its best. This process of ‘re-speaking’, through a staggered process of listening and repeating the words into a microphone can be particularly arduous and time consuming. Furthermore all automated text should be thoroughly checked for any translation errors.
If the idea of automatic transcription still appeals then Dragon NaturallySpeaking, by Nuance, is highly regarded as one of the best applications for auto transcription. With a range of options for mobile devices and for Mac and PC computers the Premium Edition currently costs £60 ($99) with their educational discount.
Express Scribe by NCH
Windows/Mac OSX. Free version or Pro version (£19)
A very popular program designed not to replace your word processor but to work alongside it, Express Scribe is handy for keeping things in the ever-familiar Word. It provides a simple interface for having full control of the audio either via a USB foot pedal or assigning key controls with outever having to leave Word. It also has the option for transcribing text inside it if needed. A pedal setup wizard is designed to help set things up smoothly (see below for more on pedals), or using the F keys to control the audio whilst you are in other applications (Word) is a brilliant alternative. One of its best features is the ability to speed up the speech, or slow it down even, to help adjust to a comfortable working speed. Express Scribe can also sync Word docs with your audio files – fine management being a crucial aspect of transcription and even has a one-click function to email transcriptions. With a free version and a Pro version (currently on offer at £19 ($29.99) offering compatibility with more proprietary audio file formats Express Scribe is a great way to speed up the transcription process without having to learn your way around a whole new word processor.
Screenshot of ExpressScribe interface
Windows/Mac OSX. Free 30 day trial or full license (£80)
For a more streamlined package bursting with handy features Inqscribe offers a great alternative to the ‘Word add-on’ style of Express Scribe. Although relatively expensive (£80 - 14 day free trial) InqScribe has a good looking interface and is really easy to get start using. It feels like it has been designed very carefully for this purpose. One of the major differences of these two programs is that Inqscribe allows you to control video files for transcribing, and the export the videos with subtitles; ideal for making additional accessible resources. There is also the ability to insert media time information into text files which can be useful when sharing recordings and text. For example at any point in the transcription you can click to enter the corresponding time within the audio file [00:00:56.18]. This also helps you align video subtitles correctly.
Screenshot of InqScribe interface
USB Foot Pedal
A USB foot pedal can really help speed things up. A pedal allows you to start, pause and rewind audio on your computer freeing up your fingers to work solely on the typing, away from switching between programs and working the sound controls. The first port of call towards getting your hands on one is visiting your institution's media loans department, as they are likely to stock these.
If they don’t have any it’s worth picking up a second hand one (approx. £30 on eBay), as they can easily be resold on. Buying new a Philips USB Foot Control LFH2310 will set you back £78.75. At the other end of the scale is the Infinity USB pedal for £36.99.
Paid for transcription services
There are many people who will pay but unfortunately this is not an option for everybody. For those who are willing to pay to outsource transcription then there are some excellent companies out there that don’t charge the earth. Local companies that come recommended are worth investigating first, ask your departmental administration team and/or colleagues for some leads. Most companies charge by the minute of audio which gives you a good guide to how much you’ll end up paying in total. An example of prices taken from speechtotextservice.com is given below, as a guideline.
|Turnaround Time||Good Quality Audio||Bad Quality Audio|
|Standard Service up to 14 days||£0.50/min||£0.6/min|
All prices in this article accurate on 25/11/13
Posted by Karla Youngs on Friday 15 November 2013
Posted by Steve Hull on Tuesday 05 November 2013
We've had a quick look at the statistics for our helpdesk over the past few months and have a few observations to share.
Posted by Steve Hull on Tuesday 29 October 2013
Posted by Sophie Allen on Monday 14 October 2013
Announced at the beginning of October, this provides FE institutions with direct access to over 130,000 images, film and audio files licensed for educational use. As well as this there are over 1 million additional items, available in various public collections.
Posted by Sophie Allen on Wednesday 09 October 2013
The workshops are aimed at providing you with useful skills and knowledge that can be applied to using digital media in teaching and learning.
Posted by Sophie Allen on Wednesday 02 October 2013
Running for one hour, the webinar will explore searching in the web including Jisc, Jisc Collections and new resources from the Jisc Digitisation Programme.
Posted by Sophie Allen on Monday 23 September 2013
Below is a summary of the workshops we are offering throughout October and November. Underneath the information for each workshop is a link to each workshop's page where you can find course fees and sign up.
Posted by Sophie Allen on Monday 16 September 2013
The workshop will focus on filming and editing lectures and interviews and all equipment and facilities supplied. Participants will undertake a variety of activities culminating in the creation of a short video. Through doing so, they will learn a variety of skills including how to plan a video project, how to competently use a video camera and basic camera and video editing techniques.
Posted by Sophie Allen on Tuesday 10 September 2013
The webinar will explore the platforms most popular for creating MOOCs and their approaches to video.
Its launch is part of a lead-up to the commemoration of next year’s World War One Centenary. The archive was digitised by Iron Mountain and Her Majesty’s Court and Tribunal Service (HMCTS).
Posted by Sophie Allen on Tuesday 06 August 2013
According to the official YouTube Partners & Creators Blog, this opens up live streaming to a wider audience and users are advised to check their ‘Account Features’ where there is the option to enable this feature.
Kinograph is a free, open source resource which enables anyone to build a film scanner which will allow them to digitise old 35mm, 16mm or 8mm movie film.