Casual dining options: informative sprinkles

Karla Youngs on Monday 14 February 2011

Experiments with YouTube Machine Transcription and Camtasia Speech-to-Text tools.

This screencast shows the Machine Transcription generated from a screencast which I recorded with Camtasia and uploaded to YouTube. While uploading the pre-prepared caption track, I noticed that Youtube had also automatically (and quite quickly) analysed the voice-over and generated its own caption track, using its built-in Speech-to-Text algorithm, so I made a quick recording of the results, as this is a feature I hadn't noticed before:

Screencast URL: http://www.youtube.com/watch?feature=player_embedded&v=8SqVP_c0tf0#. There's quite a bit of dense text, so I'd suggest you watch in fullscreen (button at bottom right of the player)

The caption track can also be downloaded as text, so even if YouTube isn't your chosen delivery channel, you could just use it as a free online Speech-toText (STT) tool...

For sake of comparison I then removed both the original captions from the Camtasia project, and those transcribed by YouTube, and used Camtasia's own automatic Speech-to-Text transcription tool to make a third set of captions, which generated the following version of the same section of speech:

 

"I discovered official media

 

in this greenhouse or by giving you a quick introduction to screen customer to work for which you can use spotted checklist when

 

planning any type of screen coating project

 

off the five key stages of the screen coating work for a similar to those for the production of other types of digital media and the for learning

 

objects and they are planning

 

reproduction production

 

post production and use of the

 

planning involves both the design of your screen cast including scripting 40 villages another content and also planning what resources you

 

will need to act as during production and delivery

 

of goods clink of design is focused on the set of learning goals we should inform your plans for the content length and formatted screen cost

 

target length in particular can be an important decision

 

which can add useful structural constraints

 

resources you will need to include people collaborators and technical assistance equipment access to a suitable workspace the

 

skills needed to capture prepare your materials for delivery

 

planning is an absolutely vital stage of the workflow and many of you most important decisions will be made before you press record

 

for the first time

 

once planning is complete you can start to prepare your workspace WorkStation &Materials to ensure a successful and stress free recording session

 

this is the preproduction stage of the workflow way the shore everything you need is to hand and the you've configured and where necessary tested your input devices your liking

 

want to pursue a call by UN protection

 

while this is the stage where you are she create much your contact you may find if you've planned and prepared successfully that can be quite a quick process

 

the main thing to bear in mind return of factors which can factual record and white background noise interruptions and they need to minimize or avoid them

 

different techniques for recording often a matter of personal style and preference of goodwill is to concentrate on a relaxed and natural performance record everything that a decisions about which version or

 

take to use until afterwards

 

if you plan to edit the work of your screen. The next stage of its production is where you will do it

 

this production can be as simple as Preston the export by not using outlook for Mac all he can involve complex and editing process..."

 

Note - I have 'trained' the Camtasia STT tool to recognise my voice, but it is still very hit-and miss, as you can see! Particular words can be taught (eg JISC Digital Media) which I have not yet done, so this may also improve performance.

 

The original screencast can be seen in its entirety here, and the captions made from the imported text of the script can be enabled by pressing the CC button in the trasnsport bar.

 

I've had a lot of interest from workshop and online surgery attendees in production of captions and subtitles, and captions are a vital tool for providing access. While it's interesting to compare these STT transcripts, which are certainly better than nothing, their results still need heavy editing to be usable - a process which could take just as much time as manual transcription by a fast typist, if not more.

Still the most reliable method I have found (not being a touch typist) is to read the voice-over from a script where possible, and then import and sync it with the screencast in post-production.

Commenting is not available in this channel entry.

Older | Newer