Quality Assurance and Digitisation Projects
This document examines issues relating to quality assurance from a project management perspective. It proposes a four-layer model for assuring quality and looks at some of the more common reasons for problems with output quality. This document is intended to be used by collection managers intending to digitise their resources or by managers of digitisation projects.
Quality assurance (QA) is an integral part of the successful digitisation workflow. It should be established during the planning stage and implemented throughout the project.
The 'quality' of a digital resource can only be defined in terms of its proposed use. A resource that is perfect for one use may well be entirely inappropriate for another.
Many issues will affect the selection of standards for a digitisation project. Is the goal of the project to create a near-perfect likeness of an original or to simply convey the informational content of the original? How will the resource, once digitised be delivered? Unfortunately it is impossible to define one generic standard for 'best' or 'acceptable' quality. Every project will have a unique set of aims and objectives and will need to set quality standards that reflect these aims. Whilst it is true that projects with similar collections and similar aims are likely to use similar standards, these same standards will work at best with a majority of projects but never with all.
Remember, QA is not limited to the creation of the digital resource itself, you must also ensure that associated metadata is useful and accurate, and that it remains so.
Acceptable quality standards or benchmarks should be determined and quantified and agreed with all stakeholders during the planning stage of a digitisation project. They should be based on an assessment of user needs and your initial workflow feasibility testing. This process should be undertaken by the project management team and might well highlight some differences in the users' expectations and the proposed project deliverables. Once established, benchmarks must be documented and included among the project specifications.
Many factors can affect the quality of a digital resource; the condition of original material, the type of digitisation equipment used, the skill of the operator, resolution, post-digitisation optimisation or re-mastering, the choice of file format or the compression algorithms used. Often quality benchmarks are a trade-off between potential quality and expense. For example, higher resolution resources will expand file size and consequently require more storage space. Similarly, checking the quality of resources individually may produce better results than batch processing, but will add to staff time and costs.
Quality assurance can be considered to be an attitude to work, rather than an external testing system. Everyone involved in the project needs to take responsibility for ensuring quality at all times. QA should be pervasive and can usefully be considered in four-layers:
- Process QA
- Automated QA
- Personal Checking
- User-fault reporting
Process faults are normally out of the control of the operator and need to be addressed by the project manager or a technical manager. Process faults typically relate to one or more of the project's documented processes. The entire digitisation project should be guided by such documented specifications, these will typically include:
- Project specifications - management-level detail of what the project will deliver, drawn up after a survey of user needs
- Selection criteria - setting out how materials will be chosen for digitisation
- Workflow manual - step-by-step instructions for all workflow processes, including digitisation, file processing and metadata handling. Workflow manuals are drawn up after feasibility or 'time-and-motion' testing
- Chosen thesauri and vocabularies - setting out the terms to be used and conventions for recording names, dates and places
Although the results of your work will be subjectively experienced, a quality assurance system must aim for as much objectivity as it can. This means making sure that all digitisation hardware is calibrated to a specified standard and workflow processes are automated wherever possible. Calibration and automation are especially important in maintaining consistency where projects involve multiple digitisers or cataloguers. Whilst trying to entirely deskill the operator's job will certainly be counterproductive, it should also be realised that human error is likely to be the biggest cause of faults and any appropriately applied automation may improve the quality of output. Areas where automated QA can be successfully achieved include:
- File management - a digital asset management system can be used to name and keep track of digital resources and their metadata. It can also provide an audit trail, recording what has been done to a file, when and by whom.
- Creation of surrogates – files can often be reliably transcoded in batches for delivery purposes.
- Metadata collection - some metadata can be derived automatically from the management system. If some metadata already exists in an electronic form, it should be migrated rather than re-keyed, to avoid introducing errors. Similarly, if new data is required, typing errors can be minimised by using check-boxes, drop-down lists and spell-checking facilities.
Many digitisation projects have to work at a rate which might well be faster than the operators will (at least at first) be comfortable with.
However good the QA process is and however much automation is introduced, there is still a need for all output to be checked and 'signed off' before final release. Mistakes are inevitable, and the QA system must be seen as a filter that gradually refines the quality of the material until it meets the required standards. Like proofreading, work should be signed off by someone other than the person who originally processed the file or metadata.
This signing off system should be built within the project workflow. And should enable the team to know that not only has the quality of every file been checked, but also by whom and when it was signed off. This piece of metadata will form a fundamental part of the QA audit history.
Despite the best efforts made, a few faults will find their way into the digital collection. Once delivery is underway it is good practice to provide a method for the user to report any error back to the collection manager or digitisation team so that they can be rectified as soon as possible.
Details of any files or 'exceptions' which fail to meet a QA benchmark should be recorded and any patterns or similarities between them mentioned in periodic QA reports. These reports can be used by project managers to rectify any faults attributable to the digitisation or cataloguing workflows.
A fault report system should be implemented that allows faults to be recorded both within the project team before delivery and also by the end user after delivery. Once these faults have been recorded within the system they can be checked by the project team, addressed and potentially fixed and re-signed off for delivery.
Depending on the nature of the project and the type of digital file, this will mean collecting various pieces of data, the simple example below is from a project dealing with digital sound files.
Table 1.1 Example end-user error report form
|End-user error report form|
|File name or identifier||AD0000388.wav|
|Nature of the fault||Very quiet|
|How to fix (if known)||Make it louder!|
|Name of fault reporter||Jill Townsend|
Table 1.2 Example in-project exception handling form
|Exception handling form|
|Checked by||Alex Brand|
|Barred from delivery until addressed||Yes|
|Action Taken||+ 5dB increase to gain|
|New sign off date||13/11/08|
This metadata will create a 'QA audit history' for all files and their faults within the digitisation project. The project manager can then use this information within periodic QA reports. Exceptions can often be described as either 'process' or 'operator' faults. Process faults might be brought about by faulty hardware or software, low quality original materials (either analogue maters or legacy digital files), or inaccurate original catalogue information. Operator faults occur within the day-to-day digitisation workflow and are attributable to human error.
A QA system is only as good as the people who put it into operation. The whole team must be able to view and assess the quality of data as it moves through the workflow, compare this data against benchmarks and easily re-visit any output which is considered to have a fault.
Figure 1.3 Suggested quality assurance workflow diagram