Last updated: 10 November 2008
Published in:
Digitising analogue media
Tags:
analogue collections |
digital preservation |
digitisation
This document looks at points to consider when developing a selection criteria for digitisation. Some of the points made here will not be relevant to all projects and collections, but the document has been designed to serve as a useful overview and to stimulate some further thinking and research. It is intended to be of use to resource managers who intend to digitise all or part of their collection.
Before digitising anything, it is important to establish procedures and criteria for selecting the material to be digitised. The decisions you make will depend on the particular aims and resources of your project, but it is vital that you develop some clearly defined criteria to guide the selection process.
It's important to note that in some cases there will be two steps to your selection. You might first be selecting a collection to digitise from among a number of possible collections and then choosing individual items from within that collection. This paper concentrates on the selection of items within collections, but many of the issues and principles will be relevant whether you're choosing a collection to digitise or selecting individual items.
Before you even begin to think about selection, you must first give some thought to why you're embarking on digitisation at all. It's all too easy to get caught up in the idea that "everyone is digitising, so we should be too". This is not, by itself, a good enough reason to undertake digitisation. Some better reasons are discussed in Deciding to Digitise. Rationales for digitisation will vary according to each collection and context. For some organisations the primary goal will be to widen access to a resource; for others, it might be to reunite a dispersed collection, or to reduce wear and tear on fragile originals by offering digital versions instead. Whatever your particular rationale, it's important that you're very clear about it, since it will influence the selection criteria you develop.
Of course, one approach to selection is to not select – to simply digitise everything within a collection. While it would be unwise to take this as a default position, for some collections it will indeed be appropriate to adopt a comprehensive approach. This is particularly the case where the collection is small and valuable, or its integrity or usefulness will be compromised if it is not captured in its entirety. In a few situations, there may also be some cost or efficiency reasons for capturing everything. For example, when outsourcing the digitisation of video materials, it will increase costs to only capture some defined parts of a tape. In these circumstances it may be preferable to digitise everything and then undertake a de-selection once the digital copy comes back to you.
Even if you decide to undertake the comprehensive digitisation of a collection, it will still be important to prioritise the material – which will require you to develop some sort of criteria. One common approach in these circumstances is to prioritise according to user demand and then systematically fill in the gaps during the low-demand periods. Another is to begin by providing a representative sample of the collection and then progressively fill out the content.
For most digitisation programmes or projects selection will be a necessary and important part of the digitisation workflow. The purpose of this paper is to help you to identify factors you should take into account when developing your approach.
In order to develop an effective approach to selection, you must have a good understanding of the nature of your user and their needs, and the wider context within which your collection and its users exist. In addition to helping you make good selection decisions, these elements will greatly assist you in identifying some of the risks involved in digitisation and determining the benefits.
It's vital that you have a good understanding of the strengths and limitations of your collection, as these will influence which material you decide to include or exclude. In establishing this it will be important to consult the expertise of curators, collection managers and other staff who are involved with the collection on a day-to-day basis. Their detailed knowledge and advice will be of enormous help in understanding the material and its potential. Here are some specific things you may wish to investigate:
In addition to evaluating your collection, it is important that you undertake an assessment of your users and what they need. This might take a number of different forms (survey, focus group, analysis of previous usage), but it must establish who your potential users are, what parts of the collection they're likely to be most interested in, and how they might want to use your resources once digitised. You may find it helpful to make a distinction between core users who must be supported by your digital collection, and more peripheral users, who might be supported if there are sufficient resources. Here are some specific questions you may want to consider:
Your collection does not exist in a vacuum. There are wider things to consider that are likely to have some bearing on your selection, such as institutional priorities, trends in teaching or research, technological change, or the legal environment. Here are some things you might wish to consider:
The previous section has identified several factors that are going to impact on your selection. Many of these are collection- or context-specific, such as subject coverage, potential for commercial exploitation, or your collection's synergy with research or teaching priorities. However, there are some generic factors that are likely to have significant impact on your selection. This section pulls out a few of the main ones, discussing them in a bit more detail. Note that the factors here are predominantly exclusion criteria – i.e. reasons for not selecting certain items.
Copyright is one of the most important issues to address and will have a major influence on your selection and your project costs. Where your collection is fairly uniform you may be able to make a fairly quick assessment of the status of the material (e.g. if created by one person or constrained to a narrow period of time). Where the collection is more diverse, you may need to consider the copyright status of each individual item, which is likely to take some time – although you could always check a sample to determine the likely proportions of copyright material.
Copyright clearance is likely to involve considerable time and money, so you will need to decide whether it's important or worthwhile undertaking such effort for the items in your collection. If you decide that it is, then copyright may be a factor in your prioritisation of material for digitisation. If you decide that you cannot afford to undertake clearance, then copyright will become an important exclusion factor within your selection criteria. Either way, an item's copyright status should be one of the first things you consider when undertaking selection.
The condition of the original material will also be an important factor in your selection. Some of the material might be too fragile to risk digitising, or it might require expensive conservation work before it can be safely captured.
Where material is fragile, but stable in its current form, then you will need to consider whether the risk of digitisation is worth the gain. Sometimes digitisation can be viewed as a way of supporting the preservation of fragile originals (by reducing wear and tear through use) – but this might be undermined if the digitisation process itself subjects the material to greater stress than normal handling. You also need to bear in mind that in bringing more attention to your collection, your digitisation project might actually increase demand for access to the originals.
Where your original material is actively deteriorating and cannot be stabilised (e.g. magnetic audio tapes with ‘sticky shed’ syndrome) then your decision may be made on a different basis. If a resource is going to disappear, whether captured or not, it could be quite reasonable to undertake digitisation even though you know this is likely to hasten its decline.
Digitisation is a costly activity, so you might decide that this money and effort is better spent on unique material rather than duplicates. You might seek to avoid duplication with another, existing digital collection, or you might de-duplicate within your own collection if it contains lots of copies of the same item. This is not always straightforward: you might need to compare two or more copies to see which is going to be more suitable to digitise. You may need to make a decision between capturing an original item or a pre-existing surrogate. There may also be similar, near-duplicates to consider.
You will also want to ensure there is adequate information about the items in your collection to enable retrieval and understanding of the resource. Where there is little or no original metadata (e.g. no known creator, unknown subject matter, unknown dates, etc.) you should ask yourself whether the item is worth digitising – or worth the expense of having experts provide you with a suitable identification. It is unlikely that there is going to be a clear-cut answer to this problem. One approach might be to specify a minimum amount of metadata required for material to be digitised, and then address separately any items that don't make the grade.
Having carefully considered your users, collection, the wider context and some specific problems such as copyright or duplication, you are now well placed to undertake the selection. However, rather than proceeding intuitively, it would be sensible to produce a document outlining the basis on which your material will be selected or excluded. Not only will this assist with the practicalities of undertaking the selection process, but it will minimise any potential for "scope creep" – of you allowing (or being pushed into) an enlargement of scope for your digitisation project.
Your selection criteria could be documented in a number of different ways: such as a set of questions to answer or a decision tree or matrix. Selection criteria should be made explicit within your Project Specifications document. The diagram below offers a simplified decision tree from a photography digitisation project as an example.
Figure 1. Decision tree (simplified)

A non-graphical text version of this diagram is also available.
The order of your questions and assessments is likely to differ according to collection and context. If you have existing tools (e.g. catalogues) that will enable you to, very quickly, rule material in or out, then it would be sensible to employ these first – before any criteria that require you to pull things out of boxes and examine closely for their physical condition.
You will also need to think about the order in which you are going to undertake the selection. You might be doing this from a printed list or by systematically working through shelves or cabinets. Again, the answer will very much depend upon your type of collection and its context. For example, if your collection is clearly labelled and physically ordered by creation date or subject, then there may be some advantages to working systematically from the shelves. Where it is more randomly arranged (e.g. acquisition date) you might prefer to begin with indexes and catalogues (if available).
It may be worthwhile undertaking a small pilot selection to help you identify any important criteria you have overlooked. This will also help you to see whether you are including or excluding too much content and where you may need to redraw the lines.
You may find that you're actually employing multiple selection criteria within a digitisation project, especially if it's divided into a number of phases. You might, for example, have a first phase which aims to capture a representative sample of the collection, followed by further phases that provide more depth. Each of these phases is likely to require unique selection criteria.
While a written selection criteria will be important in helping avoid diversion or scope creep, it should not be set in stone. As you become more familiar with the collection and with the possibilities and constraints of the digitisation process, you may find that you need to make some adjustments to your selection process. This is normal and likely to lead to revisions to your criteria.
If you only digitised a selection of items rather than a complete collection then make sure you're explicit about your choices. It will be important to make your criteria clear to your users. They need to understand what is included within the digital collection and how this compares with the collection as a whole. This is particularly important where the exclusion of items due to copyright or fragility has led to a digital collection with quite a different balance to the physical collection.
Last updated: 10 November 2008
Published in:
Digitising analogue media
Tags:
analogue collections |
digital preservation |
digitisation