Free help and advice to the UK Further and Higher Education community

Helpdesk

Selection Procedures for Digitisation

Last updated: 10 November 2008
Published in: Digitising analogue media
Tags: analogue collections | digital preservation | digitisation

Comment icon Comments (0)

Summary

This document looks at points to consider when developing a selection criteria for digitisation. Some of the points made here will not be relevant to all projects and collections, but the document has been designed to serve as a useful overview and to stimulate some further thinking and research. It is intended to be of use to resource managers who intend to digitise all or part of their collection.

Contents

  1. Introduction
  2. Understanding your collection, users' needs, and context
  3. Some key issues to address
  4. Getting practical
  5. Further reading

1. Introduction

Before digitising anything, it is important to establish procedures and criteria for selecting the material to be digitised. The decisions you make will depend on the particular aims and resources of your project, but it is vital that you develop some clearly defined criteria to guide the selection process.

It's important to note that in some cases there will be two steps to your selection. You might first be selecting a collection to digitise from among a number of possible collections and then choosing individual items from within that collection. This paper concentrates on the selection of items within collections, but many of the issues and principles will be relevant whether you're choosing a collection to digitise or selecting individual items.

Why digitise at all?

Before you even begin to think about selection, you must first give some thought to why you're embarking on digitisation at all. It's all too easy to get caught up in the idea that "everyone is digitising, so we should be too". This is not, by itself, a good enough reason to undertake digitisation. Some better reasons are discussed in Deciding to Digitise. Rationales for digitisation will vary according to each collection and context. For some organisations the primary goal will be to widen access to a resource; for others, it might be to reunite a dispersed collection, or to reduce wear and tear on fragile originals by offering digital versions instead. Whatever your particular rationale, it's important that you're very clear about it, since it will influence the selection criteria you develop.

Why not just digitise everything?

Of course, one approach to selection is to not select – to simply digitise everything within a collection. While it would be unwise to take this as a default position, for some collections it will indeed be appropriate to adopt a comprehensive approach. This is particularly the case where the collection is small and valuable, or its integrity or usefulness will be compromised if it is not captured in its entirety. In a few situations, there may also be some cost or efficiency reasons for capturing everything. For example, when outsourcing the digitisation of video materials, it will increase costs to only capture some defined parts of a tape. In these circumstances it may be preferable to digitise everything and then undertake a de-selection once the digital copy comes back to you.

Even if you decide to undertake the comprehensive digitisation of a collection, it will still be important to prioritise the material – which will require you to develop some sort of criteria. One common approach in these circumstances is to prioritise according to user demand and then systematically fill in the gaps during the low-demand periods. Another is to begin by providing a representative sample of the collection and then progressively fill out the content.

For most digitisation programmes or projects selection will be a necessary and important part of the digitisation workflow. The purpose of this paper is to help you to identify factors you should take into account when developing your approach.

2. Understanding your collection, users' needs, and context

In order to develop an effective approach to selection, you must have a good understanding of the nature of your user and their needs, and the wider context within which your collection and its users exist. In addition to helping you make good selection decisions, these elements will greatly assist you in identifying some of the risks involved in digitisation and determining the benefits.

What are the important characteristics of your collection?

It's vital that you have a good understanding of the strengths and limitations of your collection, as these will influence which material you decide to include or exclude. In establishing this it will be important to consult the expertise of curators, collection managers and other staff who are involved with the collection on a day-to-day basis. Their detailed knowledge and advice will be of enormous help in understanding the material and its potential. Here are some specific things you may wish to investigate:

  • Size and scope of the collection
    How large is the collection? What does it cover in terms of material, time periods, or subject matter?
  • Uniqueness of the collection and its content
    Is this collection fairly common or does it contain rare or unique items? Is there any duplication within the collection or with other collections you are aware of?
  • Comprehensiveness of the collection
    Does your collection provide a complete or coherent representation of this type of material or subject matter? Or is its coverage patchy or eclectic?
  • Value of the collection
    How much is the collection worth? Not just in monetary terms, but also in ways that are harder to measure, such as its value in teaching or research. Will the collection be more valuable in a digital form? Would commercial exploitation of the digitised collection be possible?
  • Intellectual property rights in the collection
    What is the copyright status of the material within the collection? Is it out of copyright or will it require further investigation or the clearance of rights?
  • Physical condition: suitability for digitisation
    What condition is the material in? Is it safe and suitable for capturing using the technologies that are currently available and within your budget?
  • Physical condition: risk of deterioration
    Is the material in a stable condition or is it deteriorating? If it's deteriorating, then it might be important to digitise it now even if this will hasten its decline and there might be more 'intellectually worthy' resources to digitise.
  • Availability of metadata
    What information do you have about the items in your collection? Will this be sufficient for you and your user's needs or will you need to undertake some further research or effort to create metadata?

Who are your users and what do they need?

In addition to evaluating your collection, it is important that you undertake an assessment of your users and what they need. This might take a number of different forms (survey, focus group, analysis of previous usage), but it must establish who your potential users are, what parts of the collection they're likely to be most interested in, and how they might want to use your resources once digitised. You may find it helpful to make a distinction between core users who must be supported by your digital collection, and more peripheral users, who might be supported if there are sufficient resources. Here are some specific questions you may want to consider:

  • Who is likely to use the digital resource?
    Will it be the same groups who now use the collection, or will the digitised materials appeal to a wider audience? If the audience is small and/or the resource has a limited lifetime, you may need to consider how much investment the collection is worth.
  • Will digitisation actually help these current and potential users?
    Will it enable them to access and use the collection more easily? Will it facilitate new ways of understanding or using the collection? If there is no gain in terms of access, understanding, or the ability to use the collection then it may not be worth undertaking extensive digitisation.
  • What material would these users like to see prioritised for digitisation?
    Previous usage of the collection will provide some guidance on your user's interests, but they may not be aware of the full scope of your collection or you may discover new users with different interests. Try not to second-guess: there is no substitute for asking your users what they want.
  • Can you anticipate how the type of users and their usage might change over time?
    It's very difficult to predict such changes, but if you can anticipate further usage, then you may want your selection criteria to reflect these.
  • Will digitisation increase demand for the originals?
    Digitisation can be a means of reducing the need to access fragile originals, but it can have the opposite effect if you've only digitised a small selection or are offering low-quality surrogates. Could you cope with increased demand?
  • Will you have an adequate means of controlling access and use of the collection, once digitised?
    It can be more difficult to control what happens to materials once digitised. If control over your collection and its usage is important, then this may be an important factor to consider in your selection.

What about the wider context?

Your collection does not exist in a vacuum. There are wider things to consider that are likely to have some bearing on your selection, such as institutional priorities, trends in teaching or research, technological change, or the legal environment. Here are some things you might wish to consider:

  • Institutional priorities
    How does the digitisation of this collection fit in with the wider aims and objectives of your group, department or institution? Does your selection criteria need to fit within a broader collection development policy? Does the digitisation of this collection need to also serve some political purpose, such as demonstrating value or eliciting further funding?
  • Institutional resources
    What funds, skills and equipment do you have to undertake digitisation? This will be a key factor in determining the kind of material you choose to digitise and the amount you select.
  • External sources of funding
    If you're relying on external funding for your digitisation, then your choice of collections or items will need to fit in with the priorities of funding agencies or meet the requirements of particular funding calls.
  • Educational and research contexts
    Are there any developments or trends within teaching or research that should be informing your selection criteria? For example, are there changes to the curriculum or research priorities that might lead to increased or reduced demand for the collection or some of its content?
  • Legal context
    Your collection exists within a legal context and so you must take into account issues such as copyright and data protection. In addition to understanding the legal status of your collection, you should consider whether there are any changes afoot that might lead you to broaden or narrow your selection (e.g. a new licence or changes to copyright provisions).
  • Technological changes
    Can you identify or anticipate any changes in technology that might impact on your choice of items to digitise. For example, you might decide to delay digitising some items if you can anticipate a new technology that will enable you to carry out the work more quickly, easily or in better quality.
  • External collections
    Are there similar or duplicate collections elsewhere that are already (or about to be) digitised? If some of your material is already available online, you may prefer to target the unique items. Or you may be able to identify some opportunities for collaborative selection and digitisation.

3. Some key issues to address

The previous section has identified several factors that are going to impact on your selection. Many of these are collection- or context-specific, such as subject coverage, potential for commercial exploitation, or your collection's synergy with research or teaching priorities. However, there are some generic factors that are likely to have significant impact on your selection. This section pulls out a few of the main ones, discussing them in a bit more detail. Note that the factors here are predominantly exclusion criteria – i.e. reasons for not selecting certain items.

Copyright

Copyright is one of the most important issues to address and will have a major influence on your selection and your project costs. Where your collection is fairly uniform you may be able to make a fairly quick assessment of the status of the material (e.g. if created by one person or constrained to a narrow period of time). Where the collection is more diverse, you may need to consider the copyright status of each individual item, which is likely to take some time – although you could always check a sample to determine the likely proportions of copyright material.

Copyright clearance is likely to involve considerable time and money, so you will need to decide whether it's important or worthwhile undertaking such effort for the items in your collection. If you decide that it is, then copyright may be a factor in your prioritisation of material for digitisation. If you decide that you cannot afford to undertake clearance, then copyright will become an important exclusion factor within your selection criteria. Either way, an item's copyright status should be one of the first things you consider when undertaking selection.

Physical condition of the original material

The condition of the original material will also be an important factor in your selection. Some of the material might be too fragile to risk digitising, or it might require expensive conservation work before it can be safely captured.

Where material is fragile, but stable in its current form, then you will need to consider whether the risk of digitisation is worth the gain. Sometimes digitisation can be viewed as a way of supporting the preservation of fragile originals (by reducing wear and tear through use) – but this might be undermined if the digitisation process itself subjects the material to greater stress than normal handling. You also need to bear in mind that in bringing more attention to your collection, your digitisation project might actually increase demand for access to the originals.

Where your original material is actively deteriorating and cannot be stabilised (e.g. magnetic audio tapes with ‘sticky shed’ syndrome) then your decision may be made on a different basis. If a resource is going to disappear, whether captured or not, it could be quite reasonable to undertake digitisation even though you know this is likely to hasten its decline.

Duplication or near duplication

Digitisation is a costly activity, so you might decide that this money and effort is better spent on unique material rather than duplicates. You might seek to avoid duplication with another, existing digital collection, or you might de-duplicate within your own collection if it contains lots of copies of the same item. This is not always straightforward: you might need to compare two or more copies to see which is going to be more suitable to digitise. You may need to make a decision between capturing an original item or a pre-existing surrogate. There may also be similar, near-duplicates to consider.

Availability of metadata

You will also want to ensure there is adequate information about the items in your collection to enable retrieval and understanding of the resource. Where there is little or no original metadata (e.g. no known creator, unknown subject matter, unknown dates, etc.) you should ask yourself whether the item is worth digitising – or worth the expense of having experts provide you with a suitable identification. It is unlikely that there is going to be a clear-cut answer to this problem. One approach might be to specify a minimum amount of metadata required for material to be digitised, and then address separately any items that don't make the grade.

4. Getting practical

Having carefully considered your users, collection, the wider context and some specific problems such as copyright or duplication, you are now well placed to undertake the selection. However, rather than proceeding intuitively, it would be sensible to produce a document outlining the basis on which your material will be selected or excluded. Not only will this assist with the practicalities of undertaking the selection process, but it will minimise any potential for "scope creep" – of you allowing (or being pushed into) an enlargement of scope for your digitisation project.

Make sure you put it down in writing

Your selection criteria could be documented in a number of different ways: such as a set of questions to answer or a decision tree or matrix. Selection criteria should be made explicit within your Project Specifications document. The diagram below offers a simplified decision tree from a photography digitisation project as an example.

Figure 1. Decision tree (simplified)

Diagram of decision tree (simplified) for selection criteria. A non-graphical text version is also available: see below

A non-graphical text version of this diagram is also available.

Take an orderly approach to your selection

The order of your questions and assessments is likely to differ according to collection and context. If you have existing tools (e.g. catalogues) that will enable you to, very quickly, rule material in or out, then it would be sensible to employ these first – before any criteria that require you to pull things out of boxes and examine closely for their physical condition.

You will also need to think about the order in which you are going to undertake the selection. You might be doing this from a printed list or by systematically working through shelves or cabinets. Again, the answer will very much depend upon your type of collection and its context. For example, if your collection is clearly labelled and physically ordered by creation date or subject, then there may be some advantages to working systematically from the shelves. Where it is more randomly arranged (e.g. acquisition date) you might prefer to begin with indexes and catalogues (if available).

Be aware that your criteria might need to change

It may be worthwhile undertaking a small pilot selection to help you identify any important criteria you have overlooked. This will also help you to see whether you are including or excluding too much content and where you may need to redraw the lines.

You may find that you're actually employing multiple selection criteria within a digitisation project, especially if it's divided into a number of phases. You might, for example, have a first phase which aims to capture a representative sample of the collection, followed by further phases that provide more depth. Each of these phases is likely to require unique selection criteria.

While a written selection criteria will be important in helping avoid diversion or scope creep, it should not be set in stone. As you become more familiar with the collection and with the possibilities and constraints of the digitisation process, you may find that you need to make some adjustments to your selection process. This is normal and likely to lead to revisions to your criteria.

Conclusion

If you only digitised a selection of items rather than a complete collection then make sure you're explicit about your choices. It will be important to make your criteria clear to your users. They need to understand what is included within the digital collection and how this compares with the collection as a whole. This is particularly important where the exclusion of items due to copyright or fragility has led to a digital collection with quite a different balance to the physical collection.

5. Further reading

Last updated: 10 November 2008
Published in: Digitising analogue media
Tags: analogue collections | digital preservation | digitisation

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

Comments (0)

Post your comment

How was this document useful to you? Do you have any questions?

Name

Email (required, but will not be shown)

URL (optional)


Please note: All comments are reviewed by a moderator for approval