General Guidelines and Information

Different Types of Digitization

Reference Versus Research Images

At one level of digitization of print materials is a quickly derived Reference image that can be created using automated flatbed or orbital scanners. These devices require no specialized skill in lighting, composition or focusing as devices automatically determine settings. This can be a very useful image, often made from text-based material slated for optical character recognition (See Digitization Optimized for OCR), but not optimized for deep zoom and detailed online research of materials. Files created by these devices (PDF, JPEG) are not intended for optimized enhancement and are often low-resolution, ideal for reference, speedy transfer, and portability, but insufficient for quality reproduction.

At another level of digitization of cultural heritage materials is a skillfully derived Research image created at documented preservation standards informed by best practices specifications that meet or exceed FADGI (Federal Agencies Digitization Guidelines Initiative) standards. This type of digitization requires the skilled use of high-resolution photographic equipment (See PUL Imaging). The photographer will use lighting designed for cultural heritage imaging and must use professional judgment to properly set exposure, illuminate, and compose each photograph. In addition, the camera, lighting, and display monitor must be calibrated regularly. File formats created using this equipment are lossless (RAW, TIFF) allowing for optimized enhancement and captured at equipment-capable resolutions suitable for high-quality reproduction.

As both types of images may be ingested to the PUL repository it is desirable to follow best practices recommendations. Staff will need to consider what type of image is appropriate to the material being digitized and its intended use. Additional considerations are the availability and functionality of equipment, timeframe, quality of the source material, and storage costs.

File/Directory naming

All image files should use an 8.3 naming convention: eight digits, numeric, sequential, padded with leading zeros followed by a lowercase, three-character file extension, e.g. 00000013.tif, ensuring consistent and relevant image order. Directory names and structure should reflect the collection.

  • For material that has been described bibliographically, the directory name containing the image files should be the metadata management system ID number, the 001 field in the catalog record: 6124186 (MMS ID)/00000001.tif (file names); i.e. 6124186/00000001.tif. An intermediate directory is appropriate for multi-volume items. For example: 6124186 (MMS ID)/01 (volume number)/00000001.tif (file names). Directory names should not have punctuation or spaces.
  • For archival materials described in a finding aid, the directory structure generally follows the finding aid structure. The variation is that the collection code and component identification numbers are separated by an underscore and are used in place of an MMS ID. For example: C0744 (collection code)/c002 (component ID)/00000001.tif (file names) should be organized as C0744/C0744_c002/00000001.tif.

Some materials, such as audiovisual materials or ephemera, may be named corresponding to a barcode or other unique identifier assigned to each physical asset. This may also include indicators about the side (for bilateral media) and derivative status appended at the end. For example: 32101047381338_1_pm.wav (where 32101047381338 = barcode, 1 = side 1, pm = preservation master, and .wav = file extension).

Metadata needs

The most efficient workflow for PUL is for descriptive metadata to be created prior to digitization. At minimum, there must be a unique identifier, such as a metadata management system ID or a Finding Aids component ID, connecting digitized content to a metadata record.

When to Outsource Digitization

The Digital Imaging Studio can digitize many types of materials, and should be used for rare, valuable, and/or fragile material.  DPSG issues a call for digitization projects three times a year, and a small number of items can be digitized outside of the proposal period by the studio at the discretion of the studio manager.

There are a number of factors that tend to make outsourcing digitization more practical in certain circumstances:

  • AV digitization for any formats beyond what the Mendel Music Library can support
  • Mass paper digitization, or any digitization with a short timeline and a large volume
  • Dedicated external funding for digitization, which may be more practical than hiring staff and purchasing equipment

Schedule for Review and Updating

This documentation will be reviewed at the end of each calendar year and, when necessary, designate a person or working group to make updates.