Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Different Goals of Digitization

Reference, Research images

At a low level of digitization of print materials is a simply derived reference image that can be created using automated scanners or smartphones, with no intended determination for long term preservation. Scanning devices or smartphones require no specialized skill in lighting, composition or focusing as devices automatically determine settings. This can be a very useful image but is not optimized for OCR or other high level online research of materials. Files created by these devices (PDF, JPEG, PNG) are not intended for optimized enhancement and are often low-resolution, ideal for reference, speedy transfer and portability, but insufficient for quality reproduction.

At a high level of digitization of cultural heritage materials is a skillfully derived research image created at documented preservation standards informed by best practices specifications that meet or exceed FADGI (Federal Agencies Digitization Guidelines Initiative) standards. This type of digitization requires the skilled use of high-resolution photographic equipment. The photographer will use lighting designed for cultural heritage imaging and must use professional judgment to properly set exposure, illuminate, and compose each photograph. In addition, the camera, lighting, and display monitor must be calibrated regularly. File formats created using this equipment are lossless (RAW, TIFF) allowing for optimized enhancement and captured at equipment-capable resolutions suitable for high-quality reproduction.

While both types of images may be ingested to the PUL repository, when creating research images it is desirable to follow best practices recommendations. Staff will need to make a judgement of what quality is appropriate to the material being digitized and its intended use. Additional considerations are the availability of equipment, timeframe, quality of the source material, and storage costs. Some of these issues are already accounted for in the guidelines, such as using the lower-resolution "Optimized for OCR" standards for text-based material, as opposed to the "Special Collections on Paper and Film" standards.

File/Directory naming

All image files should use an 8.3 naming convention: eight digits, numeric, sequential, padded with leading zeros followed by a lowercase, three-character file extension, e.g. 00000013.tif, ensuring consistent and relevant image order. Directory names and structure should reflect the collection.

...

Some materials, such as audiovisual materials or ephemera, may be named corresponding to a barcode or other unique identifier assigned to each physical asset. This may also include indicators about the side (for bilateral media) and derivative status appended at the end. For example: 32101047381338_1_pm.wav (where 32101047381338 = barcode, 1 = side 1, pm = preservation master, and .wav = file extension).

Different Goals of Digitization

When digitizing to support access or to fulfill user requests, it is generally desirable to follow these standards and ingest the content into the digital repository so it can be reused to avoid having to re-digitize materials in the future.  However, staff will need to make the judgement call of what quality is appropriate, based on available equipment, timeframe, quality of the source material, and storage costs.  Some of these issues are already accounted for in the guidelines, such as using the lower-quality "Optimized for OCR" standards for mostly textual material, as opposed to the "Special Collections on Paper and Film" standards.

Metadata Standards

In general, descriptive metadata should be created prior to digitization. At minimum, there must be a unique identifier, such as a metadata management system ID or a Finding Aids component ID, connecting digitized content to a metadata record.

...