Digitization Guidelines for External Ephemera Contributors

Welcome!

This document will walk you through a general example of Princeton University Libraryā€™s ephemera digitization best practices for external contributors used to prepare digital images for ingest into PULā€™s digital repository (Figgy). If you have questions or concerns, do not hesitate contacting us and your PUL collaborator!Ā 


Digitization Specifications

Digitization of all Special Collections Rare Book and Manuscript materials ideally will occur in accordance with Princeton University Libraryā€™s best practices. Specifications for other paper-based material can be found on the webpage for Digitization for Special Collections on Paper and Film. These standards suggest best practices for digitizing paper-based material and film/glass originals that are slated for intensive visual study. Digitization standards for these types of material prioritize legibility, artistic creation, and historical significance, requiring deep zoom and resolution of fine detail. While material digitized according to these standards may be in good condition, it may be vulnerable due to inherent vice in the medium (such as brittle paper or deterioration of a film substrate). Conservation assessment is recommended prior to digitization, if possible. The result of the assessment may determine a need for treatment before digitization, after, or both.

Paper-based digitization standards for preservation

Paper based items should be photographed or scanned to 7200 pixels on the long axis, 500 PPI, minimum, generally not to exceed 9600 pixels long. This allows for predicting long-term storage needs based on the number of items in a collection or project. There are exceptions to this, such as scrolls, where 7200-9600 is appropriate for the short axis.

OR

We require 500 ppi 24-bit RGB color TIFF v.6.0 images (not multi-page). A JHOVE audit of the deliverable files should be sent with each batch. Imaging equipment must be calibrated to ensure proper image density and color balance that faithfully reproduces the original material. The Library's naming conventions and instructions for directory structures must be followed.

  • JHOVE: to be utilized to validate all files prior to delivery to PUL. JHOVE results will be included in each delivery for cross referencing. JHOVE version 1.24 is currently available from the Open Preservation Foundation (https://jhove.openpreservation.org/)

Image Capture and deliverables

  • All digitization: 500ppi uncompressed TIFF v6.0, in 24-bit RGB color, Adobe 1998 colorspace attached.
  • Imaging equipment to be calibrated to ensure proper image density and color balance.Ā 
    • A single image file of the calibration target may be included with each batch of material.Ā 
  • Deliverable images in FiggyĀ  are Pyramidal TIFF images with JPEG compression.

File/Directory naming

All image files will use an 8.3 naming convention: eight digits, numeric, sequential, padded with leading zeros followed by a lowercase, three-character file extension, e.g. 00000013.tif, ensuring consistent and relevant image order. Directory names and structure should reflect the collection described in the provided template. For archival materials described at the item level as Figgy Ephemera, the directory structure generally follows the Ephemera structure:Ā 

  • Directory to be named for the Box/Folder number entry on the Ephemera Template. Example:Ā 
    • VAD6896-U-00001
    • Why is this important to me and my collection? We in IT depend on our collections managers and subject specialists (like you!) to help us connect your images to the correct metadata in our digital repository. If your unique identifier for the metadata matches the unique identifier for the item that was digitized, then we know how to match the digital images to your metadata!Ā 
  • The file will contain the JHOVE output, and will be named by the barcode/Project Unique Identifier entry followed by .jhove.xml. Example:
    • VAD6896-U-00001.jhove.txt. This file(s) should remain outside of the directories.
  • Inside the directory, there are to be directories named after the barcodes/Project Unique Identifiers found in the Ephemera Template for the project. Inside of these folder directories are the image files, with zero padded integer sequence base names. Example:
    • VAD6896-U-00001.jhove.txt
    • VAD6896-U-00001/
      • RC_LLMC_000001/
        • 00000001.tif
        • 00000002.tif
        • Etc
      • RC_LLMC_000002/
        • 00000001.tif
        • 00000002.tif
        • Etc

Metadata needs for photography

The most efficient workflow for PUL is for descriptive metadata to be created prior to digitization. At minimum, there must be a unique identifier connecting digitized content to a metadata record.

Figgy Collections/Ephemera Project assignment(s)

All images for this project will be added to the following Figgy/DPUL Collections:

  • Please Specify preferred slug for each collection.

Specifications for Text-based images optimized for OCR

See the following (full Specification Listing may be referenced here)


Archival Master FileĀ Optical capture resolutionBit depthEmbedded color/gray profileNotesProduction Master File Format
*Text-based



*Image specs optimized for OCR
Printed computer documentsUncompressed TIFF v.6300-400 PPI8/24Adobe RGB (1998)/Gray Gamma 2.2Size of text should be considered when determining resolution

JPEG 2000 (see recipes https://github.com/pulibrary/figgy/blob/master/config/config.yml#L12)

Tiled multi-resolution TIFF (Pyramidal TIFF)

Typed documentsUncompressed TIFF v.6300-400 PPI8/24Adobe RGB (1998)/Gray Gamma 2.2Size of text should be considered when determining resolution
Printed publication matterUncompressed TIFF v.6300-400 PPI8/24Adobe RGB (1998)/Gray Gamma 2.2Size of text should be considered when determining resolution
Printed matter on microformUncompressed TIFF v.6*3500 PPI8Gray Gamma 2.2*Accounts for magnification ratio

Specifications for Special Collections on paper

See the following (full Specification Listing may be referenced here)

Special collections on paper

Master File format

Optical capture resolution

Bit depth

Embedded color/gray profile

Bound material

Uncompressed TIFF v.6

500 PPI minimum

24

Adobe RGB (1998)

Unbound material

Uncompressed TIFF v.6

500 PPI minimum

24

Adobe RGB (1998)

Posters

Uncompressed TIFF v.6

300 PPI minimum to yield 7200 px minimum on long axis

24

Adobe RGB (1998)

Scrolls

Uncompressed TIFF v.6

Minimum 7200 px on short axis

24

Adobe RGB (1998)

Photographs

Uncompressed TIFF v.6

Minimum 7200 px on long axis

24

Adobe RGB (1998)

Qualitative Methods

Primary Stakeholders will perform a 100% quality assurance on all TIFF image files after they have been ingested into Figgy. This quality assurance phase will include the evaluation of the overall quality and integrity of each image file. Any files that do not meet overall image quality requirements, will be noted, rescanned, and reinserted into the stream of existing image files. Rescans will be typically reshot within [ TIME FRAME ], depending upon the number.

Image file errors

If Primary Stakeholders find errors in the ingested images prior to Final Acceptance, Stakeholders may require the Digital Studio or external digitization vendor to correct the errors.


Communications & contact information

  • PUL Ephemera & Controlled Vocabularies Questions

  • PUL Digitization Best Practices Questions and Repository (Figgy) Ingest

  • PUL Project Management, Figgy and DPUL Application Questions