Digitization Guidelines for External Ephemera Contributors
Welcome!
This document will walk you through a general example of Princeton University Library’s ephemera digitization best practices for external contributors used to prepare digital images for ingest into PUL’s digital repository (Figgy). If you have questions or concerns, do not hesitate contacting us and your PUL collaborator!Â
Digitization Specifications
Digitization of all Special Collections Rare Book and Manuscript materials ideally will occur in accordance with Princeton University Library’s best practices. Specifications for other paper-based material can be found on the webpage for Digitization for Special Collections on Paper and Film. These standards suggest best practices for digitizing paper-based material and film/glass originals that are slated for intensive visual study. Digitization standards for these types of material prioritize legibility, artistic creation, and historical significance, requiring deep zoom and resolution of fine detail. While material digitized according to these standards may be in good condition, it may be vulnerable due to inherent vice in the medium (such as brittle paper or deterioration of a film substrate). Conservation assessment is recommended prior to digitization, if possible. The result of the assessment may determine a need for treatment before digitization, after, or both.
Paper-based digitization standards for preservation
Paper based items should be photographed or scanned to 7200 pixels on the long axis, 500 PPI, minimum, generally not to exceed 9600 pixels long. This allows for predicting long-term storage needs based on the number of items in a collection or project. There are exceptions to this, such as scrolls, where 7200-9600 is appropriate for the short axis.
OR
We require 500 ppi 24-bit RGB color TIFF v.6.0 images (not multi-page). A JHOVE audit of the deliverable files should be sent with each batch. Imaging equipment must be calibrated to ensure proper image density and color balance that faithfully reproduces the original material. The Library's naming conventions and instructions for directory structures must be followed.
- JHOVE: to be utilized to validate all files prior to delivery to PUL. JHOVE results will be included in each delivery for cross referencing. JHOVE version 1.24 is currently available from the Open Preservation Foundation (https://jhove.openpreservation.org/)
Image Capture and deliverables
- All digitization: 500ppi uncompressed TIFF v6.0, in 24-bit RGB color, Adobe 1998 colorspace attached.
- Imaging equipment to be calibrated to ensure proper image density and color balance.Â
- A single image file of the calibration target may be included with each batch of material.Â
- Deliverable images in Figgy are Pyramidal TIFF images with JPEG compression.
File/Directory naming
All image files will use an 8.3 naming convention: eight digits, numeric, sequential, padded with leading zeros followed by a lowercase, three-character file extension, e.g. 00000013.tif, ensuring consistent and relevant image order. Directory names and structure should reflect the collection described in the provided template. For archival materials described at the item level as Figgy Ephemera, the directory structure generally follows the Ephemera structure:Â
- Directory to be named for the Box/Folder number entry on the Ephemera Template. Example:Â
- VAD6896-U-00001
- Why is this important to me and my collection? We in IT depend on our collections managers and subject specialists (like you!) to help us connect your images to the correct metadata in our digital repository. If your unique identifier for the metadata matches the unique identifier for the item that was digitized, then we know how to match the digital images to your metadata!Â
- The file will contain the JHOVE output, and will be named by the barcode/Project Unique Identifier entry followed by .jhove.xml. Example:
- VAD6896-U-00001.jhove.txt. This file(s) should remain outside of the directories.
- Inside the directory, there are to be directories named after the barcodes/Project Unique Identifiers found in the Ephemera Template for the project. Inside of these folder directories are the image files, with zero padded integer sequence base names. Example:
- VAD6896-U-00001.jhove.txt
- VAD6896-U-00001/
- RC_LLMC_000001/
- 00000001.tif
- 00000002.tif
- Etc
- RC_LLMC_000002/
- 00000001.tif
- 00000002.tif
- Etc
Metadata needs for photography
The most efficient workflow for PUL is for descriptive metadata to be created prior to digitization. At minimum, there must be a unique identifier connecting digitized content to a metadata record.
Figgy Collections/Ephemera Project assignment(s)
All images for this project will be added to the following Figgy/DPUL Collections:
- Please Specify preferred slug for each collection.
Specifications for Text-based images optimized for OCR
See the following (full Specification Listing may be referenced here)
Archival Master File | Optical capture resolution | Bit depth | Embedded color/gray profile | Notes | Production Master File Format | |
*Text-based | *Image specs optimized for OCR | |||||
Printed computer documents | Uncompressed TIFF v.6 | 300-400 PPI | 8/24 | Adobe RGB (1998)/Gray Gamma 2.2 | Size of text should be considered when determining resolution | JPEG 2000 (see recipes https://github.com/pulibrary/figgy/blob/master/config/config.yml#L12) Tiled multi-resolution TIFF (Pyramidal TIFF) |
Typed documents | Uncompressed TIFF v.6 | 300-400 PPI | 8/24 | Adobe RGB (1998)/Gray Gamma 2.2 | Size of text should be considered when determining resolution | |
Printed publication matter | Uncompressed TIFF v.6 | 300-400 PPI | 8/24 | Adobe RGB (1998)/Gray Gamma 2.2 | Size of text should be considered when determining resolution | |
Printed matter on microform | Uncompressed TIFF v.6 | *3500 PPI | 8 | Gray Gamma 2.2 | *Accounts for magnification ratio |
Specifications for Special Collections on paper
See the following (full Specification Listing may be referenced here)
Special collections on paper | Master File format | Optical capture resolution | Bit depth | Embedded color/gray profile |
Bound material | Uncompressed TIFF v.6 | 500 PPI minimum | 24 | Adobe RGB (1998) |
Unbound material | Uncompressed TIFF v.6 | 500 PPI minimum | 24 | Adobe RGB (1998) |
Posters | Uncompressed TIFF v.6 | 300 PPI minimum to yield 7200 px minimum on long axis | 24 | Adobe RGB (1998) |
Scrolls | Uncompressed TIFF v.6 | Minimum 7200 px on short axis | 24 | Adobe RGB (1998) |
Photographs | Uncompressed TIFF v.6 | Minimum 7200 px on long axis | 24 | Adobe RGB (1998) |
Qualitative Methods
Primary Stakeholders will perform a 100% quality assurance on all TIFF image files after they have been ingested into Figgy. This quality assurance phase will include the evaluation of the overall quality and integrity of each image file. Any files that do not meet overall image quality requirements, will be noted, rescanned, and reinserted into the stream of existing image files. Rescans will be typically reshot within [ TIME FRAME ], depending upon the number.
Image file errors
If Primary Stakeholders find errors in the ingested images prior to Final Acceptance, Stakeholders may require the Digital Studio or external digitization vendor to correct the errors.
Communications & contact information
PUL Ephemera & Controlled Vocabularies Questions
Slack channel
- Appropriate Subject Librarian
PUL Digitization Best Practices Questions and Repository (Figgy) Ingest
Roel Munoz, Library Digital Imaging Manager
- rmunoz@princeton.edu
- Slack: Roel
PUL Project Management, Figgy and DPUL Application Questions
Kim Leaman, Library IT Project Manager
- kleaman@princeton.edu
- Slack: Kim (Kelea)
- Digital Library Services Team (DLS)
- #digital_library Slack channel