File/Directory naming
All image files should use an 8.3 naming convention: eight digits, numeric, sequential, padded with leading zeros followed by a lowercase, three-character file extension, e.g. 00000013.tif, ensuring consistent and relevant image order. Directory names and structure should reflect the collection.
- For material that has been described bibliographically, the directory name containing the image files should be the bibliographic ID number, the 001 field in the catalog record: 6124186 (bib ID)/00000001.tif (file names); i.e. 6124186/00000001.tif. An intermediate directory is appropriate for multi-volume items. For example: 6124186 (bib ID)/01 (volume number)/00000001.tif (file names). Directory names should not have punctuation or spaces.
- For archival materials described in a finding aid, the directory structure generally follows the finding aid structure. The variation is that the collection code and component identification numbers are separated by an underscore and are used in place of a bib ID. For example: C0744 (collection code)/c002 (component ID)/00000001.tif (file names) should be organized as C0744/C0744_c002/00000001.tif.
Some materials, such as audiovisual materials or ephemera, may be named corresponding to a barcode or other unique identifier assigned to each physical asset. This may also include indicators about the side (for bilateral media) and derivative status appended at the end. For example: 32101047381338_1_pm.wav (where 32101047381338 = barcode, 1 = side 1, pm = preservation master, and .wav = file extension).
Different Goals of Digitization
When digitizing to support access or to fulfill user requests, it is generally desirable to follow these standards and ingest the content into the digital repository so it can be reused to avoid having to re-digitize materials in the future. However, staff will need to make the judgement call of what quality is appropriate, based on available equipment, timeframe, quality of the source material, and storage costs. Some of these issues are already accounted for in the guidelines, such as using the lower-quality "Optimized for OCR" standards for mostly textual material, as opposed to the "Special Collections on Paper and Film" standards.
Metadata Standards
In general, descriptive metadata should be created prior to digitization. At minimum, there must be a unique identifier, such as a metadata management system ID or a Finding Aids component ID, connecting digitized content to a metadata record.
When to Outsource Digitization
The Digital Imaging Studio can digitize many types of materials, and should generally be used for rare, valuable, and/or fragile material. DPSG issues a call for digitization projects three times a year, and a small number of items can be digitized outside of the proposal period by the studio at the discretion of the studio manager.
There are a number of factors that tend to make outsourcing digitization more practical in certain circumstances:
- AV digitization for any formats beyond what the Mendel Music Library can support
- Mass paper digitization, or any digitization with a short timeline and a large volume
- Dedicated external funding for digitization, which may be more practical than hiring staff and purchasing equipment
Schedule for Review and Updating
The Digital Projects Steering Group (DPSG) will review this documentation at the end of each calendar year and, when necessary, designate a person or working group to make updates.