Types of Treatment
Status
Approved by DSSG on
Overview
Providing bibliographic access to library materials can take a number of different forms depending on the nature of the material, availability of staffing, and user needs. Each different type of treatment has implications for discovery in terms of where the data can be searched and how detailed the description is. The decision about what type of metadata treatment an object will receive is often automatically determined by established procedures, but there are some edge cases where a specific decision must be made. The “types of metadata treatment” documents each describe a specific method for providing bibliographic access so there can be a common understanding of the discovery implications when catalogers, public services, and selectors need to discuss treatment for material falling outside the established procedures.
When there is an “edge” case or perceived special treatment need, the stakeholder should initiate a conversation with the appropriate metadata personnel. Anticipated use of the material, value within the overall collections, type of material, and staffing will be taken into account in negotiating a final decision. The “types of metadata treatment” documents should be referred to so stakeholders have a clear understanding of the discovery implications of treatment options.
Note that ephemera covers a wide range of types of material and could potentially be handled in accordance with the “Bib Record”, “Finding Aid/Archival Description”, or “Ephemera in Figgy” treatments. The default treatment for ephemera being digitized is “Ephemera in Figgy”. Other ephemeral material needs a specific decision.
The following members of the Types of Treatment Working Group, chartered by the Discovery Systems Steering Committee developed the following descriptions:
- Eliot Jordan
- Alexis Antracoli
- Joyce Bell
- Don Thornbury
- Regine Heberlein
- Rex Hatfield
- Armando Suarez
- Thomas Keenan
Bib Record
(Item level, not abbreviated)
Standards:
- Content standards used in PUL MARC records are ALA cataloging rules, AACR1, AACR2, RDA, DACS, and "none of the above."
- The encoding standard is basically MARC 21. Princeton has added a few local fields in 9xx. Legacy records may contain some variable fields or fixed field values that are no longer valid or no longer applied in new records.
Who does the work?
Work is done by trained staff and others as well (from librarians to student workers).
Range of data elements available?
A vast range of "variable fields". Our bibliographic tag table has 258 variable fields listed. For holdings we have 45. Some fields are loosely defined, such as 500.
What system the data is created and stored in?
Data is stored in Voyager. The underlying database is Oracle. Records are created both in Voyager and on OCLC (primarily CJK & HAPY records). Vendor records for approval and shelf-ready materials are batch-loaded. Other special-project records have been batch-loaded too, like Visuals and Parrish in the last year or so.
Distinctive features of the treatment?
- For content, MARC is very flexible. It can be used for anything and everything.
- Holdings can be linked or embedded to/in bibliographic records.
- There are parallel (linked) fields for non-roman script and the romanized equivalent.
- There are many "linking" fields to handle relationships such as parent-child (both logical and physical), earlier-later serial titles, and so forth. They include record IDs.
- URIs can be added for entities that match "authority records".
- N.B. All links described above are free-text, without validation. In fact, everything in the variable fields is free-text.
How does it fit into our discovery environment?
The catalog (Blacklight)
How does it fit into external discovery environments?
- OCLC, primarily, if records are exported or originally created there.
- The catalog is crawled by search engines
What material is this standard practice for?
Anything and everything (primarily printed monographs and serials but including e-books, data files, manuscripts, maps, scores, theses, videos, ephemera, realia, and visual materials). We have MARC records representing all of the material categories in the WG’s list and more.
Examples:
Collection-level MARC
Collection level is one of 7 bibliographic levels defined in MARC (LDR/07).
The entity described consists of more than 1 bibliographic component (child). The collection-level record describes the aggregation as a whole. Components may or may not be described on their own. If they are described there are several options:
In the collection-level record:
- Contents note (basic or enhanced)
- Authorized access point for the Work/Expression.
Separately:
- Individual MARC record.
- Non-MARC form (as an EAD component)
If the component description is in an individual MARC record, it is more or less customary to provide a link to the parent record. Links can also be provided from parents to children, as mentioned above.
Standard practice for: Collections described in EAD; large pamphlet volumes or other such physical aggregations of independent publications. This option had been used for some Cyrillic materials coming in on approval, mostly pamphlets and smaller publications.
[Other non-item levels we use a lot are Serial and Integrating resource.]
Examples
Examples of component records
Abbreviated-level MARC
Abbreviated level is one of 10 encoding levels defined in MARC (LDR/17). This level means that record content does not meet minimal level cataloging specifications in terms of data elements or standard vocabulary.
Standard practice for: Not generally used by metadata professionals in new cataloging; may be applied to data imported from non-MARC sources. Princeton has used this option for materials in languages not handled by staff or materials deemed suitable for such treatment by selectors.
Example: 6457002
Electronic Resources
Most of our electronic resources are discoverable via cataloging acquired from vendors. The Serials and Electronic Resources Team acquires and loads data, activates targets/databases and creates new cataloging. Completeness of the cataloging varies depending on the source of the data.
Databases and other integrating resources
Most resources that fall in this category appear in PUL's Databases list (http://library.princeton.edu/research/databases) with a concise general description, usually provided by the publisher or written by the relevant selector. Also, a cataloger creates a MARC record for each resource so that it is discoverable via searches in the PUL catalog. The catalog records generally link to the corresponding entries on the Databases list, from which users can access the resources. In the interest of discoverability, individual components within larger databases sometimes have their own MARC records and/or Databases list entries. Databases that appear in the SFX (link resolver) and Summon (Articles+) KnowledgeBases are activated there as well.
Some databases and integrating resources are not listed on PUL’s Databases list but are cataloged in Voyager so they can be discovered through Blacklight. Data and Statistical Services maintains an online catalog of statistical datasets (https://dss.princeton.edu/), and these are also cataloged in Voyager, either through bulk loading of records (e.g., ICPSR) or via original cataloging.
E-books
Generally, each individual title has its own MARC record in Voyager, usually acquired from the resource provider. When possible, e-books are also activated in SFX and Summon to make discovery and linking possible from Articles+ for any indexed chapters/sections.
E-journals:
Generally, all titles are activated in SFX and Summon with data about specific coverage. Titles are added to SFX if not found in the global KB. MARC records are loaded into the catalog based on the SFX activations--full records from OCLC when possible, brief records otherwise. Links on these records lead to a "Get It" menu presenting users with options for electronic access.
How does it fit into our discovery environment?
All e-resources appear in Blacklight. E-journals are also in the A-Z e-journal list. The “Get it” service works off e-resources activated in SFX. The “Articles+” search interface operates on e-resources activated in Summon.
How does it fit into external discovery environments?
- E-resources do not generally appear in WorldCat.
- PUL does not submit MARC records for electronic subscriptions—or for external e-resources to which it purchases temporary or permanent access—to external aggregators such as WorldCat. Although for records sets we receive via OCLC WorldShare, our holdings are set. We also set our holdings for permanent e-resource acquisitions. Statistical data held by PUL are generally reflected in WorldCat, unless use is restricted.
What material is this standard practice for:
- Remote access electronic resources.
- External e-resources to which PUL purchases temporary or permanent access.
Examples:
- E-book collection record
- E-book individual title record
- E-journal records: 11105215, 10735576
- Database
Ephemera in Figgy
Ephemeral objects are described at the item level in figgy. Data entry is through an online form with a set universe of data elements, many of which are constrained to a closed list of values from an established vocabulary (genre, subject). The system is designed to allow data entry by staff and students with language and subject knowledge who don’t need to have any cataloging background. The main bibliographic elements here are:
- Title(s)
- Language
- Genre
- Dimensions
- Series
- Creator/Contributor
- Publication information
- Subject
- Description (optional)
- Provenance
Several of these elements are optional (e.g. creator/contributor, publication information). The person entering the metadata for an item will ordinarily include in the record all information readily extractable from the item itself, but will not resort to inference or extensive research to supply metadata elements, and will leave optional fields blank where there is no readily available, authoritative information. The title, series, creator/contributor, publication information, description and provenance fields are free-text, but it is considered best practice to use authorized forms of names (e.g. LCNA) wherever possible, with a view to future linked-data and other interoperability potentialities.
The optional fields include added fields in the title and creator/contributor categories. In addition to instances where there are multiple alternate titles for an item, or where items have multiple creators/contributors, these repeated fields are used for multi-script records, e.g. original-script and Romanized forms of titles and names of creators/contributors in languages using non-Roman scripts
Subjects in the ephemera system are from a locally-created list of categories and sub-categories originally designed for the Latin American ephemera collection. This subject vocabulary periodically undergoes minor expansions and modifications to accommodate new additions to the ephemera collections. The elements in this vocabulary are mapped to LCSH. Geographic descriptors from the MARC list for countries (http://id.loc.gov/vocabulary/countries) are kept separate from the subject vocabulary elements, so as to avoid creating multiple region-specific strings that duplicate each other thematically. There are 2 geographic descriptor categories: one for the scope of materials, and one for place of origin. Researchers can combine geographic and subject facets in the search interface to target material of interest.
When the print ephemera are being prepared for imaging they are organized into numbered boxes and folders and, in cases where multiple items are contained within a single folder, item numbers are often penciled onto the items themselves. Boxes, folders, and items are all individually barcoded. In cases where applying barcodes directly onto items is unadvisable from a preservation standpoint, item barcodes can be affixed to the inside of the relevant folder. Box, folder and barcode items are scanned by the digital studio in a way that creates a virtual box-folder-item structure that mirrors the physical box-folder-item structure and imports images directly into their assigned place within that structure.
How does it fit into our discovery environment?
Completed ephemeral objects in figgy appear in the digital library (DPUL). MARC records are not created for ephemera at the item level. No ephemera collections are yet active in DPUL. As they become active, a workflow will be created to add collection-level MARC records in Voyager so these collections can be discoverable via PUL’s OPAC (collections, not specific items).
How does it fit into external discovery environments?
Completed ephemeral objects can appear in Google search results, but they are not sent to WorldCat. As collections become public and MARC records are created, the collections will be represented in WorldCat.
What material is this standard practice for?
Ephemeral material which is being digitized.
Examples:
The only examples currently public appear in the custom Latin American Ephemera site: https://lae.princeton.edu. An in-production collection of over 1,000 late-Soviet posters is being processed as an ephemera collection in figgy, and a collection of ephemera related to Euromaidan and the Ukraine Crisis was recently approved for digitization and figgy processing the Digital Projects Steering Group.
Finding Aid/Archival Description
Archival description is designed to describe collections, such as organizational records, personal and family papers, and collected materials of generally unpublished materials. The main principle of all archival description is respect des fonds, which refers to the combined principles of provenance and received order. Provenance is the principle that records are arranged, described, and kept according to their origin and records from separate creators are kept separately. Received order is the principle that records are kept in the order established by the creator of the records. Order can be imposed in cases where material is received without any discernible organization. Intellectual order is distinct from physical order.
- Content standards: Describing Archives: A Content Standard (DACS), Adapted UC Guidelines for Born Digital Archival Description
- Encoding standards: EAD (Princeton uses EAD 2002, not EAD3), EAC-CPF
- Who does the work: Archival Description and Processing Team (ADAPT)
- Range of data elements: Vast range of fields. See the links above for full list of DACS elements and EAD and EAC-CPF tags. EADiva is also a great resource for information on EAD tags and how they are used.
- Data is created in both Archivists’ Toolkit and Oxygen; and stored in SVN (“Subversion”), a central version-control system that stores and makes retrievable versions of edited files.
- There are various levels of treatment (regardless of format) depending on the collection or sections within a collection: Collection-level; Series or Box-level; Sub-series-level, Folder-level; and, on very rare occasions, Item-level
- For procedures regarding the arrangement and description of born-digital records, the Description and Access for Born-Digital Archival Collections (DABDAC) working group is currently developing the draft of a document based on UC Guidelines which we’ll have ready to be shared soon.
How does it fit into our discovery environment?
- Finding aids are published in the PUL findings aids site, also known as PULFA.
- Some MARC records are published in the library catalog. An automated process is currently being developed, which will automatically create collection-level MARC records from the descriptive data in the finding aids.
- EAC-CPF records are published in PULFA and are available via the Names tab at the top of the page.
How does it fit into external discovery environments?
- Philadelphia Area Consortium of Special Collections Libraries (PACSCL)
- Google search
- Some records are found in WorldCat
What happens when the material is digitized?
- The current workflow involves generating a regular report with all of the collections that had content added to Figgy, and then a script is run that adds all of the dao’s to the EAD files in one fell swoop.
- Material is generally digitized at an aggregate and not at the item level and is viewable via the finding aids site.
- ADAPT created the following digitization policies and guidelines, which have been approved by the Digital Project Steering Group and are being integrated into digital projects documentation.
What material is this standard practice for?
- Archival description is designed to describe collections, such as organizational records, personal and family papers, and collected materials.
- Archival materials may include a variety of types/formats such as audio and video recordings in a variety of formats, born-digital files, paper records, and photographs.
- Archival description does not separate materials based on format. It describes materials holistically according to the principle of respect des fonds.
Edge cases:
- Single item acquisitions, both bound and unbound.
- See Single Item Manuscript Acquisitions
Examples:
- Personal papers finding aid using DACS and EAD
- Personal papers collection level MARC record
- EAC-CPF record for an individual
- Organizational records in born digital format finding aid using DACS and EAD
- Collected materials finding aid using DACS and EAD
Geospatial Data
Standards
- Federal Geographic Data Committee (FGDC)
- Content Standard for Digital Geospatial Metadata (CSDGM)
Who does the work?
- Wangyal Shawa and his team in the Map and Geospatial Information Center. In that group, Dan Walker currently does most of the FGDC metadata creation with assistance from student workers.
- The metadata is created using an editor included with the ArcGIS for Desktop software package.
- FGDC metadata is exported from ArcGIS as an XML file and imported into the Figgy repository along with the dataset it describes. Relevant metadata is extracted automatically extracted and saved in Figgy.
How does it fit into our discovery environment?
- Records for geospatial data are indexed into pulmap, our GeoBlacklight discovery portal.
- Some geospatial datasets have brief order records in Voyager. Efforts are currently underway to enhance these records with extra metadata and 856 field URIs pointing to relevant records in Pulmap. There is also a proposal, not yet implemented, to index geospatial records directly from Figgy into the Orangelight catalog.
How does it fit into external discovery environments?
- Pulmap records are shared and indexed into the GeoBlacklight instances of other institutions via an OpenGeoMetadata repository.
- For more information: https://doi.org/10.1080/19386389.2018.1443414
What material is this standard practice for?
- Vector-based geospatial data
- Data representing geographic features as points, lines, and polygons. Includes census tracts, congressional districts, parcels, points of interest, stream networks, road centerlines.
- Formats include Shapefile, Geodatabase, KML, and GeoJSON.
- See: https://gdal.org/ogr_formats.html
- Raster-based geospatial data
- Pixel-based data including georeferenced scanned maps, elevation models, bathymetry, aerial imagery, satellite imagery, and land cover classification among many others.
- Formats include GeoTIFF, USGS DEM, MrSID, ERDAS Imagine
- See: https://www.gdal.org/formats_list.html
Vector Data Examples:
Raster Data Examples:
- FGDC Document
- Figgy Metadata as JSON
- Record in Pulmap
- We currently do not have much raster data in Figgy, but we will have much more by the end of the year. This is an example from Cornell that we indexed into Pulmap.
- FGDC Document
- Record in Pulmap
- Georeferenced Scanned Map
- Land Cover
Edge Cases:
Georeferenced maps often have a single Voyager bib id with a record in Figgy for the non-georeferenced TIFF as well as a record for the georeferenced GeoTIFF. The “Georeferenced Scanned Map” example above is one such case.
Maps
Encoding standard: MARC21
Content standards:
- These references are for cataloging maps, scanned or unscanned.
- Cartographic materials : a manual of interpretation for AACR2, 2002 revision / prepared by the Anglo-American Cataloguing Committee for Cartographic Materials ; Elizabeth U. Mangan, editor.2nd edition. Chicago : American Library Association. 2003
- RDA, resource description & access and cartographic resources / Paige G. Andrew, Susan M. Moore, Mary Lynette Larsgaard. Chicago : ALA Editions, an imprint of the American Library Association. 2015.
- Maps and related cartographic materials : cataloging, classification, and bibliographic control / Paige G. Andrew, Mary Lynette Larsgaard, editors. Binghamton, N.Y. : Haworth Information Press, 1999.
- http://www.itsmarc.com/crs/mergedProjects/mapcat/mapcat/Contents.htm
- http://www.princeton.edu/~shawatw/classifi.html
- http://www.loc.gov/rr/geogmap/catteam.html
- DCRM(C)
Who does the work?
Cataloging is done in Voyager.
- Contemporary Maps: Berta Harvey in the Map and Geospatial Information Center
- Historic Maps: RBCat staff and SC staff
- Atlases: CaMS staff
After digitization, resources are created for each scanned map in Figgy. Metadata is then extracted from the MARC record and saved in the repository.
How does it fit into our discovery environment?
- Public domain scanned maps are viewable in Orangelight.
- Records for scanned maps are also indexed into Pulmap, our GeoBlacklight discovery portal.
How does it fit into external discovery environments:
- OCLC
- Like geospatial data, scanned map Pulmap records are shared and indexed into the GeoBlacklight instances of other institutions via an OpenGeoMetadata repository.
What material is this standard practice for?
- Contemporary Maps: Geologic, USGS, Army Map Service, etc...
- Historic Maps
- RBSC maps from 1918 and before. There are hundreds of exceptions to this cutoff date, however.
- Sanborn Fire Insurance Maps.
Edge cases:
- There are cases where one MARC record describes a set of maps or an atlas.
Examples:
- Historic Map (SC) - Single scanned map: Orangelight, OCLC, Pulmap, Figgy
- Historic Map (SC) - MapSet / Atlas: Orangelight, OCLC, Pulmap, Figgy
- Contemporary Map - Single scanned map: Orangelight, OCLC, Pulmap, Figgy
- Contemporary Map - MapSet / Atlas: Orangelight, OCLC, Pulmap, Figgy
Numismatics
The element set derives from one developed by the American Numismatic Society. Numismatic Description Standard schema (NUDS) is pretty similar. The basic descriptive units are “Issue” (~FRBR Manifestation) and “Coin” (~FRBR Item). Content standards do not exist, and data is not validated. The work is largely done by students under curatorial direction, though others contribute (including a professional specialist working on Persian coins). The system currently used is a local SQL database designed about 2004, called PrinNum. There are efforts underway to migrate this to figgy. There’s no provision for aggregate description--everything must be described at “Coin” (item) level. Princeton has about 115,000 coins and other numismatic objects such as medals. Only 13,000 of them are in the database. Everyone familiar with the PrinNum database knows that it needs a total reconceptualizing and rebuild and massive data cleanup too.
How does it fit into our discovery environment?
Data is in Figgy and discoverable in the catalog and available as a facet
How does it fit into external discovery environments?
Some interest has been expressed in getting PrinNum records into OCRE (Online Coins of the Roman Empire) and CRRO (Coinage of the Roman Republic Online).
What material is this standard practice for?
Holdings of the Numismatic Collection in Special Collections
Edge cases:
- A few medals acquired by Graphic Arts were described in Visuals as works of art. The Visuals records are now in MARC. Example: See 10662441
- There are also MARC records derived from Visuals for Numismatics holdings. See 10641237
Examples:
- Coin
- Examples under edge cases above