Princeton OCLC Data Sync Process

Summary

OCLC Data Sync is a process to synchronize institutional holdings with WorldCat to make local library collections visible and available through OCLC services by:

  • Adding original cataloging records to WorldCat
  • Matching records from your local catalog with records in WorldCat
  • Managing local holdings data
  • Setting or deleting holdings for single institutions or groups to accurately reflect what is in your collection
  • Updating your holdings in WorldCat with additional Local Bibliographic data

As of Nov 17, 2022, PUL began sending records to OCLC via DataSync (automated Aug 22, 2023). 
As of Jan 8, 2024, automatic Alma updates from OCLC were activated and on Mar 4, 2024, they were suspended indefinitely. 
As of Jan 8, 2024, when OCLC Master Records are merged into other OCLC records, Alma’s OCLC numbers are updated.

Initial Reclamation Project

In August of 2022, the Data Sync group sent a large batch of records to OCLC to synchronize holdings. Criteria for inclusion were:

  • Bib record is unsuppressed.
  • Bib record has at least one unsuppressed holding.
  • Bib record passed extensive validation checks (see here).

Staff remediated many of the records that did not pass validation to make them ready for the automated DataSync process that was to be turned on after this initial reclamation.

If a record received an OCLC number from OCLC that differed from the OCLC number present in the bib record, the original OCLC number was moved to a 919 field (see documentation on the 919 field here).

Criteria For Sending Records

Every week, PUL extracts records (mostly newly cataloged titles) and submits them to OCLC via an Alma publishing job to sync PUL holdings with the WorldCat database. The criteria used to identify records eligible for Data Sync are:

  • Bib record has no 914 fields.
  • Bib record has no 915 fields.
  • Bib is unsuppressed.
  • Bib record is brief level 4 or above. See Brief Level documentation
  • Bib record has at least one physical item attached to an unsuppressed holding that is not in the Acquisition Process or the Acquisitions and Cataloging Work Order.

OCLC Process and Reports

When OCLC receives the files, the records will go through OCLC’s own validation process. OCLC then generates a report that shows the action taken with each record (BibProcessingReport). If a record passes validation, OCLC takes one of the following actions:

  1. Create: A new OCLC record is created, and PUL’s holdings are updated on that record.
  2. Match: PUL’s holdings were updated on an existing OCLC record, but no metadata was transferred.
  3. Field Transfer: PUL’s holdings were updated on an existing OCLC record, and one or more fields eligible for metadata transfer were updated in the OCLC record (see the fields eligible for transfer below).
  4. Replace: PUL’s holdings were updated on an existing OCLC record that PUL originated, and the entire record was replaced with the new version of the record.

If a record failed validation or could not be disambiguated from similar records, the record is marked as Unresolved. The errors found with unresolved records are documented in another report (BibExceptionReport).

Errors are grouped by level of severity: MINOR, SEVERE, and CRITICAL.

MINOR errors include invalid repeated fields, fields with incorrect length, invalid subfield codes, and other MARC format issues. These errors do not prevent OCLC from processing the record and setting holdings; they are informational.

SEVERE and CRITICAL errors prevent OCLC from processing the incoming record and setting holdings. SEVERE errors include invalid indicators in key fields and repeated 245 or 010 fields. CRITICAL errors include invalid data in the leader and 008 field.

Fields Eligible for Transfer

050

Library of Congress Call Number (R)

055

Classification Numbers Assigned in Canada (R)

060

National Library of Medicine Call Number (R)

070

National Agricultural Library Call Number (R)

080

Universal Decimal Classification Number (R)

082

Local Call Number (R)

083

Additional Dewey Decimal Classification Number (R)

084

Other Classification Number (R)

085

Synthesized Classification Number Components (R)

086

Government Document Classification Number (R)

090

Local Call Number (R)

092

Local Call Number (R)

505

Formatted Contents Note (R)

508

Creation/Production Credits Note (R)

511

Participant or Performer Note (R)

520

Summary, Etc. (R)

600

Subject Added Entry─Personal Name (R)

610

Subject Added Entry─Corporate Name (R)

611

Subject Added Entry─Meeting Name (R)

630

Subject Added Entry─Uniform Title (R)

648

Subject Added Entry─Chronological Term (R)

650

Subject Added Entry─Topical Term (R)

651

Subject Added Entry─Geographic Name (R)

655

Index Term─Genre/Form (R)

856

Electronic Location and Access (R)

Processing OCLC Reports

Records in the BibProcessingReport that passed validation with no errors above MINOR severity are updated in Alma through an external job to add a 914 field to each record. The 914 field contains the definitive OCLC number and the action taken.

Records in the BibException report are updated in Alma through an external job to add a 915 field for each error to each record. The 915 field contains the level of severity and the specific error found.

A data quality group will review and resolve the validation errors listed in the 915 field (mainly SEVERE or CRITICAL errors). Once resolved, staff will delete the 915 field. The resolved records will be picked up by the Data Sync publishing job when all 915 fields are deleted.


OCLC WorldCat Updates

In Jan 2024, Princeton University Library automated a process to receive updated records when records with PUL holdings in WorldCat were enhanced. Note that automated record updates were suspended indefinitely as of Mar 4, 2024. The following documents the settings and processes that were configured in WorldShare Collection Manager and Alma.

 Click here for configuration details

Triggers for updated records from WorldCat

Defined within OCLC WorldShare Collection Manager.
We will receive records from OCLC if any of the fields in the table below change. Continuing resources are excluded from the process.
Changes we make to records in Connexion under the Princeton symbol will be included in the files we receive from OCLC.

Enc Lvl LDR/06Only if updated to ^, I, 7
006Fixed-Length Data Elements - Additional Material Characteristics
007Physical Description Fixed Field
008Fixed-Length Data Elements
019Former OCLC number, will be translated to 035 $z
020International Standard Book Number (R)
022International Standard Serial Number (R)
024Other Standard Identifier (R)
028Publisher or Distributor Number (R)
035System Control Number (R) - OCLC number only
050Library of Congress Call Number (R)
066Character Sets Present (NR) - Indicates presence of non-Latin script
086Government Document Classification Number (R) - SuDoc
090Local Call Numbers
1XX

Main Entries-General Information

245Title Statement (NR)
246Varying Form of Title (R)
247Former Title (R)
26XPublication, Distribution, etc. (Imprint) (R) 
Projected Publication Date (NR)
Production, Publication, Distribution, Manufacture, and Copyright Notice (R) 
300Physical Description (R)
310Current Publication Frequency (R)
34XPhysical Medium (R)
Accessibility Content (R)
Geospatial Reference Data (R)
Planar Coordinate Data (R)
Sound Characteristics (R)
Moving Image Characteristics (R)
Video Characteristics (R)
Digital File Characteristics (R)
Notated Music Characteristics (R)
362Dates of Publication and/or Sequential Designation (R)
38XForm of Work (R)
Other Distinguishing Characteristics of Work or Expression (R)
Medium of Performance (R)
Numeric Designation of Musical Work (R)
Key (R)
Audience Characteristics (R)
Creator/Contributor Characteristics (R)
Representative Expression Characteristics (R)
Time Period of Creation (R)
490Series Statement (R)
500General Note (R)
502Dissertation Note (R)
505Formatted Contents Note (R)
508Creation/Production Credits Note (R) 
511Participant or Performer Note (R) 
520Summary, etc. (R)
525Supplement Note (R)
538System Details Note (R)
546Language Note (R)
580Linking Entry Complexity Note (R)
6XX

Subject Added Entries

700-754

Added Entry Fields - General Information
System Details Access to Computer Files (R)

760-788

Linking Entries - General Information
Parallel Description in Another Language of Cataloging (R)

800Series Added Entry - Personal Name (R)
810Series Added Entry - Corporate Name (R)
811Series Added Entry - Meeting Name (R)
830Series Added Entry - Uniform Title (R)
880Alternate Graphic Representation (R)

Fields deleted from updated WorldCat records

Defined within OCLC WorldShare Collection Manager.
The following fields will be removed from updated records that we receive from OCLC.

TagInd1Ind2
902anyany
910anyany
917anyany
936anyany
938anyany
949anyany
984anyany
989anyany

Fields added to updated WorldCat records

Defined within OCLC WorldShare Collection Manager.
The following fields will be added to updated records that we receive from OCLC.

TagSubfieldValue
980aInvoice Date
980bList Price
980eNet Price
980fInvoice Number
980kCustom text1
980lCustom text2
916aDate Delivered
916bOCLC Number
916cReason for Record Output
916dLocal System Number
916e"WorldShare Record Update"
856uKB URL

Alma Indication Rule Actions

Defined within Alma.
The following records are filtered out from the DataSync process.
CGCRB stands for Cataloging Guidelines for Creating Chinese Rare Book Records In Machine-Readable Form.

TagSubfieldValue
040ecgcrb
LDR/07N/As (filters out serials)

Alma Normalization Rule Actions

Defined within Alma.
When updated records are imported into Alma, normalization rule actions are performed first to refine metadata in the OCLC record. 

TagConditionRecord Action
001
remove
003
remove
029
remove
035pattern is not "035.a.(OCoLC)*"remove
035OCoLC number has leading zeroesremove leading zeroes
2xxhas $5remove
3xxhas $5remove
5XXhas $5remove
6XXhas $5remove
7XXhas $5remove
856
remove
8XXhas $5remove
880has $6 020 or 022remove

Alma Merge Rule Actions: Updates

Defined within Alma, relevant to "updates" files (records that have PUL holdings updated in WorldCat as defined in Triggers).
Merge rule actions are applied to the Alma record (how OCLC records merge with existing Alma records).
NOTE: Fields that are not explicitly defined here will not be affected. 

TagConditionRecord Action
LDR
replace
006
replace
007
replace
008
replace
01X
replace
02X
replace
031
replace
032
replace
033
replace
034if not already in Alma recordadd
035
add
036
replace
037
add
038
replace
040
replace
041
replace
042
replace
043if not already in Alma recordadd
044
replace
045
replace
046
replace
047
replace
048
replace
050if not already in Alma recordadd
06X
replace
07X
replace
08X
replace
090
add
1XX
remove from Alma; add from OCLC
240
replace
245
replace
255if not already in Alma recordadd
246if not already in Alma recordadd
247
add
250
replace
251
replace
254
replace
256
replace
257
replace
258
replace
26X
remove from Alma; add from OCLC
27X
replace
300if not already in Alma recordadd
306
replace
307
replace
310
replace
34X
replace
35X
replace
362
replace
37X
replace
38X
replace
440
replace
490
replace
500if not already in Alma recordadd
502
replace
504if not already in Alma recordadd
505if not already in Alma recordadd
508
replace
511
replace
520if not already in Alma recordadd
525
replace
538
replace
580
replace
6XXif no subfield "5" in Alma recordremove from Alma; add from OCLC
700-758if no subfield "5" in Alma recordremove from Alma; add from OCLC
760-787if no subfield "w" in Alma recordremove from Alma; add from OCLC
800if no subfield "5" in Alma recordremove from Alma; add from OCLC
810if no subfield "5" in Alma recordremove from Alma; add from OCLC
811if no subfield "5" in Alma recordremove from Alma; add from OCLC
830if no subfield "5" in Alma recordremove from Alma; add from OCLC
880if exists in OCLC recordreplace
916
add

Alma Merge Rule Actions: Merges

Defined within Alma, relevant to "merge" files (records that have PUL holdings merged in WorldCat).
Merge rule actions are applied to the Alma record.
NOTE: Fields that are not explicitly defined here will not be affected. 

TagConditionRecord Action
019
replace
035
add
916
add

OCLC Update Workflow Diagram

Process Run Times

ScheduleDescriptionProcess Name
Tuesday 10:00 pm EST/EDTEligible records are captured in Alma and sent to OCLC as the first step of the DataSync process. Done by a Publishing Profile which is pointed a logical set
Records eligible for DataSync
Publishing Profile: Datasync export without Inventory
Wednesday 5:00 am EST/EDTWhen OCLC Master Records are merged into other OCLC records, Alma’s OCLC numbers are updated. Import Profile: 

WorldShare Record Update: "merge" files

Wednesday 7:00am EST or 8:00am EDT

Processes DataSync Exceptions: downloads BibException reports from OCLC-sftp, creates a Marc Collection with individual records for each MMS ID, and uploads the file with the Marc Collection to lib-sftp, in preparation for further processing by Alma. Done by lib_job. Results in 915 fields. lib_job: Datasync Exceptions
Wednesday 7:30am EST or 8:30am EDTProcesses DataSync Updates: downloads BibProcessing reports from OCLC-sftp, creates a Marc record for each MMS ID, and uploads the file with the Marc records to lib-sftp, for ingest into Alma. Done by lib_job. Results in 914 fields. lib_job: DataSync Processed
Thursday 3:00 am EST / 4:00 am EDTTriggered by the 'alma_bib_norm' lib_job: moves OCLC number from 914 to 035 field using the normalization process PUL-BIBNorm. Done by Repository Job (via API).
Repository Job: Unprocessed Datasync 914 fields


See Also