Thoughts on metadata in OPC

Arms, Caroline caar at loc.gov
Sun Feb 3 18:03:23 CET 2013


Since I won't be in Copenhagen and don't know if I would be able to call in at the appropriate time, here are some thoughts on the idea of adding a feature to OPC to allow embedding of metadata generated externally into an OPC package.

   Caroline

>>>>
Informal suggestion for a new feature for OPC (ISO 29500-2).

Introduce a documented mechanism whereby metadata parts can be embedded in a package with an expectation that consuming applications that do not recognize the syntax or semantics of the metadata will retain metadata parts by default when producing a new version of the package.

One particular scenario I have in mind is in association with submission of a package to an archive or repository that re-distributes content.  It is often useful to have rich metadata for a package embedded within the package rather than (or as well as) being submitted separately.  Given that OPC is designed as a general package format, it seems appropriate to have a recognizable mechanism for embedding such metadata files.  Otherwise an OPC package would need to be wrapped inside another package that can incorporate a separate metadata file.  Examples of rich metadata schemes that are standard within different scientific communities that aggregate resources are Federal Geographic Data Committee (FGDC) Geospatial Metadata, EML (Ecological Markup Language).  These metadata schemes are fundamental to interoperablity within those communities.  In other contexts, ONIX metadata, IPTC4XMP,  an application profile of Dublin Core, or a locally specified metadata format might be usefully embedded.

Another possible use of such a mechanism in the same scenario is that metadata about the submission transaction could be embedded in a way that is independent of the submitted content.

Thoughts:
More than one such metadata file/part needs to be allowed.
Some what similar to Custom File Properties in Part 1.
Metadata in XML might be recommended but not required.
Probably need some way to name/identify/recognize the metadata format/schema
May be useful to have a way to identify date of metadata part separately from package as a whole.  But maybe that should be left to the metadata format if it is difficult.
>>>>

Caroline Arms
Library of Congress Contractor
Co-compiler of Sustainability of Digital Formats resource
http://www.digitalpreservation.gov/formats/

** Views expressed are personal and not necessarily those of the institution **



More information about the sc34wg4 mailing list