FW: FYI -- more thoughts on metadata for OPC

John Haug johnhaug at exchange.microsoft.com
Fri Mar 30 22:47:17 CEST 2012


Forwarding for Caroline.

-----Original Message-----
From: Arms, Caroline [mailto:caar at loc.gov] 
Sent: Friday, March 23, 2012 3:30 PM
To: rex at RexJaeschke.com; John Haug
Subject: FYI -- more thoughts on metadata for OPC

John and Rex,

As I had said I would, I sent Murata-san contact information for a Japanese guy active in the Dublin Core community.  Murata-san already knows him.  In replying to Murata-san, I tried to articulate some of my thoughts on metadata in relation to OPC (certainly not comprehensive or final).  Given that I will be away for several months,  I thought I would send you these thoughts to keep in a back pocket and bring out if appropriate.  

I hate to clutter the list, but if you think it best for me to send these thoughts to the whole list, I can do that, framing them as a continuation of the discussion on the phone-call.

     Caroline

Caroline Arms
Library of Congress Contractor
Co-compiler of Sustainability of Digital Formats resource http://www.digitalpreservation.gov/formats/

** Views expressed are personal and not necessarily those of the institution ** ________________________________________
From: Arms, Caroline
Sent: Friday, March 23, 2012 6:17 PM
To: MURATA Makoto
Subject: RE: FW: Call for volunteers to organise the session of DCMI Localization & Internationalization at DC2012, Malaysia, Sept 3-7, 2012

Murata-san,

Great.  If you see Prof. Sugimoto again (or Shigeo as I he went by when he was in the States), please give him my regards -- and from my husband Bill too.

I agree that a generic solution to multilingual metadata would be a useful direction.  Whether OPC is the right vehicle for moving in that direction is a different question.  I don't have a strong view on that.

As an example where your problem is addressed:  http://www.loc.gov/standards/mods/mods-changes-3-4.html   see the section on "Link for transliterations and translations of the same"

To add to your list of schemas  that do not do what you want:
  http://dtd.nlm.nih.gov/   I did notice a trans-title element here.
  http://dtd.nlm.nih.gov/book/tag-library/       trans-...   elements here too, but not for <name>.  Both these tag libraries intended for published content are aimed at publications in English -- not surprising since NLM is looking to influence what it gets via mandatory copyright deposit -- and the set of medical publications in Japanese published in the US may be empty.
BTW, both of these schemas do have an attribute that identifies the correct sort and display orders for the names.

In relation to another question you asked, I checked on how bi-directionality is handled in MARC.  The expectation is that within any MARC subfield, character data is entered in the order intended for display given the language/script in use.  I find that stated explicitly.  Display interfaces (i.e. library catalogs) seem to be expected to order subfields within fields as appropriate for the audience and or language.

My interest in having a recognized place to embed a chunk of metadata in an OPC package is a generic solution to a different problem.   What constitutes "good" metadata is very context-dependent.  Many communities have developed their own schemas and practices.  Transforming from one schema to another is always fraught with problems.  For example, in some schemes, a name is split into first/last or family/given, with forms of address kept separate; in others, it is just a string.  Geospatial data is sometimes recorded in spreadsheets and there is a relevant ISO standard: ISO 19115.  In some workflows, it could be useful to record important metadata about the time/place/equipment/accuracy of recorded observations, not to mention the necessary coordinate reference scheme for any coordinates.  It makes no sense to think of expanding OPC metadata to cover everything in ISO 19115.  DDI [http://www.ddialliance.org/] is another such rich domain-specific metadata scheme.  If there was a recognized place in the package to look for metadata chunks in other schemes, that could be useful in many workflows.

This does not mean that I don't think there is room for improvement in the "basic" metadata that goes with any OPC package.  Any changes probably need to be marketplace relevant -- and I don't claim to be an expert on the overall marketplace for "office documents" or OPC.  What I do know is that no one enters metadata at all unless it is easy and they have a reason to do it!!   I think metadata for OPC is an issue that will warrant quite a bit of discussion.

Have a good time while I'm away.  Assuming I get a new contract with the Library of Congress, I hope to be back as part of the team.  But the budget situation is uncertain and how management will react to the uncertainty is an open question!

   Caroline

Caroline Arms
Library of Congress Contractor
Co-compiler of Sustainability of Digital Formats resource http://www.digitalpreservation.gov/formats/

** Views expressed are personal and not necessarily those of the institution ** ________________________________________
From: eb2mmrt at gmail.com [eb2mmrt at gmail.com] On Behalf Of MURATA Makoto [eb2m-mrt at asahi-net.or.jp]
Sent: Friday, March 23, 2012 12:43 AM
To: Arms, Caroline
Cc: MURATA Makoto (FAMILY Given)
Subject: Re: FW: Call for volunteers to organise the session of DCMI Localization & Internationalization at DC2012, Malaysia, Sept 3-7, 2012

Caroline,

Thank you.  Last year, I attended a meeting of a government
metadata project lead by Prof. Sugimoto.   I insisted that
we need a generic solution to the Japanese phonetics issue.  Prof. Miyazawa was also there. Consensus in this group was that we need a multi-lingual solution to this issue and Japanese phonetics is just a special case.  In other words, we need multiple elements for a single author and each element are in different languages or scripts.  So far so good.  But OPC, Atom, ODF, and ONIX do not allow such multiple elements for a single author.

Incidentally, PRISM people (metadata for magazine articles) also contacted me yesterday.  It appears that they do allow multiple dc:title elements (with @xml:lang) for a single article.  But they are now inclined to introduce an additional attribute.  I am trying to persuade them.

Regards,
Makoto

2012/3/23 Arms, Caroline <caar at loc.gov>:
> Murata-san,
>
> Here is an e-mail that serves two purposes. It gives you the email address for Shigeo Sugimoto and tells you about a proposed meeting of a DC interest group that might be relevant (although I suspect not).  Given that this is a public announcement, I don't feel a reason to make a formal introduction.
>
>     Caroline
>
> Caroline Arms
> Library of Congress Contractor
> Co-compiler of Sustainability of Digital Formats resource 
> http://www.digitalpreservation.gov/formats/
>
> ** Views expressed are personal and not necessarily those of the 
> institution ** ________________________________________
> From: General DCMI discussion list [DC-GENERAL at JISCMAIL.AC.UK] On 
> Behalf Of Shigeo Sugimoto [sugimoto at SLIS.TSUKUBA.AC.JP]
> Sent: Wednesday, March 21, 2012 6:18 PM
> To: DC-GENERAL at JISCMAIL.AC.UK
> Subject: Fwd: Call for volunteers to organise the session of DCMI 
> Localization & Internationalization at DC2012, Malaysia, Sept 3-7, 
> 2012
>
> Dear Subscribers,
>
> I'm forwarding Call for Volunteers from DC-Internatinal list.
> Localization and Internationalization issues such as multi-lingual 
> vocabulary and translation of DC standards have been and still are an 
> important issue for our community.
> Please feel free to contact the moderators if you are planning to 
> attend DC-2012 in Kuching, Malaysia and/or have interested in L&I 
> activities.
>
> Best regards,
>
> Shigeo Sugimoto
>
> -------- Original Message --------
> Subject:        Call for volunteers to organise the session of DCMI
> Localization & Internationalization at DC2012, Malaysia, Sept 3-7, 2012
> Date:   Tue, 20 Mar 2012 01:40:31 +0000
> From:   Karen Rollitt <Karen.Rollitt at dia.govt.nz>
> To:     'DC-INTERNATIONAL at JISCMAIL.AC.UK'
> <DC-INTERNATIONAL at JISCMAIL.AC.UK>, 'Shigeo Sugimoto'
> <sugimoto at slis.tsukuba.ac.jp>
>
>
>
> Dear Subscribers,
>
> This is a call for a volunteer or volunteers to organise the session 
> of the DCMI Localization & Internationalization Community at the next 
> Dublin Core Conference, DC2012, to be held in Kuching, Sarawak, 
> Malaysia, Sepember 3-7, 2012. The session typically lasts one and one 
> half hours and would include setting the topic and agenda and 
> facilitating the meeting.
>
> Topics for the Localization & Internationalization session will be in 
> scope for the DCMI Localization & Internationalization Community, such 
> as multilingual topics in linked data and vocabulary issues. See the 
> DCMI Localization and Internationalization Wiki [1] and also the DCMI 
> webpage for the DCMI Localization & Internationalization Community [2] 
> for more topic ideas for the meeting.
>
> If you are interested in organising this meeting please contact the 
> moderators of the DCMI Localization & Internationalization Community 
> by
> 25 March 2012.
>
> Kind regards
>
> Karen Rollitt karen.rollitt at dia.govt.nz
>
> Shigeo Sugimoto sugimoto at slis.tsukuba.ac.jp
>
> Moderators
>
> DCMI Localization & Internationalization Community
>
> [1]
> http://wiki.dublincore.org/index.php/DCMI_Localization_and_Internation
> alization_Wiki
>
> [2] http://dublincore.org/groups/languages
>
> ====
> CAUTION: This email message and any attachments contain information 
> that may be confidential and may be LEGALLY PRIVILEGED. If you are not 
> the intended recipient, any use, disclosure or copying of this message 
> or attachments is strictly prohibited. If you have received this email 
> message in error please notify us immediately and erase all copies of 
> the message and attachments. Thank you.
> ====
>
> =========================================================
> SUGIMOTO, Shigeo, Ph.D.
> Professor
> Research Center for Knowledge Communities Faculty of Library, 
> Information and Media Studies University of Tsukuba postal address:  
> 1-2, Kasuga, Tsukuba, Ibaraki 305-8550, JAPAN
> phones: +81-29-859-1348(office), +81-29-859-1531 (secretary)
> fax: +81-29-859-1093    email: sugimoto at slis.tsukuba.ac.jp



--

Praying for the victims of the Japan Tohoku earthquake

Makoto


More information about the sc34wg4 mailing list