OPC names in general

Mon Aug 16 20:46:22 CEST 2010

Regarding the Unicode conversion: it's my understanding that the text in that Annex of Part 3 just covers a standardised way of getting to a part name from a Unicode string; it doesn't imply that part names can actually be Unicode strings (as it's specified elsewhere that they cannot). I think we could remove that Annex if we had to, although I think it's useful guidance.

Hello again - my response on the other thread (DR 09-0321) covers a lot of the topics in here as well, but I'm against expanding the scope of possible part names just because existing IS 29500-compliant applications will be unable to read those new files, but the new files won't actually have any new user content. I don't think this will be a very good experience for document owners, and I suspect that applications will tend to favour backward compatibility over nicer part names (given that these are not typically something exposed to the user anyway).

Chris

-----Original Message-----
From: MURATA Makoto (FAMILY Given) [mailto:eb2m-mrt at asahi-net.or.jp] 
Sent: 14 August 2010 18:31
To: e-SC34-WG4 at ecma-international.org
Subject: OPC names in general

Here is my ideas about OPC part names.  I think we should follow RFC 3986, RFC 3987, and LEIRI whenever possible.  At present, OPC has too many ad-hoc things, which deviate from the best practices of Internet.

First, the first sentence of 9.1.1.1 is too confusing.  It even appears to allow schemes such as "http:".

	A Part name shall be an IRI and shall be encoded as either a Part IRI or a Part URI. 

I would argue that the correct syntax of OPC part names should be specified as follows:

	An OPC part name shall be a non-empty absolute path, which is defined
	by the non-terminal "ipath-aboslute" in RFC 3987.

Note: We might want to attach two restrigions (not containing "//" and
          not ending with "/"), but the 7th and 8th bullets in A.3 already
         ignore them.

Second, why do we need "Unicode string" in Part 3?  I think that it is not needed at all as long as we do not want to allow LEIRI-specific characters.  If we do want to allow them, we use "legacy extended OPC part names".  I oppose to allowing the space character while disallowing the other LEIRI additional characters. 
I also oppose to allowing what LEIRI does not allow.

Third, the converion described in A.3 appears to be needed because of the mismatch between the ZIP format and URIs.  If this is the case, should OPC and other formats (e.g., OPF of IDPF) do the same thing?

Cheers,
Makoto