Four levels of document packages
robert_weir at us.ibm.com
robert_weir at us.ibm.com
Mon Oct 25 17:18:02 CEST 2010
At our last call I tried to point out that you can look at a document
package at different levels:
1) At the lowest level, the actual bits, which might be ZIP, TAR, gzip,
2) At the level of an abstract file system, with Files and Directories and
metadata commonly found on file systems, e.g., date stamps
3) Document-package conventions, dealing with things like manifests,
declaration of contained content types, digital signatures, etc.
4) The details of individual document formats, e.g., what exactly ISO/IEC
26300 puts in contents.xml
IMHO, we don't have a significant technical problem with using the ZIP
Application Note for #1 above. It is working today for several standards.
However, we do have a referencing problem, e..g, external normative
references may only be made to other International Standards, or to
specification for which there is an approved Referencing Explanatory
Report (RER), or to specifications from an Approved RS Originator
Organization (ARO). The ZIP Application Note does not fall into any of
these categories. But clearly there is more than one way to remedy the
#2 is a more interesting problem for us to tackle, I think, because it
deals with how to specify the conventions that make ZIP archives "play
well" in XML/Unicode/Web world. Things like a ZIP URL protocol,
specification for fragment identifiers, etc. Given only the ZIP
Application Note it is not clear how content within a ZIP interacts with
the broader world.
For example, a goal could be to have XML in one ZIP define a reference to
an XML element in another XML file in another ZIP. And do this all in
Japanese. This is harder than it looks, since you are traversing three
different domains with three different conventions: XML/XPath, "paths"
within a ZIP (which are really just names that happen to look like paths)
and paths according to your protocol, e.g., HTTP/URL. IMHO, this is the
practical problem we have with implementations today. Solve this and we
open up greater opportunities for intra- and inter-linking of these kinds
of documents. I think this is where we should focus.
#3 is also of interest to some parties. For example, we (the ODF TC)
recently received a liaison message from ETSI where they describe their
ASig work which deals with associating digital signatures in XML/ZIP
packages, including ODF, EPub, etc. There is some interest in doing this
As for #4, this is the job of WG4 and WG6, not WG1.
More information about the sc34wg1study