Four levels of document packages

robert_weir at us.ibm.com robert_weir at us.ibm.com
Mon Oct 25 17:18:02 CEST 2010


At our last call I tried to point out that you can look at a document 
package at different levels:

1) At the lowest level, the actual bits, which might be ZIP, TAR, gzip, 
etc.

2) At the level of an abstract file system, with Files and Directories and 
metadata commonly found on file systems, e.g., date stamps

3) Document-package conventions, dealing with things like manifests, 
declaration of contained content types,  digital signatures, etc.

4) The details of individual document formats, e.g., what exactly ISO/IEC 
26300 puts in contents.xml


IMHO, we don't have a significant technical problem with using the ZIP 
Application Note for #1 above.  It is working today for several standards. 
 However, we do have a referencing problem, e..g, external normative 
references may only be made to other International Standards, or to 
specification for which there is an approved Referencing Explanatory 
Report (RER), or to specifications from an Approved RS Originator 
Organization (ARO).  The ZIP Application Note does not fall into any of 
these categories.  But clearly there is more than one way to remedy the 
referencing issue.

#2 is a more interesting problem for us to tackle, I think, because it 
deals with how to specify the conventions that make ZIP archives "play 
well" in XML/Unicode/Web world.  Things like a ZIP URL protocol, 
specification for fragment identifiers,  etc.  Given only the ZIP 
Application Note it is not clear how content within a ZIP interacts with 
the broader world.

For example, a goal could be to have XML in one ZIP define a reference to 
an XML element in another XML file in another ZIP. And do this all in 
Japanese.   This is harder than it looks, since you are traversing three 
different domains with three different conventions:  XML/XPath, "paths" 
within a ZIP (which are really just names that happen to look like paths) 
and paths according to your protocol, e.g., HTTP/URL.   IMHO, this is the 
practical problem we have with implementations today.  Solve this and we 
open up greater opportunities for intra- and inter-linking of these kinds 
of documents.  I think this is where we should focus.

#3 is also of interest to some parties.  For example, we (the ODF TC) 
recently received a liaison message from ETSI where they describe their 
ASig work which deals with associating digital signatures in XML/ZIP 
packages, including ODF, EPub, etc.  There is some interest in doing this 
generically. 

Details here:  
http://lists.oasis-open.org/archives/office/201010/msg00522.html

As for #4, this is the job of WG4 and WG6, not WG1.

-Rob



More information about the sc34wg1study mailing list