Guidelines + Hints (Re: DR-08-0012: Preventing a slippery slope)

rjelliffe at allette.com.au rjelliffe at allette.com.au
Fri May 29 07:43:16 CEST 2009


A couple of comments.

===================================
First, would it be useful for us to make some kind of development strategy
to guide IS29500 development wrt versioning?

For example, for different kinds of changes:
  When do we make a new namespace?
  When do we keep the namespace but rely on other markup?
  When do we keep the namespace and just add or remove the change?

For example, are new namespaces the equivalent of "major" (i.e. breaking)
numbers to be handled by MCE, but we we still require something else for
smaller changes? Or are even the smallest changes supposed to be handled
by new namespaces and MCE?


===================================
Second, we need to be careful that defining the issue as 'versioning' does
not push us in a certain direction for the solution: such as that if the
problem is versioning the solution must be version numbers.

MCE is great. It takes the SGML-y line that explicit labelling is the best
policy: excellent. But it works by duplication to get optimal data
interchange: the more namespaces that consuming and generating application
import and export using MCE, the better the chance of a happy marriage.

This will favour market operation, so that in effect only a few basic
alternatives will appear in documents. Postel's principle will be
relevant.

But MCE does not address one particular 'versioning' problem: how can a
consuming application make a stab at using an OOXML document which does
not contain any namespaces it knows about?  For example, how can an "ecma"
application use a document that only has "strict" in it? This of course
becomes a much stronger issue over time, particularly if MCE encourages
WG4/ECMA to regularly change the namespace.

Will OpenOffice2030 understand every OOXML namespace for the next 20
years, in order to be able to open a current "ecma" document? This is of
course one of the issues we need to be taking seriously.

I suggest that MCE could usefully be augmented by a hint mechanism, that
gives the relationships between namespaces. For example

  <namespace iri="http://purl.olcl.org/schemas/wordprocessingml2030">
      <superset iri="http://purl.olcl.org/schemas/wordprocessingml2029" />
      <dialect iri="http://purl.olcl.org/schemas/wordprocessingml2010" />
  </namespace>

We are in 2030. We have a document in the 2030 namespace. When we open it
with Office2029, the hint is enough to say "You can convert this document
to the 2029 namespace and read it with that (but warn the user *if* you
find things you don't understand)". If we open it with Office 2028, the
application converts it to the 2010 namespace and tries the same.

Now this is perhaps a kind of type relationship system for namespaces...it
could go in a more theoretical direction but I don't think that is
necessary. The point is just to provide hints to allow graceful
degradation that can kick in when a document uses namespaces that the
consuming application is not aware of, because MCE is a mechanism that
does not guarantee that the document contains any namespaces the consumer
understands.

In terms of "S", we could say (the namespaces are approximated, sorry)

 <namespace iri="http://purl.olcl.org/schemas/ooxml/wordprocessingml">
    <subset iri="http://www.openxml.org/schemas/2007/wordprocessingml" />
    <dialect iri="http://www.ecma-international.org/schemas/ecma376" />
 </namespace>

This approach has the virtue of defining the versioning problems in terms
of filling in gaps from MCE to allow a better story on graceful
degradation: I see this playing out

 - Short term: helps where a new generation of generators comes out
generating documents with new namespace: previous generation can se the
hint to version-down the namespace. MCE has a good story here, except
perhaps for very large documents where the duplication from a new
namespace might be unfortunate.

 - Mid term: Helps older applications or apps with a slower development
cycle. An MCE document that had the data in the new namespace and one
previous (since having it in all alternative namespaces including ODF
seems unlikely) may still not match with, someone say using a six-year
old version of their application.

 - Long term: a old document saying which dialect it uses can allow an
application to make a stab at the appropriate processor.

I recognize that the more that we move to "download codec when required"
architectures for document systems, the less justification there may be
for my suggestion of markup giving the relationship between namespaces for
application fallback.

Cheers
Rick Jelliffe

P.S. For WG1 people. This hinting mechanism has a possible DSRL tie-in. If
a consuming application decides it does not understand any namespace in
its MCE data, but it does see that the namespace is a
dialect/superset/subset of a namespace it does understand, it could look
for a DSRL schema to perform the transform and try to load the result. But
that is blue-sky, not necessary.



More information about the sc34wg4 mailing list