Maintaining schemas and tracking changes (formerly, Making RELAX NG schemas normative rather than non-normative)

MURATA Makoto eb2m-mrt at asahi-net.or.jp
Sat Mar 26 04:51:56 CET 2011


Rex,
>
> Up until now, all changes to 29500 have been documented in the DR log, and
> that log was then used to generate the COR 1 set and then the AMD1 set. As

Unfortunately, no.

Before the completion of DCOR1 and FPDAM1, schemas in
the DR log are very buggy.  There were many syntactical errors as well as
other errors.  When electronic schemas were consolidated and tested, bugs
in schemas in the DR log were fixed.

Now, some schema changes for the cosed DRs (to appear in COR2) exist
only in the subversion repository.  I would like to know the opinion of
WG members about this.

> we were resolving DRs, I argued that we shouldn't declare a DR to be closed
> until all the detailed changes were documented as part of the corresponding
> DR(s) entries in the log. And I was successful except for RELAX NG edits,
> which came sometime (very much) later. (I accepted that delay because
> support for this schema was not required; that is, it was non-normative).

No, you were not successful, since XSD schemas in the DR log were buggy.
Correct XSD schemas also came later.

> Anyone could look at a closed DR at any time and see exactly what XSD schema
> changes had been agreed to by WG4. And given that the complete history of a
> DR's processing is captured there, interim schema changes are also recorded.

I, as a schema hacker, do not believe anything that has not been tested by
a validator.  Even if a schema fragment in the DR log is created from
a tested schema, I do not trust it unless the creation process is completely
automatic.

> And as they were maintained using pseudo-change-tracking notation, the
> reader could see what was deleted or added. CORs (and AMDs like AMD1) need
> to continue to show JUST the changes being made, NOT the whole schemas.

I do not agree here.  Those who really care schemas can compare
old schemas and new schemas by some programs.  I do not think
that they believe the result of manual operations.  I certainly do not.

I  think that there is nothing wrong in providing the entire Annexes
A and B for each COR or AMD, but it is certainly possible to create a
schema-change-tracking WordprocessingML document from
two WordprocessingML documents: one for the old schemas and
the other for the new schemas.


> I believe that if we stop showing the schema changes in the DR log, that
> would be a major step backwards in allowing readers to see all the
> information in one place.

I have two possible options.  One is to create an image file (or something
else) that is generated by an XML or text comparison program, and embed
that image to the DR log.  The other is to simply reference the  SVN repository.
Schema hackers should have no problems in finding differences.  Note
that each commit action accompanies comments indicating the DR number.

(By the way, the Web-based Assembla UI for showing schema changes
does not work well.  I always use Oxygen Diff.)

>It would also make the generation of a COR or AMD
> much more complicated requiring a lot of work to be done very late in the
> process rather than as each DR is closed. Now, maintaining schemas changes

In the case of AMD1 and COR1, a lof of work about schemas was done
very late in the process.

> in the DR log as well as electronically could lead to differences between
> them. How can that happen? If WG4 approves the changes documented in the DR
> log then those are the official ones (even if they are technically incorrect
> or don't validate!).

I think that WG4 should not approve schemas in the DR log.  WG4 should
approve schemas in the SVN repository.

> If the electronic schema is found to be different then
> someone made changes there without having WG4 review them and approve them
> formally in a teleconference or F2F meeting, or I made an error in what was
> recorded in the meeting minutes and/or DR log. To make sure differences
> don't happen, all schema changes should be tested electronically BEFORE the
> DR is accepted as closed. Any change to the printed or electronic version
> after that requires the DR to be re-opened.

Even if each schema fragment is tested before a DR is closed,
subsequent manual operations (embedding the schema fragment
in the DR log and consolidation of Annexes A and B) will lead to
errors, as demonstrated by the recent reprint.

In the case of AMD1 and COR1, the creation of electronic schemas
was a nightmare, since I had to generate them from merged schemas.
At present, every schema change from closed DRs exist in one branch.
If we again have to heavily use AMDs, we will have a similar problem,
but change log in the Assembla SVN repository greatly help.

Our life is hard since we have to develop more than one COR/AMD
in parallel.  We have AMD2 already.  We are likely to have another AMD
for anchors in DrawingML.  Thus, we cannot avoid merging schema sets:
one set from one COR or AMD.  Electronic schemas and automatic
merging provided by ToiToireSVN greatly help.  Manual merging process
based on the DR Log cannot take advantage of any software but
requires tedious, labor-intensive, and error-prone operations.




Praying for the victims of the Japan Tohoku earthquake

Makoto


More information about the sc34wg4 mailing list