Upgrading the 29500 DR Processing System (was: Following up on my action item to give feedback on my DR Log V2 Plan)

Rex Jaeschke rex at RexJaeschke.com
Wed Sep 30 04:12:01 CEST 2009


Murata-san has posted a reply to my proposed DR Log V2, and made his own
proposal. However, his posting still does not address my main concerns,
which are as follows:

A. What is the problem he is trying to solve?
B. Does WG4 agree that problem is real and actually *needs* solving?
C. Who will provide the resources for the work that would be needed to
implement the solution?


Before I address Murata-san's specific points, here is some of my thinking
with respect to 29500 DR Processing: 

1. I had a detailed look at our email archive, and here's what I found. For
the 9 months between October 23, 2008 (the first message), until July 31,
2009 (when we closed-out the DCOR1 and FPDAM1 sets)

* There were 650 emails posted, of which I estimate at least 20% [130] were
administrative in nature and/or not directly connected to DR processing.
That leaves 520 DR-related messages, which comes to approximately 60/month
and 15/week.
* We disposed of 293 DRs
* 85 DRs were still open.

Based on my experience, we made *very* good progress.

Note that there are 64 members of WG registered to access the email list.
That means that the average number of postings per member per month is 1!

2. Given that 29500 provides a rich set of facilities to help in publishing
documents, it seems like a *really* good idea to use those as much as
possible to help manage the 29500 standardization process. After all,
shouldn't we be promoting the use of our own standard? (I use some of these
facilities already, and my recent proposal uses more.)

3. I am not opposed to using Assembla, or something like it, to *help* with
DR processing. (One example appears to be to maintain various versions of
the *electronic* schema.)

4. As editor, I believe strongly that the format of any text in interim
"spec" releases should be the same as that intended for the final version,
so readers get comfortable with, and can gain confidence in, the work of the
author. And since our main goal is to produce CORs and Amds (and,
eventually, a revised standard), then we should use a consistent style at
every step along the way. (That is, not only do we need support for rich
text, we need support for the styles actually used in 29500).

5. In a world of finite (and currently diminishing) resources, when it comes
to developing tools to do a particular job, I'm at the minimalist end of the
spectrum. From that comes my long-held attitude of "It might not be the best
that is possible, but it sure looks good enough." This does *not" mean I'm
in favor of doing a poor job; quite the opposite, I'm very big on quality
assurance. It simply means that we almost certainly do not *need* the best
system, we almost never have the resources to build it anyway or to deliver
it in a timely fashion.

6. The fact that one can access something on-line, at any hour of the day,
is not justification in itself for having such access. And that goes the
same for *wanting* all kinds of statistics.

7. Let's be realistic about the usage we have made of the existing DR Log
system before we start claiming that we need "something better". The
existing system is email-based, with no registration or education required
to use it, and yet we've averaged roughly one participation per WG4 member
per month.  If we haven't used it all that much, why is that? Perhaps, we
have used the existing system "enough", and "more" is not necessary.

8. We have a master DR Log. It is intended to contain all information
relevant to any DR, from its initial submission, to its discussion on the
email list, and in meetings, through to the final resolution. (Supporting
documents are not included in the DR Log itself, but are made committee
documents that are referenced in that Log.) If anything is missing, then
members should be telling me, so I can rectify the situation.

9. But, the DR Log is large and growing: In my recent proposal I suggested
breaking it into individual DR chunks. That said, so what if it is large? So
long as you can download a copy and load it into some tool to use it, then
who cares what its size is? Can you do real work with it? If not, say what
the problem is, and let's address that.

10. The DR Log is sometimes out of date: I have issued a revised Log on a
regular basis, typically soon after each meeting or call, which has been
every 2-4 weeks. In-between times, we all have access to the emails and
attachments that members have posted, so we do have access to *all*
contributions, in one of two places: the DR Log or our email in-trays.

11. An on-line system will make it easier for (more) members to participate:
I don't believe that for an instant! Right now, the way to participate is by
posting a new email or responding to a previous one. What can be easier than
that?

12. We really need facility xxx: Don't confuse *need* with *want*. And ask
yourself if *you* are willing to pay for its implementation.

13. In the 4 months assembla has been "on the table", no-one has objected to
using it: Having been on standards' committees for 25 years, I've found that
if you ask members "Who is in favor of someone else doing some work?", there
are rarely objections. From that, one might conclude there is a lot of
support for that work. However, if you ask the question, "Who is in favor of
going in some given direction if *they* had to fund their share of the
time/effort needed to implement it?", then you often get the complete
opposite result. So, to make it quite clear, I'm objecting to any sort of
wholesale move to using assembla anytime soon.



BTW, in more than 3 years, Ecma TC45 has received only 3 or 4 emails from
members of the public, asking about details in the standard. I have received
no correspondence from the public via JTC 1, and I don't recall anyone
within WG4 saying they were contacted by the public and asked to submit any
DRs on their behalf. In short, the level of direct technical inquiry from
the public has been negligible. And think of all the implementers out there
who are *not* directly represented within WG4, yet, presumably need to
understand the standard; they seem to be getting along just fine without
contacting us at all. So, who then is clamoring for more information? Murata
already makes the DR Log available publically. I would say that most people
involved in OOXML-related projects are more interested in the published
standard and published CORs and Amds, than in what WG4 is doing internally.
Besides, pretty much anyone who *really* has a stake in 29500 (rather than
just "being interested") has a way to participate through their NBs, a
liaison, or as an invited guest. (I know, because as an independent
consultant I funded myself to participate for 15 years in the ANSI/ISO
standardization for the C language, which typically had 4.5-day face-to-face
meetings 4 times per year.)


Ok, let's move on to specific points raised by Murata-san (see my replies
in-line):


-----Original Message-----
From: MURATA Makoto (FAMILY Given) [mailto:eb2m-mrt at asahi-net.or.jp] 
Sent: Monday, September 28, 2009 1:34 AM
To: SC 34 WG4
Subject: Re: Following up on my action item to give feedback on my DR Log V2
Plan

Dear colleagues,

Here is my proposal.  I strongly advocate the use of Assembla.

1. Problems of the current approach

1.1 Schema maintenance problems

1) Different specfications (e.g., FPDAM1 and DCOR1) developed 
   in parallel share a single set of schemas.  What happens if 
   one of them is disapproved?

Rex> This could be a problem. And as I stated above, I am not opposed to
using assembla to help solve certain parts of the maintenance problem. Note,
however, in the DCOR1 and FPDAM1 sets (which closed 293 DRs), there was only
one instance of overlap with respect to changes to the same lines of schema.
In any event, the same issue exists with changes to the same sentence or
paragraph by multiple DR resolutions. In both cases, when I produced the
DCOR1 and FPDAM1 sets, I noticed this and made appropriate adjustments. 


2) WG4 members cannot obtain schema files and point out problems 
   (including syntactical errors) before schemas
   are generated from change-tracked Word files.

Rex> In my V2 proposal I suggested that we not actually mark a DR closed
until its schema changes have been applied to the electronic version and
validated. As to how this is done, I have no preference at this time.


3) RELAX NG schemas cannot be created before XSD schemas are 
   generated from change-tracked Word files.

Rex> See previous response.


1.2 DR maintenance problems

1) The DR Log file is bulky and cumbersome.

Rex> These are not meaningful characterizations. Please be specific about
how this impairs maintenance. Also, see my earlier response.


2) The task of maintaining the DR Log is a too siginifant 
   burden on the project editor(s), and thus the the DR Log 
   does not always contain all the relevant e-mails or attachment files.

Rex> Perhaps you should have asked the project editor first before making
such a bold statement! Let me state for the record that I have never
claimed, not at any time do I recall having thought that was true. As to the
possibility of there being missing information, what steps has anyone taken
to notify me of that? I'm not aware of any substantive problem here.


3) Maitaining the consistency between the DR index (as an Excel file) 
   and the DR Log is also a siginifant burden on the project editor(s).

Rex> Once again, you are speaking on my behalf without checking with me.
Because I started the spreadsheet Index well after I started the DR Log,
yes, it took a couple of hours of cutting and pasting to populate the sheet,
but, now, it's trivial to create a new row for each DR I add to the Log. In
any event, my V2 proposal makes the generation of the Index automatic.


4) Statistics are published only occasionally (by hand), and they are
   not very flexible.  For example, we cannot count the number of 
   MIME-related DRs.

Rex> As far as I know, other SCs and WGs generate statistics by hand too,
since statistics are only needed for occasional summaries of the status of
work. In any case, you can take the DR Log PDF and search for a string, such
as "MIME", and find the number of occurrences. Although a rough gauge, I
expect it will be good enough for most purposes. I can't say that I've had
the need to produce any other statistics. What else do members *need*, how
often, and why?


1.3 DR submission problems

When we developed the web form for submitting DRs, we thought that
this will become by far the most important channel.  Apparently, it is
not.  Since some editorial defects are best described as changes in
the DOCX file rather than filling out flat-text-only forms, some
member bodies continue to use DOCX files for submitting DRs.  The web
form does not cause any troubles, however.

Rex> Correct, the web form does not cause any "trouble", but it does make
work for me. However, let's be careful about the use of "we". I argued
against it from the start. It is only able to capture initial submissions,
not track anything beyond that, and it has no support for rich text. As for
others using an alternate approach to DR submission, that's because I have
openly encouraged them to do so by providing them with a "blank" DOCX file
for them to "fill in". 50% of the DRs come to me via that method, which
makes it trivial for me to integrate them into the DR Log. So, if others
would submit their DRs that way too, that would reduce my workload.


2. My proposal

The proposal by Rex addresses 1), 3) and 4) in 1.2.  However, 
the other problems are not solved.  I would like to provide 
some additions and changes to the proposed plan.

Rex> Since 2) was about the current system's burden on the project editor,
and as the project editor I've explained that this was an incorrect
assumption, it appears that my proposal does, in fact, address all the
problems you identify.


I would like to advocate the use of Assembla.  In particular,
subversion can easily addresses 1), 2), and 3) in 1.1.

2.1 Schema maintenance

Schemas should be maintained in the subversion repository of Assembla
rather than changed-tracked MS-Word documents.

Rex> So long as the schemas are printed in annexes, I *need* to publish
tracked changes in CORs and Amds to show how those annexes contents are
changed. So, it is not an "either/or" situation; we need to deal with
changes in the printed annex versions, and we need to deal with the
electronic versions. Maybe my current approach can be used for the annexes,
and assembla for the xsd/rnc file versions.


Schema files can be accessed via the web-based user interface of
Assembla, but schema experts and the project editor will require
front-end sytems (such as Tortoise) for schema authoring.

Rex> At this time, I am open to how changes to the electronic version of the
schemas are handled. 


- For each specification (AM or DCOR), we create a subversion
  branch.  This branch contains those schema changes required for the
  amendment or DCOR.  Note that different specifications developed in
  parallel have different schemas.

- Whenever a WG4 member proposes a schema change in reply to a DR,
  that member should store the change into the subversion repository,
  and should create a link to the DR at the same time.  This 
  action requires understanding of subversion and the front end system.

- Everybody can view schemas stored in the Assembla subversion 
  repository at any time.

An issue in the subversion approach (which maintains schema files
only) is the creation of changed-tracked Word files.  I believe that
this issue can be addressed by using Word for comparing two DOCX
documents.

- Create a DOCX document containg old schemas.

- Create another DOCX document containing new schemas for the 
  speficiation in question.

- Use the file comparison feature of MS Word for generating change-
  tracked DOCX documents.

- Incorporaate the change-tracked DOCX documents as part of the 
  (F)PDAM or DCOR.

Rex> At a glance, this sounds like a non-trivial process. When considering
Murata-san's proposal, keep in mind that of the 213 DRs we closed in the
DCOR1/FPDAM1 sets, only 37 (that is, 17%) involved changes to schema. So, we
need to keep in perspective how sophisticated a system we really *need* to
manage this. In any event, I don't recall any other members raising any
concerns about their inability to do anything schema-related in terms of the
DR Log machinery.


2.2 DR maintenance

Although I strongly advocate the use of Assembla, we should 
not force the project editor to use the web-based user interface 
of Assembla for DR maintenance.   This is because he might 
not be always able to access Assembla and also because MS Office
is sometimes more convenient than web-based user interfaces.

- For each DR, the project editor creates a DR file as a DOCX document, 
  as proposed by Rex.  Whenever the status of the DR changes, the 
  project editor revise the DR file.

- The project editor uses a program for automatically creating an
  Assembla ticket from the DR file.  Custom xml or smart tags embedded
  in the DR file allow data to be easily extracted.  The program also 
  store the DR file as an attachment file in Assembla.

- The project editor uses another program for creating the DR Log
  Index from the Assembla tickets (not from the DR files) as an XLSX
  document.  Links from the DR Log Index to DR files use the URI of
  the Assembla repository as a base URI.  The DR Log Index also 
  contains the summaries and statistics on a new Sheet.

- The project editor then stores the DR Log Index in three formats, 
  namely XLSX, TSV, and PDF, as attachment files, and create links from
  the Assembla wiki.

- Each WG4 member receives a notification mail for each newly-created DR.

- WG4 members discuss about each DR by using the comment feature of 
  Assembla.  All comments on DRs are stored in the Assembla
  repository, and are accessible via the web-based Assembla user 
  interface.  Attachment files are also stored in Assembla.

- If necessary, we develop a program for generating a single file (zip?
pdf?)
  for each DR or all DRs.  This program would help those who do not always 
  have access to the Internet.

- As described in 2.1, schema changes required for any DR should be
  stored in the Assembla subversion repository, and ech commit action 
  should provide a link to the relevant DR.  Such links make it easy 
  to maintain the relationship between schema changes and DRs.

The project editor will be freed from two works.

- He does not have to copy all e-mails and attachment files to DR Log files.

- He does not have to maintain the DR Log and the DR Log Index 
  in sync.

Note: I do not have an opinion about the DR history log.  Is it hart
to maintain?  I have almost never read the DR history log.
  
2.3 Milestones

Whenever a new amendment project is created, we create an Assembla
milestone and a branch in the subversion repository.  Whenever we
start to create a DCOR, we do the same thing.



Regards,

SC34/WG4 Convenor
MURATA Makoto (FAMILY Given)


Rex> As far as I can tell, with respect to 29500 maintenance, the assembla
solution currently being proposed appears to me largely to be a solution in
search of a problem to solve. I believe that the use of it would make the
project editor's work more time-consuming.  It might also reduce
participation due the administrative and training overhead involved, and the
time and resources needed to move to assembla would distract WG4 from the
more important work of processing defect reports.  WG4 has moved rapidly
through a large number of DRs already, and I think we should build on that
success.  I'm open to fine-tuning the process, but I don't believe anything
has happened to justify a wholesale change to a radically different
approach.

Rex Jaeschke
29500 Project Editor








More information about the sc34wg4 mailing list