DR-16-0018: WML: need sort method for special characters [for today's call]

Francis Cave francis at franciscave.com
Tue Dec 20 22:37:33 CET 2016


Hi Keld

Thanks for your suggestions. We could certainly consider passing these on to
the DR submitter as a possible approach to implementing the sort method for
special characters. Whether or not they would match how existing
implementers have tackled this issue is for those implementers to say,
should they choose to do so.

Kind regards,

Francis



-----Original Message-----
From: keld at keldix.com [mailto:keld at keldix.com] 
Sent: 20 December 2016 19:24
To: Francis Cave <francis at franciscave.com>
Cc: 'MURATA Makoto' <eb2m-mrt at asahi-net.or.jp>; 'SC 34 WG4'
<e-SC34-WG4 at ecma-international.org>
Subject: Re: DR-16-0018: WML: need sort method for special characters [for
today's call]

Hi Francis

I understand that there are 3 valid values for sortMethod, Strokes, PinYin
and none.
The DR asks for what sorting to use for special characters.
I propose to use the one defined by the null tailoring of ISO 14651.

As strings can have all UCS charcters in it, the ordering just mentioned
fits that bill nicely as it is defined on all UCS characters. ISO 14651 also
addresses other codesets than UCS, if that is relevant.

It would make a difference on the first accent, all other characters
considered equivalent, corresponding to the expected ordering in English and
many other languages.

Maybe one could use extLst to further describe the sorting method. I would
advise to use normal internationalisation mechanisms, such as given by a
locale, as this is a well-known concept, and readily available in most
operating systems. This is not a new feature, so there is no need to have an
amendment or revision of the standard, it could be handled by a DR with some
guidance on how to do it. 

Some suggested advice could be: if the extList is empty, then the associated
locale of the current environment should be used.
Hereby I mean the language  setting for the text in question, it could be an
English part of a Spanish document.

extLst could also be a name, in that case it should be the name of an
implementation-defined locale of the operating system, of which the sorting
spec is to be used.

I don't know if a new DR is needed or this could be part of the answer to
DR-16-0018.

Best regards
keld

On Tue, Dec 20, 2016 at 04:09:41PM -0000, Francis Cave wrote:
> 
> HI Keld
> 
> My main concern is whether the sort method, if specified, affects what 
> the user sees when they open a document. Suppose that, for the sake of 
> argument, the default sort method of implementation A sorts accented 
> text according to the "normal" approach (first accent difference 
> determines the order), while implementation B sorts accented text 
> according to the "French" approach (last accent difference determines 
> the order). Suppose a spreadsheet is created by implementation A and 
> has an auto-filter applied that sorts according to its default sort 
> method.  If this spreadsheet is subsequently opened by implementation 
> B, will it appear the same as if it was re-opened by implementation A, 
> or will implementation B's default sort method be automatically 
> applied? My guess is that generally the latter will be the case, but this
is obviously implementation-dependent.
> 
> I suspect that this is a case where implementations must be free to 
> choose alternative approaches that aren't fully interoperable. Both 
> implementations A and B in my example will have had good reasons for 
> choosing different sort methods, e.g. based upon market demands.
> 
> It might be nice if the sort method were spelt out in the document, 
> but this would definitely be a new feature. For now, in response to DR 
> 16-0018, I think we should simply ensure that the specification is 
> consistent with the schema and is clear about what are meant by the 
> existing values of ST_SortMethod, i.e. 'none', 'pinYin' and 'stroke'.
> 
> Interestingly, I note that the content model of sortState includes the 
> application-defined extension element extLst, which could be used to 
> specify the sort state in more detail, using MCE. So, in theory, it 
> would be possible to define an extension to OOXML without having to 
> amend the base standard. However, I'm not sure that there'd be sufficient
demand for this.
> 
> Kind regards,
> 
> Francis
> 
> 
> 
> -----Original Message-----
> From: keld at keldix.com [mailto:keld at keldix.com]
> Sent: 20 December 2016 09:37
> To: Francis Cave <francis at franciscave.com>
> Cc: 'MURATA Makoto' <eb2m-mrt at asahi-net.or.jp>; 'SC 34 WG4'
> <e-SC34-WG4 at ecma-international.org>
> Subject: Re: DR-16-0018: WML: need sort method for special characters 
> [for today's call]
> 
> Hi Francis
> 
> Yes, I understand that new requirements need an amendment or revision.
> 
> However, this is not what I recommend as the immediate solution to 
> sorting special characters. I just propose to use the ISO 14651/UTS#10 
> kind of default spec.
> And no locale choice. 
> 
> Best regards
> keld
> 
> On Mon, Dec 19, 2016 at 05:27:25PM -0000, Francis Cave wrote:
> > 
> > Try again...
> > 
> > In the first paragraph, for "cannot be done" read "can only be done".
> > Clearly a serious finger malfunction...
> > 
> > Francis
> > 
> > 
> > 
> > -----Original Message-----
> > From: Francis Cave [mailto:francis at franciscave.com]
> > Sent: 19 December 2016 16:41
> > To: keld at keldix.com; 'MURATA Makoto' <eb2m-mrt at asahi-net.or.jp>
> > Cc: 'SC 34 WG4' <e-SC34-WG4 at ecma-international.org>
> > Subject: RE: DR-16-0018: WML: need sort method for special 
> > characters [for today's call]
> > 
> > 
> > In the first paragraph or "cannot be done" ready "can only be done".
> Sorry!
> > 
> > Francis
> > 
> > 
> > 
> > -----Original Message-----
> > From: Francis Cave [mailto:francis at franciscave.com]
> > Sent: 19 December 2016 16:33
> > To: keld at keldix.com; 'MURATA Makoto' <eb2m-mrt at asahi-net.or.jp>
> > Cc: 'SC 34 WG4' <e-SC34-WG4 at ecma-international.org>
> > Subject: RE: DR-16-0018: WML: need sort method for special 
> > characters [for today's call]
> > 
> > 
> > Keld
> > 
> > You may or may not be correct in your view that there is a user 
> > requirement that fields be sortable according to the current locale.
> > However, if there is such a user requirement, it isn't currently met 
> > by OOXML or by implementations. Support for specifying the current 
> > locale as the sort method would involve an extension to OOXML and as 
> > such cannot be done by amendment or revision of the standard, not by 
> > Technical Corrigendum, so this is a big deal. Although this might be 
> > a relatively simple change, e.g. by changing ST_SortMethod to allow 
> > any string (§18.18.73 of ISO/IEC 29500-1:2016), it could not be made 
> > mandatory for implementations to support these new values without 
> > breaking existing implementations, and if they do not already do so, 
> > that suggests that there hasn't be much market pressure that would
> persuade them to implement the enhancement.
> > 
> > I note that ODF doesn't appear to have this feature either (although 
> > it has a similar feature – §19.865 text:sort-algorithm – for sorting 
> > text,
> e.g.
> > bibliographies), and in §19.685 table:order there is the following note:
> > 
> > 	Note: Sorting is locale and implementation-dependent.
> > 
> > It is hard to avoid the conclusion that demand for this feature is 
> > very limited, at least in office document applications.
> > 
> > Francis
> > 
> > 
> > 
> > -----Original Message-----
> > From: keld at keldix.com [mailto:keld at keldix.com]
> > Sent: 18 December 2016 20:19
> > To: MURATA Makoto <eb2m-mrt at asahi-net.or.jp>
> > Cc: SC 34 WG4 <e-SC34-WG4 at ecma-international.org>
> > Subject: Re: DR-16-0018: WML: need sort method for special 
> > characters [for today's call]
> > 
> > I am not aware of the reason why this is so.
> > 
> > But anyway, why not then use the 14651 tailorable ordering in its 
> > template form, which is equivalent to UTS#10 - as the universal 
> > sorting in
> OOXML?
> > 
> > I would think it was a user requirement that fields are sortable 
> > according to the current locale, eg a list of names.
> > 
> > Having the sorting order not being changeable creates troubles for 
> > users too.
> > 
> > best regards
> > keld
> > 
> > 
> > On Mon, Dec 19, 2016 at 04:48:21AM +0900, MURATA Makoto wrote:
> > > The sort order of Excel cannot be changed without causing troubles 
> > > to users.
> > > 
> > > Regards,
> > > Makoto
> > > 
> > > 2016-12-08 3:17 GMT+09:00 Keld Simonsen <keld at keldix.com>:
> > > 
> > > > I propose that you use the locale of the current process, and 
> > > > the implied sorting sequence for special cheracters there, Or at 
> > > > least the sorting specified in ISO/IEC
> > > > 14651
> > > > or the equivalent Unicode specifcation.
> > > >
> > > > best regards
> > > > keld Simonsen
> > > >
> > > 
> > > 
> > > 
> > > --
> > > 
> > > Praying for the victims of the Japan Tohoku earthquake
> > > 
> > > Makoto
> > 
> > 
> > 
> > 
> > 
> 




More information about the sc34wg4 mailing list