<div dir="ltr">Keld,<div><br></div><div>Thank you. I did not know that. I am still looking for the tailoring for JIS X 4061, </div><div>but it is good to know it is somewhere in CLDR. Are there tailoring for Chinese?</div><div><br></div><div>Regards,</div><div>Makoto</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">2017-07-19 19:04 GMT+09:00 <span dir="ltr"><<a href="mailto:keld@keldix.com" target="_blank">keld@keldix.com</a>></span>:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Dear Makoto-san<br>
<br>
I found out that indeed JIS X 4061 is supported in the Unicode collection of sorting specifications,<br>
which are tailoring of 14651 specs:<br>
<br>
<a href="https://stackoverflow.com/questions/29874198/which-japanese-sorting-collation-orders-are-supported-by-icu-cldr-uca" rel="noreferrer" target="_blank">https://stackoverflow.com/<wbr>questions/29874198/which-<wbr>japanese-sorting-collation-<wbr>orders-are-supported-by-icu-<wbr>cldr-uca</a><br>
<br>
Best regards<br>
<span class="HOEnZb"><font color="#888888">keld<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
On Wed, Jul 19, 2017 at 11:33:03AM +0200, <a href="mailto:keld@keldix.com">keld@keldix.com</a> wrote:<br>
> I think that MS excel uses 14651 or the equivalent Unicode UTS #10.<br>
> Microsoft normailly uses Unicode specifications in many of their<br>
> products, and 14651/UTS #10 is readily availiable in Microsoft's<br>
> operating systems, as far as I know.<br>
><br>
> But we can find out.<br>
><br>
> Many applications use nowadays ISO 10646/Unicode, and just having the<br>
> sorting specifications for the characters of X0208 and X0201 is not enough in<br>
> todays Japanese environments, especially if you run modern<br>
> microsoft operating systems and use the MS Office application.<br>
><br>
> But you can have a tailoring of the 14651 template that is conforming to<br>
> JIS X 4061, so that all of the characters of X0208 and X0201 are sorted<br>
> correctly - and then the rest of the Japanese characters, including those in X0212,<br>
> are also sorted in a culturally acceptable order.<br>
><br>
> I remember from our previous meetings, Makoto-san, that you seem to be fond<br>
> of Unicode specifications. In this case the ISO and Unicode specs are technically the<br>
> same, so I hope that this fact will make you more positive about referencing<br>
> the ISO 14651 standard and template.<br>
><br>
> Best regards<br>
> Keld<br>
><br>
><br>
><br>
> On Wed, Jul 19, 2017 at 06:07:06PM +0900, MURATA Makoto wrote:<br>
> > But in Japan, we did not rewrite JIS X 4061 on the basis of ISO 14651.<br>
> > I do not know if China has created their tailoring of 14651. I do not<br>
> > think<br>
> > that MS Excel relies on 14651.<br>
> ><br>
> > I agree that 14651 is good for future projects. But I do not<br>
> > think that it is practically possible to document what MS Excel<br>
> > does using 14651.<br>
> ><br>
> > Regards,<br>
> > Makoto<br>
> ><br>
> > 2017-07-19 18:00 GMT+09:00 <<a href="mailto:keld@keldix.com">keld@keldix.com</a>>:<br>
> ><br>
> > > Dear Makato-san<br>
> > ><br>
> > > I have asked a sorting expert, who is the editor of 14651, and he says that<br>
> > > for Japanese and Chinese, there surely is a need for tailoring the sorting<br>
> > > template<br>
> > > in 14651. I also know that for my own language, Danish, a tailoring of<br>
> > > 14651<br>
> > > is required.<br>
> > ><br>
> > > A number of tailoring is already available for this, for many languages,<br>
> > > and also in use in the marketplace, and readily available as open source<br>
> > > specifications.<br>
> > > It is many hundreds of specifications. I can come back with more<br>
> > > information.<br>
> > > But in the first place I can refer to the glibc specifications and the<br>
> > > CLDR collections<br>
> > > of sorting specifications. They both build on the ISO 14651 template.<br>
> > ><br>
> > > So I think a normative reference to ISO 14651 is very relevant, as this is<br>
> > > what is done<br>
> > > in the marketplace, and then also a mention of glibc and CLDR sorting<br>
> > > could be mentioned<br>
> > > in the bibliography.<br>
> > ><br>
> > > Best regards<br>
> > > keld<br>
> > ><br>
> > > On Tue, Jul 18, 2017 at 09:43:00PM +0900, MURATA Makoto wrote:<br>
> > > > ISO/IEC 14651 defines a reference comparison method and a common<br>
> > > > template table for ordering text data. It is necessary to tailor the<br>
> > > > common template table for a given language's ordering. For example,<br>
> > > > the order of CJK ideographic characters is based on UCS code points,<br>
> > > > and thus look meaningless to human CJK users.<br>
> > > ><br>
> > > > Meanwhile, Japanese standard (JIS X 4061:1996) defines ordering of<br>
> > > > Japanese text. JIS X 4061 is NOT based on ISO/IEC 14651. Its<br>
> > > > ordering is based on code points in JIS X 0208 and X0201 (which is<br>
> > > > roughly equal to US-ASCII). This ordernig makes sense for Japanese<br>
> > > > users, since code points in these standards are based on the Japanese<br>
> > > > alphabetical order of the kana transcription of each Kanji.<br>
> > > ><br>
> > > > I do not believe that referencing ISO/IEC 14651 from OOXML is useful,<br>
> > > > unless we provide our own variation of the common template table.<br>
> > > > This variation should cover ordering in JIS X 0208. I suppose that it<br>
> > > > should also cover orderings in mailing China, Taiwan, HongKong, and<br>
> > > > and many areas. It might be technically possible to document do so,<br>
> > > > but I do not think that it is practically possible.<br>
> > > ><br>
> > > > Regards,<br>
> > > > Makoto<br>
> > > ><br>
> > > > 2016-12-21 6:37 GMT+09:00 Francis Cave <<a href="mailto:francis@franciscave.com">francis@franciscave.com</a>>:<br>
> > > ><br>
> > > > ><br>
> > > > > Hi Keld<br>
> > > > ><br>
> > > > > Thanks for your suggestions. We could certainly consider passing these<br>
> > > on<br>
> > > > > to<br>
> > > > > the DR submitter as a possible approach to implementing the sort<br>
> > > method for<br>
> > > > > special characters. Whether or not they would match how existing<br>
> > > > > implementers have tackled this issue is for those implementers to say,<br>
> > > > > should they choose to do so.<br>
> > > > ><br>
> > > > > Kind regards,<br>
> > > > ><br>
> > > > > Francis<br>
> > > > ><br>
> > > > ><br>
> > > > ><br>
> > > > > -----Original Message-----<br>
> > > > > From: <a href="mailto:keld@keldix.com">keld@keldix.com</a> [mailto:<a href="mailto:keld@keldix.com">keld@keldix.com</a>]<br>
> > > > > Sent: 20 December 2016 19:24<br>
> > > > > To: Francis Cave <<a href="mailto:francis@franciscave.com">francis@franciscave.com</a>><br>
> > > > > Cc: 'MURATA Makoto' <<a href="mailto:eb2m-mrt@asahi-net.or.jp">eb2m-mrt@asahi-net.or.jp</a>>; 'SC 34 WG4'<br>
> > > > > <<a href="mailto:e-SC34-WG4@ecma-international.org">e-SC34-WG4@ecma-<wbr>international.org</a>><br>
> > > > > Subject: Re: DR-16-0018: WML: need sort method for special characters<br>
> > > [for<br>
> > > > > today's call]<br>
> > > > ><br>
> > > > > Hi Francis<br>
> > > > ><br>
> > > > > I understand that there are 3 valid values for sortMethod, Strokes,<br>
> > > PinYin<br>
> > > > > and none.<br>
> > > > > The DR asks for what sorting to use for special characters.<br>
> > > > > I propose to use the one defined by the null tailoring of ISO 14651.<br>
> > > > ><br>
> > > > > As strings can have all UCS charcters in it, the ordering just<br>
> > > mentioned<br>
> > > > > fits that bill nicely as it is defined on all UCS characters. ISO 14651<br>
> > > > > also<br>
> > > > > addresses other codesets than UCS, if that is relevant.<br>
> > > > ><br>
> > > > > It would make a difference on the first accent, all other characters<br>
> > > > > considered equivalent, corresponding to the expected ordering in<br>
> > > English<br>
> > > > > and<br>
> > > > > many other languages.<br>
> > > > ><br>
> > > > > Maybe one could use extLst to further describe the sorting method. I<br>
> > > would<br>
> > > > > advise to use normal internationalisation mechanisms, such as given by<br>
> > > a<br>
> > > > > locale, as this is a well-known concept, and readily available in most<br>
> > > > > operating systems. This is not a new feature, so there is no need to<br>
> > > have<br>
> > > > > an<br>
> > > > > amendment or revision of the standard, it could be handled by a DR with<br>
> > > > > some<br>
> > > > > guidance on how to do it.<br>
> > > > ><br>
> > > > > Some suggested advice could be: if the extList is empty, then the<br>
> > > > > associated<br>
> > > > > locale of the current environment should be used.<br>
> > > > > Hereby I mean the language setting for the text in question, it could<br>
> > > be<br>
> > > > > an<br>
> > > > > English part of a Spanish document.<br>
> > > > ><br>
> > > > > extLst could also be a name, in that case it should be the name of an<br>
> > > > > implementation-defined locale of the operating system, of which the<br>
> > > sorting<br>
> > > > > spec is to be used.<br>
> > > > ><br>
> > > > > I don't know if a new DR is needed or this could be part of the answer<br>
> > > to<br>
> > > > > DR-16-0018.<br>
> > > > ><br>
> > > > > Best regards<br>
> > > > > keld<br>
> > > > ><br>
> > > > > On Tue, Dec 20, 2016 at 04:09:41PM -0000, Francis Cave wrote:<br>
> > > > > ><br>
> > > > > > HI Keld<br>
> > > > > ><br>
> > > > > > My main concern is whether the sort method, if specified, affects<br>
> > > what<br>
> > > > > > the user sees when they open a document. Suppose that, for the sake<br>
> > > of<br>
> > > > > > argument, the default sort method of implementation A sorts accented<br>
> > > > > > text according to the "normal" approach (first accent difference<br>
> > > > > > determines the order), while implementation B sorts accented text<br>
> > > > > > according to the "French" approach (last accent difference determines<br>
> > > > > > the order). Suppose a spreadsheet is created by implementation A and<br>
> > > > > > has an auto-filter applied that sorts according to its default sort<br>
> > > > > > method. If this spreadsheet is subsequently opened by implementation<br>
> > > > > > B, will it appear the same as if it was re-opened by implementation<br>
> > > A,<br>
> > > > > > or will implementation B's default sort method be automatically<br>
> > > > > > applied? My guess is that generally the latter will be the case, but<br>
> > > this<br>
> > > > > is obviously implementation-dependent.<br>
> > > > > ><br>
> > > > > > I suspect that this is a case where implementations must be free to<br>
> > > > > > choose alternative approaches that aren't fully interoperable. Both<br>
> > > > > > implementations A and B in my example will have had good reasons for<br>
> > > > > > choosing different sort methods, e.g. based upon market demands.<br>
> > > > > ><br>
> > > > > > It might be nice if the sort method were spelt out in the document,<br>
> > > > > > but this would definitely be a new feature. For now, in response to<br>
> > > DR<br>
> > > > > > 16-0018, I think we should simply ensure that the specification is<br>
> > > > > > consistent with the schema and is clear about what are meant by the<br>
> > > > > > existing values of ST_SortMethod, i.e. 'none', 'pinYin' and 'stroke'.<br>
> > > > > ><br>
> > > > > > Interestingly, I note that the content model of sortState includes<br>
> > > the<br>
> > > > > > application-defined extension element extLst, which could be used to<br>
> > > > > > specify the sort state in more detail, using MCE. So, in theory, it<br>
> > > > > > would be possible to define an extension to OOXML without having to<br>
> > > > > > amend the base standard. However, I'm not sure that there'd be<br>
> > > sufficient<br>
> > > > > demand for this.<br>
> > > > > ><br>
> > > > > > Kind regards,<br>
> > > > > ><br>
> > > > > > Francis<br>
> > > > > ><br>
> > > > > ><br>
> > > > > ><br>
> > > > > > -----Original Message-----<br>
> > > > > > From: <a href="mailto:keld@keldix.com">keld@keldix.com</a> [mailto:<a href="mailto:keld@keldix.com">keld@keldix.com</a>]<br>
> > > > > > Sent: 20 December 2016 09:37<br>
> > > > > > To: Francis Cave <<a href="mailto:francis@franciscave.com">francis@franciscave.com</a>><br>
> > > > > > Cc: 'MURATA Makoto' <<a href="mailto:eb2m-mrt@asahi-net.or.jp">eb2m-mrt@asahi-net.or.jp</a>>; 'SC 34 WG4'<br>
> > > > > > <<a href="mailto:e-SC34-WG4@ecma-international.org">e-SC34-WG4@ecma-<wbr>international.org</a>><br>
> > > > > > Subject: Re: DR-16-0018: WML: need sort method for special characters<br>
> > > > > > [for today's call]<br>
> > > > > ><br>
> > > > > > Hi Francis<br>
> > > > > ><br>
> > > > > > Yes, I understand that new requirements need an amendment or<br>
> > > revision.<br>
> > > > > ><br>
> > > > > > However, this is not what I recommend as the immediate solution to<br>
> > > > > > sorting special characters. I just propose to use the ISO<br>
> > > 14651/UTS#10<br>
> > > > > > kind of default spec.<br>
> > > > > > And no locale choice.<br>
> > > > > ><br>
> > > > > > Best regards<br>
> > > > > > keld<br>
> > > > > ><br>
> > > > > > On Mon, Dec 19, 2016 at 05:27:25PM -0000, Francis Cave wrote:<br>
> > > > > > ><br>
> > > > > > > Try again...<br>
> > > > > > ><br>
> > > > > > > In the first paragraph, for "cannot be done" read "can only be<br>
> > > done".<br>
> > > > > > > Clearly a serious finger malfunction...<br>
> > > > > > ><br>
> > > > > > > Francis<br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > > -----Original Message-----<br>
> > > > > > > From: Francis Cave [mailto:<a href="mailto:francis@franciscave.com">francis@franciscave.<wbr>com</a>]<br>
> > > > > > > Sent: 19 December 2016 16:41<br>
> > > > > > > To: <a href="mailto:keld@keldix.com">keld@keldix.com</a>; 'MURATA Makoto' <<a href="mailto:eb2m-mrt@asahi-net.or.jp">eb2m-mrt@asahi-net.or.jp</a>><br>
> > > > > > > Cc: 'SC 34 WG4' <<a href="mailto:e-SC34-WG4@ecma-international.org">e-SC34-WG4@ecma-<wbr>international.org</a>><br>
> > > > > > > Subject: RE: DR-16-0018: WML: need sort method for special<br>
> > > > > > > characters [for today's call]<br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > > In the first paragraph or "cannot be done" ready "can only be<br>
> > > done".<br>
> > > > > > Sorry!<br>
> > > > > > ><br>
> > > > > > > Francis<br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > > -----Original Message-----<br>
> > > > > > > From: Francis Cave [mailto:<a href="mailto:francis@franciscave.com">francis@franciscave.<wbr>com</a>]<br>
> > > > > > > Sent: 19 December 2016 16:33<br>
> > > > > > > To: <a href="mailto:keld@keldix.com">keld@keldix.com</a>; 'MURATA Makoto' <<a href="mailto:eb2m-mrt@asahi-net.or.jp">eb2m-mrt@asahi-net.or.jp</a>><br>
> > > > > > > Cc: 'SC 34 WG4' <<a href="mailto:e-SC34-WG4@ecma-international.org">e-SC34-WG4@ecma-<wbr>international.org</a>><br>
> > > > > > > Subject: RE: DR-16-0018: WML: need sort method for special<br>
> > > > > > > characters [for today's call]<br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > > Keld<br>
> > > > > > ><br>
> > > > > > > You may or may not be correct in your view that there is a user<br>
> > > > > > > requirement that fields be sortable according to the current<br>
> > > locale.<br>
> > > > > > > However, if there is such a user requirement, it isn't currently<br>
> > > met<br>
> > > > > > > by OOXML or by implementations. Support for specifying the current<br>
> > > > > > > locale as the sort method would involve an extension to OOXML and<br>
> > > as<br>
> > > > > > > such cannot be done by amendment or revision of the standard, not<br>
> > > by<br>
> > > > > > > Technical Corrigendum, so this is a big deal. Although this might<br>
> > > be<br>
> > > > > > > a relatively simple change, e.g. by changing ST_SortMethod to allow<br>
> > > > > > > any string (§18.18.73 of ISO/IEC 29500-1:2016), it could not be<br>
> > > made<br>
> > > > > > > mandatory for implementations to support these new values without<br>
> > > > > > > breaking existing implementations, and if they do not already do<br>
> > > so,<br>
> > > > > > > that suggests that there hasn't be much market pressure that would<br>
> > > > > > persuade them to implement the enhancement.<br>
> > > > > > ><br>
> > > > > > > I note that ODF doesn't appear to have this feature either<br>
> > > (although<br>
> > > > > > > it has a similar feature ??? §19.865 text:sort-algorithm ??? for<br>
> > > sorting<br>
> > > > > > > text,<br>
> > > > > > e.g.<br>
> > > > > > > bibliographies), and in §19.685 table:order there is the following<br>
> > > > > note:<br>
> > > > > > ><br>
> > > > > > > Note: Sorting is locale and implementation-dependent.<br>
> > > > > > ><br>
> > > > > > > It is hard to avoid the conclusion that demand for this feature is<br>
> > > > > > > very limited, at least in office document applications.<br>
> > > > > > ><br>
> > > > > > > Francis<br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > > -----Original Message-----<br>
> > > > > > > From: <a href="mailto:keld@keldix.com">keld@keldix.com</a> [mailto:<a href="mailto:keld@keldix.com">keld@keldix.com</a>]<br>
> > > > > > > Sent: 18 December 2016 20:19<br>
> > > > > > > To: MURATA Makoto <<a href="mailto:eb2m-mrt@asahi-net.or.jp">eb2m-mrt@asahi-net.or.jp</a>><br>
> > > > > > > Cc: SC 34 WG4 <<a href="mailto:e-SC34-WG4@ecma-international.org">e-SC34-WG4@ecma-<wbr>international.org</a>><br>
> > > > > > > Subject: Re: DR-16-0018: WML: need sort method for special<br>
> > > > > > > characters [for today's call]<br>
> > > > > > ><br>
> > > > > > > I am not aware of the reason why this is so.<br>
> > > > > > ><br>
> > > > > > > But anyway, why not then use the 14651 tailorable ordering in its<br>
> > > > > > > template form, which is equivalent to UTS#10 - as the universal<br>
> > > > > > > sorting in<br>
> > > > > > OOXML?<br>
> > > > > > ><br>
> > > > > > > I would think it was a user requirement that fields are sortable<br>
> > > > > > > according to the current locale, eg a list of names.<br>
> > > > > > ><br>
> > > > > > > Having the sorting order not being changeable creates troubles for<br>
> > > > > > > users too.<br>
> > > > > > ><br>
> > > > > > > best regards<br>
> > > > > > > keld<br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > > On Mon, Dec 19, 2016 at 04:48:21AM +0900, MURATA Makoto wrote:<br>
> > > > > > > > The sort order of Excel cannot be changed without causing<br>
> > > troubles<br>
> > > > > > > > to users.<br>
> > > > > > > ><br>
> > > > > > > > Regards,<br>
> > > > > > > > Makoto<br>
> > > > > > > ><br>
> > > > > > > > 2016-12-08 3:17 GMT+09:00 Keld Simonsen <<a href="mailto:keld@keldix.com">keld@keldix.com</a>>:<br>
> > > > > > > ><br>
> > > > > > > > > I propose that you use the locale of the current process, and<br>
> > > > > > > > > the implied sorting sequence for special cheracters there, Or<br>
> > > at<br>
> > > > > > > > > least the sorting specified in ISO/IEC<br>
> > > > > > > > > 14651<br>
> > > > > > > > > or the equivalent Unicode specifcation.<br>
> > > > > > > > ><br>
> > > > > > > > > best regards<br>
> > > > > > > > > keld Simonsen<br>
> > > > > > > > ><br>
> > > > > > > ><br>
> > > > > > > ><br>
> > > > > > > ><br>
> > > > > > > > --<br>
> > > > > > > ><br>
> > > > > > > > Praying for the victims of the Japan Tohoku earthquake<br>
> > > > > > > ><br>
> > > > > > > > Makoto<br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > > ><br>
> > > > > ><br>
> > > > ><br>
> > > > ><br>
> > > > ><br>
> > > ><br>
> > > ><br>
> > > > --<br>
> > > ><br>
> > > > Praying for the victims of the Japan Tohoku earthquake<br>
> > > ><br>
> > > > Makoto<br>
> > ><br>
> ><br>
> ><br>
> ><br>
> > --<br>
> ><br>
> > Praying for the victims of the Japan Tohoku earthquake<br>
> ><br>
> > Makoto<br>
</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><br>Praying for the victims of the Japan Tohoku earthquake<br><br>Makoto</div>
</div>