DR 18-0002: SML: Sorting CJK ideographic characters based on Japanese phonetics

MURATA Makoto eb2m-mrt at asahi-net.or.jp
Thu Sep 6 04:53:54 CEST 2018


Here is my proposal.

Add a normative paragraph and a note.

When a String Item (si) of a cell contains Phonetic Run (rPh)
elements, sorting shall use such rPh elements rather than the text (t)
elements representing the base text of the rPh elements.

[Example:

Suppose that the content of a cell is

    <si>
        <t>正直</t>
        <rPh sb="0" eb="2">
            <t>ショウジキ</t>
        </rPh>
        <phoneticPr fontId="1"/>
    </si>

and that of another cell is

    <si>
        <t>正直</t>
        <rPh sb="0" eb="2">
            <t>マサナオ</t>
        </rPh>
        <phoneticPr fontId="1"/>
    </si>

Although 正直 is visually rendered for these two cells,
sorting uses ショウジキ for the first cell and
uses マサナオ for the second cell.

end example]

Regards,
Makoto

2018年6月24日(日) 19:13 Rex Jaeschke <rex at rexjaeschke.com>:

> Attached is the DR log entry with my proposed note included for review.
>
>
>
> The other attachments are examples Murata-san produced when researching
> this topic. I have chosen to write a simple note **without** including
> any of his specific examples.
>
>
>
> Rex
>
>
>
>
>
> *From:* eb2mmrt at gmail.com <eb2mmrt at gmail.com> *On Behalf Of *MURATA Makoto
> *Sent:* Wednesday, June 6, 2018 10:03 PM
> *To:* Rex Jaeschke <Rex at rexjaeschke.com>
> *Subject:* Examples for kana-based sorting
>
>
>
> Rex,
>
>
>
> I created four SML documents.  They share the same sharedStrings.xml.
> These SML documents were created by modifying the value of the sortMethod
> attribute.
>
>
>
> The sharedStrings.xml shows some combinations of kana
>
> and CJK ideographic characters.
>
>
>
>
>
> 1) 正直 without Kana
>
> 2) 正直 with ショウジキ as Kana
>
> 3) 正直 with マサナオ as Kana
>
> 4) 政直 with マサナオ as Kana
>
> 5) 政直 without Kana
>
>
>
>
>
> I also added two cells containing す and み, respectively.
>
>
>
> The four values exhibit different behaviors.  Both sortMethod="none" and
> the absence of this attribute provides Kana based sorting.  In other words,
> if Kana is present, the CJK ideographic characters are ignored.  Meanwhile,
> sortMethod="pinYin" and sortMethod="stroke" appear to ignore Kana.
>
>
>
> It appears that sortMethod="pinYin" is converted to
>
> sortMethod="stroke", when Excel saves the document.
>
>
>
> Regards,
> Makoto
>


-- 

Praying for the victims of the Japan Tohoku earthquake

Makoto


This message has been scanned for malware by Forcepoint. www.forcepoint.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20180906/eaf9e2e8/attachment-0001.html>


More information about the sc34wg4 mailing list