COR3 issue: ST_Hint
John Haug
johnhaug at exchange.microsoft.com
Sat Oct 17 00:32:11 CEST 2015
I toyed with the language a bit based on what I think you’re going for.
This simple type specifies information used to decide how to format any characters in the current run for which the font type is otherwise ambiguous.
Certain characters can be mapped into more than one of the font slot categories described in the parent element. This attribute shall be used to determine how ambiguities in this run shall be handled. [Note: This can be used to handle the formatting on the paragraph mark glyph, and other characters that are not stored as text in the WordprocessingML document. Some printable characters can be mapped to more than one font slot, such as Unicode glyph U+2026 ‘HORIZONTAL ELLIPSIS’. end note]
Does that get across everything discussed below?
From: eb2mmrt at gmail.com [mailto:eb2mmrt at gmail.com] On Behalf Of MURATA Makoto
Sent: Friday, October 16, 2015 3:25 PM
To: John Haug <johnhaug at exchange.microsoft.com>
Cc: SC 34 WG4 <e-SC34-WG4 at ecma-international.org>; suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp>
Subject: Re: COR3 issue: ST_Hint
Oops, I intended to make this note look like
an example. But dropping "primarily" would
not do that
How about
[Example: This can be used to handle the formatting on the paragraph mark glyph,
and other characters that are not stored as text in the WordprocessingML document. end example].
2015-10-17 6:47 GMT+09:00 John Haug <johnhaug at exchange.microsoft.com<mailto:johnhaug at exchange.microsoft.com>>:
FYI, that was part of the original text. If we drop that, then the note says this simple type is for the paragraph mark, but we mention ellipsis as an example of a character stored in the document that has mapping ambiguity. Would that confuse?
From: eb2mmrt at gmail.com<mailto:eb2mmrt at gmail.com> [mailto:eb2mmrt at gmail.com<mailto:eb2mmrt at gmail.com>] On Behalf Of MURATA Makoto
Sent: Friday, October 16, 2015 2:28 PM
To: John Haug <johnhaug at exchange.microsoft.com<mailto:johnhaug at exchange.microsoft.com>>
Cc: SC 34 WG4 <e-SC34-WG4 at ecma-international.org<mailto:e-SC34-WG4 at ecma-international.org>>; suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp<mailto:mpsuzuki at hiroshima-u.ac.jp>>
Subject: Re: COR3 issue: ST_Hint
Looks good, but I would like to drop "primarily".
Regards,
Makoto
2015-10-17 3:06 GMT+09:00 John Haug <johnhaug at exchange.microsoft.com<mailto:johnhaug at exchange.microsoft.com>>:
How about:
For example, certain characters are not explicitly stored in the document or can otherwise be mapped into more than one of the font slot categories [Example: Unicode glyph U+2026 ‘HORIZONTAL ELLIPSIS’ end example] described in the parent element. This attribute shall be used to determine how ambiguities in this run shall be handled. [Note: This is primarily used to handle the formatting on the paragraph mark glyph, and other characters that are not stored as text in the WordprocessingML document. end note]
John
From: eb2mmrt at gmail.com<mailto:eb2mmrt at gmail.com> [mailto:eb2mmrt at gmail.com<mailto:eb2mmrt at gmail.com>] On Behalf Of MURATA Makoto
Sent: Thursday, October 15, 2015 4:12 PM
To: SC 34 WG4 <e-SC34-WG4 at ecma-international.org<mailto:e-SC34-WG4 at ecma-international.org>>
Cc: suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp<mailto:mpsuzuki at hiroshima-u.ac.jp>>
Subject: Re: COR3 issue: ST_Hint
2015-10-16 7:51 GMT+09:00 John Haug <johnhaug at exchange.microsoft.com<mailto:johnhaug at exchange.microsoft.com>>:
Does U+2026 actually work that way in WML? If so, we could note it as an example.
Absolutely yes. I learned this from Ishii-san.
I wasn’t sure what to say about cs, so I left it unsaid, since it’s all dependent on the parent element anyway. Which is noted in the second paragraph. If there are any suggestions…
I spoke with Suzuki-san in a recent meeting of the Japanese mirror.
We agreed that we do not know what should be said... 😩
Regards,
Makoto
From: eb2mmrt at gmail.com<mailto:eb2mmrt at gmail.com> [mailto:eb2mmrt at gmail.com<mailto:eb2mmrt at gmail.com>] On Behalf Of MURATA Makoto
Sent: Sunday, October 4, 2015 1:44 AM
To: John Haug <johnhaug at exchange.microsoft.com<mailto:johnhaug at exchange.microsoft.com>>
Cc: SC 34 WG4 <e-SC34-WG4 at ecma-international.org<mailto:e-SC34-WG4 at ecma-international.org>>; suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp<mailto:mpsuzuki at hiroshima-u.ac.jp>>
Subject: Re: COR3 issue: ST_Hint
Shouldn't we say why "cs" was kept altough it never appears in the
table in Part 1, §17.3.2.26, “rFonts (Run Fonts)”? Just like us, users
will be confused.
Regards,
Makoto
2015-10-04 17:20 GMT+09:00 MURATA Makoto <eb2m-mrt at asahi-net.or.jp<mailto:eb2m-mrt at asahi-net.or.jp>>:
I think that we should mention U+2026<http://www.fileformat.info/info/unicode/char/2026/index.htm> (Ellipsis). This is a very important use
case for Japanese users of the hint attribute.
https://en.wikipedia.org/wiki/Ellipsis
Regards,
Makoto
2015-10-01 9:32 GMT+09:00 John Haug <johnhaug at exchange.microsoft.com<mailto:johnhaug at exchange.microsoft.com>>:
I edited the doc as suggested, though rather than saying (only) for characters which are not explicitly stored… I made it less specific. I believe someone said there are some printable characters that could map to different font slots. Attached. How does that work?
John
-----Original Message-----
From: suzuki toshiya [mailto:mpsuzuki at hiroshima-u.ac.jp<mailto:mpsuzuki at hiroshima-u.ac.jp>]
Sent: Thursday, September 24, 2015 4:33 AM
To: Arms, Caroline <caar at loc.gov<mailto:caar at loc.gov>>
Cc: John Haug <johnhaug at exchange.microsoft.com<mailto:johnhaug at exchange.microsoft.com>>; SC 34 WG4 <e-SC34-WG4 at ecma-international.org<mailto:e-SC34-WG4 at ecma-international.org>>
Subject: Re: [sc34wg4] RE: COR3 issue: ST_Hint
I have no problem with the proposed improvement of the text.
Regards,
mpsuzuki
________________________________
From: Arms, Caroline [mailto:caar at loc.gov<mailto:caar at loc.gov>]
Sent: Thursday, September 24, 2015 3:20 AM
To: John Haug <johnhaug at exchange.microsoft.com<mailto:johnhaug at exchange.microsoft.com>>; SC 34 WG4 <e-SC34-WG4 at ecma-international.org<mailto:e-SC34-WG4 at ecma-international.org>>; suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp<mailto:mpsuzuki at hiroshima-u.ac.jp>>
Subject: RE: COR3 issue: ST_Hint
John,
From a grammatical viewpoint, I certainly don’t like “multiple of the”. What about “more than one of the”?
I found the paragraph beginning ”There are” rather hard to grasp. I really needed to read the Note to understand it. The dogmatic “There are” seemed odd. And “arbitrate that conflict” and “determine how ambiguities in this run shall be handled” seem to be saying the same thing.
Would there be any problem with:
For characters which are not explicitly stored in the document, and which can be mapped into more than one of the font slot categories described in the parent element, this attribute shall be used to determine how ambiguities in this run shall be handled.
Just a thought.
Caroline
Caroline Arms
Library of Congress Contractor
Co-compiler of Sustainability of Digital Formats resource http://www.digitalpreservation.gov/formats/
** Views expressed are personal and not necessarily those of the institution **
From: John Haug [mailto:johnhaug at exchange.microsoft.com]
Sent: Thursday, September 24, 2015 4:44 AM
To: SC 34 WG4; suzuki toshiya
Subject: RE: COR3 issue: ST_Hint
Keeping this in the existing e-mail thread. I've attached the final text we arrived at during the face-to-face meeting in Beijing. In short, I modified the descriptive text to make it less proscriptive and more indicative that the values of the attributes are an indicator that only has meaning when taken within the context of the parent element containing the attribute using this simple type.
John
From: John Haug [mailto:johnhaug at exchange.microsoft.com]
Sent: Friday, September 18, 2015 3:28 PM
To: SC 34 WG4 <e-SC34-WG4 at ecma-international.org<mailto:e-SC34-WG4 at ecma-international.org>>; suzuki toshiya <mpsuzuki at hiroshima-u.ac.jp<mailto:mpsuzuki at hiroshima-u.ac.jp>>
Subject: RE: COR3 issue: ST_Hint
Suzuki-san offered these comments on DR 09-0040 along with others in another e-mail:
DR 09-0040
----------
The "hint" attribute can take "cs" (complex script) for font slot selection?
The reason why "cs" was to be removed is following; By default, the font slot (from ansi/hansi/ea/cs) is selected by the codepoint of the character to be rendered.
Also it is forcibly changed by w:cs or w:rtl elements.
The default slot selection by the codepoint is sometimes not unique but configurable (e.g. the ellipsis is rendered by for ascii slot by default, but it could be changed to ea slot, by setting hint to ea). For such configurable cases, the codepoint-to-slot table have special descriptions like "use X slot, but if hint is set to Y, Y slot is used".
We should be careful that the hint does not change everything, but changes configurable parts.
At present, the codepoint-to-slot table has no character that cs slot is used by default, or, cs slot could be used by setting hint appropriately. If the table is correct, I'm suspicious if setting hint to cs has any effect.
I think this was background why cs would be removed from the possible values of hint.
There would be a few rationales to permit "cs" value in hint;
a) the table was incorrect; some codepoints could be configured to use cs font slot, by setting hint to cs.
b) setting hint to cs makes everything rendered by cs font slot, as w:cs and w:rtl.
c) setting hint to cs does nothing, but the values should be permitted because existing implementation could write it (dealing them as "invalid" is problematic).
I think any of above 3 rationales would be reasonable to permit "cs" in hint, however, "what would occur by setting hint to cs" should be clarified.
I can confirm after talking with a developer here that Word will write hint=”cs” and that it appears only to be used when Word writes list formatting in HTML output. So, hopefully we can agree to keep the attribute value and we are all correct that it has no effect on the giant table added to 17.3.2.26 rFonts by this DR. Assuming there are no other concerns, we should take a little time to tweak the text on the cs description (and possibly reflect the same in eastAsia) to sound less directive since the actions implementers are to take are defined well in the table added to 17.3.2.26.
Thanks for the analysis, Suzuki-san!
John
From: John Haug [mailto:johnhaug at exchange.microsoft.com]
Sent: Wednesday, September 9, 2015 4:06 PM
To: 'SC 34 WG4' <e-SC34-WG4 at ecma-international.org<mailto:e-SC34-WG4 at ecma-international.org>>
Subject: RE: COR3 issue: ST_Hint
We discussed this on the last teleconference. Nobody remembered the details behind this and some old e-mail dug up by Murata-san and Caroline didn’t show a specific decision or reason. Chris and I looked through a trove of public files we have access to and “cs” does appear in some. I think the right resolution here is to keep “cs”.
John
From: John Haug [mailto:johnhaug at exchange.microsoft.com]
Sent: Friday, August 14, 2015 2:52 PM
To: 'SC 34 WG4' <e-SC34-WG4 at ecma-international.org<mailto:e-SC34-WG4 at ecma-international.org>>
Subject: COR3 issue: ST_Hint
Background: E-mail “Leftover from DR 09-0040” (latest reply 2015-08-03)
Summary:
ST_Hint (DR 09-0040)
• DR removed “cs” value
• --> Murata-san to remove "cs" from schemas
• ** Only the value “default” remains; where did “eastAsia” get removed? DR 09-0040 adds lots of text referring to “cs” and “eastAsia” values of the hint attribute for some CTs, which is of type ST_Hint. DR 09-0040 is therefore internally inconsistent and should not be applied as is. I haven’t yet rediscovered why ST_Hint had items removed.
Rex> I just checked MM’s mail, “Leftover from DR 09-0040” of 2015-08-02, and his proposed schema fix only removes cs; eastAsia is still there.
However, on close inspection, the way the Enumeration Value table is written in the COR (see entry #81) is that this is the complete table. As eastAsian was never intended to be removed, I should have shown an empty row after the default row, containing …, indicating that the remainder of the table stays as is. And even though the COR didn’t contain that …, when I applied the COR to 2012, I did NOT remove eastAsian, ‘cos there was no delete (strike-through in red) instruction in the COR to do so.
Regarding, “DR 09-0040 adds lots of text referring to “cs” and “eastAsia” values of the hint attribute for some CTs, which is of type ST_Hint. DR 09-0040 is therefore internally inconsistent and should not be applied as is”, I see there is a cs element APART from the cs enumeration value in ST_Hint. (I also see that eastAsia is also an attribute name as well as an enum value.)
I think this needs to be looked at to determine whether the cs attribute value should be removed and to confirm whether eastAsia was to be removed (and if so whether it ought to). Because there is such a significant amount of text changed here, I think publishing as is would be a problem and that we should correct this before the Beijing meeting so the new DCOR ballot (COR3b?) will have it.
--
Praying for the victims of the Japan Tohoku earthquake
Makoto
--
Praying for the victims of the Japan Tohoku earthquake
Makoto
--
Praying for the victims of the Japan Tohoku earthquake
Makoto
--
Praying for the victims of the Japan Tohoku earthquake
Makoto
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20151016/d7369bc7/attachment-0001.html>
More information about the sc34wg4
mailing list