RE: DR 09-0045 ― WML, Fonts: Character encodings of font names
Chris Rae
Chris.Rae at microsoft.com
Tue Oct 5 01:49:56 CEST 2010
Hi Suzuki-san - just getting back onto this now. I've made a reworked version incorporating your comments - what do you think of the attached?
Regarding guiding people towards the right sort of character encoding - are you thinking about this only in terms of font name encoding, or are you meaning a wider span (i.e. encoding anything that has to go into an IS 29500 file)?
Your thoughts,
Chris
-----Original Message-----
From: suzuki toshiya [mailto:mpsuzuki at hiroshima-u.ac.jp]
Sent: 06 September 2010 21:57
To: Chris Rae
Cc: e-SC34-WG4 at ecma-international.org; MURATA Makoto (mmurata at japan.email.ne.jp)
Subject: Re: DR 09-0045 ― WML, Fonts: Character encodings of font names
Dear Chris,
Thank you for quick drafting.
I think it's far better than the original version, but I want to clarify following points clearer:
1) font file (embedded or external) may not have UTF-8 font names
2) copying font names from font file is not recommended, if its character encoding is incompatible with XML part.
3) code conversion is usually required to handle such localized font names.
I drafted as:
--------------------------------------------------------------
Localized font names stored in the embedded font resources or external font resources may be coded by the character encoding that is incompatible with the character encodings for XML parser.
To use these font names in the values of this attribute in the XML part, they should be converted to the character encoding fitting to XML parser. Copying raw byte sequences from font files should be avoided. [Note: ISO/IEC 14496-22:2007 does not permit to store UTF-8 font name in the font file. end note]
--------------------------------------------------------------
Please give me comment.
I want to ask about the question about "converted to the character encoding". What I'm thinking is a conversion from "a Kanji expressed by a Shift-JIS codepoint" to "a Kanji expressed by a Unicode codepoint".
I want to exclude ASCII-fy approach converting "字" (ShiftJIS = 8E9A,
UCS2 = U+5B57) to "x8Ex9A" (worst) or "x5Bx57" (worse) etc. This is because restoring such "escaped byte sequence" to original font name is not easy for most font management systems (they check the existence of "x8Ex9A", if not found, try "U+8E9A", if not found, ...).
Is there good terminology to exclude such conversion?
Regards,
suzuki toshiya
Chris Rae wrote (2010/09/07 13:19):
> http://cid-c8ba0861dc5e4adc.office.live.com/view.aspx/Public%20Documen
> ts/2009/DR-09-0045.docx
>
> I've updated the wording to incorporate a note explaining that intra-font names will often need alterations to appear in XML. Suzuki-san and Murata-san, does this look reasonable?
>
> Changes attached.
>
> Chris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: DR 09-0045 proposed changes.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 13698 bytes
Desc: DR 09-0045 proposed changes.docx
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20101004/0a25d07c/attachment.bin>
More information about the sc34wg4
mailing list