DR 09-0040: WML/DML: Complex scripts

Chris Rae Chris.Rae at microsoft.com
Tue Dec 7 08:20:45 CET 2010


Hi all - we've done a bit of a rewrite of this text that I think should answer Caroline's concerns. We've added some more normative text, and also created a couple more examples.

Those at WG4 at the moment - I know it's short notice, but if anyone has the time to review it this evening or early tomorrow then we could talk through the details tomorrow.

Chris

-----Original Message-----
From: Arms, Caroline [mailto:caar at loc.gov] 
Sent: 04 November 2010 04:00
To: e-SC34-WG4 at ecma-international.org
Subject: RE: DR 09-0040: WML/DML: Complex scripts

Chris,
I have no doubt that the "algorithm" addresses issues raised by this DR, but it's not clear to me that the precise question has been answered directly.  The DR mentions several specific sections that use the phrases "complex script characters" or "complex script contents [of a run]" with apparently different interpretations for the range of characters included.  Your proposed change to 17.3.2.26 certainly doesn't address that directly.

THE SECTION YOU PROPOSED ADDING THE ALGORITHM TO BEGINS BY DIVIDING CHARACTERS INTO 4 SETS:
17.3.2.26 rFonts (Run Fonts)
This element specifies the fonts which shall be used to display the text contents of this run. Within a single run, there can be up to four types of content present which shall each be allowed to use a unique font:
• ASCII (i.e., the first 128 Unicode code points) • High ANSI • Complex Script • East Asian The use of each of these fonts shall be determined by the Unicode character values of the run content, unless manually overridden via use of the cs element (§17.3.2.7).

[Aside: It might be helpful to provide the ranges that apply unless manually over-ridden.]

MEANWHILE
17.3.2.1 b (Bold)
This element specifies whether the bold property shall be applied to all non-complex script characters in the contents of this run when displayed in a document.

AND
17.3.2.2 bCs (Complex Script Bold)
This element specifies whether the bold property shall be applied to all complex script characters in the contents of this run when displayed in a document.

APPEAR TO DIVIDE THE CHARACTER UNIVERSE INTO 2 SETS

SO WHAT DO b AND bCs DO FOR High ANSI or East Asian as defined in 17.3.2.26?

This is just ONE of the examples raised in the DR.

I find myself wondering whether Shawn's earlier work on this DR might have addressed this, but it is not reflected in the DR Log.

   Caroline

Caroline Arms
Library of Congress Contractor
Co-compiler of Sustainability of Digital Formats resource http://www.digitalpreservation.gov/formats/

** Views expressed are personal and not necessarily those of the institution ** ________________________________________
From: Rex Jaeschke [rex at RexJaeschke.com]
Sent: Wednesday, October 27, 2010 2:54 PM
To: e-SC34-WG4 at ecma-international.org
Subject: RE: DR 09-0040: WML/DML: Complex scripts

I applied those edits to Chris' proposal when I added it to the log. Should I also be adding a language code for Chinese Simplified?

Rex


Rex


> -----Original Message-----
> From: Jirka Kosek [mailto:jirka at kosek.cz]
> Sent: Tuesday, October 26, 2010 4:27 AM
> To: Chris Rae
> Cc: e-SC34-WG4 at ecma-international.org
> Subject: Re: DR 09-0040: WML/DML: Complex scripts
>
> Chris Rae wrote:
>
> > This DR is asking when exactly the various different 
> > script-mandating tags are used inside runs of text in 
> > WordprocessingML. I've dug up
> the
> > algorithm for this and I think the best place to put it is next to 
> > where the Run is defined. Proposed changes are attached.
>
> Thanks Chris. Thats's interesting algorithm, but I bet we can't 
> improve it as we are just describing existing behaviour. ;-)
>
> I think it would be better to reference Unicode characters using 
> U+XXXX notation, not just by XX. U+XXXX notation is used elsewhere in 
> the standard. Also when referencing language (eg. Chinese Traditional) 
> it would be useful to add BCP47 language code to this reference to 
> make language reference unambiguous.
>
>                       Jirka
>
>
> --
> ------------------------------------------------------------------
>   Jirka Kosek      e-mail: jirka at kosek.cz      http://xmlguru.cz
> ------------------------------------------------------------------
>        Professional XML consulting and training services
>   DocBook customization, custom XSLT/XSL-FO document processing
> ------------------------------------------------------------------
>  OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
> ------------------------------------------------------------------




-------------- next part --------------
A non-text attachment was scrubbed...
Name: DR 09-0040 proposed changes.docx
Type: application/vnd.openxmlformats-officedocument.wordprocessingml.document
Size: 80485 bytes
Desc: DR 09-0040 proposed changes.docx
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20101207/9663b642/attachment-0001.bin>


More information about the sc34wg4 mailing list