DR 09-0040: WML/DML: Complex scripts

Arms, Caroline caar at loc.gov
Wed Nov 3 20:59:59 CET 2010


Chris,
I have no doubt that the "algorithm" addresses issues raised by this DR, but it's not clear to me that the precise question has been answered directly.  The DR mentions several specific sections that use the phrases "complex script characters" or "complex script contents [of a run]" with apparently different interpretations for the range of characters included.  Your proposed change to 17.3.2.26 certainly doesn't address that directly.

THE SECTION YOU PROPOSED ADDING THE ALGORITHM TO BEGINS BY DIVIDING CHARACTERS INTO 4 SETS:
17.3.2.26 rFonts (Run Fonts)
This element specifies the fonts which shall be used to display the text contents of this run. Within a single run, there can be up to four types of content present which shall each be allowed to use a unique font:
•ASCII (i.e., the first 128 Unicode code points)
•High ANSI
•Complex Script
•East Asian
The use of each of these fonts shall be determined by the Unicode character values of the run content, unless manually overridden via use of the cs element (§17.3.2.7).

[Aside: It might be helpful to provide the ranges that apply unless manually over-ridden.]

MEANWHILE
17.3.2.1 b (Bold)
This element specifies whether the bold property shall be applied to all non-complex script characters in the contents of this run when displayed in a document.

AND
17.3.2.2 bCs (Complex Script Bold)
This element specifies whether the bold property shall be applied to all complex script characters in the contents of this run when displayed in a document.

APPEAR TO DIVIDE THE CHARACTER UNIVERSE INTO 2 SETS

SO WHAT DO b AND bCs DO FOR High ANSI or East Asian as defined in 17.3.2.26?

This is just ONE of the examples raised in the DR.

I find myself wondering whether Shawn's earlier work on this DR might have addressed this, but it is not reflected in the DR Log.

   Caroline

Caroline Arms
Library of Congress Contractor
Co-compiler of Sustainability of Digital Formats resource
http://www.digitalpreservation.gov/formats/

** Views expressed are personal and not necessarily those of the institution **
________________________________________
From: Rex Jaeschke [rex at RexJaeschke.com]
Sent: Wednesday, October 27, 2010 2:54 PM
To: e-SC34-WG4 at ecma-international.org
Subject: RE: DR 09-0040: WML/DML: Complex scripts

I applied those edits to Chris' proposal when I added it to the log. Should I also be adding a language code for Chinese Simplified?

Rex


Rex


> -----Original Message-----
> From: Jirka Kosek [mailto:jirka at kosek.cz]
> Sent: Tuesday, October 26, 2010 4:27 AM
> To: Chris Rae
> Cc: e-SC34-WG4 at ecma-international.org
> Subject: Re: DR 09-0040: WML/DML: Complex scripts
>
> Chris Rae wrote:
>
> > This DR is asking when exactly the various different script-mandating
> > tags are used inside runs of text in WordprocessingML. I've dug up
> the
> > algorithm for this and I think the best place to put it is next to
> > where the Run is defined. Proposed changes are attached.
>
> Thanks Chris. Thats's interesting algorithm, but I bet we can't improve
> it as we are just describing existing behaviour. ;-)
>
> I think it would be better to reference Unicode characters using U+XXXX
> notation, not just by XX. U+XXXX notation is used elsewhere in the
> standard. Also when referencing language (eg. Chinese Traditional) it
> would be useful to add BCP47 language code to this reference to make
> language reference unambiguous.
>
>                       Jirka
>
>
> --
> ------------------------------------------------------------------
>   Jirka Kosek      e-mail: jirka at kosek.cz      http://xmlguru.cz
> ------------------------------------------------------------------
>        Professional XML consulting and training services
>   DocBook customization, custom XSLT/XSL-FO document processing
> ------------------------------------------------------------------
>  OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member
> ------------------------------------------------------------------






More information about the sc34wg4 mailing list