DR 09-0061 _ Shared MLs, Shared Simple Types: Constrain ST_Panose value set

Chris Rae Chris.Rae at microsoft.com
Mon Nov 8 21:55:39 CET 2010


Suzuki-san - we discussed this defect report briefly on the last WG4 call. Many thanks, as usual, for the immensely useful background information and all the work you've done here. It's great to see detail experts participating in WG4 like this!

I mentioned on the call that my preference would be to take your adjusted RegExp from the later mail in this thread as the preferred version in the standard (I'll confirm that in the next week or two - we're busy testing it at the moment). However, I believe Microsoft didn't ever knowingly contradict the Panose standard, so I'm hoping I can simultaneously convince you that that RegExp is in fact a valid RegExp for testing Panose.

I think the part of the Panose standard (http://www.panose.com/ProductsServices/pan1.aspx for everyone else) that disagrees with the "Latin Pictorial" page is section 1.5. It reads:

--
1.5 Digit values of 0 and 1
The reader will notice that the value 0 and 1 are defined as Any and No Fit for every digit in the PANOSE system. These have specific meanings to the mapper. 0 means match that digit with any available digit. This allows the mapper to handle distortable typefaces such as multiple master fonts in which, for example, weights may be variable or serifs may change. 1 means that the item being classified does not fit within the present system. There are two possible causes of this. First is that there has been no work done on that family of faces, for example at the present time an Arabic cursive font would have the PANOSE number 1 1 1 1 1 1 1 1 1 1 as there has as yet been no work done on Arabic fonts.
--

I think these global overrides make values such as the 05000000000000000000 we were discussing valid Panose. Do you have any thoughts?

Chris

-----Original Message-----
From: mpsuzuki at hiroshima-u.ac.jp [mailto:mpsuzuki at hiroshima-u.ac.jp] 
Sent: 26 October 2010 22:03
To: Chris Rae
Cc: e-SC34-WG4 at ecma-international.org
Subject: Re: DR 09-0061 _ Shared MLs, Shared Simple Types: Constrain ST_Panose value set

Dear Chris,

Excuse me, 05000000000000000000 is invalid Panose, if I follow panose.com definition.

Please refer "Latin Pictorial" page,

  http://www.panose.com/ProductsServices/pan5.aspx

"5.3 Weight" and "5.5 Aspect ratio & contrast".
In there, the 3rd and 5th digits of the Panose for family kind 5 are restricted to "1".
"0" is unavailable. Thus my panose.com-strict regex

\s*0?2\s*0?[0-9A-Fa-f]\s*0?[0-9ABab]\s*0?[0-9]\s*0?[0-9]\s*0?[0-9Aa]\s*0?[0-9ABab]\s*0?[0-9A-Fa-f]\s*0?[0-9A-Da-d]\s*0?[0-7]\s*|\s*0?3\s*0?[0-9]\s*0?[0-9ABab]\s*0?[0-3]\s*0?[0-6]\s*0?[0-9]\s*0?[0-9Aa]\s*0?[0-9A-Da-d]\s*0?[0-9A-Da-d]\s*0?[0-6]\s*|\s*0?4\s*0?[0-9A-Ca-c]\s*0?[0-9ABab]\s*0?[0-9]\s*0?[0-9A-Da-d]\s*(0?[0-9A-Fa-f]|10)\s*0[0-7]\s*0?[0-8]\s*0[0-9A-Fa-f]\*0[0-5]\s*|\s*0?5\s*0?[0-9A-Ca-c]\s*0?1\s*0?[0-3]\s*0?1\s*0?[0-9]\s*0?[0-9]\s*0?[0-9]\s*0?[0-9]\s*0?[0-9]\s*

refuses 05000000000000000000.

However, as I mentioned in Tokyo meeting, the definition of Panose is different between MSDN and panose.com.

http://msdn.microsoft.com/en-us/library/ms533998.aspx
http://www.panose.com/

The MSDN-style regex

\s*0?[0-5]\s*0?[0-9A-Fa-f]\s*0?[0-9ABab]\s*0?[0-9]\s*0?[0-9]\s*0?[0-8]\s*0?[0-9ABab]\s*0?[0-9A-Fa-f]\s*0?[0-9A-Da-d]\s*0?[0-7]\s*

accepts 05000000000000000000.

I will check the Panose values in existing TrueType fonts bundled to Microsoft Windows etc and the number of the Panose that is valid in MSDN syntax but invalid in Panose.com syntax is remarkably large.

Regards,
mpsuzuki

On Wed, 27 Oct 2010 11:57:27 +0900
mpsuzuki at hiroshima-u.ac.jp wrote:

>Dear Chris,
>
>Sorry for lated response. The value 05000000000000000000 must be 
>accepted, and I was thinking my proposal accepts it. I will check, 
>please wait. I will post my replies to other issues within 12 hours.
>
>Regards,
>suzuki toshiya, Hiroshima University, Japan
>
>On Tue, 26 Oct 2010 21:24:50 +0000
>Chris Rae <Chris.Rae at microsoft.com> wrote:
>
>>I may have spoken too soon on this one. It would appear there are some values which are acceptable according to the Panose spec which this RegExp does not regard as valid. For example, 05000000000000000000. This seems to be valid according to section 1.5 of the Panose spec (http://www.panose.com/ProductsServices/pan1.aspx) but doesn't match this RegEx. I've pasted the section below.
>>
>>Suzuki-san - is there a chance you could check that I'm right in this assertion?
>>
>>Chris
>>
>>--
>>
>>Panose: 1.5 Digit values of 0 and 1
>>The reader will notice that the value 0 and 1 are defined as Any and No Fit for every digit in the PANOSE system. These have specific meanings to the mapper. 0 means match that digit with any available digit. This allows the mapper to handle distortable typefaces such as multiple master fonts in which, for example, weights may be variable or serifs may change. 1 means that the item being classified does not fit within the present system. There are two possible causes of this. First is that there has been no work done on that family of faces, for example at the present time an Arabic cursive font would have the PANOSE number 1 1 1 1 1 1 1 1 1 1 as there has as yet been no work done on Arabic fonts.
>>
>>-----Original Message-----
>>From: Chris Rae [mailto:Chris.Rae at microsoft.com]
>>Sent: 25 October 2010 15:37
>>To: e-SC34-WG4 at ecma-international.org
>>Cc: suzuki toshiya (mpsuzuki at hiroshima-u.ac.jp)
>>Subject: DR 09-0061 _ Shared MLs, Shared Simple Types: Constrain 
>>ST_Panose value set
>>
>>http://cid-c8ba0861dc5e4adc.office.live.com/view.aspx/Public%20Documen
>>ts/2009/DR-09-0061.docx
>>
>>This is a very simple one indeed. We talked about this at some length at Tokyo - I was under the impression that certain valid Panose values were not accepted by Suzuki-san's RegEx. This, it turns out, was a mistake - I was truncating unused zeros from the front of the strings and in actual fact this is not done (in Office, or in the Panose specification itself). Panose consists of ten byte couplets denoted in hex where leading zeros are always included, even on the first byte.
>>
>>I used http://www.regexplanet.com/simple/index.html to validate Suzuki-san's sample against some Word documents and it looks like this RegEx does indeed work fine (some sample values: 020F0502020204030204, 02010600030101010101, 020B0604020202020204, 02020603050405020304, 02040503050406030204).
>>
>>I think we can accept the original solution as proposed by the submitter in the DR. Let's discuss on the next call.
>>
>>Chris
>>



More information about the sc34wg4 mailing list