RE: Clauses §8.3.5 and §8.2.2.3

Francis Cave francis at franciscave.com
Mon Jan 22 17:37:09 CET 2018


In case you’ve missed them, I have spotted a couple of typos. See corrected text highlighted in cyan.

 

Francis

 

 

From: eb2mmrt at gmail.com [mailto:eb2mmrt at gmail.com] On Behalf Of MURATA Makoto
Sent: 22 January 2018 01:04
To: caroline arms <caroline.arms at gmail.com>
Cc: Rex Jaeschke <rex at rexjaeschke.com>; SC 34 WG4 <e-SC34-WG4 at ecma-international.org>
Subject: Re: Clauses §8.3.5 and §8.2.2.3

 

Caroline,

 

Thank you very much for your thoughtful suggestions  

Here is my first cut.

 

Regards,

Makoto

 

---------------------------------------------------------------------

(The first paragraph, which defines equivalence, is not changed.)

 

The names of two different parts within a package shall not

be equivalent.

 

[Example: If a package contains a part named "/a", the

name of another part in that package must not be "/a" or

"/A". end example]

 

For each part name N and string S, let the result of

concatenating N, the forward slash and S be denoted by

N[S]. A part name N1 is said to be derivable from another

part name N2 if, for some string S, N1 is equivalent to

N2[S].

 

[Example: "/a/b" is derivable from "/a", where N is

"/a" and S is "b". end example]

 

The name of a part shall not be derivable from the name of

another part.

 

[Example: Suppose that a package contains a part named

"/segment1/segment2/…/segmentn".  For it not to be

derivable, other parts in that package must not have names

such as "/segment1", "/SEGMENT1",

"/segment1/segment2", "/segment1/SEGMENT2", or

"/segment1/segment2/…/segmentn-1".  end example]

 

This subclause further introduces recommendations so that

NFC normalization to part names does not cause part name

crashes. [Note: Some implementations of the directory

structure always apply NFC normalization. end note]

 

The result of applying Unicode Normalization Form C (NFC)

to the names of two different parts within a package should

not be equivalent.

 

[Example: If a package contains a part named "/é", where

é is 'LATIN SMALL LETTER E' (U+0065) followed by 'COMBINING

ACUTE ACCENT' (U+0301), the name of another part in that

package should not be "/é", where é is 'LATIN SMALL

LETTER E WITH ACUTE' (U+00E9), or "/É", where É/ is

'LATIN CAPITAL LETTER E WITH ACUTE '(U+00C9). end example]

 

[Example: If a package contains a part named "/Å", where

Å is 'ANGSTROM SIGN' (U+212B), the name of another part in

that package should not "/Å" where Å is 'LATIN CAPITAL

LETTER A WITH RING ABOVE' (U+00C5) because U+212B and

U+00C5 are normalized to the same character sequence. end

example]

 

A part name N1 is said to be weakly derivable from another

part name N2 if, for some string S, the result of applying

NFC to N1 is equivalent to the result of applying NFC to

N2[S].

 

[Example: Consider a part name "/é", where é is 'LATIN

SMALL LETTER E WITH ACUTE' (U+00E9).  Another part name

"/é/a", where é is 'LATIN SMALL LETTER E' (U+0065)

followed by 'COMBINING ACUTE ACCENT' (U+0301) is weakly

derivable from "/é".  Yet another part name "/É/a",

where É is 'LATIN CAPITAL LETTER E' (U+0045) followed by

'COMBINING ACUTE ACCENT' (U+0301) is also weakly derivable.

end example]

 

The name of a part should not be weakly derivable from the

name of another part.

 

[Example: Suppose that a package contains a part named

"/é/Å/foo", where é is 'LATIN SMALL LETTER E WITH ACUTE'

(U+00E9) and Å is 'ANGSTROM SIGN' (U+212B).  For it not to

be weakly derivable, other parts in that package should not

have names such as "/É" and "/É/Å", where É is 'LATIN

CAPITAL LETTER E' (U+0045) followed by 'COMBINING ACUTE

ACCENT' (U+0301) and Å is 'LATIN CAPITAL LETTER A WITH RING

ABOVE' (U+00C5).  end example]

 

 

2018-01-22 8:10 GMT+09:00 caroline arms <caroline.arms at gmail.com <mailto:caroline.arms at gmail.com> >:

Thanks for your response.  

 

I have one substantive question, relating to "in its name"  (in red below).  As I read 9.1.1.4 in the 2012 published version, I would understand the constraint to mean that another part's name should not BEGIN with "/a" or "/A".   And I believe that's what the explanation of derivable says.  So I might change:

 

If a package contains a part named “/a”, another part in that package must not have “/a” or “/A” in its name.

   to
If a package contains a part named “/a”, another part in that package must not have a name that begins with “/a” or “/A”.

 

But perhaps I'm missing something

 

However my main concern about §8.2.2.3 is that it is very confusing as to its order of normative text and examples.  It is not obvious what the examples are examples of. Also each of the [Example] blocks holds more than one example, which makes them very hard to read.  

I have several possible suggestions that might make the clause more readable.  I've listed them as options.  One or more could be followed.

Option 1.  Put everything related to "derivable" before everything related to "weakly derivable."

Hence:

For each part name N and string S, let the result of concatenating N, the forward slash and S be denoted by N[s]. A part name N1 is said to be derivable from another part name N2 if, for some string S, N1 is equivalent to N2[S].

The name of a part shall not be derivable from the name of another part.

[Example: If a package contains a part named “/a”, another part in that package must not have “/a” or “/A” in its name. If a package contains a part named “/segment1/segment2/…/segmentn”, other parts in that package must not have names such as “/segment1”, “/SEGMENT1”, “/segment1/segment2”, “/segment1/SEGMENT2”, or “/segment1/segment2/…/segmentn-1”. If a package contains a part named “/Å” where Å is 'ANGSTROM SIGN' (U+212B), another part in that package should not have in its name “/Å” where Å is 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5) because U+212B and U+00C5 are normalized to the same character sequence. end example]

A part name N1 is said to be weakly derivable from another part name N2 if, for some string S, the result of applying NFC to N1 is equivalent to the result of applying NFC to N2[S]. 

The name of a part should not be weakly derivable from the name of another part.

[Example: Given N[s] equal to “/a/b” where N is “/a” and S is “b”, then “/a/b” is derivable from “/a”. A part named “/é/a”, where é is 'LATIN SMALL LETTER E' (U+0065) followed by 'COMBINING ACUTE ACCENT' (U+0301) is weakly derivable from “/é”, where é is 'LATIN SMALL LETTER E WITH ACUTE' (U+00E9).  A part named “/É/a”, where é is 'LATIN CAPITAL LETTER E' (U+0045) followed by 'COMBINING ACUTE ACCENT' (U+0301) is also weakly derivable. end example]

Option 2.  Break the text in the [Example] blocks into the independent examples.

 [Example: 
If a package contains a part named “/a”, another part in that package must not have “/a” or “/A” in its name. 

If a package contains a part named “/segment1/segment2/…/segmentn”, other parts in that package must not have names such as “/segment1”, “/SEGMENT1”, “/segment1/segment2”, “/segment1/SEGMENT2”, or “/segment1/segment2/…/segmentn-1”. 

If a package contains a part named “/Å” where Å is 'ANGSTROM SIGN' (U+212B), another part in that package should not have in its name “/Å” where Å is 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5) because U+212B and U+00C5 are normalized to the same character sequence. end example]

Option 3.  Explain what the examples are about at the beginning of the block -- which might require dividing the first example block into two, since the third example is related to NFC and is a "should not"  whereas the first two are about derivability and case-insensitive matching.

       To be continued, no doubt ...

       Caroline

 

On Sat, Jan 20, 2018 at 4:34 PM, MURATA Makoto <eb2m-mrt at asahi-net.or.jp <mailto:eb2m-mrt at asahi-net.or.jp> > wrote:

Caroline,

 

Thank you for your careful reviewing!

 

2018-01-21 5:53 GMT+09:00 caroline arms <caroline.arms at gmail.com <mailto:caroline.arms at gmail.com> >:

Murata-san, Rex, et al.

I have started to go through the draft.  Rather than wait to the teleconference, I thought I would send emails on issues that are not simply fixing typos or grammar as I come across them.  

Clause §8.3.5 includes
"The path components are equivalent part names, as specified in §8.2.2 [M7.3]"

Should this point instead to §8.2.2.3 Part Name Equivalence and Integrity in a Package?

 

"equivalent" is defined in 8.2.2.3, but "part names" is defined 

in  8.2.2.1 and 8.2.2.2.

 


I find §8.2.2.3 rather confusing and as I read it carefully, I realized that "equivalence" as meant in §8.3.5 might need to incorporate more than ASCII case-insensitive matching -- as equivalence is defined in the first paragraph of §8.2.2.3.  In particular, I wondered whether equivalence after application of NFC was also relevant.  

 

No, they are not.  Microsoft never does NFC to part names.

 


Perhaps someone more expert than me can weigh in here.

I realize that §8.2.2.3 mixes "shall" and "should" -- presumably deliberately.  That probably adds complexity here.

 

Yes, it is confusing.  But I believe that Microsoft and Apple do different 

things here.  Microsoft never normalizes file names.  Apple always 

does.  Part name crashes by Apple (and possibly others) should 

be avoided.  That's why we have a number of "should" in 8.2.2.3.

 

Regards,

Makoto

 


I have some other concerns about §8.2.2.3, but I would need clarification on what "equivalence" needs to be for §8.3.5 before I could make useful suggestions.

    Caroline

PS:  Given the government shutdown, please be sure to send important emails to my gmail account (or to the WG4 list).  I'm afraid the shutdown may play havoc with my schedule, just as the threat of a shutdown has been leading to inefficiency over the last few weeks.

 

On Tue, Jan 9, 2018 at 3:46 PM, Rex Jaeschke <rex at rexjaeschke.com <mailto:rex at rexjaeschke.com> > wrote:

Attached is WD3.3 of the OPC Spec.

 

Once I got it back from Murata-san, here’s what I did:

 

1.	I adopted all edits from WD3.2 and prior that had been resolved, so they no longer show as tracked changes.
2.	I kept all the comments that do not appear to have been resolved.
3.	All Murata-san’s edits proposed since WD3.2 are shown as tracked changes.

 

I propose that at the March F2F meeting, we walk through this document and accept/reject the proposed edits, and resolve the issues raised in comments.

 

Our most-recent discussion of a time line for this spec was to have a complete version at the end of the March 2018 meeting, and after minor changes from the F2F meeitng, to send it out for a 2-month CD ballot, closing before the June F2F. I now think this is quite unrealistic. There is a lot of work to do yet, and the decisions we make in March will need to be applied to the spec and then reviewed in the following teleconferences. We migth have a shot at getting a near-final draft for review of the June meeting.

 

Murata-san has long pushed to get rid of informative Annex G [formerly H], “Guidelines for Meeting Conformance”, while I pushed for keeping it. And while we agreed to keep it, it still needed serious work to make it complete. Unfortunately, in its current state, many of its links and bookmarks are now badly broken, and will be non-trivial to reconstruct. So, reluctantly, I am dropping my objection to removing this Annex. As such, I have *not* done any work on repairing/updating this annex. If we drop this annex, we’ll need to decide what to do about all the [M], [O], and [S] markers spread throughout the normative text.

 

In DR 13-0002, Murata-san proposed the addition of a new informative Annex, “Guidelines for Format Designers” (see https://goo.gl/gzIX9y)”. As I cannot access this link, I have not added this annex. Murata-san, can you please circulate this proposed text?

 

As Caroline will likely not attend the March meeting, I’d like to give her time to review and submit feedback before then. Likewise for Aarti’s experts (who likely will not attend that meeting).

 

We’ll have a big job in March resolving all the open issues, so the more preparation you can do before then, the better. And, of course, we can do serious work on this on our January 31 teleconference.

 

Rex

 

 





 

-- 


Praying for the victims of the Japan Tohoku earthquake

Makoto

 





 

-- 


Praying for the victims of the Japan Tohoku earthquake

Makoto

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20180122/6b63cacd/attachment-0001.html>


More information about the sc34wg4 mailing list