DR 09-0210: WML: Custom XML and Smart Tags

rjelliffe at allette.com.au rjelliffe at allette.com.au
Wed Jun 17 07:51:25 CEST 2009


> What's the migration story for files which pre-date 29500?  For example,
> suppose I have a binary wordprocessing document and it contains the
> equivalent @uri value of something like "c:\users\shawnv\desktop"?  Are
> you suggesting that there is no upgrade path to Strict for such files?
> Wouldn't this bring is into direct conflict with the existing scope of
> 29500?
>
>
>
> I wonder if there is the opportunity to slightly shift your proposal:
>
>
>
> *         In "transitional", we offer no guidance
>
> *         In "strict", we offer guidance that should use a namespace name
>
>
>
> The data type is going to be similarly tricky, for the same reasons.

Please note that there is no difference in the lexical or value spaces for
xs:string and xs:anyURI. Both allow any character allowed by XML; both may
be empty strings. This is obscure in the XSD 1.0 Datatypes spec, but much
clearer in the XSD 1.1 CR draft. (The modest typechecking referred to is,
for example, that to be valid in a DOM only XML Char characters are
allowed, and not that it needs any special checking.)

The intent of xs:anyURI was to allow stronger labelling of the "intention"
or semantics of the datatype rather than to constrain the lexical or value
space at all.

So specifying  xs:string or xs:anyURI achieves almost nothing in practical
terms.

I proposed anyURI at XSD WG to overcome the lack, then of a standard for
IRIs--now http://www.ietf.org/rfc/rfc3987.txt-- or Legacy Extended IRIs
--now http://www.w3.org/TR/leiri/-- which are also in the new IRI draft
--http://www.ietf.org/internet-drafts/draft-duerst-iri-bis-05.txt--.

These LEIRIs allow "\" and spaces directly, for example.

I suggest the following:
  * Both S & T use xs:anyURI
  * The  minLength facet be specified as 1 to prevent empty URIs if that
is appropriate
  * Both strict and transitional spec non-normatively reference the LEIRI
note as the intended mapping (if that does indeed do the job) so that
developers know what to expect
  * A normative requirement along the lines "Strict and new transitional
documents should conform to IRI RFC2987 syntax."

Cheers
Rick Jelliffe




More information about the sc34wg4 mailing list