DR 09-0207: WML: Custom XML and Smart Tags

Shawn Villaron shawnv at microsoft.com
Thu Jun 11 02:58:20 CEST 2009


Here is an updated response to this defect report.  Based on WG4 feedback, I've made the following changes:



1.  I've replaced "customer-defined" with "extra-standard"

2.  I've removed "customer" whenever possible from this part of the standard ( note: I did NOT change element or attribute names, just prose )

3.  I've extended the original defect report about one misuse of example to include two others



Hopefully this is an improvement on this part of the standard.  More to come, naturally.



DR 09-0207 - WML: Custom XML and Smart Tags
Part 1, §17.5 will be updated as follows:
17.5   Custom Markup
Within a WordprocessingML document, it is often necessary for specific documents to contain semantic information beyond the presentation information specified by ISO/IEC 29500. [Example: An invoice document might wish to specify that a particular sentence of text is a customer name, in order for that information to be easily extracted from the document without the need to parse the text using regular expression matching or similar. end example]
For these scenarios, multiple facilities are provided for the insertion and round-tripping of customer defined extra-standard semantics within a WordprocessingML document. There are three distinct forms in which customer defined extra-standard semantics can be inserted into a WordprocessingML document, each with their own specific intended usage:

·         Smart tags

·         Custom XML markup

·         Structured document tags (content controls)
The elements and attributes which define each of these forms is described in the following clauses.
Part 1, §17.5.1 will be updated as follows:
17.5.1            Custom XML and Smart Tags
The first form example of customer defined extra-standard semantics that can be embedded in a WordprocessingML document are smart tags. Implementations Customers can establish sets of smart tags that allow semantic labels to be added around an arbitrary run or set of runs within a document to provide information about the type of data contained within.
[Example: Consider the following text in a WordprocessingML document, with a smart tag around the stock symbol 'CNTS':
        This is a stock symbol: CNTS
This text would translate to the following WordprocessingML markup:

<w:p w:rsidR="00672474" w:rsidRDefault="00672474">
  <w:r>
    <w:t xml:space="preserve">This is a stock symbol: </w:t>
  </w:r>
  <w:smartTag w:uri="http://www.example.com"
     w:element="stockticker">
    <w:r>
      <w:t>CNTS</w:t>
    </w:r>
  </w:smartTag>
</w:p>
As shown above, the smart tag is delimited by the smartTag element, which surrounds the run (or runs) which contain the text which is part of the smart tag. end example]
The smart tag itself carries two required pieces of information, which together contain the extra-standard semantics for this smart tag:

·         The first of these is the namespace for this smart tag (contained in the uri attribute). This allows the smart tag to specify a URI which should identifies the namespace of this smart tag to a consumer. It is intended to be used to specify a family of smart tags to which this one belongs.. [Example: In the sample above, the smart tag belongs to the http://www.example.com namespace. end example]

·         The second of these is the element name for this smart tag (contained in the element attribute). This allows the smart tag to specify a name which identifies this type of smart tag within its namespace and again available to a consumer. It is intended to be used to specify a unique name for this type of smart tag. [Example: In the sample above, the smart tag specifies that its data is of style stockticker. end example]
The next example of customer defined extra-standard semantics which can be embedded in a WordprocessingML document is custom XML markup. Custom XML markup allows the application of the XML elements defined in any schema syntax (XML Schema, NVDL, etc.) to be applied to the contents of a WordprocessingML document in one of two locations: around a paragraph or set of paragraphs (at the block level); or around an arbitrary run or set of runs within a document (at the inline level) to provide semantics to that content within the context and structures defined by the associated schema definition.
The distinction between custom XML markup and smart tags is that custom XML markup is based on a specified schema.  As a result, the custom XML elements can be validated against the schema.  Also, as shown below, custom XML markup can be used at the block-level as well as on the inline (run) level.
[Example: Consider a simple XML Schema which defines two elements: a root element of invoice, and a child element of customerName - the first defining that this file's contents are an invoice, and the second specifying that the enclosed text as a customer's name:
[cid:image001.png at 01C9E9F5.0A3F26E0]
This output would translate to the following WordprocessingML markup:

<w:customXml w:uri="http://www.example.com/2006/invoice" w:element="invoice">
  <w:p>
    <w:r>
      <w:t>This is an invoice.</w:t>
    </w:r>
  </w:p>
  <w:p>
    <w:r>
      <w:t xml:space="preserve">And this is a customer name: </w:t>
    </w:r>
    <w:customXml w:uri="http://www.example.com/2006/invoice" w:element="customerName">
      <w:r>
        <w:t>Tristan Davis</w:t>
      </w:r>
    </w:customXml>
  </w:p>
</w:customXml>
As shown above, each of the XML elements from the implementation- customer-supplied XML schema is represented within the document output as a customXml element. end example]
Similar to the smart tag example above, a custom XML element in a document has two required attributes.

·         The first is the uri attribute, whose contents specify the namespace of the custom XML element in the document. In the example above, the elements each belong to the http://www.example.com/2006/invoice namespace.

·         The second is the element attribute, whose contents specify the name of the custom XML element at this location in the document. In the example above, the root element is called invoice and the child element is called customerName.
As well as the required information specified above, custom XML elements can also specify any number of attributes (as specified in the associated XML Schema) on the element. To add this information, the customXmlPr (properties on the custom XML element) specify one or more attr elements.
[Example: Using the example above, we can add a type attribute to the customerName element as follows:

<w:customXml w:uri="http://www.example.com/2006/invoice" w:element="customerName">
  <w:customXmlPr>
    <w:attr w:uri="http://www.example.com/2006/invoice" w:name="type" w:val="individual"/>
  </w:customXmlPr>
  <w:r>
    <w:t>Tristan Davis</w:t>
  </w:r>
</w:customXml>
The resulting XML, as seen above, simply adds an attr element which specifies the attribute for the custom XML element. end example]...
Part 1, §17.5.2 will be updated as follows:
17.5.2            Structured Document Tags
The final form of customer defined extra-standard semantics which can be embedded in a WordprocessingML document are structured document tags (SDTs).
As shown above, smart tags and custom XML markup each provide a facility for embedding customer defined extra-standard semantics into the document: smart tags, via the ability to provide a basic namespace/name for a run or set of runs within a documents; and custom XML markup, via the ability to tag the document with XML elements and attributes specified by any XML Schema file.
However, each of these techniques, while they each provide a way to add the desired semantic information, does not provide a way to affect the presentation or interaction within the document. To bridge these two worlds, structured document tags allow both the specification of extra-standard customer semantics as well as the ability to influence the presentation of that data in the document.
This means that the implementation customer can define the semantics and context of the tag, but can then use a rich set of pre-defined properties to define its behavior and appearance within the WordprocessingML document's presentation.
[Example: Consider a region which should be tagged with the semantic of "birthday", for the user to enter their date or birth into the document. Ideally, this region would also utilize a date picker to allow the user to enter the date from a calendar:
[cid:image002.png at 01C9E9F5.0A3F26E0]
This content would be specified using the following WordprocessingML:

<w:sdt>
  <w:sdtPr>
    <w:alias w:val="Birthday"/>
    <w:id w:val="8775518"/>
    <w:placeholder>
      <w:docPart w:val="DefaultPlaceholder_22479095"/>
    </w:placeholder>
    <w:showingPlcHdr/>
    <w:date>
      <w:dateFormat w:val="M/d/yyyy"/>
      <w:lid w:val="EN-US"/>
    </w:date>
  </w:sdtPr>
  <w:sdtContent>
    <w:p>
      <w:r>
        <w:rPr>
          <w:rStyle w:val="PlaceholderText"/>
        </w:rPr>
        <w:t>Click here to enter a date...</w:t>
      </w:r>
    </w:p>
  </w:sdtContent>
</w:sdt>
end example]
As shown above, each of the structured document tags in the WordprocessingML file is represented using the sdt element.
Within a structured document tag, there are two child elements which contain the definition and the content of this SDT. The first of these is the sdtPr element, which contains the set of properties specified for this structured document tag. The second is the sdtContent element, which contains all the content which is contained within this structured document tag.
Part 1, §M.1.6 will be updated as follows:

M.1.6  Custom Markup
Within a WordprocessingML document, it is often necessary for specific documents to contain semantic information beyond the presentation information specified by this Office Open XML specification. For example, an invoice document might wish to specify that a particular sentence of text is a customer name, in order for that information to be easily extracted from the document without the need to parse the text using regular expression matching or similar. For those cases, multiple facilities are provided for the insertion and round-tripping of customer defined semantics within a WordprocessingML document.
There are three distinct forms in which customer defined extra-standard semantics can be inserted into a WordprocessingML document, each with their own specific intended usage:

·         Smart tags

·         Custom XML markup

·         Structured document tags (content controls)
The usage and presentation of each of these forms is described in the following sections.
Part 1, §M.1.6.1 will be updated as follows:

M.1.6.1            Smart Tags
The first form example of customer defined extra-standard semantics which can be embedded in a WordprocessingML document are smart tags. Smart tags allow semantic information to be added around an arbitrary run or set of runs within a document to provide information about the kind of data contained within.
Consider the following text in a WordprocessingML document, with a smart tag around the stock symbol 'CNTS' (where the smart tag is displayed using a purple dotted underline):
        This is a stock symbol: CNTS
This text would translate to the following WordprocessingML markup:

<w:p w:rsidR="00672474" w:rsidRDefault="00672474">
  <w:r>
    <w:t xml:space="preserve">This is a stock symbol: </w:t>
  </w:r>
  <w:smartTag w:uri="http://schemas.openxmlformats.org/2006/smarttags"
     w:element="stockticker">
    <w:r>
      <w:t>MSFT</w:t>
    </w:r>
  </w:smartTag>
</w:p>
As shown above, the smart tag is delimited by the smartTag element, which surrounds the run (or runs) which contain the text which is part of the smart tag.
The smart tag itself carries two required pieces of information, which together contain the customer semantics for this smart tag.
The first of these is the namespace for this smart tag (contained in the uri attribute). This allows the smart tag to specify a URI which should be round-tripped with this smart tag and be available to a consumer. It is intended to be used to specify a family of smart tags to which this one belongs - for example, in the sample above, the smart tag belongs to the http://schemas.openxmlformats.org/2006/smarttags namespace.
The second of these is the element name for this smart tag (contained in the element attribute). This allows the smart tag to specify a name which should be round-tripped with this smart tag and again available to a consumer. It is intended to be used to specify a unique name for this class of smart tag - for example, in the sample above, the smart tag specifies that its data is of class stockticker.
As well as the required information specified above, a smart tag can also contain any number of additional properties in namespace/name/value sets by adding them to the smart tag's property bag.
Using the example above, adding a new property called fullCompanyName with no namespace and value Microsoft Corporation to the smart tag would mean augmenting the output to add the smartTagPr element with this new property as follows:

  <w:smartTag w:uri="http://schemas.openxmlformats.org/2006/smarttags"
     w:element="stockticker">
  <w:smartTagPr>
    <w:attr w:name="fullCompanyName" w:val="Microsoft Corporation"/>
  </w:smartTagPr>
  <w:r>
    <w:t>MSFT</w:t>
  </w:r>
</w:smartTag>
The resulting XML, as seen above, simply adds an attr element which specifies the property and value for the property bag.
A producer can embed a smart tag around any run-level content in a WordprocessingML document in order to embed additional information about the family and class of the data contained within. This allows 'tagging' of specific regions of a document with these semantics without need to provide context beyond the information provided in the uri and element attributes.
A consumer can read this smart tag data and provide additional functionality around these namespace/element pairs, which might or might not be specific to that smart tag class in the document. Examples of this functionality include: the ability to add/remove this markup via a user interface, ability to provide actions to operating in the context of this data classification, etc.
Part 1, §M.1.6.2 will be updated as follows:

M.1.6.2            Custom XML Markup
The next form example of customer defined extra-standard semantics which can be embedded in a WordprocessingML document is custom XML markup. Custom XML markup allows the application of the XML elements defined in any valid XML Schema file to be applied to the contents of a WordprocessingML document in one of two locations: around a paragraph or set of paragraphs (at the block level); or around an arbitrary run or set of runs within a document (at the inline level) to provide semantics to that content within the context and structures defined by the associated XML Schema definition file.
The distinction between custom XML markup and smart tags is based on the fact that custom XML markup corresponds with the contents of a custom XML schema; which means that as shown below, custom XML markup can be used at the block-level to mark up the contents of a document on levels beyond that of one or more runs as well as on the inline (run) level. It can also be validated against a custom XML schema by a producer at run time.
Consider a simple XML Schema which defines two elements: a root element of invoice, and a child element of customerName - the first defining that this file's contents are an invoice, and the second specifying that the enclosed text as a customer's name:
[cid:image001.png at 01C9E9F5.0A3F26E0]
This output would translate to the following WordprocessingML markup:

<w:customXml w:uri="http://www.example.com/2006/invoice" w:element="invoice">
  <w:p>
    <w:r>
      <w:t>This is an invoice.</w:t>
    </w:r>
  </w:p>
  <w:p>
    <w:r>
      <w:t xml:space="preserve">And this is a customer name: </w:t>
    </w:r>
    <w:customXml w:uri="http://www.example.com/2006/invoice" w:element="customerName">
      <w:r>
        <w:t>Tristan Davis</w:t>
      </w:r>
    </w:customXml>
  </w:p>
</w:customXml>
As shown above, each of the XML elements from the customer-supplied XML schema is represented within the document output as a customXml element.
Similar to the smart tag example above, a custom XML element in a document has two required attributes.
The first is the uri attribute, whose contents specify the namespace of the custom XML element in the document. In the example above, the elements each belong to the http://www.example.com/2006/invoice namespace.
The second is the element attribute, whose contents specify the name of the custom XML element at this location in the document. In the example above, the root element is called invoice and the child element is called customerName.
As well as the required information specified above, custom XML elements can also specify any number of attributes (as specified in the associated XML Schema) on the element. To add this information, the customXmlPr (properties on the custom XML element) specify one or more attr elements.
Using the example above, we can add a type attribute to the customerName element as follows:

<w:customXml w:uri="http://www.example.com/2006/invoice" w:element="customerName">
  <w:customXmlPr>
    <w:attr w:uri="http://www.example.com/2006/invoice" w:name="type" w:val="individual"/>
  </w:customXmlPr>
  <w:r>
    <w:t>Tristan Davis</w:t>
  </w:r>
</w:customXml>
The resulting XML, as seen above, simply adds an attr element which specifies the attribute for the custom XML element.
A producer can embed a custom XML element around or with block-level or run-level content in a WordprocessingML document in order to embed the structure of the customer defined extra-standard XML Schema within the WordprocessingML content. This allows 'tagging' of specific regions of a document with the semantics from this schema, while ensuring that the resulting file can be validated to the WordprocessingML schemas.
A consumer can read this custom XML markup and provide additional functionality around this customer defined extra-standard XML markup, which might or might not be specific to that particular XML namespace. Examples of this functionality include: the ability to add/remove this XML markup via a user interface, ability to provide actions to operating in the context of this namespace, etc.
Each custom XML element is analogous to an XML element in the specified XML schema, and can be nested arbitrarily to any depth in the document. This facility is limited only by the XML Schema file itself, and the contents of the current document.
Part 1, §M.1.6.3 will be updated as follows:

M.1.6.3            Structured Document Tags
The final form example of customer defined extra-standard semantics which can be embedded in a WordprocessingML document is the structured document tag (SDT).
As shown above, smart tags and custom XML markup each provide a facility for embedding customer-defined semantics into the document: smart tags, via the ability to provide a basic namespace/name for a run or set of runs within a documents; and custom XML markup, via the ability to tag the document with XML elements and attributes specified by any valid XML Schema file.
However, each of these techniques, while they each provide a way to add the desired semantic information, does not provide a way to affect the presentation or interaction within the document. To bridge these two worlds, structured document tags allow both the specification of customer semantics as well as the ability to influence the presentation of that data in the document.
This means that the customer can define the semantics and context of the tag, but can then use a rich set of pre-defined properties to define its behavior and appearance within the WordprocessingML document's presentation.
Consider a region which should be tagged with the semantic of "birthday", for the user to enter their date or birth into the document. Ideally, this region would also utilize a date picker to allow the user to enter the date from a calendar::
[cid:image003.png at 01C9E9F5.0A3F26E0]
This content would translate to the following WordprocessingML markup:

<w:sdt>
  <w:sdtPr>
    <w:alias w:val="Birthday"/>
    <w:id w:val="8775518"/>
    <w:placeholder>
      <w:docPart w:val="DefaultPlaceholder_22479095"/>
    </w:placeholder>
    <w:showingPlcHdr/>
    <w:date>
      <w:dateFormat w:val="M/d/yyyy"/>
      <w:lid w:val="EN-US"/>
    </w:date>
  </w:sdtPr>

  <w:sdtContent>
    <w:p>
      <w:r>
        <w:rPr>
          <w:rStyle w:val="PlaceholderText"/>
        </w:rPr>
        <w:t>Click here to enter a date...</w:t>
      </w:r>
    </w:p>
  </w:sdtContent>
</w:sdt>
As shown above, each of the structured document tags in the WordprocessingML file is represented using the sdt element.
Within a structured document tag, there are two child elements which contain the definition and the content of this SDT. The first of these is the sdtPr element, which contains the set of properties specified for this structured document tag. The second is the sdtContent element, which contains all the content which is contained within this structured document tag.
Part 1, §M.3.1.2.8 will be updated as follows:

M.3.1.2.8         Customer Data
There is a set of utilities that facilitate the storage of customer XML data within the file format.  Although a topic for a separate paper, essentially, this functionality comes down to the ability to store customer defined extra-standard XML in the file format in a way that it can be easily queried, modified and/or surfaced in the presentation.  Suffice it to say, the data is stored in a separate part within the package, and hence the utility pairs the object using it with the part within the package.






-----Original Message-----
From: Rick Jelliffe [mailto:rjelliffe at allette.com.au]
Sent: Saturday, May 30, 2009 10:57 PM
To: Shawn Villaron
Cc: SC 34 WG4
Subject: Re: DR 09-0207: WML: Custom XML and Smart Tags



Shawn Villaron wrote:

>

> *Nature of the Defect:*

>

> Para 3 begins, "The first example of customer-defined semantics that

> can be embedded in a WordprocessingML document are smart tags [...]"

>

> This is a misuse of the word "example".

>

> *Here is the proposed response for this DR:*

>

> The exact changes are as follows:

>

> The first example _form _of customer-defined semantics that can be

> embedded in a WordprocessingML document are smart tags.

>

> I've like to suggest that we move this to LAST CALL.

>

> shawn

>

Is "customer" a defined term? I think for wordsmithing, when we re-write a sentence, if it has "customer-defined" it should be replaced by a defined term like "foreign semantics" or "extra-standard semantics" or "additional semantics".



Cheers

Rick Jelliffe


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20090610/b2a2bdb8/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 2570 bytes
Desc: image001.png
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20090610/b2a2bdb8/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image002.png
Type: image/png
Size: 2755 bytes
Desc: image002.png
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20090610/b2a2bdb8/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image003.png
Type: image/png
Size: 3686 bytes
Desc: image003.png
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20090610/b2a2bdb8/attachment-0002.png>


More information about the sc34wg4 mailing list