DR-09-0014: SML: Incorrect restriction on the Name Representation

Shawn Villaron shawnv at exchange.microsoft.com
Thu Feb 19 03:15:25 CET 2009


DR-09-0014: SML: Incorrect restriction on the Name Representation

Hi everyone,

Not everyone was able to attend Okinawa last month and so I wanted to take some time today to write up a summary mail regarding this particular defect for your review.  I've incorporated some of the feedback I received in Okinawa, so even for the folks who were able to attend that meeting, there is some new information contained herein.

As background information for this defect, it is important to remember that the intent of the IS 29500 standard was to fully encode existing binary documents, binary spreadsheets and binary presentations using XML.  The latest document from Microsoft regarding these binary file formats can be found here:

http://www.microsoft.com/interop/docs/officebinaryformats.mspx/

The applicable documentation for this issue can be found in the spreadsheet documentation.  In particular, we're addressing the encoding of record 218h which is defined on page 144 in the PDF version.

DR-09-0014
This issue deals with the scope of named cells within a spreadsheet.

Named cells are a common spreadsheet feature where one or more cells in a spreadsheet can be given a "friendly" name which makes it easier to reference those cells elsewhere in the spreadsheet.

For example, suppose you want to calculate the payment on a loan and you want to evaluate the impact that various interest rates would have on that payment.  One way to do this would be to use a cell to represent the interest rate, name that cell interestRate and then use that name in the formula that calculates the payment.  This way you could play with the value in the cell and not have to modify the actual formula to see the impact it has on the payment.

If we consider ThinkFree's Office Calc 3 implementation, the customer experience is pretty simple: select the cell that you will use for the interest rate and use the Insert / Name / Define ... menu.  This will bring up a dialog that shows you any named cells found in the spreadsheet.  In the dialog you can provide the name for the selected cell(s).  At this point, any time you want to refer to the contents of this cell, you can refer to is as "interestRate" as opposed to "A:B2".

An important observation here is that the scope of the named cells is at the workbook level and not the worksheet level.  Given that worksheets can refer to cells in other worksheets, the uniqueness of named cells needs to be scoped at the workbook level.  Otherwise, there could be two different cell ranges with the same name causing confusion as to which named cell(s) value(s) should be used.  Clearly this would be a problem for data integrity and interoperability.

Other vendors' products such as OpenOffice.org Calc 3, Corel Quatro Pro X4 and Microsoft's Excel 2003 all have similar implementations: they support the ability to name cells, they require workbook-level uniqueness and they express named cells using the same manner described in the binary file format document above.  Because these four vendors have chosen similar implementations, spreadsheets with named cells can be successfully passed between these vendors' products.

I've attached a sample file generated from Think Free's Office Calc 3 product in case folks would like to experiment with this functionality and the associated interop scenarios themselves.  Naturally there are some bugs in the vendors' implementations, but you can take the attached Think Free file and open it up successfully in the other implementations.  The spreadsheets open, the named cells are persisted and the function which consumes the named cells continues to work as expected.

Additionally, Apple's iWorks '09 Numbers can consume files generated from such implementations and make use of those named cells.  While they have similar functionality to named cells, I would not go so far as to say they have native support for authoring named cells ( I can go into more detail if anyone cares ).

If we look at the IS 29500 standard, the prose states the following ( italics mine ):

18.17.6.5 Name Representation
A formula can contain one or more names. These names shall be defined in the Worksheet part's XML with each being the subject of a definedName element, inside a definedNames element.

Not only does this conflict with existing binary spreadsheets, it also conflicts with the intent of the schema.  The only valid parent element for the definedNames element is the workbook element.  I interpret this such that the intent of the standard was to match the existing binary spreadsheets' design and scope the uniqueness of named cells to the workbook and not the worksheet.

Given that the standard's prose conflicts with the standard's schema, that the industry has a common implementation of this that successfully enables interoperability through the exchange of binary spreadsheets and that the vendors' implementation is consistent with the binary spreadsheet documentation, I would recommend we change the prose to scope the uniqueness of named cells to the workbook.  The updated text would look as follows ( italics to emphasize changed text ):

18.17.6.5 Name Representation
A formula can contain one or more names. These names shall be defined in the Workbook part's XML with each being the subject of a definedName element, inside a definedNames element.

Thanks for reading.

shawn



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20090219/e2ff6fda/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: thinkfree-namedcell.xls
Type: application/vnd.ms-excel
Size: 5632 bytes
Desc: thinkfree-namedcell.xls
URL: <http://mailman.vse.cz/pipermail/sc34wg4/attachments/20090219/e2ff6fda/attachment.xls>


More information about the sc34wg4 mailing list