Growth hint -- used in file format identification

Arms, Caroline caar at loc.gov
Tue Nov 4 23:23:37 CET 2014


Jirka,

I suspected that was the case.  I'm not able to jump in myself (since it's out of scope for my contract with the Library Of Congress) but I might contact someone who probably interacts with the PRONOM folks anyway and prompt them as to the potential problem.  

     Thanks.   Caroline

Caroline Arms
Library of Congress Contractor
Co-compiler of Sustainability of Digital Formats resource http://www.digitalpreservation.gov/formats/

** Views expressed are personal and not necessarily those of the institution **

-----Original Message-----
From: Jirka Kosek [mailto:jirka at kosek.cz] 
Sent: Tuesday, November 04, 2014 4:41 PM
To: Arms, Caroline; SC34
Subject: Re: Growth hint -- used in file format identification

On 4.11.2014 19:25, Arms, Caroline wrote:
> Just FYI.  The OOXML record in the primary database of file format signatures used by the archival community includes:
> http://apps.nationalarchives.gov.uk/PRONOM/Format/proFormatSearch.aspx
> ?status=detailReport&id=910&strPageToDisplay=signatures
> which takes advantage of the growth hint bytes actually found in .xslx, .docx, and .pptx (etc.) files to recognize OOXML files from the Zip-based package without unpacking it.

Hi Caroline,

but this signature matches only files produced by MS Office, right?

I sometimes create OOXML files programmatically and ZIP them using some generic ZIP library. Such files are not matching signatures mentioned above.

				Jirka


--
------------------------------------------------------------------
  Jirka Kosek      e-mail: jirka at kosek.cz      http://xmlguru.cz
------------------------------------------------------------------
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document processing
------------------------------------------------------------------
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 rep.
------------------------------------------------------------------
    Bringing you XML Prague conference    http://xmlprague.cz
------------------------------------------------------------------



More information about the sc34wg4 mailing list