UTF-8 in ZIP

Dennis E. Hamilton dennis.hamilton at acm.org
Tue Nov 2 17:21:14 CET 2010

The main difficulty is that the default situation in Zip is single-byte
encoding and a presumed single-byte code page in the filename entry.  This
clashes with use of UTF-8 for any Unicode code points that do not map to
7-bit ASCII (bit 8 = 0), where the UTF-8 is essentially single-byte ASCII. 

There is an Appendix about this in versions of the App Note more recent than

Of course, if we introduced %-encoding of other UTF-8 sequences (say, using
the IRI collapse to URI mapping), it would fit that practice and we would be
within the sweet spot that Zip has traditionally supported cross-platform.

 - Dennis

-----Original Message-----
From: sc34wg1study-bounces at vse.cz [mailto:sc34wg1study-bounces at vse.cz] On
Behalf Of MURATA Makoto (FAMILY Given)
Sent: Tuesday, November 02, 2010 07:39
To: ISO Zip
Subject: UTF-8 in ZIP

Dear colleagues,

I am wondering if I have to pay license fee to PKWARE if 
I use UTF-8 file names in a ZIP file.  Note that EPUB and 
Widet packaging of W3C use UTF-8 file names in ZIP.  (I plan 
to submit an issue about this topic to the IDPF EPUB WG.)


MURATA Makoto (FAMILY Given)

sc34wg1study mailing list
sc34wg1study at vse.cz

More information about the sc34wg1study mailing list