UTF-8 in ZIP
Dennis E. Hamilton
dennis.hamilton at acm.org
Tue Nov 2 17:21:14 CET 2010
The main difficulty is that the default situation in Zip is single-byte
encoding and a presumed single-byte code page in the filename entry. This
clashes with use of UTF-8 for any Unicode code points that do not map to
7-bit ASCII (bit 8 = 0), where the UTF-8 is essentially single-byte ASCII.
There is an Appendix about this in versions of the App Note more recent than
6.2.0.
Of course, if we introduced %-encoding of other UTF-8 sequences (say, using
the IRI collapse to URI mapping), it would fit that practice and we would be
within the sweet spot that Zip has traditionally supported cross-platform.
- Dennis
-----Original Message-----
From: sc34wg1study-bounces at vse.cz [mailto:sc34wg1study-bounces at vse.cz] On
Behalf Of MURATA Makoto (FAMILY Given)
Sent: Tuesday, November 02, 2010 07:39
To: ISO Zip
Subject: UTF-8 in ZIP
Dear colleagues,
I am wondering if I have to pay license fee to PKWARE if
I use UTF-8 file names in a ZIP file. Note that EPUB and
Widet packaging of W3C use UTF-8 file names in ZIP. (I plan
to submit an issue about this topic to the IDPF EPUB WG.)
Regards,
MURATA Makoto (FAMILY Given)
_______________________________________________
sc34wg1study mailing list
sc34wg1study at vse.cz
http://mailman.vse.cz/mailman/listinfo/sc34wg1study
More information about the sc34wg1study
mailing list