Copyright | Copyright (C) 2004-2011 John Goerzen |
---|---|
License | BSD-3-Clause |
Stability | stable |
Portability | portable |
Safe Haskell | Safe |
Language | Haskell2010 |
GZip file decompression
Copyright (c) 2004 John Goerzen, jgoerzen@complete.org
The GZip format is described in RFC1952.
Synopsis
- data Header = Header {}
- type Section = (Header, String, Footer)
- data GZipError
- data Footer = Footer {}
- decompress :: String -> (String, Maybe GZipError)
- hDecompress :: Handle -> Handle -> IO (Maybe GZipError)
- read_sections :: String -> Either GZipError [Section]
- read_header :: String -> Either GZipError (Header, String)
- read_section :: String -> Either GZipError (Section, String)
GZip Files
GZip files contain one or more Section
s. Each Section
, on disk, begins
with a GZip Header
, then stores the compressed data itself, and finally
stores a GZip Footer
.
The Header
identifies the file as a GZip file, records the original
modification date and time, and, in some cases, also records the original
filename and comments.
The Footer
contains a GZip CRC32 checksum over the decompressed data as
well as a 32-bit length of the decompressed data. The module
GZip
is used to validate stored CRC32 values.
The vast majority of GZip files contain only one Section
. Standard tools
that work with GZip files create single-section files by default.
Multi-section files can be created by simply concatenating two existing
GZip files together. The standard gunzip and zcat tools will simply
concatenate the decompressed data when reading these files back. The
decompress
function in this module will do the same.
When reading data from this module, please use caution regarding how you access
it. For instance, if you are wanting to write the decompressed stream
to disk and validate its CRC32 value, you could use the decompress
function. However, you should process the entire stream before you check
the value of the Bool it returns. Otherwise, you will force Haskell to buffer
the entire file in memory just so it can check the CRC32.
Types
The data structure representing the GZip header. This occurs
at the beginning of each Section
on disk.
type Section = (Header, String, Footer) Source #
A section represents a compressed component in a GZip file. Every GZip file has at least one.
CRCError | CRC-32 check failed |
NotGZIPFile | Couldn't find a GZip header |
UnknownMethod | Compressed with something other than method 8 (deflate) |
UnknownError String | Other problem arose |
Stored on-disk at the end of each section.
Whole-File Processing
decompress :: String -> (String, Maybe GZipError) Source #
Read a GZip file, decompressing all sections that are found.
Returns a decompresed data stream and Nothing, or an unreliable string and Just (error). If you get anything other than Nothing, the String returned should be discarded.
Read a GZip file, decompressing all sections found.
Writes the decompressed data stream to the given output handle.
Returns Nothing if the action was successful, or Just GZipError if there was a problem. If there was a problem, the data written to the output handle should be discarded.