darcs-2.12.5: a distributed, interactive, smart revision control system

Copyright(c) The University of Glasgow 2001
David Roundy 2003-2005
LicenseGPL (I'm happy to also license this file BSD style but don't want to bother distributing two license files with darcs.
Maintainerdroundy@abridgegame.org
Stabilityexperimental
Portabilityportable
Safe HaskellNone
LanguageHaskell2010

Darcs.Util.ByteString

Description

GZIp and MMap IO for ByteStrings, encoding utilities, and miscellaneous functions for Data.ByteString

Synopsis

Documentation

unsafeWithInternals :: ByteString -> (Ptr Word8 -> Int -> IO a) -> IO a Source #

Do something with the internals of a PackedString. Beware of altering the contents!

unpackPSFromUTF8 :: ByteString -> String Source #

Decodes a ByteString containing UTF-8 to a String. Decoding errors are flagged with the U+FFFD character.

gzReadFilePS :: FilePath -> IO ByteString Source #

Read an entire file, which may or may not be gzip compressed, directly into a ByteString.

mmapFilePS :: FilePath -> IO ByteString Source #

Like readFilePS, this reads an entire file directly into a ByteString, but it is even more efficient. It involves directly mapping the file to memory. This has the advantage that the contents of the file never need to be copied. Also, under memory pressure the page may simply be discarded, wile in the case of readFilePS it would need to be written to swap. If you read many small files, mmapFilePS will be less memory-efficient than readFilePS, since each mmapFilePS takes up a separate page of memory. Also, you can run into bus errors if the file is modified. NOTE: as with readFilePS, the string representation in the file is assumed to be ISO-8859-1.

gzReadStdin :: IO ByteString Source #

Read standard input, which may or may not be gzip compressed, directly into a ByteString.

type FileSegment = (FilePath, Maybe (Int64, Int)) Source #

Pointer to a filesystem, possibly with start/end offsets. Supposed to be fed to (uncurry mmapFileByteString) or similar.

readSegment :: FileSegment -> IO ByteString Source #

Read in a FileSegment into a Lazy ByteString. Implemented using mmap.

gzDecompress :: Maybe Int -> ByteString -> ([ByteString], Bool) Source #

Decompress the given bytestring into a lazy list of chunks, along with a boolean flag indicating (if True) that the CRC was corrupted. Inspecting the flag will cause the entire list of chunks to be evaluated (but if you throw away the list immediately this should run in constant space).

unlinesPS :: [ByteString] -> ByteString Source #

This function acts exactly like the Prelude unlines function, or like Data.ByteString.Char8 unlines, but with one important difference: it will produce a string which may not end with a newline! That is:

unlinesPS ["foo", "bar"]

evaluates to "foo\nbar", not "foo\nbar\n"! This point should hold true for linesPS as well.

TODO: rename this function.

readIntPS :: ByteString -> Maybe (Int, ByteString) Source #

readIntPS skips any whitespace at the beginning of its argument, and reads an Int from the beginning of the PackedString. If there is no integer at the beginning of the string, it returns Nothing, otherwise it just returns the int read, along with a B.ByteString containing the remainder of its input.

betweenLinesPS :: ByteString -> ByteString -> ByteString -> Maybe ByteString Source #

betweenLinesPS returns the B.ByteString between the two lines given, or Nothing if they do not appear.

intercalate :: ByteString -> [ByteString] -> ByteString #

O(n) The intercalate function takes a ByteString and a list of ByteStrings and concatenates the list after interspersing the first argument between each element of the list.

isAscii :: ByteString -> Bool Source #

Test if a ByteString is made of ascii characters

decodeLocale :: ByteString -> String Source #

Decode a ByteString to a String according to the current locale unsafePerformIO in the locale function is ratified by the fact that GHC 6.12 and above also supply locale conversion with functions with a pure type. Unrecognized byte sequences in the input are skipped.

encodeLocale :: String -> ByteString Source #

Encode a String to a ByteString according to the current locale

decodeString :: String -> IO String Source #

Take a String that represents byte values and re-decode it acording to the current locale. Note: we globally enforce char8 as the default encoding, see Main and Darcs.Utils. This means we get command line args and environment variables as Strings with char8 encoding, too. So we need this to convert such strings back to the user's encoding.