ssv-0.3: Comma-separated-value (CSV) read, show and write routines

Safe HaskellSafe-Inferred
LanguageHaskell2010

Text.SSV

Contents

Description

This modules provides conversion routines to and from various "something-separated value" (SSV) formats. In particular, it converts the infamous "comma separated value" (CSV) format.

Synopsis

SSV format descriptions

These records define a fairly flexible, if entirely kludgy, domain-specific language for describing "something-separated value" formats. An attempt is made in the reader and formatter to allow for fairly arbitrary combinations of features in a sane way. However, your mileage may undoubtedly vary; CSV is the only tested configuration.

data SSVFormat Source

Formatting information for a particular SSV variant.

Constructors

SSVFormat

Characters regarded as whitespace.

Fields

ssvFormatName :: String
 
ssvFormatTerminator :: Char

End of row.

ssvFormatSeparator :: Char

Field separator.

ssvFormatEscape :: Maybe Char

Escape character outside of quotes.

ssvFormatStripWhite :: Bool

Strip "extraneous" whitespace next to separators on input.

ssvFormatQuote :: Maybe SSVFormatQuote

Quote format.

ssvFormatWhiteChars :: String
 

data SSVFormatQuote Source

Formatting information for quoted strings for a particular SSV variant.

SSV read, show and IO routines

readSSV :: SSVFormat -> String -> [[String]] Source

Read using an arbitrary SSVFormat. The input is not cleaned with toNL; if you want this, do it yourself. The standard SSV formats csvFormat and pwfFormat are provided.

showSSV :: SSVFormat -> [[String]] -> String Source

Show using an arbitrary SSVFormat. The standard SSV formats csvFormat and pwfFormat are provided. Some effort is made to "intelligently" quote the fields; in the worst case an SSVShowException will be thrown to indicate that a field had characters that could not be quoted. Spaces or tabs in input fields only causes quoting if they are adjacent to a separator, and then only if ssvFormatStripWhite is True.

hPutSSV :: SSVFormat -> Handle -> [[String]] -> IO () Source

Put a representation of the given SSV input out on a file handle using the given SSVFormat. Uses CRLF as the line terminator character, as recommended by RFC 4180 for CSV. Otherwise, this function behaves as writing the output of showSSV to the Handle; if you want native line terminators, this latter method works for that.

writeSSVFile :: SSVFormat -> String -> [[String]] -> IO () Source

Write an SSV representation of the given input into a new file located at the given path, using the given SSVFormat. As with hPutCSV, CRLF will be used as the line terminator.

CSV read, show and IO routines

CSV is a special case here. Partly this is by virtue of being the most common format. CSV also needs a little bit of "special" help with input line endings to conform to RFC 4180.

readCSV :: String -> [[String]] Source

Convert a String representing a CSV file into a properly-parsed list of rows, each a list of String fields. Adheres to the spirit and (mostly) to the letter of RFC 4180, which defines the `text/csv` MIME type.

toNL is used on the input string to clean up the various line endings that might appear. Note that this may result in irreversible, undesired manglings of CRs and LFs.

Fields are expected to be separated by commas. Per RFC 4180, fields may be double-quoted: only whitespace, which is discarded, may appear outside the double-quotes of a quoted field. For unquoted fields, whitespace to the left of the field is discarded, but whitespace to the right is retained; this is convenient for the parser, and probably corresponds to the typical intent of CSV authors. Whitespace on both sides of a quoted field is discarded. If a double-quoted fields contains two double-quotes in a row, these are treated as an escaped encoding of a single double-quote.

The final line of the input may end with a line terminator, which will be ignored, or without one.

showCSV :: [[String]] -> String Source

Convert a list of rows, each a list of String fields, to a single String CSV representation. Adheres to the spirit and (mostly) to the letter of RFC 4180, which defines the `text/csv` MIME type.

Newline will be used as the end-of-line character, and no discardable whitespace will appear in fields. Fields that need to be quoted because they contain a special character or line terminator will be quoted; all other fields will be left unquoted. The final row of CSV will end with a newline.

hPutCSV :: Handle -> [[String]] -> IO () Source

Perform hPutSSV with csvFormat.

Newline conversions

toNL :: String -> String Source

Convert CR / LF sequences on input to LF (NL). Also convert other CRs to LF. This is probably the right way to handle CSV data.

fromNL :: String -> String Source

Convert LF (NL) sequences on input to CR LF. Leaves | other CRs alone.

Exceptions

data SSVReadException Source

Indicates format name, line and column and gives an error message.

data SSVShowException Source

Indicates format name and failed field and gives an error message. This should probably just be an error, as the calling program is really responsible for passing something formattable to the show routines.

Predefined formats

csvFormat :: SSVFormat Source

SSVFormat for CSV data. Closely follows RFC 4180.

pwfFormat :: SSVFormat Source

SSVFormat for UNIX "password file" data, i.e. colon-separated fields with no escape convention.