hadoop-streaming-0.2.0.0: A simple Hadoop streaming library

MaintainerZiyang Liu <free@cofree.io>
Safe HaskellNone
LanguageHaskell2010

HadoopStreaming.ByteString

Description

This module has some utilities for working with ByteString in Hadoop streaming.

Synopsis

Documentation

sourceHandle :: MonadIO m => Handle -> ConduitT i ByteString m () Source #

Stream the contents of a Handle one line at a time as ByteString.

NB: This only works if the input from the Handle is UTF-8 encoded.

sinkHandle :: MonadIO m => Handle -> ConduitT ByteString o m () Source #

Stream data to a Handle, separated by \n.

NB: This only works if the data is UTF-8 encoded.

stdinLn :: (MonadIO m, MonadThrow m) => ConduitT i ByteString m () Source #

Stream the contents from stdin one line at a time as ByteString.

NB: This only works if the input from the Handle is UTF-8 encoded.

stdinLn = sourceHandle System.IO.stdin

stdoutLn :: MonadIO m => ConduitT ByteString o m () Source #

Stream data to stdout, separated by \n.

NB: This only works if the data is UTF-8 encoded.

stdoutLn = sinkHandle System.IO.stdout

defaultKeyValueEncoder Source #

Arguments

:: (k -> ByteString)

Key encoder

-> (v -> ByteString)

Value encoder

-> k 
-> v 
-> ByteString 

Encode a key-value pair by separating them with a (UTF-8 encoded) tab (i.e., a 0x09 byte), which is the default way the mapper output should be formatted.

defaultKeyValueDecoder Source #

Arguments

:: (ByteString -> Either e k)

Key decoder

-> (ByteString -> Either e v)

Value decoder

-> ByteString 
-> Either e (k, Maybe v) 

Decode a line by treating the prefix up to the first tab as key, and the suffix after the first tab as value. If the line does not contain a tab, or if the first tab is the last character, the whole line is considered as key, and the value decoder is not used.

NB: This only works if the data is UTF-8 encoded.