krapsh-0.1.6.1: Haskell bindings for Spark Dataframes and Datasets

Safe HaskellNone
LanguageHaskell2010

Spark.Core.Context

Description

This module defines session objects that act as entry points to spark.

There are two ways to interact with Spark: using an explicit state object, or using the default state object (interactive session).

While the interactive session is the most convenient, it should not be used for more than quick experimentations. Any complex code should use the SparkSession and SparkState objects.

Synopsis

Documentation

data SparkSessionConf Source #

The configuration of a remote spark session in krapsh.

Constructors

SparkSessionConf 

Fields

  • confEndPoint :: !Text

    The URL of the end point.

  • confPort :: !Int

    The port used to configure the end point.

  • confPollingIntervalMillis :: !Int

    (internal) the polling interval

  • confRequestedSessionName :: !Text

    (optional) the requested name of the session. This name must obey a number of rules: - it must consist in alphanumerical and -,_: [a-zA-Z0-9-_] - if it already exists on the server, it will be reconnected to

    The default value is "" (a new random context name will be chosen).

data SparkSession Source #

A session in Spark. Encapsualates all the state needed to communicate with Spark and to perfor some simple optimizations on the code.

type SparkState a = LoggingT (StateT SparkSession IO) a Source #

Represents the state of a session and accounts for the communication with the server.

class FromSQL a Source #

Instances

FromSQL Int Source # 

Methods

_cellToValue :: Cell -> TryS Int

FromSQL Cell Source # 

Methods

_cellToValue :: Cell -> TryS Cell

FromSQL a => FromSQL [a] Source # 

Methods

_cellToValue :: Cell -> TryS [a]

FromSQL a => FromSQL (Maybe a) Source # 

Methods

_cellToValue :: Cell -> TryS (Maybe a)

defaultConf :: SparkSessionConf Source #

The default configuration if the krapsh server is being run locally.

executeCommand1 :: forall a. (FromSQL a, HasCallStack) => LocalData a -> SparkState (Try a) Source #

Executes a command: - performs the transforms and the optimizations in the pure state - sends the computation to the backend - waits for the terminal nodes to reach a final state - commits the final results to the state

If any failure is detected that is internal to Krapsh, it returns an error. If the error comes from an underlying library (http stack, programming failure), an exception may be thrown instead.

createSparkSessionDef :: HasCallStack => SparkSessionConf -> IO () Source #

Creates a spark session that will be used as the default session.

If a session already exists, an exception will be thrown.

closeSparkSessionDef :: HasCallStack => IO () Source #

Closes the default session. The default session is empty after this call completes.

NOTE: This does not currently clear up the resources! It is a stub implementation used in testing.

exec1Def :: (FromSQL a, HasCallStack) => LocalData a -> IO a Source #

Executes a command using the default spark session.

This is the most unsafe way of running a command: it executes a command using the default spark session, and throws an exception if any error happens.