karps-0.2.0.0: Haskell bindings for Spark Dataframes and Datasets

Safe HaskellNone
LanguageHaskell2010

Spark.IO.Inputs

Synopsis

Documentation

data SparkPath Source #

A path to some data that can be read by Spark.

data DataSchema Source #

The schema policty with respect to a data source. It should either request Spark to infer the schema from the source, or it should try to match the source against a schema provided by the user.

data JsonOptions Source #

The options for the json input.

data SourceDescription Source #

A description of a data source, following Spark's reader API version 2.

Eeach source constists in an input source (json, xml, etc.), an optional schema for this source, and a number of options specific to this source.

Since this descriptions is rather low-level, a number of wrappers of provided for each of the most popular sources that are already built into Spark.

json' :: DataType -> String -> DataFrame Source #

Declares a source of data of the given data type.

The source is not read at this point, it is just declared. It may be found to be invalid in subsequent computations.

json :: SQLTypeable a => String -> Dataset a Source #

Declares a source of data of the given data type.

The source is not read at this point, it is just declared.

jsonInfer :: SparkPath -> SparkState DataFrame Source #

Reads a source of data expected to be in the JSON format.

The schema is not required and Spark will infer the schema of the source. However, all the data contained in the source may end up being read in the process.