karps-0.2.0.0: Haskell bindings for Spark Dataframes and Datasets

Safe HaskellNone
LanguageHaskell2010

Spark.Core.Internal.Groups

Synopsis

Documentation

data GroupData key val Source #

A dataset that has been partitioned according to some given field.

Instances

Show (GroupData key val) Source # 

Methods

showsPrec :: Int -> GroupData key val -> ShowS #

show :: GroupData key val -> String #

showList :: [GroupData key val] -> ShowS #

type LogicalGroupData = Try UntypedGroupData Source #

groupByKey :: HasCallStack => Column ref key -> Column ref val -> GroupData key val Source #

Performs a logical group of data based on a key.

mapGroup :: GroupData key val -> (forall ref. Column ref val -> Column ref val') -> GroupData key val' Source #

Transforms the values in a group.

aggKey :: HasCallStack => GroupData key val -> (forall ref. Column ref val -> LocalData val') -> Dataset (key, val') Source #

The generalized value transform.

This generalizes mapGroup to allow more complex transforms involving joins, groups, etc.

Given a group and an aggregation function, aggregates the data.

Note: not all the reduction functions may be used in this case. The analyzer will fail if the function is not universal.

groupAsDS :: forall key val. GroupData key val -> Dataset (key, val) Source #

Creates a group by expanding a value into a potentially large collection.

Note on performance: this function is optimized to work at any scale and may not be the most efficient when the generated collections are small (a few elements).

Builds groups within groups.

This function allows groups to be constructed from each collections inside a group.

This function is usually not used directly by the user, but rather as part of more complex pipelines that may involve multiple levels of nesting.

Reduces a group in group into a single group.

Returns the collapsed representation of a grouped dataset, discarding group information.