Frames-map-reduce-0.3.0.0: Frames wrapper for map-reduce-folds and some extra folds helpers.

Copyright(c) Adam Conner-Sax 2019
LicenseBSD
Maintaineradam_conner_sax@yahoo.com
Stabilityexperimental
Safe HaskellNone
LanguageHaskell2010

Frames.Aggregation.General

Contents

Description

Frames.Aggregation.General contains types and functions to support a specific map/reduce operation. Frequently, data is given with more specificity than required for downstream operations. Perhaps an age is given in years and we only need to know the age-band. Assuming we know how to aggregagte data columns, we want to perform that aggregation on all the subsets required to build the data-set with the simpler key, while perhaps leaving some other columns alone. aggregateFold does this.

Synopsis

Type-alias for maps from one record key to another

type RecordKeyMap record f k k' = record (f :. ElField) k -> record (f :. ElField) k' Source #

Type-alias for key aggregation functions.

Aggregation Function combinators

combineKeyAggregations :: forall (a :: [(Symbol, Type)]) b a' b' record f. (a (a ++ b), b (a ++ b), Disjoint a' b' ~ True, RCastC a (a ++ b) record f, RCastC b (a ++ b) record f, IsoRec a' record f, IsoRec b' record f, IsoRec (a' ++ b') record f) => RecordKeyMap record f a a' -> RecordKeyMap record f b b' -> RecordKeyMap record f (a ++ b) (a' ++ b') Source #

Combine 2 key aggregation functions over disjoint columns.

keyMap :: forall a b record f. (KnownField a, KnownField b, RecGetFieldC a record f '[a], IsoRec '[b] record f, Applicative f) => (Snd a -> Snd b) -> RecordKeyMap record f '[a] '[b] Source #

Promote an ordinary function a -> b to a RecordKeyMap aCol bCol where aCol holds values of type a and bCol holds values of type b.

aggregationFolds

aggregateAllFold Source #

Arguments

:: forall (ak :: [(Symbol, Type)]). ((ak' ++ d) ((ak ++ d) ++ ak'), ak (ak ++ d), ak' (ak' ++ d), d (ak' ++ d), Ord (record (f :. ElField) ak'), Ord (record (f :. ElField) ak), RCastC (ak' ++ d) ((ak ++ d) ++ ak') record f, RCastC ak (ak ++ d) record f, RCastC ak' (ak' ++ d) record f, RCastC d (ak' ++ d) record f, IsoRec d record f, IsoRec (ak ++ d) record f, IsoRec (ak' ++ d) record f, IsoRec ak' record f, IsoRec ((ak ++ d) ++ ak') record f) 
=> RecordKeyMap record f ak ak'

get aggregated key from key

-> Fold (record (f :. ElField) d) (record (f :. ElField) d)

aggregate data

-> Fold (record (f :. ElField) (ak ++ d)) [record (f :. ElField) (ak' ++ d)] 

Given some group keys in columns k, some keys to aggregate over in columns ak, some keys to aggregate into in (new) columns ak', a (hopefully surjective) map from records of ak to records of ak', and a fold over the data, in columns d, aggregating over the rows where ak was distinct but ak' is not, produce a fold to transform data keyed by k and ak to data keyed by k and ak' with appropriate aggregations done in the d. E.g., suppose you have voter turnout data for all 50 states in the US, keyed by state and age of voter in years. The data is two columns: total votes cast and turnout as a percentage. You want to aggregate the ages into two bands, over and under some age. So your k is the state column, ak is the age column, ak' is a new column with data type to indicate over/under. The Fold has to sum over the total votes and perform a weighted-sum over the percentages.

aggregateFold Source #

Arguments

:: forall (k :: [(Symbol, Type)]). ((ak' ++ d) ((ak ++ d) ++ ak'), ak (ak ++ d), ak' (ak' ++ d), d (ak' ++ d), Ord (record (f :. ElField) ak'), Ord (record (f :. ElField) ak), (k ++ (ak' ++ d)) ~ ((k ++ ak') ++ d), Ord (record (f :. ElField) k), k ((k ++ ak') ++ d), k ((k ++ ak) ++ d), (ak ++ d) ((k ++ ak) ++ d), RCastC ak (ak ++ d) record f, RCastC ak' (ak' ++ d) record f, RCastC d (ak' ++ d) record f, RCastC k ((k ++ ak) ++ d) record f, RCastC (ak ++ d) ((k ++ ak) ++ d) record f, RCastC (ak' ++ d) ((ak ++ d) ++ ak') record f, IsoRec k record f, IsoRec d record f, IsoRec ((k ++ ak') ++ d) record f, IsoRec (ak ++ d) record f, IsoRec (ak' ++ d) record f, IsoRec ak' record f, IsoRec ((ak ++ d) ++ ak') record f) 
=> RecordKeyMap record f ak ak'

get aggregated key from key

-> Fold (record (f :. ElField) d) (record (f :. ElField) d)

aggregate data

-> Fold (record (f :. ElField) ((k ++ ak) ++ d)) [record (f :. ElField) ((k ++ ak') ++ d)] 

Aggregate key columns ak into ak' while leaving key columns k along. Allows aggregation over only some fields. Will often require a typeapplication to specify what k is.

mergeDataFolds :: forall (a :: (Symbol, Type)) b d record f. (IsoRec '[b] record f, IsoRec '[a] record f, IsoRec '[a, b] record f) => Fold (record (f :. ElField) d) (record (f :. ElField) '[a]) -> Fold (record (f :. ElField) d) (record (f :. ElField) '[b]) -> Fold (record (f :. ElField) d) (record (f :. ElField) '[a, b]) Source #