Safe Haskell | None |
---|---|
Language | Haskell2010 |
- type ColumnLineagePlus = Map (Either FQTN FQCN) ColumnPlusSet
- class HasColumnLineage q where
- emptyLineage :: [FQColumnName ()] -> ColumnLineagePlus
- data ColumnLineage
- data ColumnPlusSet = ColumnPlusSet {
- columnPlusColumns :: Map FQCN (Map FieldChain (Set Range))
- columnPlusTables :: Map FQTN (Set Range)
- emptyColumnPlusSet :: ColumnPlusSet
- singleColumnSet :: Range -> FullyQualifiedColumnName -> ColumnPlusSet
- singleTableSet :: Range -> FullyQualifiedTableName -> ColumnPlusSet
- mergeLineages :: Writer ColumnPlusSet [ColumnPlusSet] -> Writer ColumnPlusSet [ColumnPlusSet] -> EvalT ColumnLineage TableContext Identity (Writer ColumnPlusSet [ColumnPlusSet])
- ancestorsForTableName :: RTableName Range -> Maybe (RecordSet ColumnLineage)
- truncateTableLineage :: FQTableName a -> [FQColumnName ()] -> ColumnLineagePlus
- evalDefaultExpr :: DefaultExpr ResolvedNames Range -> EvalResult ColumnLineage (Expr ResolvedNames Range)
- returnNothing :: ColumnLineagePlus -> (RecordSet ColumnLineage, ColumnLineagePlus)
- columnLineage :: Statement d ResolvedNames Range -> (RecordSet ColumnLineage, ColumnLineagePlus)
Documentation
type ColumnLineagePlus = Map (Either FQTN FQCN) ColumnPlusSet Source #
ColumnLineagePlus is a set of descendants, each with an associated set of ancestors. Descendents may be a column (representing values in that column) or a table (representing row-count).
Ancestors are the same, but that ancestor columns may be further specialized by field path.
Tracking impacts on row-count is necessary because row-count can have impacts on data without any columns being involved. For a clear example, consider `CREATE TABLE foo AS SELECT COUNT(1) FROM BAR;`
It also gives us something coherent to speak about with respect to
EXISTS
- the value depends on the row count of the subquery.
N.b. While it looks like we're talking about "tables", this is *not* the same thing as table-level lineage. Changes to values in existing rows does not impact row count. Following an UPDATE, if we ask "has this table changed, such that we need to rerun things downstream?", the answer is clearly yes. If we ask "has the row-count of this table changed, such that we need to rerun things that depend only on row-count?" the answer is clearly not.
class HasColumnLineage q where Source #
getColumnLineage :: q -> (RecordSet ColumnLineage, ColumnLineagePlus) Source #
emptyLineage :: [FQColumnName ()] -> ColumnLineagePlus Source #
data ColumnLineage Source #
Evaluation ColumnLineage Source # | |
type EvalValue ColumnLineage Source # | |
type EvalRow ColumnLineage Source # | |
type EvalMonad ColumnLineage Source # | |
data ColumnPlusSet Source #
ColumnPlusSet | |
|
mergeLineages :: Writer ColumnPlusSet [ColumnPlusSet] -> Writer ColumnPlusSet [ColumnPlusSet] -> EvalT ColumnLineage TableContext Identity (Writer ColumnPlusSet [ColumnPlusSet]) Source #
truncateTableLineage :: FQTableName a -> [FQColumnName ()] -> ColumnLineagePlus Source #