-- | Constructed Product Result analysis. Identifies functions that surely -- return heap-allocated records on every code path, so that we can eliminate -- said heap allocation by performing a worker/wrapper split. -- -- See https://www.microsoft.com/en-us/research/publication/constructed-product-result-analysis-haskell/. -- CPR analysis should happen after strictness analysis. -- See Note [Phase ordering]. module GHC.Core.Opt.CprAnal ( cprAnalProgram ) where import GHC.Prelude import GHC.Driver.Session import GHC.Builtin.Names ( runRWKey ) import GHC.Types.Var.Env import GHC.Types.Basic import GHC.Types.Id import GHC.Types.Id.Info import GHC.Types.Demand import GHC.Types.Cpr import GHC.Types.Unique.MemoFun import GHC.Core.FamInstEnv import GHC.Core.DataCon import GHC.Core.Type import GHC.Core.Utils import GHC.Core import GHC.Core.Seq import GHC.Core.Opt.WorkWrap.Utils import GHC.Data.Graph.UnVar -- for UnVarSet import GHC.Utils.Outputable import GHC.Utils.Misc import GHC.Utils.Panic import GHC.Utils.Panic.Plain import GHC.Utils.Logger ( Logger, putDumpFileMaybe, DumpFormat (..) ) import Data.List ( mapAccumL ) {- Note [Constructed Product Result] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The goal of Constructed Product Result analysis is to identify functions that surely return heap-allocated records on every code path, so that we can eliminate said heap allocation by performing a worker/wrapper split (via 'GHC.Core.Opt.WorkWrap.Utils.mkWWcpr_entry'). `swap` below is such a function: ``` swap (a, b) = (b, a) ``` A `case` on an application of `swap`, like `case swap (10, 42) of (a, b) -> a + b` could cancel away (by case-of-known-constructor) if we \"inlined\" `swap` and simplified. We then say that `swap` has the CPR property. We can't inline recursive functions, but similar reasoning applies there: ``` f x n = case n of 0 -> (x, 0) _ -> f (x+1) (n-1) ``` Inductively, `case f 1 2 of (a, b) -> a + b` could cancel away the constructed product with the case. So `f`, too, has the CPR property. But we can't really "inline" `f`, because it's recursive. Also, non-recursive functions like `swap` might be too big to inline (or even marked NOINLINE). We still want to exploit the CPR property, and that is exactly what the worker/wrapper transformation can do for us: ``` $wf x n = case n of 0 -> case (x, 0) of -> (a, b) -> (# a, b #) _ -> case f (x+1) (n-1) of (a, b) -> (# a, b #) f x n = case $wf x n of (# a, b #) -> (a, b) ``` where $wf readily simplifies (by case-of-known-constructor and inlining `f`) to: ``` $wf x n = case n of 0 -> (# x, 0 #) _ -> $wf (x+1) (n-1) ``` Now, a call site like `case f 1 2 of (a, b) -> a + b` can inline `f` and eliminate the heap-allocated pair constructor. Note [Nested CPR] ~~~~~~~~~~~~~~~~~ We can apply Note [Constructed Product Result] deeper than just the top-level result constructor of a function, e.g., ``` g x | even x = (x+1,x+2) :: (Int, Int) | odd x = (x+2,x+3) ``` Not only does `g` return a constructed pair, the pair components /also/ have the CPR property. We can split `g` for its /nested/ CPR property, as follows: ``` $wg (x :: Int#) | .. x .. = (# x +# 1#, x +# 2# #) :: (# Int#, Int# #) | .. x .. = (# x +# 2#, x +# 3# #) g (I# x) = case $wf x of (# y, z #) -> (I# y, I# z) ``` Note however that in the following we will only unbox the second component, even if `foo` has the CPR property: ``` h x | even x = (foo x, x+2) :: (Int, Int) | odd x = (x+2, x+3) -- where `foo` has the CPR property ``` Why can't we also unbox `foo x`? Because in order to do so, we have to evaluate it and that might diverge, so we cannot give `h` the nested CPR property in the first component of the result. The Right Thing is to do a termination analysis, to see if we can guarantee that `foo` terminates quickly, in which case we can speculatively evaluate `foo x` and hence give `h` a nested CPR property. That is done in !1866. But for now we have an incredibly simple termination analysis; an expression terminates fast iff it is in HNF: see `exprTerminates`. We call `exprTerminates` in `cprTransformDataConWork`, which is the main function figuring out whether it's OK to propagate nested CPR info (in `extract_nested_cpr`). In addition to `exprTerminates`, `extract_nested_cpr` also looks at the `StrictnessMark` of the corresponding constructor field. Example: ``` data T a = MkT !a h2 x | even x = MkT (foo x) :: T Int | odd x = MkT (x+2) -- where `foo` has the CPR property ``` Regardless of whether or not `foo` terminates, we may unbox the strict field, because it has to be evaluated (the Core for `MkT (foo x)` will look more like `case foo x of y { __DEFAULT -> MkT y }`). Surprisingly, there are local binders with a strict demand that *do not* terminate quickly in a sense that is useful to us! The following function demonstrates that: ``` j x = (let t = x+1 in t+t, 42) ``` Here, `t` is used strictly, *but only within its scope in the first pair component*. `t` satisfies Note [CPR for binders that will be unboxed], so it has the CPR property, nevertheless we may not unbox `j` deeply lest evaluation of `x` diverges. The termination analysis must say "Might diverge" for `t` and we won't unbox the first pair component. There are a couple of tests in T18174 that show case Nested CPR. Some of them only work with the termination analysis from !1866. Giving the (Nested) CPR property to deep data structures can lead to loss of sharing; see Note [CPR for data structures can destroy sharing]. Note [Phase ordering] ~~~~~~~~~~~~~~~~~~~~~ We need to perform strictness analysis before CPR analysis, because that might unbox some arguments, in turn leading to more constructed products. Ideally, we would want the following pipeline: 1. Strictness 2. worker/wrapper (for strictness) 3. CPR 4. worker/wrapper (for CPR) Currently, we omit 2. and anticipate the results of worker/wrapper. See Note [CPR for binders that will be unboxed]. An additional w/w pass would simplify things, but probably add slight overhead. So currently we have 1. Strictness 2. CPR 3. worker/wrapper (for strictness and CPR) -} -- -- * Analysing programs -- cprAnalProgram :: Logger -> FamInstEnvs -> CoreProgram -> IO CoreProgram cprAnalProgram :: Logger -> FamInstEnvs -> CoreProgram -> IO CoreProgram cprAnalProgram Logger logger FamInstEnvs fam_envs CoreProgram binds = do let env :: AnalEnv env = FamInstEnvs -> AnalEnv emptyAnalEnv FamInstEnvs fam_envs let binds_plus_cpr :: CoreProgram binds_plus_cpr = (AnalEnv, CoreProgram) -> CoreProgram forall a b. (a, b) -> b snd ((AnalEnv, CoreProgram) -> CoreProgram) -> (AnalEnv, CoreProgram) -> CoreProgram forall a b. (a -> b) -> a -> b $ (AnalEnv -> CoreBind -> (AnalEnv, CoreBind)) -> AnalEnv -> CoreProgram -> (AnalEnv, CoreProgram) forall (t :: * -> *) a b c. Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c) mapAccumL AnalEnv -> CoreBind -> (AnalEnv, CoreBind) cprAnalTopBind AnalEnv env CoreProgram binds Logger -> DumpFlag -> String -> DumpFormat -> SDoc -> IO () putDumpFileMaybe Logger logger DumpFlag Opt_D_dump_cpr_signatures String "Cpr signatures" DumpFormat FormatText (SDoc -> IO ()) -> SDoc -> IO () forall a b. (a -> b) -> a -> b $ Bool -> (IdInfo -> SDoc) -> CoreProgram -> SDoc dumpIdInfoOfProgram Bool False (CprSig -> SDoc forall a. Outputable a => a -> SDoc ppr (CprSig -> SDoc) -> (IdInfo -> CprSig) -> IdInfo -> SDoc forall b c a. (b -> c) -> (a -> b) -> a -> c . IdInfo -> CprSig cprSigInfo) CoreProgram binds_plus_cpr -- See Note [Stamp out space leaks in demand analysis] in GHC.Core.Opt.DmdAnal CoreProgram -> () seqBinds CoreProgram binds_plus_cpr () -> IO CoreProgram -> IO CoreProgram `seq` CoreProgram -> IO CoreProgram forall (m :: * -> *) a. Monad m => a -> m a return CoreProgram binds_plus_cpr -- Analyse a (group of) top-level binding(s) cprAnalTopBind :: AnalEnv -> CoreBind -> (AnalEnv, CoreBind) cprAnalTopBind :: AnalEnv -> CoreBind -> (AnalEnv, CoreBind) cprAnalTopBind AnalEnv env (NonRec CoreBndr id Expr CoreBndr rhs) = (AnalEnv env', CoreBndr -> Expr CoreBndr -> CoreBind forall b. b -> Expr b -> Bind b NonRec CoreBndr id' Expr CoreBndr rhs') where (CoreBndr id', Expr CoreBndr rhs', AnalEnv env') = AnalEnv -> CoreBndr -> Expr CoreBndr -> (CoreBndr, Expr CoreBndr, AnalEnv) cprAnalBind AnalEnv env CoreBndr id Expr CoreBndr rhs cprAnalTopBind AnalEnv env (Rec [(CoreBndr, Expr CoreBndr)] pairs) = (AnalEnv env', [(CoreBndr, Expr CoreBndr)] -> CoreBind forall b. [(b, Expr b)] -> Bind b Rec [(CoreBndr, Expr CoreBndr)] pairs') where (AnalEnv env', [(CoreBndr, Expr CoreBndr)] pairs') = AnalEnv -> [(CoreBndr, Expr CoreBndr)] -> (AnalEnv, [(CoreBndr, Expr CoreBndr)]) cprFix AnalEnv env [(CoreBndr, Expr CoreBndr)] pairs -- -- * Analysing expressions -- -- | The abstract semantic function ⟦_⟧ : Expr -> Env -> A from -- "Constructed Product Result Analysis for Haskell" cprAnal, cprAnal' :: AnalEnv -> CoreExpr -- ^ expression to be denoted by a 'CprType' -> (CprType, CoreExpr) -- ^ the updated expression and its 'CprType' cprAnal :: AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env Expr CoreBndr e = -- pprTraceWith "cprAnal" (\res -> ppr (fst (res)) $$ ppr e) $ AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal' AnalEnv env Expr CoreBndr e cprAnal' :: AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal' AnalEnv _ (Lit Literal lit) = (CprType topCprType, Literal -> Expr CoreBndr forall b. Literal -> Expr b Lit Literal lit) cprAnal' AnalEnv _ (Type Type ty) = (CprType topCprType, Type -> Expr CoreBndr forall b. Type -> Expr b Type Type ty) -- Doesn't happen, in fact cprAnal' AnalEnv _ (Coercion Coercion co) = (CprType topCprType, Coercion -> Expr CoreBndr forall b. Coercion -> Expr b Coercion Coercion co) cprAnal' AnalEnv env (Cast Expr CoreBndr e Coercion co) = (CprType cpr_ty, Expr CoreBndr -> Coercion -> Expr CoreBndr forall b. Expr b -> Coercion -> Expr b Cast Expr CoreBndr e' Coercion co) where (CprType cpr_ty, Expr CoreBndr e') = AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env Expr CoreBndr e cprAnal' AnalEnv env (Tick CoreTickish t Expr CoreBndr e) = (CprType cpr_ty, CoreTickish -> Expr CoreBndr -> Expr CoreBndr forall b. CoreTickish -> Expr b -> Expr b Tick CoreTickish t Expr CoreBndr e') where (CprType cpr_ty, Expr CoreBndr e') = AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env Expr CoreBndr e cprAnal' AnalEnv env e :: Expr CoreBndr e@(Var{}) = AnalEnv -> Expr CoreBndr -> [(CprType, Expr CoreBndr)] -> (CprType, Expr CoreBndr) cprAnalApp AnalEnv env Expr CoreBndr e [] cprAnal' AnalEnv env e :: Expr CoreBndr e@(App{}) = AnalEnv -> Expr CoreBndr -> [(CprType, Expr CoreBndr)] -> (CprType, Expr CoreBndr) cprAnalApp AnalEnv env Expr CoreBndr e [] cprAnal' AnalEnv env (Lam CoreBndr var Expr CoreBndr body) | CoreBndr -> Bool isTyVar CoreBndr var , (CprType body_ty, Expr CoreBndr body') <- AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env Expr CoreBndr body = (CprType body_ty, CoreBndr -> Expr CoreBndr -> Expr CoreBndr forall b. b -> Expr b -> Expr b Lam CoreBndr var Expr CoreBndr body') | Bool otherwise = (CprType lam_ty, CoreBndr -> Expr CoreBndr -> Expr CoreBndr forall b. b -> Expr b -> Expr b Lam CoreBndr var Expr CoreBndr body') where -- See Note [CPR for binders that will be unboxed] env' :: AnalEnv env' = AnalEnv -> CoreBndr -> AnalEnv extendSigEnvForArg AnalEnv env CoreBndr var (CprType body_ty, Expr CoreBndr body') = AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env' Expr CoreBndr body lam_ty :: CprType lam_ty = CprType -> CprType abstractCprTy CprType body_ty cprAnal' AnalEnv env (Case Expr CoreBndr scrut CoreBndr case_bndr Type ty [Alt CoreBndr] alts) = (CprType res_ty, Expr CoreBndr -> CoreBndr -> Type -> [Alt CoreBndr] -> Expr CoreBndr forall b. Expr b -> b -> Type -> [Alt b] -> Expr b Case Expr CoreBndr scrut' CoreBndr case_bndr Type ty [Alt CoreBndr] alts') where (CprType scrut_ty, Expr CoreBndr scrut') = AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env Expr CoreBndr scrut env' :: AnalEnv env' = AnalEnv -> CoreBndr -> CprSig -> AnalEnv extendSigEnv AnalEnv env CoreBndr case_bndr (CprType -> CprSig CprSig CprType scrut_ty) ([CprType] alt_tys, [Alt CoreBndr] alts') = (Alt CoreBndr -> (CprType, Alt CoreBndr)) -> [Alt CoreBndr] -> ([CprType], [Alt CoreBndr]) forall a b c. (a -> (b, c)) -> [a] -> ([b], [c]) mapAndUnzip (AnalEnv -> CprType -> Alt CoreBndr -> (CprType, Alt CoreBndr) cprAnalAlt AnalEnv env' CprType scrut_ty) [Alt CoreBndr] alts res_ty :: CprType res_ty = (CprType -> CprType -> CprType) -> CprType -> [CprType] -> CprType forall (t :: * -> *) b a. Foldable t => (b -> a -> b) -> b -> t a -> b foldl' CprType -> CprType -> CprType lubCprType CprType botCprType [CprType] alt_tys cprAnal' AnalEnv env (Let (NonRec CoreBndr id Expr CoreBndr rhs) Expr CoreBndr body) = (CprType body_ty, CoreBind -> Expr CoreBndr -> Expr CoreBndr forall b. Bind b -> Expr b -> Expr b Let (CoreBndr -> Expr CoreBndr -> CoreBind forall b. b -> Expr b -> Bind b NonRec CoreBndr id' Expr CoreBndr rhs') Expr CoreBndr body') where (CoreBndr id', Expr CoreBndr rhs', AnalEnv env') = AnalEnv -> CoreBndr -> Expr CoreBndr -> (CoreBndr, Expr CoreBndr, AnalEnv) cprAnalBind AnalEnv env CoreBndr id Expr CoreBndr rhs (CprType body_ty, Expr CoreBndr body') = AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env' Expr CoreBndr body cprAnal' AnalEnv env (Let (Rec [(CoreBndr, Expr CoreBndr)] pairs) Expr CoreBndr body) = CprType body_ty CprType -> (CprType, Expr CoreBndr) -> (CprType, Expr CoreBndr) `seq` (CprType body_ty, CoreBind -> Expr CoreBndr -> Expr CoreBndr forall b. Bind b -> Expr b -> Expr b Let ([(CoreBndr, Expr CoreBndr)] -> CoreBind forall b. [(b, Expr b)] -> Bind b Rec [(CoreBndr, Expr CoreBndr)] pairs') Expr CoreBndr body') where (AnalEnv env', [(CoreBndr, Expr CoreBndr)] pairs') = AnalEnv -> [(CoreBndr, Expr CoreBndr)] -> (AnalEnv, [(CoreBndr, Expr CoreBndr)]) cprFix AnalEnv env [(CoreBndr, Expr CoreBndr)] pairs (CprType body_ty, Expr CoreBndr body') = AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env' Expr CoreBndr body cprAnalAlt :: AnalEnv -> CprType -- ^ CPR type of the scrutinee -> Alt Var -- ^ current alternative -> (CprType, Alt Var) cprAnalAlt :: AnalEnv -> CprType -> Alt CoreBndr -> (CprType, Alt CoreBndr) cprAnalAlt AnalEnv env CprType scrut_ty (Alt AltCon con [CoreBndr] bndrs Expr CoreBndr rhs) = (CprType rhs_ty, AltCon -> [CoreBndr] -> Expr CoreBndr -> Alt CoreBndr forall b. AltCon -> [b] -> Expr b -> Alt b Alt AltCon con [CoreBndr] bndrs Expr CoreBndr rhs') where env_alt :: AnalEnv env_alt | DataAlt DataCon dc <- AltCon con , let ids :: [CoreBndr] ids = (CoreBndr -> Bool) -> [CoreBndr] -> [CoreBndr] forall a. (a -> Bool) -> [a] -> [a] filter CoreBndr -> Bool isId [CoreBndr] bndrs , CprType Arity arity Cpr cpr <- CprType scrut_ty , Bool -> Bool -> Bool forall a. HasCallStack => Bool -> a -> a assert (Arity arity Arity -> Arity -> Bool forall a. Eq a => a -> a -> Bool == Arity 0 ) Bool True = case DataCon -> Cpr -> UnpackConFieldsResult unpackConFieldsCpr DataCon dc Cpr cpr of AllFieldsSame Cpr field_cpr | let sig :: CprSig sig = Arity -> Cpr -> CprSig mkCprSig Arity 0 Cpr field_cpr -> AnalEnv -> [CoreBndr] -> CprSig -> AnalEnv extendSigEnvAllSame AnalEnv env [CoreBndr] ids CprSig sig ForeachField [Cpr] field_cprs | let sigs :: [CprSig] sigs = (CoreBndr -> Cpr -> CprSig) -> [CoreBndr] -> [Cpr] -> [CprSig] forall a b c. (a -> b -> c) -> [a] -> [b] -> [c] zipWith (Arity -> Cpr -> CprSig mkCprSig (Arity -> Cpr -> CprSig) -> (CoreBndr -> Arity) -> CoreBndr -> Cpr -> CprSig forall b c a. (b -> c) -> (a -> b) -> a -> c . CoreBndr -> Arity idArity) [CoreBndr] ids [Cpr] field_cprs -> AnalEnv -> [(CoreBndr, CprSig)] -> AnalEnv extendSigEnvList AnalEnv env (String -> [CoreBndr] -> [CprSig] -> [(CoreBndr, CprSig)] forall a b. String -> [a] -> [b] -> [(a, b)] zipEqual String "cprAnalAlt" [CoreBndr] ids [CprSig] sigs) | Bool otherwise = AnalEnv env (CprType rhs_ty, Expr CoreBndr rhs') = AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env_alt Expr CoreBndr rhs -- -- * CPR transformer -- data TermFlag -- Better than using a Bool = Terminates | MightDiverge -- See Note [Nested CPR] exprTerminates :: CoreExpr -> TermFlag exprTerminates :: Expr CoreBndr -> TermFlag exprTerminates Expr CoreBndr e | Expr CoreBndr -> Bool exprIsHNF Expr CoreBndr e = TermFlag Terminates -- A /very/ simple termination analysis. | Bool otherwise = TermFlag MightDiverge cprAnalApp :: AnalEnv -> CoreExpr -> [(CprType, CoreArg)] -> (CprType, CoreExpr) -- Main function that takes care of /nested/ CPR. See Note [Nested CPR] cprAnalApp :: AnalEnv -> Expr CoreBndr -> [(CprType, Expr CoreBndr)] -> (CprType, Expr CoreBndr) cprAnalApp AnalEnv env Expr CoreBndr e [(CprType, Expr CoreBndr)] arg_infos = Expr CoreBndr -> [(CprType, Expr CoreBndr)] -> [Expr CoreBndr] -> (CprType, Expr CoreBndr) go Expr CoreBndr e [(CprType, Expr CoreBndr)] arg_infos [] where go :: Expr CoreBndr -> [(CprType, Expr CoreBndr)] -> [Expr CoreBndr] -> (CprType, Expr CoreBndr) go Expr CoreBndr e [(CprType, Expr CoreBndr)] arg_infos [Expr CoreBndr] args' -- Collect CprTypes for (value) args (inlined collectArgs): | App Expr CoreBndr fn Expr CoreBndr arg <- Expr CoreBndr e, Expr CoreBndr -> Bool forall b. Expr b -> Bool isTypeArg Expr CoreBndr arg -- Don't analyse Type args = Expr CoreBndr -> [(CprType, Expr CoreBndr)] -> [Expr CoreBndr] -> (CprType, Expr CoreBndr) go Expr CoreBndr fn [(CprType, Expr CoreBndr)] arg_infos (Expr CoreBndr argExpr CoreBndr -> [Expr CoreBndr] -> [Expr CoreBndr] forall a. a -> [a] -> [a] :[Expr CoreBndr] args') | App Expr CoreBndr fn Expr CoreBndr arg <- Expr CoreBndr e , arg_info :: (CprType, Expr CoreBndr) arg_info@(CprType _arg_ty, Expr CoreBndr arg') <- AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env Expr CoreBndr arg -- See Note [Nested CPR] on the need for termination analysis = Expr CoreBndr -> [(CprType, Expr CoreBndr)] -> [Expr CoreBndr] -> (CprType, Expr CoreBndr) go Expr CoreBndr fn ((CprType, Expr CoreBndr) arg_info(CprType, Expr CoreBndr) -> [(CprType, Expr CoreBndr)] -> [(CprType, Expr CoreBndr)] forall a. a -> [a] -> [a] :[(CprType, Expr CoreBndr)] arg_infos) (Expr CoreBndr arg'Expr CoreBndr -> [Expr CoreBndr] -> [Expr CoreBndr] forall a. a -> [a] -> [a] :[Expr CoreBndr] args') | Var CoreBndr fn <- Expr CoreBndr e = (AnalEnv -> CoreBndr -> [(CprType, Expr CoreBndr)] -> CprType cprTransform AnalEnv env CoreBndr fn [(CprType, Expr CoreBndr)] arg_infos, Expr CoreBndr -> [Expr CoreBndr] -> Expr CoreBndr forall b. Expr b -> [Expr b] -> Expr b mkApps Expr CoreBndr e [Expr CoreBndr] args') | (CprType e_ty, Expr CoreBndr e') <- AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env Expr CoreBndr e -- e is not an App and not a Var = (CprType -> Arity -> CprType applyCprTy CprType e_ty ([(CprType, Expr CoreBndr)] -> Arity forall (t :: * -> *) a. Foldable t => t a -> Arity length [(CprType, Expr CoreBndr)] arg_infos), Expr CoreBndr -> [Expr CoreBndr] -> Expr CoreBndr forall b. Expr b -> [Expr b] -> Expr b mkApps Expr CoreBndr e' [Expr CoreBndr] args') cprTransform :: AnalEnv -- ^ The analysis environment -> Id -- ^ The function -> [(CprType, CoreArg)] -- ^ info about incoming /value/ arguments -> CprType -- ^ The demand type of the application cprTransform :: AnalEnv -> CoreBndr -> [(CprType, Expr CoreBndr)] -> CprType cprTransform AnalEnv env CoreBndr id [(CprType, Expr CoreBndr)] args -- Any local binding, except for data structure bindings -- See Note [Efficient Top sigs in SigEnv] | Just CprSig sig <- AnalEnv -> CoreBndr -> Maybe CprSig lookupSigEnv AnalEnv env CoreBndr id = CprType -> Arity -> CprType applyCprTy (CprSig -> CprType getCprSig CprSig sig) ([(CprType, Expr CoreBndr)] -> Arity forall (t :: * -> *) a. Foldable t => t a -> Arity length [(CprType, Expr CoreBndr)] args) -- See Note [CPR for data structures] | Just Expr CoreBndr rhs <- CoreBndr -> Maybe (Expr CoreBndr) cprDataStructureUnfolding_maybe CoreBndr id = (CprType, Expr CoreBndr) -> CprType forall a b. (a, b) -> a fst ((CprType, Expr CoreBndr) -> CprType) -> (CprType, Expr CoreBndr) -> CprType forall a b. (a -> b) -> a -> b $ AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env Expr CoreBndr rhs -- Some (mostly global, known-key) Ids have bespoke CPR transformers | Just CprType cpr_ty <- CoreBndr -> [(CprType, Expr CoreBndr)] -> Maybe CprType cprTransformBespoke CoreBndr id [(CprType, Expr CoreBndr)] args = CprType cpr_ty -- Other local Ids that respond True to 'isDataStructure' but don't have an -- expandable unfolding, such as NOINLINE bindings. They all get a top sig | CoreBndr -> Bool isLocalId CoreBndr id = Bool -> SDoc -> CprType -> CprType forall a. HasCallStack => Bool -> SDoc -> a -> a assertPpr (CoreBndr -> Bool isDataStructure CoreBndr id) (CoreBndr -> SDoc forall a. Outputable a => a -> SDoc ppr CoreBndr id) CprType topCprType -- See Note [CPR for DataCon wrappers] | CoreBndr -> Bool isDataConWrapId CoreBndr id, let rhs :: Expr CoreBndr rhs = Unfolding -> Expr CoreBndr uf_tmpl (CoreBndr -> Unfolding realIdUnfolding CoreBndr id) = (CprType, Expr CoreBndr) -> CprType forall a b. (a, b) -> a fst ((CprType, Expr CoreBndr) -> CprType) -> (CprType, Expr CoreBndr) -> CprType forall a b. (a -> b) -> a -> b $ AnalEnv -> Expr CoreBndr -> [(CprType, Expr CoreBndr)] -> (CprType, Expr CoreBndr) cprAnalApp AnalEnv env Expr CoreBndr rhs [(CprType, Expr CoreBndr)] args -- DataCon worker | Just DataCon con <- CoreBndr -> Maybe DataCon isDataConWorkId_maybe CoreBndr id = AnalEnv -> DataCon -> [(CprType, Expr CoreBndr)] -> CprType cprTransformDataConWork AnalEnv env DataCon con [(CprType, Expr CoreBndr)] args -- Imported function | Bool otherwise = CprType -> Arity -> CprType applyCprTy (CprSig -> CprType getCprSig (CoreBndr -> CprSig idCprSig CoreBndr id)) ([(CprType, Expr CoreBndr)] -> Arity forall (t :: * -> *) a. Foldable t => t a -> Arity length [(CprType, Expr CoreBndr)] args) -- | Precise, hand-written CPR transformers for select Ids cprTransformBespoke :: Id -> [(CprType, CoreArg)] -> Maybe CprType cprTransformBespoke :: CoreBndr -> [(CprType, Expr CoreBndr)] -> Maybe CprType cprTransformBespoke CoreBndr id [(CprType, Expr CoreBndr)] args -- See Note [Simplification of runRW#] in GHC.CoreToStg.Prep | CoreBndr -> Unique idUnique CoreBndr id Unique -> Unique -> Bool forall a. Eq a => a -> a -> Bool == Unique runRWKey -- `runRW (\s -> e)` , [(CprType arg_ty, Expr CoreBndr _arg)] <- [(CprType, Expr CoreBndr)] args -- `\s -> e` has CPR type `arg` (e.g. `. -> 2`) = CprType -> Maybe CprType forall a. a -> Maybe a Just (CprType -> Maybe CprType) -> CprType -> Maybe CprType forall a b. (a -> b) -> a -> b $ CprType -> Arity -> CprType applyCprTy CprType arg_ty Arity 1 -- `e` has CPR type `2` | Bool otherwise = Maybe CprType forall a. Maybe a Nothing -- | Get a (possibly nested) 'CprType' for an application of a 'DataCon' worker, -- given a saturated number of 'CprType's for its field expressions. -- Implements the Nested part of Note [Nested CPR]. cprTransformDataConWork :: AnalEnv -> DataCon -> [(CprType, CoreArg)] -> CprType cprTransformDataConWork :: AnalEnv -> DataCon -> [(CprType, Expr CoreBndr)] -> CprType cprTransformDataConWork AnalEnv env DataCon con [(CprType, Expr CoreBndr)] args | [CoreBndr] -> Bool forall (t :: * -> *) a. Foldable t => t a -> Bool null (DataCon -> [CoreBndr] dataConExTyCoVars DataCon con) -- No existentials , Arity wkr_arity Arity -> Arity -> Bool forall a. Ord a => a -> a -> Bool <= Arity mAX_CPR_SIZE -- See Note [Trimming to mAX_CPR_SIZE] , [(CprType, Expr CoreBndr)] args [(CprType, Expr CoreBndr)] -> Arity -> Bool forall a. [a] -> Arity -> Bool `lengthIs` Arity wkr_arity , AnalEnv -> DataCon -> IsRecDataConResult ae_rec_dc AnalEnv env DataCon con IsRecDataConResult -> IsRecDataConResult -> Bool forall a. Eq a => a -> a -> Bool /= IsRecDataConResult DefinitelyRecursive -- See Note [CPR for recursive data constructors] -- , pprTrace "cprTransformDataConWork" (ppr con <+> ppr wkr_arity <+> ppr args) True = Arity -> Cpr -> CprType CprType Arity 0 (Arity -> [Cpr] -> Cpr ConCpr (DataCon -> Arity dataConTag DataCon con) (((CprType, Expr CoreBndr) -> StrictnessMark -> Cpr) -> [(CprType, Expr CoreBndr)] -> [StrictnessMark] -> [Cpr] forall a b c. (a -> b -> c) -> [a] -> [b] -> [c] strictZipWith (CprType, Expr CoreBndr) -> StrictnessMark -> Cpr extract_nested_cpr [(CprType, Expr CoreBndr)] args [StrictnessMark] wkr_str_marks)) | Bool otherwise = CprType topCprType where wkr_arity :: Arity wkr_arity = DataCon -> Arity dataConRepArity DataCon con wkr_str_marks :: [StrictnessMark] wkr_str_marks = DataCon -> [StrictnessMark] dataConRepStrictness DataCon con -- See Note [Nested CPR] extract_nested_cpr :: (CprType, Expr CoreBndr) -> StrictnessMark -> Cpr extract_nested_cpr (CprType Arity 0 Cpr cpr, Expr CoreBndr arg) StrictnessMark str | StrictnessMark MarkedStrict <- StrictnessMark str = Cpr cpr | TermFlag Terminates <- Expr CoreBndr -> TermFlag exprTerminates Expr CoreBndr arg = Cpr cpr extract_nested_cpr (CprType, Expr CoreBndr) _ StrictnessMark _ = Cpr topCpr -- intervening lambda or doesn't terminate -- | See Note [Trimming to mAX_CPR_SIZE]. mAX_CPR_SIZE :: Arity mAX_CPR_SIZE :: Arity mAX_CPR_SIZE = Arity 10 -- -- * Bindings -- -- Recursive bindings cprFix :: AnalEnv -- Does not include bindings for this binding -> [(Id,CoreExpr)] -> (AnalEnv, [(Id,CoreExpr)]) -- Binders annotated with CPR info cprFix :: AnalEnv -> [(CoreBndr, Expr CoreBndr)] -> (AnalEnv, [(CoreBndr, Expr CoreBndr)]) cprFix AnalEnv orig_env [(CoreBndr, Expr CoreBndr)] orig_pairs = Arity -> AnalEnv -> [(CoreBndr, Expr CoreBndr)] -> (AnalEnv, [(CoreBndr, Expr CoreBndr)]) loop Arity 1 AnalEnv init_env [(CoreBndr, Expr CoreBndr)] init_pairs where init_sig :: CoreBndr -> CprSig init_sig CoreBndr id -- See Note [CPR for data structures] | CoreBndr -> Bool isDataStructure CoreBndr id = CprSig topCprSig | Bool otherwise = Arity -> Cpr -> CprSig mkCprSig Arity 0 Cpr botCpr -- See Note [Initialising strictness] in GHC.Core.Opt.DmdAnal orig_virgin :: Bool orig_virgin = AnalEnv -> Bool ae_virgin AnalEnv orig_env init_pairs :: [(CoreBndr, Expr CoreBndr)] init_pairs | Bool orig_virgin = [(CoreBndr -> CprSig -> CoreBndr setIdCprSig CoreBndr id (CoreBndr -> CprSig init_sig CoreBndr id), Expr CoreBndr rhs) | (CoreBndr id, Expr CoreBndr rhs) <- [(CoreBndr, Expr CoreBndr)] orig_pairs ] | Bool otherwise = [(CoreBndr, Expr CoreBndr)] orig_pairs init_env :: AnalEnv init_env = AnalEnv -> [CoreBndr] -> AnalEnv extendSigEnvFromIds AnalEnv orig_env (((CoreBndr, Expr CoreBndr) -> CoreBndr) -> [(CoreBndr, Expr CoreBndr)] -> [CoreBndr] forall a b. (a -> b) -> [a] -> [b] map (CoreBndr, Expr CoreBndr) -> CoreBndr forall a b. (a, b) -> a fst [(CoreBndr, Expr CoreBndr)] init_pairs) -- The fixed-point varies the idCprSig field of the binders and and their -- entries in the AnalEnv, and terminates if that annotation does not change -- any more. loop :: Int -> AnalEnv -> [(Id,CoreExpr)] -> (AnalEnv, [(Id,CoreExpr)]) loop :: Arity -> AnalEnv -> [(CoreBndr, Expr CoreBndr)] -> (AnalEnv, [(CoreBndr, Expr CoreBndr)]) loop Arity n AnalEnv env [(CoreBndr, Expr CoreBndr)] pairs | Bool found_fixpoint = (AnalEnv reset_env', [(CoreBndr, Expr CoreBndr)] pairs') | Bool otherwise = Arity -> AnalEnv -> [(CoreBndr, Expr CoreBndr)] -> (AnalEnv, [(CoreBndr, Expr CoreBndr)]) loop (Arity nArity -> Arity -> Arity forall a. Num a => a -> a -> a +Arity 1) AnalEnv env' [(CoreBndr, Expr CoreBndr)] pairs' where -- In all but the first iteration, delete the virgin flag -- See Note [Initialising strictness] in GHC.Core.Opt.DmdAnal (AnalEnv env', [(CoreBndr, Expr CoreBndr)] pairs') = AnalEnv -> [(CoreBndr, Expr CoreBndr)] -> (AnalEnv, [(CoreBndr, Expr CoreBndr)]) step (Bool -> (AnalEnv -> AnalEnv) -> AnalEnv -> AnalEnv forall a. Bool -> (a -> a) -> a -> a applyWhen (Arity nArity -> Arity -> Bool forall a. Eq a => a -> a -> Bool /=Arity 1) AnalEnv -> AnalEnv nonVirgin AnalEnv env) [(CoreBndr, Expr CoreBndr)] pairs -- Make sure we reset the virgin flag to what it was when we are stable reset_env' :: AnalEnv reset_env' = AnalEnv env'{ ae_virgin :: Bool ae_virgin = Bool orig_virgin } found_fixpoint :: Bool found_fixpoint = ((CoreBndr, Expr CoreBndr) -> CprSig) -> [(CoreBndr, Expr CoreBndr)] -> [CprSig] forall a b. (a -> b) -> [a] -> [b] map (CoreBndr -> CprSig idCprSig (CoreBndr -> CprSig) -> ((CoreBndr, Expr CoreBndr) -> CoreBndr) -> (CoreBndr, Expr CoreBndr) -> CprSig forall b c a. (b -> c) -> (a -> b) -> a -> c . (CoreBndr, Expr CoreBndr) -> CoreBndr forall a b. (a, b) -> a fst) [(CoreBndr, Expr CoreBndr)] pairs' [CprSig] -> [CprSig] -> Bool forall a. Eq a => a -> a -> Bool == ((CoreBndr, Expr CoreBndr) -> CprSig) -> [(CoreBndr, Expr CoreBndr)] -> [CprSig] forall a b. (a -> b) -> [a] -> [b] map (CoreBndr -> CprSig idCprSig (CoreBndr -> CprSig) -> ((CoreBndr, Expr CoreBndr) -> CoreBndr) -> (CoreBndr, Expr CoreBndr) -> CprSig forall b c a. (b -> c) -> (a -> b) -> a -> c . (CoreBndr, Expr CoreBndr) -> CoreBndr forall a b. (a, b) -> a fst) [(CoreBndr, Expr CoreBndr)] pairs step :: AnalEnv -> [(Id, CoreExpr)] -> (AnalEnv, [(Id, CoreExpr)]) step :: AnalEnv -> [(CoreBndr, Expr CoreBndr)] -> (AnalEnv, [(CoreBndr, Expr CoreBndr)]) step AnalEnv env [(CoreBndr, Expr CoreBndr)] pairs = (AnalEnv -> (CoreBndr, Expr CoreBndr) -> (AnalEnv, (CoreBndr, Expr CoreBndr))) -> AnalEnv -> [(CoreBndr, Expr CoreBndr)] -> (AnalEnv, [(CoreBndr, Expr CoreBndr)]) forall (t :: * -> *) a b c. Traversable t => (a -> b -> (a, c)) -> a -> t b -> (a, t c) mapAccumL AnalEnv -> (CoreBndr, Expr CoreBndr) -> (AnalEnv, (CoreBndr, Expr CoreBndr)) go AnalEnv env [(CoreBndr, Expr CoreBndr)] pairs where go :: AnalEnv -> (CoreBndr, Expr CoreBndr) -> (AnalEnv, (CoreBndr, Expr CoreBndr)) go AnalEnv env (CoreBndr id, Expr CoreBndr rhs) = (AnalEnv env', (CoreBndr id', Expr CoreBndr rhs')) where (CoreBndr id', Expr CoreBndr rhs', AnalEnv env') = AnalEnv -> CoreBndr -> Expr CoreBndr -> (CoreBndr, Expr CoreBndr, AnalEnv) cprAnalBind AnalEnv env CoreBndr id Expr CoreBndr rhs {- Note [The OPAQUE pragma and avoiding the reboxing of results] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Consider: {-# OPAQUE f #-} f x = (x,y) g True = f 2 x g False = (0,0) Where if we didn't strip the CPR info from 'f' we would end up with the following W/W pair for 'g': $wg True = case f 2 of (x, y) -> (# x, y #) $wg False = (# 0, 0 #) g b = case wg$ b of (# x, y #) -> (x, y) Where the worker unboxes the result of 'f', only for wrapper to box it again. That's because the non-stripped CPR signature of 'f' is saying to W/W-transform 'f'. However, OPAQUE-annotated binders aren't W/W transformed (see Note [OPAQUE pragma]), so we should strip 'f's CPR signature. -} -- | Process the RHS of the binding for a sensible arity, add the CPR signature -- to the Id, and augment the environment with the signature as well. cprAnalBind :: AnalEnv -> Id -> CoreExpr -> (Id, CoreExpr, AnalEnv) cprAnalBind :: AnalEnv -> CoreBndr -> Expr CoreBndr -> (CoreBndr, Expr CoreBndr, AnalEnv) cprAnalBind AnalEnv env CoreBndr id Expr CoreBndr rhs | CoreBndr -> Bool isDFunId CoreBndr id -- Never give DFuns the CPR property; we'll never save allocs. = (CoreBndr id, Expr CoreBndr rhs, AnalEnv -> CoreBndr -> CprSig -> AnalEnv extendSigEnv AnalEnv env CoreBndr id CprSig topCprSig) -- See Note [CPR for data structures] | CoreBndr -> Bool isDataStructure CoreBndr id = (CoreBndr id, Expr CoreBndr rhs, AnalEnv env) -- Data structure => no code => no need to analyse rhs | Bool otherwise = (CoreBndr id', Expr CoreBndr rhs', AnalEnv env') where (CprType rhs_ty, Expr CoreBndr rhs') = AnalEnv -> Expr CoreBndr -> (CprType, Expr CoreBndr) cprAnal AnalEnv env Expr CoreBndr rhs -- possibly trim thunk CPR info rhs_ty' :: CprType rhs_ty' -- See Note [CPR for thunks] | Bool stays_thunk = CprType -> CprType trimCprTy CprType rhs_ty | Bool otherwise = CprType rhs_ty -- See Note [Arity trimming for CPR signatures] sig :: CprSig sig = Arity -> CprType -> CprSig mkCprSigForArity (CoreBndr -> Arity idArity CoreBndr id) CprType rhs_ty' -- See Note [OPAQUE pragma] -- See Note [The OPAQUE pragma and avoiding the reboxing of results] sig' :: CprSig sig' | InlinePragma -> Bool isOpaquePragma (CoreBndr -> InlinePragma idInlinePragma CoreBndr id) = CprSig topCprSig | Bool otherwise = CprSig sig id' :: CoreBndr id' = CoreBndr -> CprSig -> CoreBndr setIdCprSig CoreBndr id CprSig sig' env' :: AnalEnv env' = AnalEnv -> CoreBndr -> CprSig -> AnalEnv extendSigEnv AnalEnv env CoreBndr id CprSig sig' -- See Note [CPR for thunks] stays_thunk :: Bool stays_thunk = Bool is_thunk Bool -> Bool -> Bool && Bool not_strict is_thunk :: Bool is_thunk = Bool -> Bool not (Expr CoreBndr -> Bool exprIsHNF Expr CoreBndr rhs) Bool -> Bool -> Bool && Bool -> Bool not (CoreBndr -> Bool isJoinId CoreBndr id) not_strict :: Bool not_strict = Bool -> Bool not (Demand -> Bool isStrUsedDmd (CoreBndr -> Demand idDemandInfo CoreBndr id)) isDataStructure :: Id -> Bool -- See Note [CPR for data structures] isDataStructure :: CoreBndr -> Bool isDataStructure CoreBndr id = Bool -> Bool not (CoreBndr -> Bool isJoinId CoreBndr id) Bool -> Bool -> Bool && CoreBndr -> Arity idArity CoreBndr id Arity -> Arity -> Bool forall a. Eq a => a -> a -> Bool == Arity 0 Bool -> Bool -> Bool && Unfolding -> Bool isEvaldUnfolding (CoreBndr -> Unfolding idUnfolding CoreBndr id) -- | Returns an expandable unfolding -- (See Note [exprIsExpandable] in "GHC.Core.Utils") that has -- So effectively is a constructor application. cprDataStructureUnfolding_maybe :: Id -> Maybe CoreExpr cprDataStructureUnfolding_maybe :: CoreBndr -> Maybe (Expr CoreBndr) cprDataStructureUnfolding_maybe CoreBndr id -- There are only FinalPhase Simplifier runs after CPR analysis | Activation -> Bool activeInFinalPhase (CoreBndr -> Activation idInlineActivation CoreBndr id) , CoreBndr -> Bool isDataStructure CoreBndr id = Unfolding -> Maybe (Expr CoreBndr) expandUnfolding_maybe (CoreBndr -> Unfolding idUnfolding CoreBndr id) | Bool otherwise = Maybe (Expr CoreBndr) forall a. Maybe a Nothing {- Note [Arity trimming for CPR signatures] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Although it doesn't affect correctness of the analysis per se, we have to trim CPR signatures to idArity. Here's what might happen if we don't: f x = if expensive then \y. Box y else \z. Box z g a b = f a b The two lambdas will have a CPR type of @1m@ (so construct a product after applied to one argument). Thus, @f@ will have a CPR signature of @2m@ (constructs a product after applied to two arguments). But WW will never eta-expand @f@! In this case that would amount to possibly duplicating @expensive@ work. (Side note: Even if @f@'s 'idArity' happened to be 2, it would not do so, see Note [Don't eta expand in w/w].) So @f@ will not be worker/wrappered. But @g@ also inherited its CPR signature from @f@'s, so it *will* be WW'd: f x = if expensive then \y. Box y else \z. Box z $wg a b = case f a b of Box x -> x g a b = Box ($wg a b) And the case in @g@ can never cancel away, thus we introduced extra reboxing. Hence we always trim the CPR signature of a binding to idArity. Note [CPR for DataCon wrappers] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We used to give DataCon wrappers a (necessarily flat) CPR signature in 'GHC.Types.Id.Make.mkDataConRep'. Now we transform DataCon wrappers simply by analysing their unfolding. A few reasons for the change: 1. DataCon wrappers are generally inlined in the Final phase (so before CPR), all leftover occurrences are in a boring context like `f x y = $WMkT y x`. It's simpler to analyse the unfolding anew at every such call site, and the unfolding will be pretty cheap to analyse. Also they occur seldom enough that performance-wise it doesn't matter. 2. 'GHC.Types.Id.Make' no longer precomputes CPR signatures for DataCon *workers*, because their transformers need to adapt to CPR for their arguments in 'cprTransformDataConWork' to enable Note [Nested CPR]. Better keep it all in this module! The alternative would be that 'GHC.Types.Id.Make' depends on DmdAnal. 3. In the future, Nested CPR could take a better account of incoming args in cprAnalApp and do some beta-reduction on the fly, like !1866 did. If any of those args had the CPR property, then we'd even get Nested CPR for DataCon wrapper calls, for free. Not so if we simply give the wrapper a single CPR sig in 'GHC.Types.Id.Make.mkDataConRep'! Note [Trimming to mAX_CPR_SIZE] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We do not treat very big tuples as CPR-ish: a) For a start, we get into trouble because there aren't "enough" unboxed tuple types (a tiresome restriction, but hard to fix), b) More importantly, big unboxed tuples get returned mainly on the stack, and are often then allocated in the heap by the caller. So doing CPR for them may in fact make things worse, especially if the wrapper doesn't cancel away and we move to the stack in the worker and then to the heap in the wrapper. So we (nested) CPR for functions that would otherwise pass more than than 'mAX_CPR_SIZE' fields. That effect is exacerbated for the unregisterised backend, where we don't have any hardware registers to return the fields in. Returning everything on the stack results in much churn and increases compiler allocation by 15% for T15164 in a validate build. -} data AnalEnv = AE { AnalEnv -> SigEnv ae_sigs :: SigEnv -- ^ Current approximation of signatures for local ids , AnalEnv -> Bool ae_virgin :: Bool -- ^ True only on every first iteration in a fixed-point -- iteration. See Note [Initialising strictness] in "GHC.Core.Opt.DmdAnal" , AnalEnv -> FamInstEnvs ae_fam_envs :: FamInstEnvs -- ^ Needed when expanding type families and synonyms of product types. , AnalEnv -> DataCon -> IsRecDataConResult ae_rec_dc :: DataCon -> IsRecDataConResult -- ^ Memoised result of 'GHC.Core.Opt.WorkWrap.Utils.isRecDataCon' } instance Outputable AnalEnv where ppr :: AnalEnv -> SDoc ppr (AE { ae_sigs :: AnalEnv -> SigEnv ae_sigs = SigEnv env, ae_virgin :: AnalEnv -> Bool ae_virgin = Bool virgin }) = String -> SDoc text String "AE" SDoc -> SDoc -> SDoc <+> SDoc -> SDoc braces ([SDoc] -> SDoc vcat [ String -> SDoc text String "ae_virgin =" SDoc -> SDoc -> SDoc <+> Bool -> SDoc forall a. Outputable a => a -> SDoc ppr Bool virgin , String -> SDoc text String "ae_sigs =" SDoc -> SDoc -> SDoc <+> SigEnv -> SDoc forall a. Outputable a => a -> SDoc ppr SigEnv env ]) -- | An environment storing 'CprSig's for local Ids. -- Puts binders with 'topCprSig' in a space-saving 'IntSet'. -- See Note [Efficient Top sigs in SigEnv]. data SigEnv = SE { SigEnv -> UnVarSet se_tops :: !UnVarSet -- ^ All these Ids have 'topCprSig'. Like a 'VarSet', but more efficient. , SigEnv -> VarEnv CprSig se_sigs :: !(VarEnv CprSig) -- ^ Ids that have something other than 'topCprSig'. } instance Outputable SigEnv where ppr :: SigEnv -> SDoc ppr (SE { se_tops :: SigEnv -> UnVarSet se_tops = UnVarSet tops, se_sigs :: SigEnv -> VarEnv CprSig se_sigs = VarEnv CprSig sigs }) = String -> SDoc text String "SE" SDoc -> SDoc -> SDoc <+> SDoc -> SDoc braces ([SDoc] -> SDoc vcat [ String -> SDoc text String "se_tops =" SDoc -> SDoc -> SDoc <+> UnVarSet -> SDoc forall a. Outputable a => a -> SDoc ppr UnVarSet tops , String -> SDoc text String "se_sigs =" SDoc -> SDoc -> SDoc <+> VarEnv CprSig -> SDoc forall a. Outputable a => a -> SDoc ppr VarEnv CprSig sigs ]) emptyAnalEnv :: FamInstEnvs -> AnalEnv emptyAnalEnv :: FamInstEnvs -> AnalEnv emptyAnalEnv FamInstEnvs fam_envs = AE :: SigEnv -> Bool -> FamInstEnvs -> (DataCon -> IsRecDataConResult) -> AnalEnv AE { ae_sigs :: SigEnv ae_sigs = UnVarSet -> VarEnv CprSig -> SigEnv SE UnVarSet emptyUnVarSet VarEnv CprSig forall a. VarEnv a emptyVarEnv , ae_virgin :: Bool ae_virgin = Bool True , ae_fam_envs :: FamInstEnvs ae_fam_envs = FamInstEnvs fam_envs , ae_rec_dc :: DataCon -> IsRecDataConResult ae_rec_dc = (DataCon -> IsRecDataConResult) -> DataCon -> IsRecDataConResult forall k a. Uniquable k => (k -> a) -> k -> a memoiseUniqueFun (FamInstEnvs -> IntWithInf -> DataCon -> IsRecDataConResult isRecDataCon FamInstEnvs fam_envs IntWithInf fuel) } where fuel :: IntWithInf fuel = IntWithInf 3 -- If we can unbox more than 3 constructors to find a -- recursive occurrence, then we can just as well unbox it -- See Note [CPR for recursive data constructors], point (4) modifySigEnv :: (SigEnv -> SigEnv) -> AnalEnv -> AnalEnv modifySigEnv :: (SigEnv -> SigEnv) -> AnalEnv -> AnalEnv modifySigEnv SigEnv -> SigEnv f AnalEnv env = AnalEnv env { ae_sigs :: SigEnv ae_sigs = SigEnv -> SigEnv f (AnalEnv -> SigEnv ae_sigs AnalEnv env) } lookupSigEnv :: AnalEnv -> Id -> Maybe CprSig -- See Note [Efficient Top sigs in SigEnv] lookupSigEnv :: AnalEnv -> CoreBndr -> Maybe CprSig lookupSigEnv AE{ae_sigs :: AnalEnv -> SigEnv ae_sigs = SE UnVarSet tops VarEnv CprSig sigs} CoreBndr id | CoreBndr id CoreBndr -> UnVarSet -> Bool `elemUnVarSet` UnVarSet tops = CprSig -> Maybe CprSig forall a. a -> Maybe a Just CprSig topCprSig | Bool otherwise = VarEnv CprSig -> CoreBndr -> Maybe CprSig forall a. VarEnv a -> CoreBndr -> Maybe a lookupVarEnv VarEnv CprSig sigs CoreBndr id extendSigEnv :: AnalEnv -> Id -> CprSig -> AnalEnv -- See Note [Efficient Top sigs in SigEnv] extendSigEnv :: AnalEnv -> CoreBndr -> CprSig -> AnalEnv extendSigEnv AnalEnv env CoreBndr id CprSig sig | CprSig -> Bool isTopCprSig CprSig sig = (SigEnv -> SigEnv) -> AnalEnv -> AnalEnv modifySigEnv (\SigEnv se -> SigEnv se{se_tops :: UnVarSet se_tops = CoreBndr -> UnVarSet -> UnVarSet extendUnVarSet CoreBndr id (SigEnv -> UnVarSet se_tops SigEnv se)}) AnalEnv env | Bool otherwise = (SigEnv -> SigEnv) -> AnalEnv -> AnalEnv modifySigEnv (\SigEnv se -> SigEnv se{se_sigs :: VarEnv CprSig se_sigs = VarEnv CprSig -> CoreBndr -> CprSig -> VarEnv CprSig forall a. VarEnv a -> CoreBndr -> a -> VarEnv a extendVarEnv (SigEnv -> VarEnv CprSig se_sigs SigEnv se) CoreBndr id CprSig sig}) AnalEnv env -- | Extend an environment with the (Id, CPR sig) pairs extendSigEnvList :: AnalEnv -> [(Id, CprSig)] -> AnalEnv extendSigEnvList :: AnalEnv -> [(CoreBndr, CprSig)] -> AnalEnv extendSigEnvList AnalEnv env [(CoreBndr, CprSig)] ids_cprs = (AnalEnv -> (CoreBndr, CprSig) -> AnalEnv) -> AnalEnv -> [(CoreBndr, CprSig)] -> AnalEnv forall (t :: * -> *) b a. Foldable t => (b -> a -> b) -> b -> t a -> b foldl' (\AnalEnv env (CoreBndr id, CprSig sig) -> AnalEnv -> CoreBndr -> CprSig -> AnalEnv extendSigEnv AnalEnv env CoreBndr id CprSig sig) AnalEnv env [(CoreBndr, CprSig)] ids_cprs -- | Extend an environment with the CPR sigs attached to the ids extendSigEnvFromIds :: AnalEnv -> [Id] -> AnalEnv extendSigEnvFromIds :: AnalEnv -> [CoreBndr] -> AnalEnv extendSigEnvFromIds AnalEnv env [CoreBndr] ids = (AnalEnv -> CoreBndr -> AnalEnv) -> AnalEnv -> [CoreBndr] -> AnalEnv forall (t :: * -> *) b a. Foldable t => (b -> a -> b) -> b -> t a -> b foldl' (\AnalEnv env CoreBndr id -> AnalEnv -> CoreBndr -> CprSig -> AnalEnv extendSigEnv AnalEnv env CoreBndr id (CoreBndr -> CprSig idCprSig CoreBndr id)) AnalEnv env [CoreBndr] ids -- | Extend an environment with the same CPR sig for all ids extendSigEnvAllSame :: AnalEnv -> [Id] -> CprSig -> AnalEnv extendSigEnvAllSame :: AnalEnv -> [CoreBndr] -> CprSig -> AnalEnv extendSigEnvAllSame AnalEnv env [CoreBndr] ids CprSig sig = (AnalEnv -> CoreBndr -> AnalEnv) -> AnalEnv -> [CoreBndr] -> AnalEnv forall (t :: * -> *) b a. Foldable t => (b -> a -> b) -> b -> t a -> b foldl' (\AnalEnv env CoreBndr id -> AnalEnv -> CoreBndr -> CprSig -> AnalEnv extendSigEnv AnalEnv env CoreBndr id CprSig sig) AnalEnv env [CoreBndr] ids nonVirgin :: AnalEnv -> AnalEnv nonVirgin :: AnalEnv -> AnalEnv nonVirgin AnalEnv env = AnalEnv env { ae_virgin :: Bool ae_virgin = Bool False } -- | A version of 'extendSigEnv' for a binder of which we don't see the RHS -- needed to compute a 'CprSig' (e.g. lambdas and DataAlt field binders). -- In this case, we can still look at their demand to attach CPR signatures -- anticipating the unboxing done by worker/wrapper. -- See Note [CPR for binders that will be unboxed]. extendSigEnvForArg :: AnalEnv -> Id -> AnalEnv extendSigEnvForArg :: AnalEnv -> CoreBndr -> AnalEnv extendSigEnvForArg AnalEnv env CoreBndr id = AnalEnv -> CoreBndr -> CprSig -> AnalEnv extendSigEnv AnalEnv env CoreBndr id (CprType -> CprSig CprSig (Demand -> CprType argCprType (CoreBndr -> Demand idDemandInfo CoreBndr id))) -- | Produces a 'CprType' according to how a strict argument will be unboxed. -- Examples: -- -- * A head-strict demand @1!L@ would translate to @1@ -- * A product demand @1!P(1!L,L)@ would translate to @1(1,)@ -- * A product demand @1!P(1L,L)@ would translate to @1(,)@, -- because the first field will not be unboxed. argCprType :: Demand -> CprType argCprType :: Demand -> CprType argCprType Demand dmd = Arity -> Cpr -> CprType CprType Arity 0 (Demand -> Cpr go Demand dmd) where go :: Demand -> Cpr go (Card n :* SubDemand sd) | Card -> Bool isAbs Card n = Cpr topCpr | Prod Boxity Unboxed [Demand] ds <- SubDemand sd = Arity -> [Cpr] -> Cpr ConCpr Arity fIRST_TAG ((Demand -> Cpr) -> [Demand] -> [Cpr] forall a b. (a -> b) -> [a] -> [b] strictMap Demand -> Cpr go [Demand] ds) | Poly Boxity Unboxed Card _ <- SubDemand sd = Arity -> [Cpr] -> Cpr ConCpr Arity fIRST_TAG [] | Bool otherwise = Cpr topCpr {- Note [Safe abortion in the fixed-point iteration] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fixed-point iteration may fail to terminate. But we cannot simply give up and return the environment and code unchanged! We still need to do one additional round, to ensure that all expressions have been traversed at least once, and any unsound CPR annotations have been updated. Note [Efficient Top sigs in SigEnv] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ It's pretty common for binders in the SigEnv to have a 'topCprSig'. Wide records with 100 fields like in T9675 even will generate code where the majority of binders has Top signature. To save some allocations, we store those binders with a Top signature in a separate UnVarSet (which is an IntSet with a convenient Var-tailored API). Why store top signatures at all in the SigEnv? After all, when 'cprTransform' encounters a locally-bound Id without an entry in the SigEnv, it should behave as if that binder has a Top signature! Well, the problem is when case binders should have a Top signatures. They always have an unfolding and thus look to 'cprTransform' as if they bind a data structure, Note [CPR for data structures], and thus would always have the CPR property. So we need some mechanism to separate data structures from case binders with a Top signature, and the UnVarSet provides that in the least convoluted way I can think of. Note [CPR for binders that will be unboxed] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ If a lambda-bound variable will be unboxed by worker/wrapper (so it must be demanded strictly), then give it a CPR signature. Here's a concrete example ('f1' in test T10482a), assuming h is strict: f1 :: Int -> Int f1 x = case h x of A -> x B -> f1 (x-1) C -> x+1 If we notice that 'x' is used strictly, we can give it the CPR property; and hence f1 gets the CPR property too. It's sound (doesn't change strictness) to give it the CPR property because by the time 'x' is returned (case A above), it'll have been evaluated (by the wrapper of 'h' in the example). Moreover, if f itself is strict in x, then we'll pass x unboxed to f1, and so the boxed version *won't* be available; in that case it's very helpful to give 'x' the CPR property. This is all done in 'extendSigEnvForArg'. Note that * Whether or not something unboxes is decided by 'wantToUnboxArg', else we may get over-optimistic CPR results (e.g., from \(x :: a) -> x!). * If the demand unboxes deeply, we can give the binder a /nested/ CPR property, e.g. g :: (Int, Int) -> Int g p = case p of (x, y) | x < 0 -> 0 | otherwise -> x `x` should have the CPR property because it will be unboxed. We do so by giving `p` the Nested CPR property `1(1,)`, indicating that we not only have `p` available unboxed, but also its field `x`. Analysis of the Case will then transfer the CPR property to `x`. Before we were able to express Nested CPR, we used to guess which field binders should get the CPR property. See Historic Note [Optimistic field binder CPR]. * See Note [CPR examples] Note [CPR for thunks] ~~~~~~~~~~~~~~~~~~~~~ If the rhs is a thunk, we usually forget the CPR info, because it is presumably shared (else it would have been inlined, and so we'd lose sharing if w/w'd it into a function). E.g. let r = case expensive of (a,b) -> (b,a) in ... If we marked r as having the CPR property, then we'd w/w into let $wr = \() -> case expensive of (a,b) -> (# b, a #) r = case $wr () of (# b,a #) -> (b,a) in ... But now r is a thunk, which won't be inlined, so we are no further ahead. But consider f x = let r = case expensive of (a,b) -> (b,a) in if foo r then r else (x,x) Does f have the CPR property? Well, no. However, if the strictness analyser has figured out (in a previous iteration) that it's strict, then we DON'T need to forget the CPR info. Instead we can retain the CPR info and do the thunk-splitting transform (see WorkWrap.splitThunk). This made a big difference to PrelBase.modInt, which had something like modInt = \ x -> let r = ... -> I# v in ...body strict in r... r's RHS isn't a value yet; but modInt returns r in various branches, so if r doesn't have the CPR property then neither does modInt Another case I found in practice (in Complex.magnitude), looks like this: let k = if ... then I# a else I# b in ... body strict in k .... (For this example, it doesn't matter whether k is returned as part of the overall result; but it does matter that k's RHS has the CPR property.) Left to itself, the simplifier will make a join point thus: let $j k = ...body strict in k... if ... then $j (I# a) else $j (I# b) With thunk-splitting, we get instead let $j x = let k = I#x in ...body strict in k... in if ... then $j a else $j b This is much better; there's a good chance the I# won't get allocated. But what about botCpr? Consider lvl = error "boom" fac -1 = lvl fac 0 = 1 fac n = n * fac (n-1) fac won't have the CPR property here when we trim every thunk! But the assumption is that error cases are rarely entered and we are diverging anyway, so WW doesn't hurt. Should we also trim CPR on DataCon application bindings? See Note [CPR for data structures]! Note [CPR for data structures] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Long static data structures (whether top-level or not) like xs = x1 : xs1 xs1 = x2 : xs2 xs2 = x3 : xs3 should not get (nested) CPR signatures (#18154), because they * Never get WW'd, so their CPR signature should be irrelevant after analysis (in fact the signature might even be harmful for that reason) * Would need to be inlined/expanded to see their constructed product * BUT MOST IMPORTANTLY, Problem P1: Recording CPR on them blows up interface file sizes and is redundant with their unfolding. In case of Nested CPR, this blow-up can be quadratic! Reason: the CPR info for xs1 contains the CPR info for xs; the CPR info for xs2 contains that for xs1. And so on. By contrast, the size of unfoldings and types stays linear. That's why quadratic blowup is problematic; it makes an asymptotic difference. Hence (Solution S1) we don't give data structure bindings a CPR *signature* and hence don't to analyse them in 'cprAnalBind'. What do we mean by "data structure binding"? Answer: (1) idArity id == 0 (otherwise it's a function) (2) is eval'd (otherwise it's a thunk, Note [CPR for thunks] applies) (3) not (isJoinId id) (otherwise it's a function and its more efficient to analyse it just once rather than at each call site) But (S1) leads to a new Problem P2: We can't just stop giving DataCon application bindings the CPR *property*, for example the factorial function after FloatOut lvl = I# 1# fac 0 = lvl fac n = n * fac (n-1) lvl is a data structure, and hence (see above) will not have a CPR *signature*. But if lvl doesn't have the CPR *property*, fac won't either and we allocate a box for the result on every iteration of the loop. So (Solution S2) when 'cprAnal' meets a variable lacking a CPR signature to extrapolate into a CPR transformer, 'cprTransform' tries to get its unfolding (via 'cprDataStructureUnfolding_maybe'), and analyses that instead. The Result R1: Everything behaves as if there was a CPR signature, but without the blowup in interface files. There is one exception to (R1): x = (y, z); {-# NOINLINE x #-} f p = (y, z); {-# NOINLINE f #-} While we still give the NOINLINE *function* 'f' the CPR property (and WW accordingly, see Note [Worker/wrapper for NOINLINE functions]), we won't give the NOINLINE *data structure* 'x' the CPR property, because it lacks an unfolding. In particular, KindRep bindings are NOINLINE data structures (see the noinline wrinkle in Note [Grand plan for Typeable]). We'll behave as if the bindings had 'topCprSig', and that is fine, as a case on the binding would never cancel away after WW! It's also worth pointing out how ad-hoc (S1) is: If we instead had f1 x = x:[] f2 x = x : f1 x f3 x = x : f2 x ... we still give every function an ever deepening CPR signature. But it's very uncommon to find code like this, whereas the long static data structures from the beginning of this Note are very common because of GHC's strategy of ANF'ing data structure RHSs. Note [CPR for data structures can destroy sharing] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In Note [CPR for data structures], we argued that giving data structure bindings the CPR property is useful to give functions like fac the CPR property: lvl = I# 1# fac 0 = lvl fac n = n * fac (n-1) Worker/wrappering fac for its CPR property means we get a very fast worker function with type Int# -> Int#, without any heap allocation at all. But consider what happens if we call `map fac (replicate n 0)`, where the wrapper doesn't cancel away: Then we rebox the result of $wfac *on each call*, n times, instead of reusing the static thunk for 1, e.g. an asymptotic increase in allocations. If you twist it just right, you can actually write programs that that take O(n) space if you do CPR and O(1) if you don't: fac :: Int -> Int fac 0 = 1 -- this clause will trigger CPR and destroy sharing for O(n) space -- fac 0 = lazy 1 -- this clause will prevent CPR and run in O(1) space fac n = n * fac (n-1) const0 :: Int -> Int const0 n = signum n - 1 -- will return 0 for [1..n] {-# NOINLINE const0 #-} main = print $ foldl' (\acc n -> acc + lazy n) 0 $ map (fac . const0) [1..100000000] Generally, this kind of asymptotic increase in allocation can happen whenever we give a data structure the CPR property that is bound outside of a recursive function. So far we don't have a convincing remedy; giving fac the CPR property is just too attractive. #19309 documents a futile idea. #13331 tracks the general issue of WW destroying sharing and also contains above reproducer. #19326 is about CPR destroying sharing in particular. With Nested CPR, sharing can also be lost within the same "lambda level", for example: f (I# x) = let y = I# (x*#x) in (y, y) Nestedly unboxing would destroy the box shared through 'y'. (Perhaps we can call this "internal sharing", in contrast to "external sharing" beyond lambda or even loop levels above.) But duplicate occurrences like that are pretty rare and may never lead to an asymptotic difference in allocations of 'f'. Note [CPR for recursive data constructors] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Note [CPR for data structures can destroy sharing] gives good reasons not to give shared data structure bindings the CPR property. But we shouldn't even give *functions* that return *recursive* data constructor applications the CPR property. Here's an example for why: c = C# 'a' replicateC :: Int -> [Int] replicateC 1 = [c] replicateC n = c : replicateC (n-1) What happens if we give `replicateC` the (nested) CPR property? We get a WW split for 'replicateC', the wrapper of which is certain to inline, like this: replicateC (I# n) = case $wreplicateC n of (# x, xs #) -> C# x : xs $wreplicateC 1# = (# 'a', [] #) $wreplicateC n = (# 'a', replicateC (I# (n -# 1#)) #) Eliminating the shared 'c' binding in the process. And then * We *might* save allocation of the topmost (of most likely several) (:) constructor if it cancels away at the call site. Similarly for the 'C#' constructor. * But we will now re-allocate the C# box on every iteration of the loop, because we separated the character literal from the C# application. That means n times as many C# allocations as before. Yikes!! * We make all other call sites where the wrapper inlines a bit larger, most of them for no gain. But this shouldn't matter much. * The inlined wrapper may inhibit eta-expansion in some cases. Here's how: If the wrapper is inlined in a strict arg position, the Simplifier will transform as follows f (replicateC n) ==> { inline } f (case $wreplicateC n of (# x, xs #) -> (C# x, xs)) ==> { strict arg } case $wreplicateC n of (# x, xs #) -> f (C# x, xs) Now we can't float out the case anymore. In fact, we can't even float out `$wreplicateC n`, because it returns an unboxed tuple. This can inhibit eta-expansion if we later find out that `f` has arity > 1 (such as when we define `foldl` in terms of `foldr`). #19970 shows how abstaining from worker/wrappering made a difference of -20% in reptile. So while WW'ing for CPR didn't make the program slower directly, the resulting program got much harder to optimise because of the returned unboxed tuple (which can't easily float because unlifted). `replicateC` comes up in T5536, which regresses significantly if CPR'd nestedly. What can we do about it? A. Don't CPR functions that return a *recursive data type* (the list in this case). This is the solution we adopt. Rationale: the benefit of CPR on recursive data structures is slight, because it only affects the outer layer of a potentially massive data structure. B. Don't CPR any *recursive function*. That would be quite conservative, as it would also affect e.g. the factorial function. C. Flat CPR only for recursive functions. This prevents the asymptotic worsening part arising through unsharing the C# box, but it's still quite conservative. D. No CPR at occurrences of shared data structure in hot paths (e.g. the use of `c` in the second eqn of `replicateC`). But we'd need to know which paths were hot. We want such static branch frequency estimates in #20378. We adopt solution (A) It is ad-hoc, but appears to work reasonably well. Deciding what a "recursive data constructor" is is quite tricky and ad-hoc, too: See Note [Detecting recursive data constructors]. We don't have to be perfect and can simply keep on unboxing if unsure. Note [Detecting recursive data constructors] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ What qualifies as a "recursive data constructor" as per Note [CPR for recursive data constructors]? That is up to 'GHC.Core.Opt.WorkWrapW.Utils.isRecDataCon' to decide. It does a DFS search over the field types of the DataCon and looks for term-level recursion into the data constructor's type constructor. Assuming infinite fuel (point (4) below), it looks inside the following class of types, represented by `ty` (and responds `NonRecursiveOrUnsure` in all other cases): A. If `ty = forall v. ty'`, then look into `ty'` B. If `ty = Tc tc_args` and `Tc` is an `AlgTyCon`, look into the arg types of its data constructors and check `tc_args` for recursion. C. If `ty = F tc_args`, `F` is a `FamTyCon` and we can reduce `F tc_args` to `rhs`, look into the `rhs` type. A few perhaps surprising points: 1. It deems any function type as non-recursive, because it's unlikely that a recursion through a function type builds up a recursive data structure. 2. It doesn't look into kinds or coercion types because there's nothing to unbox. Same for promoted data constructors. 3. We don't care whether an AlgTyCon app `T tc_args` is fully saturated or not; we simply look at its definition/DataCons and its field tys and look for recursive occs in the `tc_args` we are given. This is so that we expand the `ST` in `StateT Int (ST s) a`. 4. We don't recurse deeper than 3 (at the moment of this writing) TyCons and assume the DataCon is non-recursive after that. One reason for this "fuel" approach is guaranteed constant-time efficiency; the other is that it's fair to say that a recursion over 3 or more TyCons doesn't really count as a list-like data structure anymore and a bit of unboxing doesn't hurt much. 5. It checks AlgTyCon apps like `T tc_args` by eagerly checking the `tc_args` *before* it looks into the expanded DataCons/NewTyCon, so that it terminates before doing a deep nest of expansions only to discover that the first level already contained a recursion. 6. As a result of keeping the implementation simple, it says "recursive" for `data T = MkT [T]`, even though we could argue that the inner recursion (through the `[]` TyCon) by way of which `T` is recursive will already be "broken" and thus never unboxed. Consequently, it might be OK to CPR a function returning `T`. Lacking arguments for or against the current simple behavior, we stick to it. 7. When the search hits an abstract TyCon (algebraic, but without visible DataCons, e.g., from an .hs-boot file), it returns 'NonRecursiveOrUnsure', the same as when we run out of fuel. If there is ever a recursion through an abstract TyCon, then it's not part of the same function we are looking at in CPR, so we can treat it as if it wasn't recursive. We handle stuck type and data families much the same. Here are a few examples of data constructors or data types with a single data con and the answers of our function: data T = T (Int, (Bool, Char)) NonRec (:) Rec [] NonRec data U = U [Int] NonRec data U2 = U2 [U2] Rec (see point (6)) data T1 = T1 T2; data T2 = T2 T1 Rec newtype Fix f = Fix (f (Fix f)) Rec data N = N (Fix (Either Int)) NonRec data M = M (Fix (Either M)) Rec data F = F (F -> Int) NonRec (see point (1)) data G = G (Int -> G) NonRec (see point (1)) newtype MyM s a = MyM (StateT Int (ST s) a NonRec type S = (Int, Bool) NonRec { type family E a where E Int = Char E (a,b) = (E a, E b) E Char = Blub data Blah = Blah (E (Int, (Int, Int))) NonRec data Blub = Blub (E (Char, Int)) Rec data Blub2 = Blub2 (E (Bool, Int)) } Unsure, because stuck (see point (7)) { data T1 = T1 T2; data T2 = T2 T3; ... data T5 = T5 T1 } Unsure (out of fuel) (see point (4)) { module A where -- A.hs-boot data T module B where import {-# SOURCE #-} A data U = MkU T f :: T -> U f t = MkU t Unsure (T is abstract) (see point (7)) module A where -- A.hs import B data T = MkT U } These examples are tested by the testcase RecDataConCPR. I've played with the idea to make points (1) through (3) of 'isRecDataCon' configurable like (4) to enable more re-use throughout the compiler, but haven't found a killer app for that yet, so ultimately didn't do that. Note [CPR examples] ~~~~~~~~~~~~~~~~~~~ Here are some examples (stranal/should_compile/T10482a) of the usefulness of Note [Optimistic field binder CPR]. The main point: all of these functions can have the CPR property. ------- f1 ----------- -- x is used strictly by h, so it'll be available -- unboxed before it is returned in the True branch f1 :: Int -> Int f1 x = case h x x of True -> x False -> f1 (x-1) ------- f3 ----------- -- h is strict in x, so x will be unboxed before it -- is rerturned in the otherwise case. data T3 = MkT3 Int Int f1 :: T3 -> Int f1 (MkT3 x y) | h x y = f3 (MkT3 x (y-1)) | otherwise = x Historic Note [Optimistic field binder CPR] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This Note describes how we used to guess whether fields have the CPR property before we were able to express Nested CPR for arguments. Consider data T a = MkT a f :: T Int -> Int f x = ... (case x of MkT y -> y) ... And assume we know from strictness analysis that `f` is strict in `x` and its field `y` and we unbox both. Then we give `x` the CPR property according to Note [CPR for binders that will be unboxed]. But `x`'s sole field `y` likewise will be unboxed and it should also get the CPR property. We'd need a *nested* CPR property here for `x` to express that and unwrap one level when we analyse the Case to give the CPR property to `y`. Lacking Nested CPR (hence this Note is historic now that we have Nested CPR), we have to guess a bit, by looking for (A) Flat CPR on the scrutinee (B) A variable scrutinee. Otherwise surely it can't be a parameter. (C) Strict demand on the field binder `y` (or it binds a strict field) While (A) is a necessary condition to give a field the CPR property, there are ways in which (B) and (C) are too lax, leading to unsound analysis results and thus reboxing in the wrapper: (b) We could scrutinise some other variable than a parameter, like in g :: T Int -> Int g x = let z = foo x in -- assume `z` has CPR property case z of MkT y -> y Lacking Nested CPR and multiple levels of unboxing, only the outer box of `z` will be available and a case on `y` won't actually cancel away. But it's simple, and nothing terrible happens if we get it wrong. e.g. #10694. (c) A strictly used field binder doesn't mean the function is strict in it. h :: T Int -> Int -> Int h !x 0 = 0 h x 0 = case x of MkT y -> y Here, `y` is used strictly, but the field of `x` certainly is not and consequently will not be available unboxed. Why not look at the demand of `x` instead to determine whether `y` is unboxed? Because the 'idDemandInfo' on `x` will not have been propagated to its occurrence in the scrutinee when CprAnal runs directly after DmdAnal. We used to give the case binder the CPR property unconditionally instead of deriving it from the case scrutinee. See Historic Note [Optimistic case binder CPR]. Historic Note [Optimistic case binder CPR] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We used to give the case binder the CPR property unconditionally, which is too optimistic (#19232). Here are the details: Inside the alternative, the case binder always has the CPR property, meaning that a case on it will successfully cancel. Example: f True x = case x of y { I# x' -> if x' ==# 3 then y else I# 8 } f False x = I# 3 By giving 'y' the CPR property, we ensure that 'f' does too, so we get f b x = case fw b x of { r -> I# r } fw True x = case x of y { I# x' -> if x' ==# 3 then x' else 8 } fw False x = 3 Of course there is the usual risk of re-boxing: we have 'x' available boxed and unboxed, but we return the unboxed version for the wrapper to box. If the wrapper doesn't cancel with its caller, we'll end up re-boxing something that we did have available in boxed form. -}