Copyright | (c) 2009 Bryan O'Sullivan |
---|---|
License | BSD3 |
Maintainer | bos@serpentine.com |
Stability | experimental |
Portability | portable |
Safe Haskell | Safe-Inferred |
Language | Haskell2010 |
Deprecated: Use Statistics.Sample.KernelDensity instead.
Kernel density estimation code, providing non-parametric ways to estimate the probability density function of a sample.
The techniques used by functions in this module are relatively
fast, but they generally give inferior results to the KDE function
in the main KernelDensity
module (due to the
oversmoothing documented for bandwidth
below).
Synopsis
- epanechnikovPDF :: Vector v Double => Int -> v Double -> (Points, Vector Double)
- gaussianPDF :: Vector v Double => Int -> v Double -> (Points, Vector Double)
- newtype Points = Points {}
- choosePoints :: Vector v Double => Int -> Double -> v Double -> Points
- type Bandwidth = Double
- bandwidth :: Vector v Double => (Double -> Bandwidth) -> v Double -> Bandwidth
- epanechnikovBW :: Double -> Bandwidth
- gaussianBW :: Double -> Bandwidth
- type Kernel = Double -> Double -> Double -> Double -> Double
- epanechnikovKernel :: Kernel
- gaussianKernel :: Kernel
- estimatePDF :: Vector v Double => Kernel -> Bandwidth -> v Double -> Points -> Vector Double
- simplePDF :: Vector v Double => (Double -> Double) -> Kernel -> Double -> Int -> v Double -> (Points, Vector Double)
Simple entry points
:: Vector v Double | |
=> Int | Number of points at which to estimate |
-> v Double | Data sample |
-> (Points, Vector Double) |
Simple Epanechnikov kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.
:: Vector v Double | |
=> Int | Number of points at which to estimate |
-> v Double | Data sample |
-> (Points, Vector Double) |
Simple Gaussian kernel density estimator. Returns the uniformly spaced points from the sample range at which the density function was estimated, and the estimates at those points.
Building blocks
Choosing points from a sample
Points from the range of a Sample
.
Instances
FromJSON Points Source # | |
ToJSON Points Source # | |
Defined in Statistics.Sample.KernelDensity.Simple | |
Data Points Source # | |
Defined in Statistics.Sample.KernelDensity.Simple gfoldl :: (forall d b. Data d => c (d -> b) -> d -> c b) -> (forall g. g -> c g) -> Points -> c Points # gunfold :: (forall b r. Data b => c (b -> r) -> c r) -> (forall r. r -> c r) -> Constr -> c Points # toConstr :: Points -> Constr # dataTypeOf :: Points -> DataType # dataCast1 :: Typeable t => (forall d. Data d => c (t d)) -> Maybe (c Points) # dataCast2 :: Typeable t => (forall d e. (Data d, Data e) => c (t d e)) -> Maybe (c Points) # gmapT :: (forall b. Data b => b -> b) -> Points -> Points # gmapQl :: (r -> r' -> r) -> r -> (forall d. Data d => d -> r') -> Points -> r # gmapQr :: forall r r'. (r' -> r -> r) -> r -> (forall d. Data d => d -> r') -> Points -> r # gmapQ :: (forall d. Data d => d -> u) -> Points -> [u] # gmapQi :: Int -> (forall d. Data d => d -> u) -> Points -> u # gmapM :: Monad m => (forall d. Data d => d -> m d) -> Points -> m Points # gmapMp :: MonadPlus m => (forall d. Data d => d -> m d) -> Points -> m Points # gmapMo :: MonadPlus m => (forall d. Data d => d -> m d) -> Points -> m Points # | |
Generic Points Source # | |
Read Points Source # | |
Show Points Source # | |
Binary Points Source # | |
Eq Points Source # | |
type Rep Points Source # | |
Defined in Statistics.Sample.KernelDensity.Simple |
:: Vector v Double | |
=> Int | Number of points to select, n |
-> Double | Sample bandwidth, h |
-> v Double | Input data |
-> Points |
Choose a uniform range of points at which to estimate a sample's probability density function.
If you are using a Gaussian kernel, multiply the sample's bandwidth by 3 before passing it to this function.
If this function is passed an empty vector, it returns values of positive and negative infinity.
Bandwidth estimation
bandwidth :: Vector v Double => (Double -> Bandwidth) -> v Double -> Bandwidth Source #
Compute the optimal bandwidth from the observed data for the given kernel.
This function uses an estimate based on the standard deviation of a sample (due to Deheuvels), which performs reasonably well for unimodal distributions but leads to oversmoothing for more complex ones.
epanechnikovBW :: Double -> Bandwidth Source #
Bandwidth estimator for an Epanechnikov kernel.
gaussianBW :: Double -> Bandwidth Source #
Bandwidth estimator for a Gaussian kernel.
Kernels
type Kernel = Double -> Double -> Double -> Double -> Double Source #
The convolution kernel. Its parameters are as follows:
- Scaling factor, 1/nh
- Bandwidth, h
- A point at which to sample the input, p
- One sample value, v
epanechnikovKernel :: Kernel Source #
Epanechnikov kernel for probability density function estimation.
gaussianKernel :: Kernel Source #
Gaussian kernel for probability density function estimation.
Low-level estimation
:: Vector v Double | |
=> Kernel | Kernel function |
-> Bandwidth | Bandwidth, h |
-> v Double | Sample data |
-> Points | Points at which to estimate |
-> Vector Double |
Kernel density estimator, providing a non-parametric way of estimating the PDF of a random variable.
:: Vector v Double | |
=> (Double -> Double) | Bandwidth function |
-> Kernel | Kernel function |
-> Double | Bandwidth scaling factor (3 for a Gaussian kernel, 1 for all others) |
-> Int | Number of points at which to estimate |
-> v Double | sample data |
-> (Points, Vector Double) |
A helper for creating a simple kernel density estimation function with automatically chosen bandwidth and estimation points.
References
- Deheuvels, P. (1977) Estimation non paramétrique de la densité par histogrammes généralisés. Mhttp:/archive.numdam.orgarticle/RSA_1977__25_3_5_0.pdf>