Copyright	(c) 2017-18 Composewell Technologies
License	BSD3
Maintainer	harendra.kumar@gmail.com
Stability	experimental
Portability	GHC
Safe Haskell	None
Language	Haskell2010

BenchShow

Description

BenchShow provides a DSL to quickly generate visual graphs or textual reports from benchmarking results file (CSV) produced by gauge or criterion. Reports or graphs can be formatted and presented in many useful ways. For example, we can prepare a graphical bar chart or column wise textual report comparing the performance of two packages or comparing the performance regression in a package caused by a particular change. Absolute or percentage difference between sets of benchmarks can be presented and sorted based on the difference. This allows us to easily identify the worst affected benchmarks and fix them. The presentation is quite flexible and a lot more interesting things can be done with it.

Generating Graphs and Reports

The input is a CSV file generated by gauge --csv=results.csv or a similar output generated by criterion. The graph or the report function is invoked on the file with an appropriate Config to control various parameters of graph or report generation. In most cases defaultConfig should just do the job and a specific config may not be required.

Fields, Groups and RunIds

In the documentation when we say field it means a benchmarking field e.g. time or maxrss. When we say group it means a group of benchmarks. An input file may have benchmark results collected from multiple runs. By default each run is designated as a single benchmark group with the group name default. Benchmark groups from different runs are distinguished using a runId which is the index of the run in the file, starting with 0.

Benchmarks can be classified into multiple groups using classifyBenchmark. Benchmarks from each run can be divided into multiple groups. In a multi-run input benchmark groups can be fully specified using the groupname (either default or as classified by classifyBenchmark) and the runId.

Presentation

We can present the results in a textual format using report or as a graphical chart using graph. Each report consists of a number of benchmarks as rows and the columns can either be benchmarking fields or groups of benchmarks depending on the Presentation setting. In a graphical chart, we present multiple clusters, each cluster representing one column from the textual report, the rows (i.e. the benchmarks) are represented as bars in the cluster.

When the columns are groups, each report presents the results for a single benchmarking field for different benchmark groups. Using GroupStyle, we can further specify how we want to present the results of the groups. We can either present absolute values of the field for each group or we can make the first group as a baseline showing absolute values and present difference from the baseline for the subsequent groups.

When the columns are fields, each report consists of results for a single benchmarking group. Fields cannot be compared like groups because they are of different types and have different measurement units.

The units in the report are automatically determined based on the minimum value in the range of values present. The ranges for fields can be overridden using fieldRanges.

Mean and Max

In a raw benchmark file (--csvraw=results.csv with gauge) we may have data for multiple iterations of each benchmark. BenchShow combines results of all iterations depending on the field type. For example if the field is time it takes the mean of all iterations and if the field is maxrss it takes the maximum of all iterations.

Tutorial and Examples

See the tutorial module BenchShow.Tutorial for sample charts and a comprehensive guide to generating reports and graphs. See the test directory for many usage examples, run the tests to see the charts generated by these tests.

Synopsis

data GroupStyle
- = Absolute
- | Diff
- | PercentDiff
- | Multiples
data Presentation
- = Solo
- | Groups GroupStyle
- | Fields
data TitleAnnotation
- = TitleField
- | TitleEstimator
- | TitleDiff
data Estimator
- = Median
- | Mean
- | Regression
data DiffStrategy
- = SingleEstimator
- | MinEstimator
data SortColumn
- = ColumnIndex Int
- | ColumnName (Either String (String, Int))
data FieldTick
- = TickSize Int
- | TickCount Int
data Config = Config {
- verbose :: Bool
- outputDir :: Maybe FilePath
- mkTitle :: Maybe (String -> String)
- title :: Maybe String
- titleAnnotations :: [TitleAnnotation]
- presentation :: Presentation
- estimator :: Estimator
- threshold :: Word
- diffStrategy :: DiffStrategy
- omitBaseline :: Bool
- selectFields :: [String] -> [String]
- fieldRanges :: [(String, Double, Double)]
- fieldTicks :: [(String, FieldTick)]
- classifyBenchmark :: String -> Maybe (String, String)
- selectGroups :: [(String, Int)] -> [(String, Int)]
- selectBenchmarks :: (SortColumn -> Maybe GroupStyle -> Either String [(String, Double)]) -> [String]
}
defaultConfig :: Config
report :: FilePath -> Maybe FilePath -> Config -> IO ()
graph :: FilePath -> FilePath -> Config -> IO ()

Documentation

data GroupStyle Source #

How to show the results for multiple benchmark groups presented in columns or bar chart clusters. In relative comparisons, the first group is considered as the baseline and the subsequent groups are compared against the baseline.

Definition changed in 0.3.0 @since 0.2.0

Constructors

Absolute	Show absolute values of the field for all groups
Diff	Show baseline group as absolute values and values for the subsequent groups as difference from the baseline
PercentDiff	If the value of the group being compared is higher than the baseline then display the difference as percentage of baseline otherwise display the difference as a percentage of the group being compared.
Multiples	If the value of the group being compared is higher than the baseline then display `+(value / baseline value)` otherwise display `-(baseline value / value)`. This provides a normalized comparison independent of the absolute value of a benchmark. Note that `Multiples` can be directly computed using `PercentDiff` and vice-versa.

Instances

Eq GroupStyle Source #
Instance details Defined in BenchShow.Common Methods (==) :: GroupStyle -> GroupStyle -> Bool # (/=) :: GroupStyle -> GroupStyle -> Bool #
Read GroupStyle Source #
Instance details Defined in BenchShow.Common Methods readsPrec :: Int -> ReadS GroupStyle # readList :: ReadS [GroupStyle] # readPrec :: ReadPrec GroupStyle # readListPrec :: ReadPrec [GroupStyle] #
Show GroupStyle Source #
Instance details Defined in BenchShow.Common Methods showsPrec :: Int -> GroupStyle -> ShowS # show :: GroupStyle -> String # showList :: [GroupStyle] -> ShowS #

data Presentation Source #

How to present the reports or graphs. Each report presents a number of benchmarks as rows, it may have, (1) a single column presenting the values for a single field, (2) multiple columns presenting values for different fields, or (3) multiple columns presenting values of the same field for different groups.

Since: 0.2.0

Constructors

Solo	Reports are generated for each group and for each field selected by the configuration. Each report presents benchmarks in a single group with a single column presenting a single field. If there are `m` fields and `n` groups selected by the configuration then a total of `m x n` reports are generated. Output files are named using `-estimator-groupname-fieldname` as suffix.
Groups GroupStyle	One report is generated for each field selected by the configuration. Each report presents a field with all the groups selected by the configuration as columns or clusters. Output files are named using `-estimator-fieldname` as suffix.
Fields	One report is generated for each group selected by the configuration. Each report presents a group with all the fields selected by the configuration as columns or clusters. Output files are named using `-estimator-groupname` as suffix.

Instances

Eq Presentation Source #
Instance details Defined in BenchShow.Common Methods (==) :: Presentation -> Presentation -> Bool # (/=) :: Presentation -> Presentation -> Bool #
Read Presentation Source #
Instance details Defined in BenchShow.Common Methods readsPrec :: Int -> ReadS Presentation # readList :: ReadS [Presentation] # readPrec :: ReadPrec Presentation # readListPrec :: ReadPrec [Presentation] #
Show Presentation Source #
Instance details Defined in BenchShow.Common Methods showsPrec :: Int -> Presentation -> ShowS # show :: Presentation -> String # showList :: [Presentation] -> ShowS #

data TitleAnnotation Source #

Deprecated: Please use mkTitle to make a custom title

Additional annotations that can be optionally added to the title of the report or graph.

Since: 0.2.2

Constructors

TitleField
TitleEstimator
TitleDiff

Instances

Eq TitleAnnotation Source #
Instance details Defined in BenchShow.Common Methods (==) :: TitleAnnotation -> TitleAnnotation -> Bool # (/=) :: TitleAnnotation -> TitleAnnotation -> Bool #
Read TitleAnnotation Source #
Instance details Defined in BenchShow.Common Methods readsPrec :: Int -> ReadS TitleAnnotation # readList :: ReadS [TitleAnnotation] # readPrec :: ReadPrec TitleAnnotation # readListPrec :: ReadPrec [TitleAnnotation] #
Show TitleAnnotation Source #
Instance details Defined in BenchShow.Common Methods showsPrec :: Int -> TitleAnnotation -> ShowS # show :: TitleAnnotation -> String # showList :: [TitleAnnotation] -> ShowS #

data Estimator Source #

The statistical estimator used to arrive at a single value for a benchmark when samples from multiple experiments are available.

Since: 0.2.0

Constructors

Median	Report the median, outliers and outlier variance using box-plot method. This is the most robust indicator with respect to outliers when successive runs of benchmarks are compared.
Mean	Report the mean and the standard deviation from the mean. This is less robust than median but more precise.
Regression	Report the coefficient of regression, discarding the constant factor, arrived at by linear regression using ordinary least square method. The R-square goodness-of-fit estimate is also reported. It works better when larger number of samples are taken. This cannot be used when the number of samples is less than 2, in that case a mean value is reported instead.

Instances

Eq Estimator Source #
Instance details Defined in BenchShow.Analysis Methods (==) :: Estimator -> Estimator -> Bool # (/=) :: Estimator -> Estimator -> Bool #
Read Estimator Source #
Instance details Defined in BenchShow.Analysis Methods readsPrec :: Int -> ReadS Estimator # readList :: ReadS [Estimator] # readPrec :: ReadPrec Estimator # readListPrec :: ReadPrec [Estimator] #
Show Estimator Source #
Instance details Defined in BenchShow.Analysis Methods showsPrec :: Int -> Estimator -> ShowS # show :: Estimator -> String # showList :: [Estimator] -> ShowS #

data DiffStrategy Source #

Strategy to compute the difference between two groups of benchmarks being compared.

Since: 0.2.0

Constructors

SingleEstimator	Use a single estimator to compute the difference between the baseline and the candidate. The estimator that is provided in the `Config` is used.
MinEstimator	Use `Mean`, `Median` and `Regression` estimators for both baseline and candidate, and report the estimator that shows the minimum difference. This is more robust against random variations.

Instances

Read DiffStrategy Source #
Instance details Defined in BenchShow.Common Methods readsPrec :: Int -> ReadS DiffStrategy # readList :: ReadS [DiffStrategy] # readPrec :: ReadPrec DiffStrategy # readListPrec :: ReadPrec [DiffStrategy] #
Show DiffStrategy Source #
Instance details Defined in BenchShow.Common Methods showsPrec :: Int -> DiffStrategy -> ShowS # show :: DiffStrategy -> String # showList :: [DiffStrategy] -> ShowS #

data SortColumn Source #

When sorting and filtering the benchmarks using selectBenchmarks we can choose a column as a sort criterion. selectBenchmarks is provided with the data for the corresponding column which can be used for sorting the benchmarks. The column could be a group or a field depending on the Presentation.

Since: 0.2.0

Constructors

ColumnIndex Int

Specify the index of the sort column. Index 0 corresponds to the first value column. In a textual report, the very first column consists of benchmark names, therefore index 0 addresses the second column of the report.

ColumnName (Either String (String, Int))

Specify the column using the name of the group or the field it represents, and the runId. When just the name is enough to uniquely identify the sort column the Left constructor can be used, otherwise the Right constructor is used which can use the runId to disambiguate. In a Fields presentation, just the field name is enough. In a Groups presentation, when there is a single benchmark run in the input file, just the group name is enough to identify the group, the runId defaults to 0. However, when there are multiple runs, a group needs to specify a runId as well.

data FieldTick Source #

FieldTick is used only in visual charts to generate the major ticks on the y-axis. You can specify either the size of a tick (TickSize) or the total number of ticks (TickCount).

Since: 0.2.0

Constructors

TickSize Int	Size of a tick, the unit is microseconds for time fields, and bytes for space fields.
TickCount Int	Total number of ticks in the range spread.

data Config Source #

Configuration governing generation of chart. See defaultConfig for the default values of these fields.

Since: 0.2.0

Constructors

Config

Fields

verbose :: Bool
Provide more details in the report, especially the standard deviation, outlier variance, R-square estimate and an annotation to indicate the actual method used when using MinEstimator are reported.
outputDir :: Maybe FilePath
The directory where the output graph or report file should be placed.
mkTitle :: Maybe (String -> String)
Function to make a title for the report. The argument to the function is the benchmark field name for which the report is being made.
title :: Maybe String
DEPRECATED: Please use mkTitle instead.
Report title, more information like the plotted field name or the presentation style may be added to it.
titleAnnotations :: [TitleAnnotation]
DEPRECATED: Please use mkTitle instead.
Additional annotations to be added to the title
presentation :: Presentation
How to determine the layout of the report or the chart.
estimator :: Estimator
The estimator used for the report.
threshold :: Word
The minimum percentage difference between two runs of a benchmark beyond which the benchmark is flagged to have regressed or improved.
diffStrategy :: DiffStrategy
Strategy to compare two runs or groups of benchmarks.
omitBaseline :: Bool
Omit the baseline group in normalized relative comparisons i.e. when the GroupStyle is PercentDiff or Multiples.
selectFields :: [String] -> [String]
Filter and reorder the benchmarking fields. It is invoked with a list of all available benchmarking fields. Only those fields present in the output of this function are plotted and in that order.
fieldRanges :: [(String, Double, Double)]
The values in the tuple are (fieldName, RangeMin, RangeMax). Specify the min and max range of benchmarking fields. If the field value is outside the range it is clipped to the range limit. For time fields, the range values are in microseconds, and for space fields they are in bytes. The minimum of the range is used to determine the unit for the field.
fieldTicks :: [(String, FieldTick)]
The values in the tuple are (fieldName, tick). Specify the tick size of the fields to be used for the graphical reports.
classifyBenchmark :: String -> Maybe (String, String)
Filter, group and translate benchmark names. This function is invoked once for all benchmark names found in the results. It produces a tuple (groupname, benchname), where groupname is the name of the group the benchmark should be placed in, and benchname is the translated benchmark name to be used in the report. If it returns Nothing for a benchmark, that benchmark is omitted from the results.
selectGroups :: [(String, Int)] -> [(String, Int)]
Filter and reorder the benchmark group names. A benchmark group may be assigned using classifyBenchmark; when not assigned, all benchmarks are placed in the default group. The input to this function is a list of tuples with benchmark group names and the runIds. The output produced by this function is a filtered and reordered subset of the input. Only those benchmark groups present in the output are rendered and are presented in that order.
selectBenchmarks :: (SortColumn -> Maybe GroupStyle -> Either String [(String, Double)]) -> [String]
Filter and reorder benchmarks. selectBenchmarks takes a function argument, the function is invoked with a sorting column name or index and a GroupStyle. The output of the function is either a Right value consisting of tuples of the benchmark names and values corresponding to the given column and style or a Left value indicating an error. selectBenchmarks can inspect these benchmarks and there values to produce a filtered and sorted list of benchmark names that are to be rendered.
The style argument is ignored when the report presentation is not Groups. When style is Nothing, the presentation setting specified in the configuration is used.
Signature changed in 0.3.0

defaultConfig :: Config Source #

Default configuration. Use this as the base configuration and modify the required fields. The defaults are:

 verbose           = False
 mkTitle           = Nothing
 titleAnnotations  = [TitleField]
 outputDir         = Nothing
 presentation      = Groups Absolute
 estimator         = Median
 threshold         = 3
 diffStrategy      = SingleEstimator
 omitBaseline      = False
 selectFields      = filter (flip elem ["time", "mean", "maxrss"] . map toLower)
 fieldRanges       = []
 fieldTicks        = []
 classifyBenchmark = Just . ("default",)
 selectGroups      = id
 selectBenchmarks  = f -> either error (map fst) $ f (ColumnIndex 0) Nothing

Since: 0.2.0

report :: FilePath -> Maybe FilePath -> Config -> IO () Source #

Presents the benchmark results in a CSV input file as text reports according to the provided configuration. The first parameter is the input file name. The second parameter, when specified using Just, is the name prefix for the output SVG image file(s). One or more output files may be generated with the given prefix depending on the Presentation setting. When the second parameter is Nothing the reports are printed on the console. The last parameter is the configuration to customize the report, you can start with defaultConfig as the base and override any of the fields that you may want to change.

For example:

report "bench-results.csv" Nothing defaultConfig

Since: 0.2.0

graph :: FilePath -> FilePath -> Config -> IO () Source #

Presents the benchmark results in a CSV input file as graphical bar charts according to the provided configuration. The first parameter is the input file name, the second parameter is the name prefix for the output SVG image file(s). One or more output files may be generated depending on the Presentation setting. The last parameter is the configuration to customize the graph, you can start with defaultConfig as the base and override any of the fields that you may want to change.

For example:

graph "bench-results.csv" "output-graph" defaultConfig

Since: 0.2.0