src/parsec/disk-image/parsec/parsec-benchmark/pkgs/libs/gsl/src/doc/statistics.texi - public/gem5-resources - Git at Google

 @cindex statistics
 @cindex mean
 @cindex standard deviation
 @cindex variance
 @cindex estimated standard deviation
 @cindex estimated variance
 @cindex t-test
 @cindex range
 @cindex min
 @cindex max

 This chapter describes the statistical functions in the library.  The
 basic statistical functions include routines to compute the mean,
 variance and standard deviation.  More advanced functions allow you to
 calculate absolute deviations, skewness, and kurtosis as well as the
 median and arbitrary percentiles.  The algorithms use recurrence
 relations to compute average quantities in a stable way, without large
 intermediate values that might overflow.

 The functions are available in versions for datasets in the standard
 floating-point and integer types.  The versions for double precision
 floating-point data have the prefix @code{gsl_stats} and are declared in
 the header file @file{gsl_statistics_double.h}.  The versions for integer
 data have the prefix @code{gsl_stats_int} and are declared in the header
 file @file{gsl_statistics_int.h}.

 @menu
 * Mean and standard deviation and variance::
 * Absolute deviation::
 * Higher moments (skewness and kurtosis)::
 * Autocorrelation::
 * Covariance::
 * Weighted Samples::
 * Maximum and Minimum values::
 * Median and Percentiles::
 * Example statistical programs::
 * Statistics References and Further Reading::
 @end menu

 @node Mean and standard deviation and variance
 @section Mean, Standard Deviation and Variance

 @deftypefun double gsl_stats_mean (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function returns the arithmetic mean of @var{data}, a dataset of
 length @var{n} with stride @var{stride}.  The arithmetic mean, or
 @dfn{sample mean}, is denoted by @math{\Hat\mu} and defined as,
 @tex
 \beforedisplay
 $$
 {\Hat\mu} = {1 \over N} \sum x_i
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 \Hat\mu = (1/N) \sum x_i
 @end example

 @end ifinfo
 @noindent
 where @math{x_i} are the elements of the dataset @var{data}.  For
 samples drawn from a gaussian distribution the variance of
 @math{\Hat\mu} is @math{\sigma^2 / N}.
 @end deftypefun

 @deftypefun double gsl_stats_variance (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function returns the estimated, or @dfn{sample}, variance of
 @var{data}, a dataset of length @var{n} with stride @var{stride}.  The
 estimated variance is denoted by @math{\Hat\sigma^2} and is defined by,
 @tex
 \beforedisplay
 $$
 {\Hat\sigma}^2 = {1 \over (N-1)} \sum (x_i - {\Hat\mu})^2
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 \Hat\sigma^2 = (1/(N-1)) \sum (x_i - \Hat\mu)^2
 @end example

 @end ifinfo
 @noindent
 where @math{x_i} are the elements of the dataset @var{data}.  Note that
 the normalization factor of @math{1/(N-1)} results from the derivation
 of @math{\Hat\sigma^2} as an unbiased estimator of the population
 variance @math{\sigma^2}.  For samples drawn from a gaussian distribution
 the variance of @math{\Hat\sigma^2} itself is @math{2 \sigma^4 / N}.

 This function computes the mean via a call to @code{gsl_stats_mean}.  If
 you have already computed the mean then you can pass it directly to
 @code{gsl_stats_variance_m}.
 @end deftypefun

 @deftypefun double gsl_stats_variance_m (const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{mean})
 This function returns the sample variance of @var{data} relative to the
 given value of @var{mean}.  The function is computed with @math{\Hat\mu}
 replaced by the value of @var{mean} that you supply,
 @tex
 \beforedisplay
 $$
 {\Hat\sigma}^2 = {1 \over (N-1)} \sum (x_i - mean)^2
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 \Hat\sigma^2 = (1/(N-1)) \sum (x_i - mean)^2
 @end example
 @end ifinfo
 @end deftypefun

 @deftypefun double gsl_stats_sd (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 @deftypefunx double gsl_stats_sd_m (const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{mean})
 The standard deviation is defined as the square root of the variance.
 These functions return the square root of the corresponding variance
 functions above.
 @end deftypefun

 @deftypefun double gsl_stats_variance_with_fixed_mean (const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{mean})
 This function computes an unbiased estimate of the variance of
 @var{data} when the population mean @var{mean} of the underlying
 distribution is known @emph{a priori}.  In this case the estimator for
 the variance uses the factor @math{1/N} and the sample mean
 @math{\Hat\mu} is replaced by the known population mean @math{\mu},
 @tex
 \beforedisplay
 $$
 {\Hat\sigma}^2 = {1 \over N} \sum (x_i - \mu)^2
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 \Hat\sigma^2 = (1/N) \sum (x_i - \mu)^2
 @end example
 @end ifinfo
 @end deftypefun


 @deftypefun double gsl_stats_sd_with_fixed_mean (const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{mean})
 This function calculates the standard deviation of @var{data} for a
 fixed population mean @var{mean}.  The result is the square root of the
 corresponding variance function.
 @end deftypefun

 @node Absolute deviation
 @section Absolute deviation

 @deftypefun double gsl_stats_absdev (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function computes the absolute deviation from the mean of
 @var{data}, a dataset of length @var{n} with stride @var{stride}.  The
 absolute deviation from the mean is defined as,
 @tex
 \beforedisplay
 $$
 absdev  = {1 \over N} \sum |x_i - {\Hat\mu}|
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 absdev  = (1/N) \sum |x_i - \Hat\mu|
 @end example

 @end ifinfo
 @noindent
 where @math{x_i} are the elements of the dataset @var{data}.  The
 absolute deviation from the mean provides a more robust measure of the
 width of a distribution than the variance.  This function computes the
 mean of @var{data} via a call to @code{gsl_stats_mean}.
 @end deftypefun

 @deftypefun double gsl_stats_absdev_m (const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{mean})
 This function computes the absolute deviation of the dataset @var{data}
 relative to the given value of @var{mean},
 @tex
 \beforedisplay
 $$
 absdev  = {1 \over N} \sum |x_i - mean|
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 absdev  = (1/N) \sum |x_i - mean|
 @end example

 @end ifinfo
 @noindent
 This function is useful if you have already computed the mean of
 @var{data} (and want to avoid recomputing it), or wish to calculate the
 absolute deviation relative to another value (such as zero, or the
 median).
 @end deftypefun

 @node Higher moments (skewness and kurtosis)
 @section Higher moments (skewness and kurtosis)

 @deftypefun double gsl_stats_skew (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function computes the skewness of @var{data}, a dataset of length
 @var{n} with stride @var{stride}.  The skewness is defined as,
 @tex
 \beforedisplay
 $$
 skew = {1 \over N} \sum
  {\left( x_i - {\Hat\mu} \over {\Hat\sigma} \right)}^3
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 skew = (1/N) \sum ((x_i - \Hat\mu)/\Hat\sigma)^3
 @end example

 @end ifinfo
 @noindent
 where @math{x_i} are the elements of the dataset @var{data}.  The skewness
 measures the asymmetry of the tails of a distribution.

 The function computes the mean and estimated standard deviation of
 @var{data} via calls to @code{gsl_stats_mean} and @code{gsl_stats_sd}.
 @end deftypefun

 @deftypefun double gsl_stats_skew_m_sd (const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{mean}, double @var{sd})
 This function computes the skewness of the dataset @var{data} using the
 given values of the mean @var{mean} and standard deviation @var{sd},
 @tex
 \beforedisplay
 $$
 skew = {1 \over N}
      \sum {\left( x_i - mean \over sd \right)}^3
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 skew = (1/N) \sum ((x_i - mean)/sd)^3
 @end example

 @end ifinfo
 @noindent
 These functions are useful if you have already computed the mean and
 standard deviation of @var{data} and want to avoid recomputing them.
 @end deftypefun

 @deftypefun double gsl_stats_kurtosis (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function computes the kurtosis of @var{data}, a dataset of length
 @var{n} with stride @var{stride}.  The kurtosis is defined as,
 @tex
 \beforedisplay
 $$
 kurtosis = \left( {1 \over N} \sum
  {\left(x_i - {\Hat\mu} \over {\Hat\sigma} \right)}^4
  \right)
  - 3
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 kurtosis = ((1/N) \sum ((x_i - \Hat\mu)/\Hat\sigma)^4)  - 3
 @end example

 @end ifinfo
 @noindent
 The kurtosis measures how sharply peaked a distribution is, relative to
 its width.  The kurtosis is normalized to zero for a gaussian
 distribution.
 @end deftypefun

 @deftypefun double gsl_stats_kurtosis_m_sd (const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{mean}, double @var{sd})
 This function computes the kurtosis of the dataset @var{data} using the
 given values of the mean @var{mean} and standard deviation @var{sd},
 @tex
 \beforedisplay
 $$
 kurtosis = {1 \over N}
   \left( \sum {\left(x_i - mean \over sd \right)}^4 \right)
   - 3
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 kurtosis = ((1/N) \sum ((x_i - mean)/sd)^4) - 3
 @end example

 @end ifinfo
 @noindent
 This function is useful if you have already computed the mean and
 standard deviation of @var{data} and want to avoid recomputing them.
 @end deftypefun

 @node Autocorrelation
 @section Autocorrelation

 @deftypefun double gsl_stats_lag1_autocorrelation (const double @var{data}[], const size_t @var{stride}, const size_t @var{n})
 This function computes the lag-1 autocorrelation of the dataset @var{data}.
 @tex
 \beforedisplay
 $$
 a_1 = {\sum_{i = 1}^{n} (x_{i} - \Hat\mu) (x_{i-1} - \Hat\mu)
 \over
 \sum_{i = 1}^{n} (x_{i} - \Hat\mu) (x_{i} - \Hat\mu)}
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 a_1 = @{\sum_@{i = 1@}^@{n@} (x_@{i@} - \Hat\mu) (x_@{i-1@} - \Hat\mu)
        \over
        \sum_@{i = 1@}^@{n@} (x_@{i@} - \Hat\mu) (x_@{i@} - \Hat\mu)@}
 @end example
 @end ifinfo
 @end deftypefun


 @deftypefun double gsl_stats_lag1_autocorrelation_m (const double @var{data}[], const size_t @var{stride}, const size_t @var{n}, const double @var{mean})
 This function computes the lag-1 autocorrelation of the dataset
 @var{data} using the given value of the mean @var{mean}.

 @end deftypefun

 @node Covariance
 @section Covariance
 @cindex covariance, of two datasets

 @deftypefun double gsl_stats_covariance (const double @var{data1}[], const size_t @var{stride1}, const double @var{data2}[], const size_t @var{stride2}, const size_t @var{n})
 This function computes the covariance of the datasets @var{data1} and
 @var{data2} which must both be of the same length @var{n}.
 @tex
 \beforedisplay
 $$
 covar = {1 \over (n - 1)} \sum_{i = 1}^{n} (x_{i} - \Hat x) (y_{i} - \Hat y)
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 covar = (1/(n - 1)) \sum_@{i = 1@}^@{n@} (x_i - \Hat x) (y_i - \Hat y)
 @end example
 @end ifinfo
 @end deftypefun

 @deftypefun double gsl_stats_covariance_m (const double @var{data1}[], const size_t @var{stride1}, const double @var{data2}[], const size_t @var{stride2}, const size_t @var{n}, const double @var{mean1}, const double @var{mean2})
 This function computes the covariance of the datasets @var{data1} and
 @var{data2} using the given values of the means, @var{mean1} and
 @var{mean2}.  This is useful if you have already computed the means of
 @var{data1} and @var{data2} and want to avoid recomputing them.
 @end deftypefun


 @node Weighted Samples
 @section Weighted Samples

 The functions described in this section allow the computation of
 statistics for weighted samples.  The functions accept an array of
 samples, @math{x_i}, with associated weights, @math{w_i}.  Each sample
 @math{x_i} is considered as having been drawn from a Gaussian
 distribution with variance @math{\sigma_i^2}.  The sample weight
 @math{w_i} is defined as the reciprocal of this variance, @math{w_i =
 1/\sigma_i^2}.  Setting a weight to zero corresponds to removing a
 sample from a dataset.

 @deftypefun double gsl_stats_wmean (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function returns the weighted mean of the dataset @var{data} with
 stride @var{stride} and length @var{n}, using the set of weights @var{w}
 with stride @var{wstride} and length @var{n}.  The weighted mean is defined as,
 @tex
 \beforedisplay
 $$
 {\Hat\mu} = {{\sum w_i x_i} \over {\sum w_i}}
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 \Hat\mu = (\sum w_i x_i) / (\sum w_i)
 @end example
 @end ifinfo
 @end deftypefun


 @deftypefun double gsl_stats_wvariance (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function returns the estimated variance of the dataset @var{data}
 with stride @var{stride} and length @var{n}, using the set of weights
 @var{w} with stride @var{wstride} and length @var{n}.  The estimated
 variance of a weighted dataset is defined as,
 @tex
 \beforedisplay
 $$
 \Hat\sigma^2 = {{\sum w_i} \over {(\sum w_i)^2 - \sum (w_i^2)}}
                 \sum w_i (x_i - \Hat\mu)^2
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 \Hat\sigma^2 = ((\sum w_i)/((\sum w_i)^2 - \sum (w_i^2)))
                 \sum w_i (x_i - \Hat\mu)^2
 @end example

 @end ifinfo
 @noindent
 Note that this expression reduces to an unweighted variance with the
 familiar @math{1/(N-1)} factor when there are @math{N} equal non-zero
 weights.
 @end deftypefun

 @deftypefun double gsl_stats_wvariance_m (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{wmean})
 This function returns the estimated variance of the weighted dataset
 @var{data} using the given weighted mean @var{wmean}.
 @end deftypefun

 @deftypefun double gsl_stats_wsd (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n})
 The standard deviation is defined as the square root of the variance.
 This function returns the square root of the corresponding variance
 function @code{gsl_stats_wvariance} above.
 @end deftypefun

 @deftypefun double gsl_stats_wsd_m (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{wmean})
 This function returns the square root of the corresponding variance
 function @code{gsl_stats_wvariance_m} above.
 @end deftypefun

 @deftypefun double gsl_stats_wvariance_with_fixed_mean (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n}, const double @var{mean})
 This function computes an unbiased estimate of the variance of weighted
 dataset @var{data} when the population mean @var{mean} of the underlying
 distribution is known @emph{a priori}.  In this case the estimator for
 the variance replaces the sample mean @math{\Hat\mu} by the known
 population mean @math{\mu},
 @tex
 \beforedisplay
 $$
 \Hat\sigma^2 = {{\sum w_i (x_i - \mu)^2} \over {\sum w_i}}
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 \Hat\sigma^2 = (\sum w_i (x_i - \mu)^2) / (\sum w_i)
 @end example
 @end ifinfo
 @end deftypefun

 @deftypefun double gsl_stats_wsd_with_fixed_mean (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n}, const double @var{mean})
 The standard deviation is defined as the square root of the variance.
 This function returns the square root of the corresponding variance
 function above.
 @end deftypefun

 @deftypefun double gsl_stats_wabsdev (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function computes the weighted absolute deviation from the weighted
 mean of @var{data}.  The absolute deviation from the mean is defined as,
 @tex
 \beforedisplay
 $$
 absdev = {{\sum w_i |x_i - \Hat\mu|} \over {\sum w_i}}
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 absdev = (\sum w_i |x_i - \Hat\mu|) / (\sum w_i)
 @end example
 @end ifinfo
 @end deftypefun

 @deftypefun double gsl_stats_wabsdev_m (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{wmean})
 This function computes the absolute deviation of the weighted dataset
 @var{data} about the given weighted mean @var{wmean}.
 @end deftypefun

 @deftypefun double gsl_stats_wskew (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function computes the weighted skewness of the dataset @var{data}.
 @tex
 \beforedisplay
 $$
 skew = {{\sum w_i ((x_i - xbar)/\sigma)^3} \over {\sum w_i}}
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 skew = (\sum w_i ((x_i - xbar)/\sigma)^3) / (\sum w_i)
 @end example
 @end ifinfo
 @end deftypefun

 @deftypefun double gsl_stats_wskew_m_sd (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{wmean}, double @var{wsd})
 This function computes the weighted skewness of the dataset @var{data}
 using the given values of the weighted mean and weighted standard
 deviation, @var{wmean} and @var{wsd}.
 @end deftypefun

 @deftypefun double gsl_stats_wkurtosis (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function computes the weighted kurtosis of the dataset @var{data}.

 @tex
 \beforedisplay
 $$
 kurtosis = {{\sum w_i ((x_i - xbar)/sigma)^4} \over {\sum w_i}} - 3
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 kurtosis = ((\sum w_i ((x_i - xbar)/sigma)^4) / (\sum w_i)) - 3
 @end example
 @end ifinfo
 @end deftypefun

 @deftypefun double gsl_stats_wkurtosis_m_sd (const double @var{w}[], size_t @var{wstride}, const double @var{data}[], size_t @var{stride}, size_t @var{n}, double @var{wmean}, double @var{wsd})
 This function computes the weighted kurtosis of the dataset @var{data}
 using the given values of the weighted mean and weighted standard
 deviation, @var{wmean} and @var{wsd}.
 @end deftypefun

 @node Maximum and Minimum values
 @section Maximum and Minimum values

 The following functions find the maximum and minimum values of a
 dataset (or their indices).  If the data contains @code{NaN}s then a
 @code{NaN} will be returned, since the maximum or minimum value is
 undefined.  For functions which return an index, the location of the
 first @code{NaN} in the array is returned.

 @deftypefun double gsl_stats_max (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function returns the maximum value in @var{data}, a dataset of
 length @var{n} with stride @var{stride}.  The maximum value is defined
 as the value of the element @math{x_i} which satisfies @c{$x_i \ge x_j$}
 @math{x_i >= x_j} for all @math{j}.

 If you want instead to find the element with the largest absolute
 magnitude you will need to apply @code{fabs} or @code{abs} to your data
 before calling this function.
 @end deftypefun

 @deftypefun double gsl_stats_min (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function returns the minimum value in @var{data}, a dataset of
 length @var{n} with stride @var{stride}.  The minimum value is defined
 as the value of the element @math{x_i} which satisfies @c{$x_i \le x_j$}
 @math{x_i <= x_j} for all @math{j}.

 If you want instead to find the element with the smallest absolute
 magnitude you will need to apply @code{fabs} or @code{abs} to your data
 before calling this function.
 @end deftypefun

 @deftypefun void gsl_stats_minmax (double * @var{min}, double * @var{max}, const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function finds both the minimum and maximum values @var{min},
 @var{max} in @var{data} in a single pass.
 @end deftypefun

 @deftypefun size_t gsl_stats_max_index (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function returns the index of the maximum value in @var{data}, a
 dataset of length @var{n} with stride @var{stride}.  The maximum value is
 defined as the value of the element @math{x_i} which satisfies
 @c{$x_i \ge x_j$}
 @math{x_i >= x_j} for all @math{j}.  When there are several equal maximum
 elements then the first one is chosen.
 @end deftypefun

 @deftypefun size_t gsl_stats_min_index (const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function returns the index of the minimum value in @var{data}, a
 dataset of length @var{n} with stride @var{stride}.  The minimum value
 is defined as the value of the element @math{x_i} which satisfies
 @c{$x_i \ge x_j$}
 @math{x_i >= x_j} for all @math{j}.  When there are several equal
 minimum elements then the first one is chosen.
 @end deftypefun

 @deftypefun void gsl_stats_minmax_index (size_t * @var{min_index}, size_t * @var{max_index}, const double @var{data}[], size_t @var{stride}, size_t @var{n})
 This function returns the indexes @var{min_index}, @var{max_index} of
 the minimum and maximum values in @var{data} in a single pass.
 @end deftypefun

 @node Median and Percentiles
 @section Median and Percentiles

 The median and percentile functions described in this section operate on
 sorted data.  For convenience we use @dfn{quantiles}, measured on a scale
 of 0 to 1, instead of percentiles (which use a scale of 0 to 100).

 @deftypefun double gsl_stats_median_from_sorted_data (const double @var{sorted_data}[], size_t @var{stride}, size_t @var{n})
 This function returns the median value of @var{sorted_data}, a dataset
 of length @var{n} with stride @var{stride}.  The elements of the array
 must be in ascending numerical order.  There are no checks to see
 whether the data are sorted, so the function @code{gsl_sort} should
 always be used first.

 When the dataset has an odd number of elements the median is the value
 of element @math{(n-1)/2}.  When the dataset has an even number of
 elements the median is the mean of the two nearest middle values,
 elements @math{(n-1)/2} and @math{n/2}.  Since the algorithm for
 computing the median involves interpolation this function always returns
 a floating-point number, even for integer data types.
 @end deftypefun

 @deftypefun double gsl_stats_quantile_from_sorted_data (const double @var{sorted_data}[], size_t @var{stride}, size_t @var{n}, double @var{f})
 This function returns a quantile value of @var{sorted_data}, a
 double-precision array of length @var{n} with stride @var{stride}.  The
 elements of the array must be in ascending numerical order.  The
 quantile is determined by the @var{f}, a fraction between 0 and 1.  For
 example, to compute the value of the 75th percentile @var{f} should have
 the value 0.75.

 There are no checks to see whether the data are sorted, so the function
 @code{gsl_sort} should always be used first.

 The quantile is found by interpolation, using the formula
 @tex
 \beforedisplay
 $$
 \hbox{quantile} = (1 - \delta) x_i + \delta x_{i+1}
 $$
 \afterdisplay
 @end tex
 @ifinfo

 @example
 quantile = (1 - \delta) x_i + \delta x_@{i+1@}
 @end example

 @end ifinfo
 @noindent
 where @math{i} is @code{floor}(@math{(n - 1)f}) and @math{\delta} is
 @math{(n-1)f - i}.

 Thus the minimum value of the array (@code{data[0*stride]}) is given by
 @var{f} equal to zero, the maximum value (@code{data[(n-1)*stride]}) is
 given by @var{f} equal to one and the median value is given by @var{f}
 equal to 0.5.  Since the algorithm for computing quantiles involves
 interpolation this function always returns a floating-point number, even
 for integer data types.
 @end deftypefun


 @comment @node Statistical tests
 @comment @section Statistical tests

 @comment FIXME, do more work on the statistical tests

 @comment -@deftypefun double gsl_stats_ttest (const double @var{data1}[], double @var{data2}[], size_t @var{n1}, size_t @var{n2})
 @comment -@deftypefunx Statistics double gsl_stats_int_ttest (const double @var{data1}[], double @var{data2}[], size_t @var{n1}, size_t @var{n2})

 @comment The function @code{gsl_stats_ttest} computes the t-test statistic for
 @comment the two arrays @var{data1}[] and @var{data2}[], of lengths @var{n1} and
 @comment -@var{n2} respectively.

 @comment The t-test statistic measures the difference between the means of two
 @comment datasets.

 @node Example statistical programs
 @section Examples
 Here is a basic example of how to use the statistical functions:

 @example
 @verbatiminclude examples/stat.c
 @end example

 The program should produce the following output,

 @example
 @verbatiminclude examples/stat.out
 @end example


 Here is an example using sorted data,

 @example
 @verbatiminclude examples/statsort.c
 @end example

 This program should produce the following output,

 @example
 @verbatiminclude examples/statsort.out
 @end example

 @node Statistics References and Further Reading
 @section References and Further Reading

 The standard reference for almost any topic in statistics is the
 multi-volume @cite{Advanced Theory of Statistics} by Kendall and Stuart.

 @itemize @asis
 @item
 Maurice Kendall, Alan Stuart, and J. Keith Ord.
 @cite{The Advanced Theory of Statistics} (multiple volumes)
 reprinted as @cite{Kendall's Advanced Theory of Statistics}.
 Wiley, ISBN 047023380X.
 @end itemize

 @noindent
 Many statistical concepts can be more easily understood by a Bayesian
 approach.  The following book by Gelman, Carlin, Stern and Rubin gives a
 comprehensive coverage of the subject.

 @itemize @asis
 @item
 Andrew Gelman, John B. Carlin, Hal S. Stern, Donald B. Rubin.
 @cite{Bayesian Data Analysis}.
 Chapman & Hall, ISBN 0412039915.
 @end itemize

 @noindent
 For physicists the Particle Data Group provides useful reviews of
 Probability and Statistics in the ``Mathematical Tools'' section of its
 Annual Review of Particle Physics.

 @itemize @asis
 @item
 @cite{Review of Particle Properties}
 R.M. Barnett et al., Physical Review D54, 1 (1996)
 @end itemize

 @noindent
 The Review of Particle Physics is available online at
 the website @uref{http://pdg.lbl.gov/}.