PING  0.9
Statistical data handling and processing in production environment
quantile

Divide a given sample, possibly weighted, into a certain number of slices of equal size, with units ranked according to a variable of interest.

Contents:

Algorithm

Nine quantile algorithms are made available, as discussed in Hyndman and Fan's, plus two additional coming also from the literature, namely Filliben’s and Cunnane's publication (see references):

type description
1 inverted empirical CDF
2 inverted empirical CDF with averaging at discontinuities
3 observation numberer closest to qN (piecewise linear function)
4 linear interpolation of the empirical CDF
5 Hazen's model (piecewise linear function)
6 Weibull quantile
7 interpolation points divide sample range into n-1 intervals
8 unbiased median (regardless of the distribution)
9 approximate unbiased estimate for a normal distribution
10 Cunnane's definition (approximately unbiased)
11 Filliben’s estimate

All sample quantiles are defined as weighted averages of consecutive order statistics. Sample quantiles of type i are defined for 1 <= i <= 10 by:

Q[i](p) = (1 - gamma) * x[j] + gamma *  x[j+1]

where x[j], for (j-m)/N<=p<(j-m+1)/N, is the j-th order statistic, N is the sample size, the value of gamma is a function of:

j = floor(N*p + m)
g = N*p + m - j

and m is a constant determined by the sample quantile type.

For types 1, 2 and 3, Q[i](p) is a discontinuous function:

type p[k] m alphapbetapgamma
1 k/N 0 0 1 1 if g>0, 0 if g=0
2 k/N 0 0 1 1/2 if g>0, 0 if g=0
3 (k+1/2)/N -.5 -.5 -1.5 0 if g=0 and j even, 1 otherwise

For types 4 through 11, Q[i](p) is a continuous function of p, with gamma and m given below. The sample quantiles can be obtained equivalently by linear interpolation between the points (p[k],x[k]) where x[k] is the k-th order statistic:

type p[k] m alphapbetapgamma
4 k/N 0 0 1 g
5 (k-1/2)/N .5 .5 .5 g
6 k/(N+1) p 0 0 g
7 (k-1)/(N-1) 1-p 1 1 g
8 (k-1/3)/(N+1/3) (1+p)3 1/3 1/3 g
9 (k-3/8)/(N+1/4) (2*p+3)/83/8 3/8 g
10 (k-.4)/(N+.2) .2*p+.4 .4 .4 g
11 (k-.3175)/(N+.365).365*p+.3175.3175 .3175 g

In the above tables, the (alphap,betap) pair is defined such that:

p[k] = (k - alphap)/(N + 1 - alphap - betap)

References