Detailed algorithms
Eleven quantile algorithms are made available: 9 are discussed in Hyndman and Fan’s, 1 in Cunnane’s and 1 in Filliben’s articles (see references):
type |
description |
---|---|
1 | inverted empirical CDF |
2 | inverted empirical CDF with averaging at discontinuities |
3 | observation numberer closest to qN (piecewise linear function) |
4 | linear interpolation of the empirical CDF |
5 | Hazen’s model (piecewise linear function) |
6 | Weibull quantile |
7 | interpolation points divide sample range into n-1 intervals |
8 | unbiased median (regardless of the distribution) |
9 | approximate unbiased estimate for a normal distribution |
10 | Cunnane’s definition (approximately unbiased) |
11 | Filliben’s estimate |
All sample quantiles are defined as weighted averages of consecutive order statistics. Sample quantiles of type i
are defined for 1 <= i <= 10
by:
Q[i](p) = (1 - gamma) * x[j] + gamma * x[j+1]
where x[j]
, for (j-m)/N<=p<(j-m+1)/N
, is the j
-th order statistic, N
is the sample size, the value of gamma
is a function of:
j = floor(N*p + m)
g = N*p + m - j
and m
is a constant determined by the sample quantile type.
For types 1, 2 and 3, Q[i](p)
is a discontinuous function:
type |
p[k] |
m |
alphap |
betap |
gamma |
---|---|---|---|---|---|
1 | k/N |
0 | 0 | 1 | 1 if g>0 , 0 if g=0 |
2 | k/N |
0 | 0 | 1 | 1/2 if g>0 , 0 if g=0 |
3 | (k+1/2)/N |
-.5 | -.5 | -1.5 | 0 if g=0 and j even, 1 otherwise |
For types 4 through 11, Q[i](p)
is a continuous function of p
, with gamma
and m
given below. The sample quantiles can be obtained equivalently by linear interpolation between the points (p[k],x[k])
where x[k]
is the k
-th order statistic:
type |
p[k] |
m |
alphap |
betap |
gamma |
---|---|---|---|---|---|
4 | k/N |
0 | 0 | 1 | g |
5 | (k-1/2)/N |
.5 | .5 | .5 | g |
6 | k/(N+1) |
p |
0 | 0 | g |
7 | (k-1)/(N-1) |
1-p |
1 | 1 | g |
8 | (k-1/3)/(N+1/3) |
(1+p)3 |
1/3 | 1/3 | g |
9 | (k-3/8)/(N+1/4) |
(2*p+3)/8 |
3/8 | 3/8 | g |
10 | (k-.4)/(N+.2) |
.2*p+.4 |
.4 | .4 | g |
11 | (k-.3175)/(N+.365) |
.365*p+.3175 |
.3175 | .3175 | g |
In the above tables, the (alphap,betap)
pair is defined such that:
p[k] = (k - alphap)/(N + 1 - alphap - betap)
References
- Makkonen L. and Pajari M. (2014): Defining sample quantiles by the true rank probability, Journal of Probability and Statistics, vol. 2014, Article ID 326579, doi:10.1155/2014/326579
- Hyndman R.J. and Fan Y. (1996): Sample quantiles in statistical packages, The American Statistician, 50(4):361-365, doi:10.2307/2684934
- Cunnane C. (1978): Unbiased plotting positions: a review, Journal of Hydrology, 37(3-4):205-222, doi:10.1016/0022-1694(78)90017-3.
- Barnett V. (1975): Probability plotting methods and order statistics, Journal of the Royal Statistical Society. Series C (Applied Statistics), 24(1):95-108, doi:10.2307/2346708 .
- Filliben J.J. (1975): The probability plot correlation coefficient test for normality, Technometrics, 17(1):111-117, doi:10.2307/1268008.