PING
0.9
Statistical data handling and processing in production environment
|
Compute empirical quantiles of a variable with sample data corresponding to given probabilities.
var
: data whose sample quantiles are estimated; this can be either:idsn
(see below) should be set;weights
: (option) name of the variable containing the weights, in the case where the computation of quantiles has to be performed on survey data. Please note that only methods available in the PROC UNIVARIATE
are available so far.probs
: (option) list of probabilities with values in [0,1]; the smallest observation corresponds to a probability of 0 and the largest to a probability of 1; in the case method=INHERIT
(see below), these values are multiplied by 100 in order to be used by PROC UNIVARIATE
; default: probs=0 0.25 0.5 0.75 1
, so as to match default values seq(0, 1, 0.25)
used in R quantile;type
: (option) an integer between 1 and 11 selecting one of the nine quantile algorithms discussed in Hyndman and Fan's article (see references) and detailed below to be used; type | description | PCTLDEF |
---|---|---|
1 | inverted empirical CDF | 3 |
2 | inverted empirical CDF with averaging at discontinuities | 5 |
3 | observation numberer closest to qN (piecewise linear function) | 2 |
4 | linear interpolation of the empirical CDF | 1 |
5 | Hazen's model (piecewise linear function) | n.a. |
6 | Weibull quantile | 4 |
7 | interpolation points divide sample range into n-1 intervals | n.a. |
8 | unbiased median (regardless of the distribution) | n.a. |
9 | approximate unbiased estimate for a normal distribution | n.a. |
10 | Cunnane's definition (approximately unbiased) | n.a. |
11 | Filliben's estimate | n.a. |
type=7
(likewise R quantile
);method
: (option) choice of the implementation of the quantile estimation method; this can be either:
INHERIT
for an estimation based on the use of the PROC UNIVARIATE
procedure already implemented in SAS,DIRECT
for a canonical implementation based on the direct transcription of the various quantile estimation algorithms (see below) into SAS language;note that the former (method=INHERIT
) is incompatible with type
other than (1,2,3,4,6)
since PROC UNIVARIATE
does actually not support these quantile definitions (see table above); in the case type=5
, 7
, 8
, or 9
, method
is then set to DIRECT
; default: method=DIRECT
;
idsn
: (option) when input data is passed as a variable name, idsn
represents the dataset to look for the variable var
(see above);ilib
: (option) name of the input library; by default: empty, i.e. WORK
is used if idsn
is set;olib
: (option) name of the output library (see names
below); by default: empty, i.e. WORK
is also used when odsn
is set;na_rm
: (obsolete) logical; if true (yes
), any NA and NaN's are removed from x before the quantiles are computed.Return estimates of underlying distribution quantiles based on one or two order statistics from the supplied elements in var
at probabilities in probs
, following quantile estimation algorithm defined by type
. The output sample quantile are stored either in a list or as a table, through:
_quantiles_
: (option) name of the output numeric list where quantiles are stored in increasing probs
order; incompatible with parameters odsn
and names
below;odsn, names
: (option) respective names of the output dataset and variable where quantiles are stored; if both odsn
and names
are set, the quantiles are saved in the names
variable ot the odsn
dataset; if just odsn
is set, then they are stored in a variable named QUANT
; if instead only names
is set, then the dataset will also be named after names
.%io_quantile, UNIVARIATE, quantile (R), mquantiles (scipy), gsl_stats_quantile* (C).