PING
0.9
Statistical data handling and processing in production environment
|
Check how many observations (rows) of a dataset verify a given condition.
dsn
: a dataset, for which the condition has to be verified;where
: (option) SAS expression used to further refine the selection (WHERE
option); should be passed with %str
; default: empty;pct
: (option) a boolean flag (yes/no
) set to return the result as a percentage of the total observations in dsn
that verify the condition cond
above; default: pct=yes
, i.e. result is returned as a percentage [0,100] of the total numbers of observations;distinct
: (option) boolean flag (yes/no
) set to count only distinct values; in practice, runs a SQL SELECT DISTINCT
process instead of a simple SELECT
; default: no
, i.e. all values are counted;lib
: (option) the library in which the dataset dsn
is stored._ans_
: name of the macro variable used to store the (quantitative) output of the test, which is, depending on the value of the flag pct
:
n
, the number of observations that verify the condition cond
when pct=yes
;n/N
, where N
is the total number of observations in the dataset dsn
, and n
is like above;hence the nul result corresponds to the situation n=0
. where no observation in the dataset verifies the input condition.
Let's perform some test on the values of test datatest #1000 (with 1000 observations sequentially enumerated), e.g.:
returns ans=0
, while:
returns ans=100
, and:
returns ans=40
.
Run %_example_obs_count
for more examples.
pct=yes
is relative to the precision of your machine. In practice, for tables with more than 1E9 observations where all but 1 verify the condition cond
, the percentage calculated may still be equal to 100 (instead of a value<100). In that case, it is preferred to set the flag pct
to no
(see %_example_obs_count
).%str
(or %quote
) so as to express a condition.Gupta, S. (2006): "WHERE vs. IF statements: Knowing the difference in how and when to apply".