![]() |
PING
0.9
Statistical data handling and processing in production environment
|
Check how many observations (rows) of a dataset verify a given condition.
dsn : a dataset, for which the condition has to be verified;where : (option) SAS expression used to further refine the selection (WHERE option); should be passed with %str; default: empty;pct : (option) a boolean flag (yes/no) set to return the result as a percentage of the total observations in dsn that verify the condition cond above; default: pct=yes, i.e. result is returned as a percentage [0,100] of the total numbers of observations;distinct : (option) boolean flag (yes/no) set to count only distinct values; in practice, runs a SQL SELECT DISTINCT process instead of a simple SELECT; default: no, i.e. all values are counted;lib : (option) the library in which the dataset dsn is stored._ans_ : name of the macro variable used to store the (quantitative) output of the test, which is, depending on the value of the flag pct:
n, the number of observations that verify the condition cond when pct=yes;n/N, where N is the total number of observations in the dataset dsn, and n is like above;hence the nul result corresponds to the situation n=0. where no observation in the dataset verifies the input condition.
Let's perform some test on the values of test datatest #1000 (with 1000 observations sequentially enumerated), e.g.:
returns ans=0, while:
returns ans=100, and:
returns ans=40.
Run %_example_obs_count for more examples.
pct=yes is relative to the precision of your machine. In practice, for tables with more than 1E9 observations where all but 1 verify the condition cond, the percentage calculated may still be equal to 100 (instead of a value<100). In that case, it is preferred to set the flag pct to no (see %_example_obs_count).%str (or %quote) so as to express a condition.Gupta, S. (2006): "WHERE vs. IF statements: Knowing the difference in how and when to apply".