PING
0.9
Statistical data handling and processing in production environment
|
Cast a given character variable into a numeric variable where numbers are attributed in sequence depending on the frequency of the corresponding category in the character variable.
idsn
: input reference dataset, whose variable shall be cast;var
: name of the character variable that should be cast, i.e. all categories in var
will be converted into numbers;suff
: (option) suffix to be added to the name of the cast variable; default: suff=_new
, i.e. the variable a
in idsn
will be renamed as a_new
;odsn
: (option) name of the output dataset; default: odsn=idsn
so that the input dataset is in practice updated;ilib
: (option) name of the input library; by default: empty, i.e. WORK
is used.odsn
: output dataset (stored in the olib
library), containing the exact same data than idsn
, plus an additional new variable (obtained as a concatenation of the original var
name and suff
) where all the categories of the variable defined by var
are cast into a numeric variable;olib
: (option) name of the output library; by default: empty, and the value of ilib
is used.Let us consider test dataset #31 in WORKing directory:
geo | value | unit |
---|---|---|
BE | 0 | EUR |
AT | 0.1 | EUR |
BG | 0.2 | NAC |
LU | 0.3 | EUR |
FR | 0.4 | NAC |
IT | 0.5 | EUR |
then the call to the macros:
will return the updated dataset below:
geo | value | unit | unit_new |
---|---|---|---|
BE | 0 | EUR | 1 |
AT | 0.1 | EUR | 1 |
BG | 0.2 | NAC | 2 |
LU | 0.3 | EUR | 1 |
FR | 0.4 | NAC | 2 |
IT | 0.5 | EUR | 1 |
Run macro %_example_var_numcast
for more examples.
The values in the new variable are attributed in sequential order, from the most to the least frequent categories in var
.
Wright, W.L. (2007): "Creating a format from raw data or a SAS dataset".