![]() |
PING
0.9
Statistical data handling and processing in production environment
|
Create dummy variables in a dataset, i.e. variables with labels used to describe membership in a category with binary coding.
idsn : a dataset reference;var : list of variables to be "dummied";prefix : (option) prefix(s) used to create the names of dummy variables; you can give one or more strings, in an order corresponding to the var variables; note that prefix=_VARNAME_, which will use the name of the corresponding variable followed by an underscore, or prefix=_BLANK_, which will make the prefix a null string (similar to specifying a null string in the macro argument) are also accepted; default: prefix=D_;name : (option) if name=_VAL_, the dummy variables are named by appending the value of the var variables to the prefix, otherwise, the dummy variables are named by appending numbers, 1, 2, ... to the prefix; note that the resulting name must be 8 characters or less.; default: name=_VAL_;base :(option) indicates the level of the baseline category, which is given values of 0 on all the dummy variables; you can give one or more strings, in an order corresponding to the var variables; parameters base=_FIRST_ or base=_LOW_ specify that the lowest value of the VAR= variable is the baseline group; base=_LAST_ or base=_HIGH_ specify the highest value of the variable; otherwise, you can specify base=<value> to make a different value the baseline group; for a character variable, you must enclose the value in quotes, e.g., base='M'; default: base=_LAST_;format : (option) user formats may be used for two purposes:var list.fullrank : (option) boolean flag (yes/no), set to yes to indicate that the indicator for the base category is eliminated; default: fullrank=yes;ilib : (option) name of the input library; by default: empty, i.e. WORK is used;odsn : (option) name of the output dataset; if not specified, the new variables are appended to the input dataset idsn;olib : (option) name of the output library; by default: empty, i.e. WORK is also used.With the input data set:
| y | group | sex |
|---|---|---|
| 10 | A | M |
| 12 | A | F |
| 13 | A | M |
| 18 | B | M |
| 19 | B | M |
| 16 | C | F |
| 21 | C | M |
| 19 | C | F |
the macro statement:
produces two new variables, D_A and D_B in the table test:
| y | group | sex | D_A | D_B |
|---|---|---|---|---|
| 10 | A | M | 1 | 0 |
| 12 | A | F | 1 | 0 |
| 13 | A | M | 1 | 0 |
| 18 | B | M | 0 | 1 |
| 19 | B | M | 0 | 1 |
| 16 | C | F | 0 | 0 |
| 21 | C | M | 0 | 0 |
| 19 | C | F | 0 | 0 |
since group C is the baseline category (corresponding to base=_LAST_). With the input dataset:
produces a dummy for sex named FEMALE, and two dummies for group:
| y | group | sex | FEMALE | GROUP_A | GROUP_B |
|---|---|---|---|---|---|
| 10 | A | M | 0 | 1 | 0 |
| 12 | A | F | 1 | 1 | 0 |
| 13 | A | M | 0 | 1 | 0 |
| 18 | B | M | 0 | 1 | 1 |
| 19 | B | M | 0 | 1 | 1 |
| 16 | C | F | 1 | 1 | 0 |
| 21 | C | M | 0 | 1 | 0 |
| 19 | C | F | 1 | 1 | 0 |
%var_dummy is a wrapper to M. Friendly's original %dummy macro. Original source code (no license, no disclaimer) is available at http://www.medicine.mcgill.ca/epidemiology/joseph/pbelisle/multitranspose.html. See resources available at DataVis.ca.Given a character or discrete numerical variable, the %var_dummy macro creates dummy (0/1) variables to represent the levels of the original variable. If the original variable has c levels, then (c-1) new variables are produced (or c variables, if fullrank=yes).
When the original variable is missing, all dummy variables will be missing (V7+ only).
http://www.math.yorku.ca/SCS/sasmac/dummy.html