PING  0.9
Statistical data handling and processing in production environment
ds_transpose

Tanspose many variables in a multiple-row-per-subject table to a one-row-per-subject table

%ds_transpose(idsn, odsn, var=, by=, pivot=, copy=, missing=no, numeric=no, ilib=WORK, olib=WORK, llib=WORK);

Arguments

  • idsn : name of the input dataset;
  • var : lists of variables to be transposed;
  • by : name of the variable(s) that identifies(y) a subject;
  • pivot : (option) name of the variable from input dataset for which each row value should lead to series of variables (one series per variable listed in var, above) in output dataset; when passed, there should be only one variable named in pivot argument; when empty, a common PROC TRANSPOSE is applied;
  • copy : (option) list of variables that occur repeatedly with each observation for a subject and will be copied to the resulting dataset; default: not taken into account;
  • missing : (option) boolean flag (yes/no) set to keep (no) or drop (yes) the observations that correspond to a missing value in pivot from the input dataset before transposing; default: missing=no;
  • numeric : (option) boolean flag (yes/no) set to replace the observed character string value for variable used as pivot (when it is indeed a string variable) by numbers as suffixes in transposed variable names or not; default: numeric=no (i.e. character string values observed in pivot variable will be used as suffixes in new variable names); this is relevant only when pivot variable is literal;
  • ilib : (option) name of the input library; by default: empty, i.e. WORK is used;
  • olib : (option) name of the output library; by default: empty, i.e. the value of ilib will be used;
  • llib : (option) name of the library where value labels are defined (if any variables of interest in the input dataset were categorical/attached value labels); by default: empty, i.e. WORK is used.

Returns

  • odsn : name of the output dataset.

Examples

We provide here the examples already used in the original code (see note below).

Given the table dsn in WORKing library ("example1"):

centre subjectno gender visit sbp wt
1 1 female A 121.667 75.4
1 1 female baseline120 75
1 1 female week 1 125 75.5
1 1 female week 4 120 75.7
1 2 male A 142.5 71.5
1 2 male baseline140 70
1 2 male week 1 145 73
2 1 female A 153.333 90.6667
2 1 female baseline155 90
2 1 female week 1 150 90.8
2 1 female week 4 155 91.2

then running:

%ds_transpose(dsn, out, var=sbp wt, by=centre subjectno, pivot=visit);

will set the output table out to:

centresubjectnosbpA wtA sbpbaselinewtbaselinesbpweek_1wtweek_1sbpweek_4wtweek_4
1 1 121.66775.4 120 75 125 75.5 120 75.7
1 2 142.5 71.5 140 70 145 73 . .
2 1 153.33390.6667155 90 150 90.8 155 91.2

and running:

%ds_transpose(dsn, out, var=sbp wt, by=centre subjectno, pivot=visit, copy=gender);

will set the output table out to:

vcentresubjectnosbpA wtA sbpbaselinewtbaselinesbpweek_1wtweek_1sbpweek_4wtweek_4gender
1 1 121.66775.4 120 75 125 75.5 120 75.7 female
1 2 142.5 71.5 140 70 145 73 . . male
2 1 153.33390.6667155 90 150 90.8 155 91.2 female

Run macro %_example_ds_transpose for examples.

Notes

  1. The macro %ds_transpose is a wrapper to L. Joseph's original %MultiTranspose macro. Original source code (no license, no disclaimer) is available at http://www.medicine.mcgill.ca/epidemiology/joseph/pbelisle/multitranspose.html.
  2. This macro transposes multiple variables in a dataset as if multiple calls to PROC TRANSPOSE were performed. Indeed, it is useful when data need to be restructured from a multiple-row-per-subject structure into a one-row-per-subject structure. While when only one variable needs to be transposed, PROC TRANSPOSE can perform the task directly, the same operation may be time consuming when two or more variables need to be transposed, since it is then necessary to transpose each variable separately and then merge the transposed datasets.
  3. The following conditions need to be satisfied:
    • a column used as a pivot cannot have any duplicate values for a given series of values found in by variables,
    • copy variables have the same values within each series of by values.
  4. Use the following options:
    • numeric=yes to use numbers as suffixes in transposed variable names rather than characters,
    • missing=yes if you'd rather not drop observations where variable defined as pivot has a missing value.

References

  1. Stuelpner, J. (2006): "The TRANPOSE procedure or how to turn it Around".
  2. Zdeb, M. (2006): "An introduction to reshaping (TRANSPOSE) and combining (MATCH-MERGE) SAS datasets".
  3. Conway, T. (2008): "It's a bird, it's a plane, it's SQL TRANSPOSE!".
  4. Zirbel, D. (2009): "Learn the basics of PROC TRANSPOSE".
  5. Tabachneck, A.S., Shan, X.K., Virgile, R. and Whitehurst, J. (2013) : "A better way to flip (transpose) a SAS dataset".

See also

PROC TRANSPOSE, MultiTranspose, MAKEWIDE/MAKELONG.