![]() |
PING
0.9
Statistical data handling and processing in production environment
|
Return the observations of a given variable in a dataset into a formatted (e.g., parentheses-enclosed, comma-separated and/or quote-enhanced) list of strings.
dsn : a dataset reference;var : either a field name, or the position of a field name in, in dsn, whose values (observations) will be converted into a list;by : (option) a variable to further split the dataset into subsets (using %ds_split), one for each value of by prior to extract the observations; the final list (see _varclst_) is obtained as the concatenation of the different extractions obtained for each subset, in the order of appearance of by values in the dataset dsn; by must differ from var; default: empty, i.e. by is not used;distinct : (option) boolean flag (yes/no) set to return in the list only distinct values from var variable; in practice, runs a SQL SELECT DISTINCT process prior to the values' extraction; default: no, i.e. all values are returned;na_rm : (option) boolean flag (yes/no) set to remove missing (NA) values from the observations; default: na_rm=yes, therefore all missing (. or ' ') values will be discarded in the output list;mark, sep : (option) characters/strings used respectively as a "quote" and a separator in the input list; default: mark=str(%"), and" sep= %quote(,), i.e. the input clist is a comma-separated list of quote-enhanced items; see %clist_unquote for further details; note in particular the use of mark=_EMPTY_ to actually set mark=%quote();lib : (option) output library; default (not passed or ' '): lib is set to WORK._varclst_ : name of the macro variable used to store the output formatted list, i.e. the list of (comma-separated) main observations in between quotes.
Let us consider the test dataset #32 in WORK.dsn:
| geo | value |
|---|---|
| BE | 0 |
| AT | 0.1 |
| BG | 0.2 |
| LU | 0.3 |
| FR | 0.4 |
| IT | 0.5 |
then running the macro:
will both return: ctry=("BE","AT","BG","LU","FR","IT"), while:
will return: val=("0","0.1","0.2","0.3","0.4","0.5"). Let us know consider the table _dstest3:
| color | value |
|---|---|
| blue | 1 |
| blue | 2 |
| blue | 3 |
| green | 3 |
| green | 4 |
| red | 2 |
| red | 3 |
| red | 4 |
it is possible to retrieve the observations by variable using the following instructions:
which returns the list val1=(1 2 3,3 4,2 3 4) of value observations for distinct color observations (say it otherwise: the first sequence of numbers is the list of value observations for blue, ibid the second for green, ibid the third for red), and:
which returns the list val2=("blue","blue red","blue green red","green red") of color observations for distinct value observations.
Run macro %_example_var_to_clist for more examples.
The option by is not available for the macro %var_to_list which var_to_clist derives from.
%var_to_list, %clist_to_var, %var_info, %list_quote, %ds_split.