PING  0.9
Statistical data handling and processing in production environment
silc_agg_compute

Legacy _"EUVALS"_-based code that calculates _(i)_ the EU aggregate of _(ii)_ an indicator _(iii)_ during a given year, possibly imputing data for missing countries from past years.

%silc_agg_compute(geo, time, idsn, odsn, ctrylst=,
max_yback=0, thr_min=0.7, thr_cum=0, agg_only=yes, force_Nwgh=NO,
ilib=WORK, olib=WORK, pdsn=META_POPULATIONxCOUNTRY, plib=G_PING_LIBCFG);

Arguments

  • geo : a given geographical area, e.g. EU28, EA, ...;
  • time : year of interest;
  • idsn : name of the dataset storing the indicator for which an aggregated value is estimated over the geo area and during the time year;
  • ctrylst : (option) list of (blank-separated, no quote) strings representing the ISO-codes of all the countries supposed to belong to geo; when not provided, it is automatically determined from geo and time (see macro %zone_to_ctry);
  • max_yback : (option) number of years used for imputation of missing data; it tells how to look backward in time, i.e. consider the max_yback years prior to the estimated; default: max_yback=0, i.e. only data available for current year shall be considered; max_yback can also be set to _ALL_ so as to take all available data from the input dataset, whatever the year considered: in that case, the other argument(s) normally used for building the list of countries (see below: thr_min) are ignored; default: max_yback=0 (i.e., only current year);
  • thr_min : (option) value (in range [0,1]) of the threshold used to test whether currently (i.e. for the year time under investigation): available population [time] / global population [time] >= thr_min ? default: thr_min=0.7, i.e. the available population should be at least 70% of the global population of the geo area;
  • thr_cum: (option) value (in range [0,1]) of the threshold used to test the cumulated available population, i.e. whether: available population [&time-&maxyback,time] / global population [time] >= thr_cum ? default: thr_cum=0, i.e. there is no further test on the cumulated population once the thr_min test on currently available population is passed;
  • grpdim : (option) list (blank separated, no comma) of dimensions used by the indicator; if not set (default), it is retrieved automatically from the input table using %ds_contents and considering the standard format of EU-SILC tables (see also %silc_ind_create);
  • agg_only : (option) boolean flag (yes/no) set to keep in the output table the aggregate geo only; when set to no, then all data used for the aggregate estimation are kept in the output table odsn (see below); default: agg_only=yes, i.e. only the aggregate will be stored in odsn;
  • flag : (option) who knows...?
  • mode : (option) flag (char) setting the mode of data output; it is either UPDATE (e.g., for primary RDB indicators, default) or INSERT (e.g., for secondary RDB2 indicators);
  • force_Nwgh : (option) additional boolean flag (yes/no) set when an additional variable nwgh (representing the weighted sample) is present in the output dataset; used in EUvals , where this option is not foreseen in the original EUvals implementation; default: force_Nwgh=no;
  • pdsn : (option) name of the dataset storing total populations per country; default: META_POPULATIONxCOUNTRY;
  • plib : (option) name of the library storing the population dataset pdsn; default: plib is associated to the folder G_PING_LIBCFG folder commonly used to store this file;
  • ilib : (option) input dataset library; default (not passed or ' '): ilib=WORK.

Returns

  • odsn : (generic) name of the output datasets; two tables are actually created: the table odsn will store all the calculations with the aggregated indicator;
  • CTRY_&odsn : this table will also store, for each country, the year of extraction of data for the calculation of aggregates in year time will also be created; for instance for a given calculated at time=2015, where BG data are missing until 2013, CY data until 2014, DE data until 2012, ES until 2014, etc..., this table will look like:
    geo time
    AT 2015
    BE 2015
    BG 2013
    CY 2014
    CZ 2015
    DE 2012
    DK 2015
    EE 2015
    EL 2015
    ES 2014
    .. ....
  • olib : (option) output dataset library; default (not passed or ' '): olib=WORK.

Example

Run macro %_example_silc_agg_compute.

Notes

  1. The computed aggregate is not inserted into the input dataset idsn but in the output odsn dataset passed as an argument. If you want to actually update the input dataset, you will need to explicitely call for it. For instance, say you want to calculate the 2016 EU28 aggregate of PEPS01 indicator from the so-called rdb library:
%silc_agg_compute(EU28, 2016, PEPS01, &odsn, ilib=rdb, olib=WORK);
DATA rdb.PEPS01;
SET rdb.PEPS01(WHERE=(not(time=2016 and geo=EU28)))
WORK.PEPS01;
run;
%work_clean(PEPS01);
  1. For that reason, the datasets idsn and odsn must be different!

References

  1. World Bank aggregation rules.
  2. Eurostat geography glossary.

See also

%silc_EUvals, %silc_agg_process, %silc_agg_list, %ctry_select, %zone_to_ctry, %var_to_list, %ds_contents.