Create filter table from the filters
and date_filter
strings parameters of the get_eurostat_data
to be used in the filter_raw_data
function for filtering by query or on the local computer.
create_filter_table(
filters,
date_filter = FALSE,
dsd = NULL,
exact_match = TRUE,
verbose = FALSE,
...
)
a string, a character or numeric vector or a named list containing words to filter by the different concepts, geographical location or time values.
The words can be any word, Eurostat variable code, or value which are in the Data Structure Definition (DSD) and can be retrieved by the search_eurostat_dsd
function.
If a named list is used, then the name of the list elements should be the concepts from the DSD and the provided values will be used to filter the dataset for the given concept.
The default is NULL
, in this case no filter table is created. To filter by time see date_filter
below.
In case for filtering for time values, the date shall be defined as character string, and it should follow the format yyyy[-mm][-dd], where the month and the day part is optional.
a logical value. If TRUE
the filter table is genrated only for the time dimension. The default is FALSE
,
in this case a (dsd
) should be provided which will be searched for the values given in the filters
.
a table containing a DSD of an Eurostat dataset which can be retreived by the get_eurostat_dsd
function.
a logical value with the default value TRUE
, if the strings provided in filters
shall be matched exactly as it is or as a pattern in the DSD.
a logical value with default FALSE
, so detailed messages (for debugging) will not printed.
Can be set also with options(restatapi_verbose=TRUE)
further arguments to the for search_eurostat_dsd
function, e.g.: ignore.case
or name
.
The ignore.case
has the default value FALSE
, then the strings provided in filters
are matched as is,
otherwise the case of the letters is ignored. If the name=FALSE
then the pattern(s) provided in the filters
argument is only searched in the code column of the DSD, and the names of the codes will not be searched.
a data.table containing in each row a distinct filtering condition to be applied to a raw Eurostat datatable or generate specific query.
If date_filter=TRUE
, the output data table contains two columns with the following names:
sd | Starting date to be included in the filtered dataset, where date is formatted yyyy[-mm][-dd] |
ed | End date of the period to be included in the filtered dataset, where the date is formatted yyyy[-mm][-dd] |
In case date_filter=FALSE
, the output tables have the following four columns:
pattern | Containing those parts of the filters string where the string part (pattern) was found in the dsd |
concept | The name of the concepts corresponding to the result in the code/name column where the pattern was found in the data structure definition |
code | The list of codes where the pattern was found, or the code of a name (description of the code) where the pattern appears |
name | The name (description of the code) which can be used as label for the code where the pattern was found, or the name (description of the code) of the code where the pattern appears |
It is a sub-function to use in the get_eurostat_data
to generate url for the given filters
and date_filter
in that function. The output can be used also for filtering data
on the local computer with the get_eurostat_raw
and filter_raw_data
function, if the direct response from REST API did not provide data because of too large data set.
# \donttest{
if (!(grepl("amzn|-aws|-azure ",Sys.info()['release']))) options(timeout=2)
dsd<-get_eurostat_dsd("avia_par_me")
#> get_eurostat_dsd - There is a warning by the download of the DSD file. Run the same command with verbose=TRUE option to get more info on the issue.
#> get_eurostat_dsd - There is an error by the reading of the downloaded DSD file. Run the same command with verbose=TRUE option to get more info on the issue.
create_filter_table(c("KYIV","hu","Quarterly"),dsd=dsd,exact_match=FALSE,ignore.case=TRUE)
#> The DSD is missing from the create_filter_table function.
#> NULL
create_filter_table(c("KYIV","LHBP","Monthly"),dsd=dsd,exact_match=FALSE,name=FALSE)
#> The DSD is missing from the create_filter_table function.
#> NULL
create_filter_table(c("2017-03",
"2001-03:2005",
"<2000-07-01",
2012:2014,
"2018<",
20912,
"<3452<",
":2018-04>",
"2<034v",
"2008:2013"),
date_filter=TRUE,
verbose=TRUE)
#> create_filter_table - filters class: character; size: 12; filters:2017-032001-03:2005<2000-07-012012201320142018<20912<3452<:2018-04>2<034v2008:2013
#> create_filter_table - filters: 2017-032001-03:2005<2000-07-012012201320142018<20912<3452<:2018-04>2<034v2008:2013; is numeric: FALSE; call parents: 46
#> create_filter_table - length df: 12 content df: 2017-03, 2001-03:2005, <2000-07-01, 2012, 2013, 2014, 2018<, 20912, <3452<, :2018-04>, 2<034v, 2008:2013
#> The date filter had invalid character (not 0-9, '-', '<', '>' or ':'). Those characters removed from the date filter.
#> create_filter_table - 2017-03, create_filter_table - 2001-03:2005, create_filter_table - <2000-07-01, create_filter_table - 2012, create_filter_table - 2013, create_filter_table - 2014, create_filter_table - 2018<, create_filter_table - 20912, create_filter_table - <3452<, create_filter_table - :2018-04>, create_filter_table - 2<034, create_filter_table - 2008:2013 date filter length: 12, nchar date_filter: 7,12,11,4,4,4,5,5,6,9,5,9
#> create_filter_table - Could not parse date filter: '20912' not in [<>]yyyy[-mm][-dd][<>] format or incorrect date value. The date filter is ignored.
#> Could not parse date filter: '<3452<'. This date filter is ignored.
#> Could not parse date filter: ':2018-04>'. This date filter is ignored.
#> create_filter_table - Could not parse date filter: '2<034' not in [<>]yyyy[-mm][-dd][<>] format or incorrect date value. The date filter is ignored.
#> sd ed
#> <char> <char>
#> 1: 0 2000-07-01
#> 2: 2001-03-01 2005-12-31
#> 3: 2008-01-01 2014-12-31
#> 4: 2017-03-01 2017-03-31
#> 5: 2018-01-01 Inf
options(timeout=60)
# }