Filter downloaded full raw dataset on local computer if the get_eurostat_data
has not provided data due to too large datasets for the REST API.
filter_raw_data(raw_data = NULL, filter_table = NULL, date_filter = FALSE)
an input data.table dataset resulted from the call of the get_eurostat_raw
function
a data table with values for the concepts or time to be filtered out which can be generated by the create_filter_table
function
a logical value. If TRUE
the filter table should be applied to the time
columns of the raw_data
. The default is FALSE
,
in this case the filters applied to the other columns of the raw_data
.
a filtered data.table containing only the rows of raw_data
which fulfills the conditions in the filter_table
It is a sub-function to use in the get_eurostat_data
to filter data on the local computer if the direct response from REST API did not provide data
because of too large data set (more than 30 thousands observations).
The filter_table
contains always at least two columns. In case if date_filter=TRUE
then the two columns should have the following names and
the provided conditions are applied to the time column of the the raw_data
data.table.
sd | Starting date to be included, where date is formatted as yyyy[-mm][-dd] (the month and day are optional) |
ed | End date of the period to be included in the dataset formatted as yyyy[-mm][-dd] (the month and day are optional) |
In case if date_filter=FALSE
then the columns should have the following names:
concept | Containing concept names, which is a column name in the raw_data data.table |
code | A possible code under the given concept, which is a value in the column of the raw_data
data.table defined by the concept |
# \donttest{
id<-"tus_00age"
if (!(grepl("amzn|-aws|-azure ",Sys.info()['release']))) options(timeout=2)
rd<-get_eurostat_raw(id)
dsd<-get_eurostat_dsd(id)
#> get_eurostat_dsd - There is a warning by the download of the DSD file. Run the same command with verbose=TRUE option to get more info on the issue.
#> get_eurostat_dsd - There is an error by the reading of the downloaded DSD file. Run the same command with verbose=TRUE option to get more info on the issue.
ft<-create_filter_table(c("TIME_SP","Hungary",'T'),FALSE,dsd)
#> The DSD is missing from the create_filter_table function.
filter_raw_data(rd,ft)
#> freq unit sex age acl00 geo time values
#> <char> <char> <char> <char> <char> <char> <char> <char>
#> 1: A PTP_RT F TOTAL AC0 BE 2000 100.0
#> 2: A PTP_RT F TOTAL AC0 BG 2000 100.0
#> 3: A PTP_RT F TOTAL AC0 DE 2000 100.0
#> 4: A PTP_RT F TOTAL AC0 EE 2000 100.0
#> 5: A PTP_RT F TOTAL AC0 ES 2000 100.0
#> ---
#> 98375: A TIME_SP T Y_GE65 TOTAL PL 2010 24:00
#> 98376: A TIME_SP T Y_GE65 TOTAL RO 2010 24:00
#> 98377: A TIME_SP T Y_GE65 TOTAL RS 2010 24:00
#> 98378: A TIME_SP T Y_GE65 TOTAL TR 2010 24:00
#> 98379: A TIME_SP T Y_GE65 TOTAL UK 2010 24:00
options(timeout=60)
# }