Filter downloaded full raw dataset on local computer if the get_eurostat_data has not provided data due to too large datasets for the REST API.

filter_raw_data(raw_data = NULL, filter_table = NULL, date_filter = FALSE)

Arguments

raw_data

an input data.table dataset resulted from the call of the get_eurostat_raw function

filter_table

a data table with values for the concepts or time to be filtered out which can be generated by the create_filter_table function

date_filter

a logical value. If TRUE the filter table should be applied to the time columns of the raw_data. The default is FALSE, in this case the filters applied to the other columns of the raw_data.

Value

a filtered data.table containing only the rows of raw_data which fulfills the conditions in the filter_table

Details

It is a sub-function to use in the get_eurostat_data to filter data on the local computer if the direct response from REST API did not provide data because of too large data set (more than 30 thousands observations). The filter_table contains always at least two columns. In case if date_filter=TRUE then the two columns should have the following names and the provided conditions are applied to the time column of the the raw_data data.table.

sdStarting date to be included, where date is formatted as yyyy[-mm][-dd] (the month and day are optional)
edEnd date of the period to be included in the dataset formatted as yyyy[-mm][-dd] (the month and day are optional)

In case if date_filter=FALSE then the columns should have the following names:

conceptContaining concept names, which is a column name in the raw_data data.table
codeA possible code under the given concept, which is a value in the column of the raw_data data.table defined by the concept

Examples

# \donttest{
id<-"tus_00age"
if (!(grepl("amzn|-aws|-azure ",Sys.info()['release']))) options(timeout=2)
rd<-get_eurostat_raw(id)
dsd<-get_eurostat_dsd(id)
#> get_eurostat_dsd - There is a warning by the download of the DSD file. Run the same command with verbose=TRUE option to get more info on the issue.
#> get_eurostat_dsd - There is an error by the reading of the downloaded DSD file. Run the same command with verbose=TRUE option to get more info on the issue.
ft<-create_filter_table(c("TIME_SP","Hungary",'T'),FALSE,dsd)
#> The DSD is missing from the create_filter_table function.
filter_raw_data(rd,ft)
#>          freq    unit    sex    age  acl00    geo   time values
#>        <char>  <char> <char> <char> <char> <char> <char> <char>
#>     1:      A  PTP_RT      F  TOTAL    AC0     BE   2000  100.0
#>     2:      A  PTP_RT      F  TOTAL    AC0     BG   2000  100.0
#>     3:      A  PTP_RT      F  TOTAL    AC0     DE   2000  100.0
#>     4:      A  PTP_RT      F  TOTAL    AC0     EE   2000  100.0
#>     5:      A  PTP_RT      F  TOTAL    AC0     ES   2000  100.0
#>    ---                                                         
#> 98375:      A TIME_SP      T Y_GE65  TOTAL     PL   2010  24:00
#> 98376:      A TIME_SP      T Y_GE65  TOTAL     RO   2010  24:00
#> 98377:      A TIME_SP      T Y_GE65  TOTAL     RS   2010  24:00
#> 98378:      A TIME_SP      T Y_GE65  TOTAL     TR   2010  24:00
#> 98379:      A TIME_SP      T Y_GE65  TOTAL     UK   2010  24:00
options(timeout=60)
# }