NetworkQualityWarnings Configuration
To initialise and run the component two configs are used - general_config.ini
and network_quality_warnings.ini
. In general_config.ini
all paths to the corresponding data objects shall be specified. Example:
[Paths.Silver]
...
network_syntactic_quality_metrics_by_column = ${Paths:silver_quality_metrics_dir}/network_syntactic_quality_metrics_by_column
network_syntactic_quality_warnings_log_table = ${Paths:silver_quality_warnings_dir}/network_syntactic_quality_warnings_log_table
network_syntactic_quality_warnings_line_plot_data = ${Paths:silver_quality_warnings_dir}/network_syntactic_quality_warnings_line_plot_data
network_syntactic_quality_warnings_pie_plot_data = ${Paths:silver_quality_warnings_dir}/network_syntactic_quality_warnings_pie_plot_data
...
The expected parameters in network_quality_warnings.ini
are as follows:
- date_of_study: string, format should be the one specified date_format
(e.g., 2023-01-01
for %Y-%m-%d
), the first date for which data will be processed by the component. All dates between this one and the specified in data_period_end
will be processed (both inclusive).
- date_format: string, it indicates the format expected in date_of_study
. For example, use %Y-%m-%d
for the usual "2023-01-01" format separated by -
.
- lookback_period: string, it indicates the length of the lookback period used to compare the metrics of the date of study with past data volume and error rates. Three possible values are accepted: week
, month
, and quarter
.
- thresholds: dictionary, indicating the different thresholds to be used for raising warnings.
Thresholds
The thresholds parameter is a dictionary of dictionaries, used to indicate the different thresholds to be used for each type of warning. In the case that one of the thresholds is missing, a default value will be used instead. The default values are contained in multimno.core.constants.network_default_thresholds.NETWORK_DEFAULT_THRESHOLDS
.
Ihe dictionary structure is as follows:
- "SIZE_RAW_DATA"
: refers to the size of the input data.
- "OVER_AVERAGE"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the input raw file is greater than this threshold, in percentage different, when compared to the average over the lookback period. By default, the value is 30
.
- "UNDER_AVERAGE"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the input raw file is lower than this threshold, in percentage different, when compared to the average over the lookback period. By default, the value is 30
.
- "VARIABILITY"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the input raw file differs from its average over the lookback period by more than its standard deviation multiplied by this threshold. This is, this value is a threshold of the number of standard deviations than a value can differ from its average over the lookback period, either over or under the average. By default, the value is 2
.
- "ABS_VALUE_UPPER_LIMIT"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the input raw file is greater than this value. By default, it is equal to the threshold calculated from the VARIABILITY
parameter over the average.
- "ABS_VALUE_LOWER_LIMIT"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the input raw file is greater than this value. By default, it is equal to the threshold calculated from the VARIABILITY
parameter under the average.
"SIZE_CLEAN_DATA"
: refers to the size of the output data."OVER_AVERAGE"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the output processed file is greater than this threshold, in percentage different, when compared to the average over the lookback period. By default, the value is30
."UNDER_AVERAGE"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the output processed file is lower than this threshold, in percentage different, when compared to the average over the lookback period. By default, the value is30
."VARIABILITY"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the output processed file differs from its average over the lookback period by more than its standard deviation multiplied by this threshold. This is, this value is a threshold of the number of standard deviations than a value can differ from its average over the lookback period, either over or under the average. By default, the value is2
."ABS_VALUE_UPPER_LIMIT"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the output processed file is greater than this value. by default, it is equal to the threshold calculated from theVARIABILITY
parameter over the average.-
"ABS_VALUE_LOWER_LIMIT"
: integer or float, indicates the threshold for a warning to be raised when the number of rows in the output processed file is greater than this value. By default, it is equal to the threshold calculated from theVARIABILITY
parameter under the average. -
"TOTAL_ERROR_RATE"
: refers to the percentage of rows preserved from the input file, i.e., the rows that passed the cleaning/check procedure. "OVER_AVERAGE"
: integer or float, indicates the threshold for a warning to be raised when the error rate is greater than this threshold, in percentage different, when compared to the average over the lookback period. By default, the value is30
."VARIABILITY"
: integer or float, indicates the threshold for a warning to be raised when the error rate differs from its average over the lookback period by more than its standard deviation multiplied by this threshold. This is, this value is a threshold of the number of standard deviations than a value can differ from its average over the lookback period, obly over the average. By default, the value is2
.-
"ABS_VALUE_UPPER_LIMIT"
: integer or float, indicates the threshold for a warning to be raised when the error rate is greater than this value. By default, the value is20
. -
"Missing_value_RATE"
: refers to the percentage of missing/null values of a given field in the input data. The value of this dictionary key is a dictionary where the keys are field names, and the values are a dictionary containing the thresholds form this error type and field. - Admitted keys (i.e., fields) are:
cell_id
,valid_date_start
,latitude
,longitude
,altitude
,antenna_height
,directionality
,azimuth_angle
,elevation_angle
,horizontal_beam_width
,vertical_beam_width
,power
,frequency
,technology
, andcell_type
. -
Each key (i.e., field) has the following thresholds:
"AVERAGE"
: integer or float, indicates the threshold for a warning to be raised when this error rate is greater than this threshold, in percentage different, when compared to the average over the lookback period. By default, the value is30
forcell_id
,latitude
, andlongitude
, and60
otherwise."VARIABILITY"
: integer or float, indicates the threshold for a warning to be raised when this error rate differs from its average over the lookback period by more than its standard deviation multiplied by this threshold. This is, this value is a threshold of the number of standard deviations than a value can differ from its average over the lookback period, obly over the average. By default, the value is2
forcell_id
,latitude
, andlongitude
, and3
otherwise."ABS_VALUE_UPPER_LIMIT"
: integer or float, indicates the threshold for a warning to be raised when this error rate is greater than this value. By default, the value is20
forcell_id
,latitude
, andlongitude
, and50
otherwise.
-
"Out_of_range_RATE"
: refers to the percentage of out of bounds, out of range or invalid values of a given field in the input data. The value of this dictionary key is a dictionary where the keys are field names, and the values are a dictionary containing the thresholds form this error type and field. - Admitted keys (i.e., fields) are:
"cell_id"
,"latitude"
,"longitude"
,"antenna_height"
,"directionality"
,"azimuth_angle"
,"elevation_angle"
,"horizontal_beam_width"
,"vertical_beam_width"
,"power"
,"frequency"
,"technology"
, and"cell_type"
. Exceptionally, theNone
value is also accepted, referring to the specific error wherevalid_date_end
is a point int time earlier thanvalid_date_start
. -
Each key has the following thresholds:
"AVERAGE"
: integer or float, indicates the threshold for a warning to be raised when this error rate is greater than this threshold, in percentage different, when compared to the average over the lookback period. By default, the value is20
forcell_id
,latitude
, andlongitude
, and60
otherwise."VARIABILITY"
: integer or float, indicates the threshold for a warning to be raised when this error rate differs from its average over the lookback period by more than its standard deviation multiplied by this threshold. This is, this value is a threshold of the number of standard deviations than a value can differ from its average over the lookback period, obly over the average. By default, the value is2
forcell_id
,latitude
, andlongitude
, and3
otherwise."ABS_VALUE_UPPER_LIMIT"
: integer or float, indicates the threshold for a warning to be raised when this error rate is greater than this value. By default, the value is20
forcell_id
,latitude
, andlongitude
, and50
otherwise.
-
"Parsing_error_RATE"
: refers to values that could not be parsed. "valid_date_start"
:"AVERAGE"
: integer or float, indicates the threshold for a warning to be raised when this error rate is greater than this threshold, in percentage different, when compared to the average over the lookback period. By default, the value is60
."VARIABILITY"
: integer or float, indicates the threshold for a warning to be raised when this error rate differs from its average over the lookback period by more than its standard deviation multiplied by this threshold. This is, this value is a threshold of the number of standard deviations than a value can differ from its average over the lookback period, obly over the average. By default, the value is3
."ABS_VALUE_UPPER_LIMIT"
: integer or float, indicates the threshold for a warning to be raised when this error rate is greater than this value. By default, the value is50
.
Configuration example
[Spark]
session_name = NetworkQualityWarnings
[NetworkQualityWarnings]
date_format = %Y-%m-%d
date_of_study = 2023-01-08
lookback_period = week
# All values must be numeric
# Missing parameter will take the default value
# Incorrect value will throw an error
thresholds = {
"SIZE_RAW_DATA": {
"OVER_AVERAGE": 30,
"UNDER_AVERAGE": 30,
"VARIABILITY": 2,
"ABS_VALUE_UPPER_LIMIT": None,
"ABS_VALUE_LOWER_LIMIT": None,
},
"SIZE_CLEAN_DATA": {
"OVER_AVERAGE": 30,
"UNDER_AVERAGE": 30,
"VARIABILITY": 2,
"ABS_VALUE_UPPER_LIMIT": None,
"ABS_VALUE_LOWER_LIMIT": None,
},
"TOTAL_ERROR_RATE": {
"OVER_AVERAGE": 30,
"VARIABILITY": 2,
"ABS_VALUE_UPPER_LIMIT": 20,
},
"Missing_value_RATE": {
"cell_id": {
"AVERAGE": 30,
"VARIABILITY": 2,
"ABS_VALUE_UPPER_LIMIT": 20,
},
"valid_date_start": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"latitude": {
"AVERAGE": 30,
"VARIABILITY": 2,
"ABS_VALUE_UPPER_LIMIT": 20,
},
"longitude": {
"AVERAGE": 30,
"VARIABILITY": 2,
"ABS_VALUE_UPPER_LIMIT": 20,
},
"altitude": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"antenna_height": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"directionality": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"azimuth_angle": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"elevation_angle": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"horizontal_beam_width": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"vertical_beam_width": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"power": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"frequency": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"technology": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"cell_type": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
},
"Out_of_range_RATE": {
"cell_id": {
"AVERAGE": 30,
"VARIABILITY": 2,
"ABS_VALUE_UPPER_LIMIT": 20,
},
"latitude": {
"AVERAGE": 30,
"VARIABILITY": 2,
"ABS_VALUE_UPPER_LIMIT": 20,
},
"longitude": {
"AVERAGE": 30,
"VARIABILITY": 2,
"ABS_VALUE_UPPER_LIMIT": 20,
},
"antenna_height": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"directionality": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"azimuth_angle": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"elevation_angle": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"horizontal_beam_width": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"vertical_beam_width": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"power": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"frequency": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"technology": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"cell_type": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
None: {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
},
"Parsing_error_RATE": {
"valid_date_start": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
},
"valid_date_end": {
"AVERAGE": 60,
"VARIABILITY": 3,
"ABS_VALUE_UPPER_LIMIT": 50,
}
}
}