foehnix.foehnix_filter

foehnix.foehnix_filter(x, filter_method=None, cols=None)[source]

Evaluates Data Filter Rules for foehnix Mixture Model Calls

foehnix.Foehnix models allow to specify an optional foehnix.foehnix_filter(). If a filter is given only a subset of the data set provided to foehnix.Foehnix is used for the foehn classification.

A typical example is a wind direction filter such that only observations (times) are used where the observed wind direction was within a user defined wind sector corresponding to the wind direction during foehn events for a specific location.

However, the filter option allows to even implement complex filter rules if required. The ‘Details’ section contains further information and examples how this filter rules can be used.

The most common filter rule: The filter is a dict where the dict-keys are column-names of the DataFrame x. The dict-values are lists of length 2 and define the range which should be used to filter the data. Example:

filter_method = {‘dd’: [43, 223]}

This will keep all wind directions dd between 43 and 223 degrees

The dict can contain several items to filter, e.g. to also limit the wind direction range at a crest station. Example:

filter_method = {‘dd’: [43, 223], ‘dd_crest’: [90, 270]}

Parameters:
x : pandas.DataFrame

containing the observations

filter_method : None, custom function or dict

Can be one of the following:

  • None: No filter will be applied
  • func: A custom function which will be applied on x
  • dict: Keys must be columns of x, values can either be a custom function on x[key] or a list of length two.
cols : list of strings

These strings must be contained in the columns of x and specify which columns are not allowed to contain missing values. If None is passed, all elements have to be non-missing.

Returns:
dict

A dictionary containing the following items:

  • dict[‘good’]: all indices of x within the filter values
  • dict[‘bad’] : all indices of x outside the filter values
  • dict[‘ugly’]: all indices where one of the filter variables is NAN
  • dict[‘total’]: length of data
  • dict[‘call’]: the filter_method being provided and used to filter