Filters the log based on frequency of activities

Filtering the event log based in resource frequency can be done in two ways: using an interval of allowed frequencies, or specify a coverage percentage.

  • percentage: When filtering using a percentage p%, the filter will return p frequency. The filter will retain additional resource labels as long as the number of activity instances does not exceed the percentage threshold.

  • interval: When filtering using an interval, resource labels will be retained when their absolute frequency fall in this interval. The interval is specified using a numeric vector of length 2. Half open intervals can be created by using NA. E.g., `c(10, NA)` will select resource labels which occur 10 times or more.

filter_resource_frequency(eventlog, interval, percentage, reverse, ...)

# S3 method for eventlog
filter_resource_frequency(eventlog, interval = NULL,
  percentage = NULL, reverse = FALSE, ...)

# S3 method for grouped_eventlog
filter_resource_frequency(eventlog,
  interval = NULL, percentage = NULL, reverse = FALSE, ...)

ifilter_resource_frequency(eventlog)

Arguments

eventlog

The dataset to be used. Should be a (grouped) eventlog object.

interval

An resource frequency interval (numeric vector of length 2). Half open interval can be created using NA.

percentage

The target coverage of activity instances. A percentile of 0.9 will return the most common resource types of the eventlog, which account for at least 90% of the activity instances.

reverse

Logical, indicating whether the selection should be reversed.

...

Deprecated arguments.

Value

When given an eventlog, it will return a filtered eventlog. When given a grouped eventlog, the filter will be applied in a stratified way (i.e. each separately for each group). The returned eventlog will be grouped on the same variables as the original event log.

Methods (by class)

  • eventlog: Filter event log

  • grouped_eventlog: Filter grouped event logs

See also