Filters the log based on frequency of activities.

filter_activity_frequency(eventlog, interval, percentage, reverse, ...)

# S3 method for eventlog
filter_activity_frequency(eventlog, interval = NULL,
  percentage = NULL, reverse = FALSE, ...)

# S3 method for grouped_eventlog
  interval = NULL, percentage = NULL, reverse = FALSE, ...)




The dataset to be used. Should be a (grouped) eventlog object.


An activity frequency interval (numeric vector of length 2). Half open interval can be created using NA.


The target coverage of activity instances. A percentile of 0.9 will return the most common activity types of the eventlog, which account for at least 90% of the activity instances.


Logical, indicating whether the selection should be reversed.


Deprecated arguments.


When given an eventlog, it will return a filtered eventlog. When given a grouped eventlog, the filter will be applied in a stratified way (i.e. each separately for each group). The returned eventlog will be grouped on the same variables as the original event log.


Filtering the event log based in activity frequency can be done in two ways: using an interval of allowed frequencies, or specify a coverage percentage.

  • percentage: When filtering using a percentage p%, the filter will return p frequency. The filter will retain additional activity labels as long as the number of activity instances does not exceed the percentage threshold.

  • interval: When filtering using an interval, activity labels will be retained when their absolute frequency fall in this interval. The interval is specified using a numeric vector of length 2. Half open intervals can be created by using NA. E.g., `c(10, NA)` will select activity labels which occur 10 times or more.

Methods (by class)

  • eventlog: Filter eventlog on activity frequency

  • grouped_eventlog: Stratified filter for grouped eventlog

See also