metrics.Rmd
library(bupaR)
## Loading required package: edeaR
## Loading required package: eventdataR
## Loading required package: processmapR
## Loading required package: xesreadR
##
## Attaching package: 'bupaR'
## The following object is masked from 'package:stats':
##
## filter
## The following object is masked from 'package:utils':
##
## timestamp
library(edeaR)
library(eventdataR)
The metrics for exploring and describing event data which are available are based on literature in the field of operational excellence and are organized in the following (sub)categories
The idle time is the time that there is no activity in a case or for a resource. It can only be calculated when there are both start and end timestamps available for activity instances. It can be computed at the levels trace, resource, case and log, and using different time units.
patients %>%
idle_time("resource", units = "days")
## # A tibble: 7 x 2
## employee idle_time
## <fct> <dbl>
## 1 r1 450.
## 2 r2 215.
## 3 r3 429.
## 4 r4 443.
## 5 r5 430.
## 6 r6 426.
## 7 r7 464.
The output of all metrics in edeaR can be visualized by supplying it to the plot function.
The processing time can be computed at the levels log, trace, case, activity and resource-activity. It can only be calculated when there are both start and end timestamps available for activity instances.
patients %>%
processing_time("activity") %>%
plot
The throughput time is the time form the very first event to the last event of a case. The levels at which it can be computed are log, trace, or case.
patients %>%
throughput_time("log") %>%
plot()
The resource frequency metric allows the computation of the number/frequency of resources at the levels of log, case, activity, resource, and resource-activity.
patients %>%
resource_frequency("resource")
## # A tibble: 7 x 3
## employee absolute relative
## <fct> <int> <dbl>
## 1 r1 500 0.184
## 2 r2 500 0.184
## 3 r6 495 0.182
## 4 r7 492 0.181
## 5 r5 261 0.0959
## 6 r3 237 0.0871
## 7 r4 236 0.0867
Resource involvement refers to the notion of the number of cases in which a resource is involved. It can be computed at levels case, resource, and resource-activity.
patients %>%
resource_involvement("resource") %>% plot
It this example it shows that only r1 and r2 are involved in all cases, r6 and r7 are involved in most of the cases, while the others are only involved in half of the cases, more or less.
The resource specalization metric shows whether resources are specialized in certain activities or not. It can be calculated at the levels log, case, resource and activity.
patients %>%
resource_specialisation("resource")
## # A tibble: 7 x 3
## employee absolute relative
## <fct> <int> <dbl>
## 1 r1 1 0.143
## 2 r2 1 0.143
## 3 r3 1 0.143
## 4 r4 1 0.143
## 5 r5 1 0.143
## 6 r6 1 0.143
## 7 r7 1 0.143
In the simple patients event log, each resource is performing exactly one activity, and is therefore 100% specialized.
Activity presence shows in what percentage of cases an activity is present. It has no level-argument.
patients %>% activity_presence() %>%
plot
The frequency of activities can be calculated using the activity_frequency function, at the levels log, trace and activity.
patients %>%
activity_frequency("activity")
## # A tibble: 7 x 3
## handling absolute relative
## <fct> <int> <dbl>
## 1 Blood test 237 0.0871
## 2 Check-out 492 0.181
## 3 Discuss Results 495 0.182
## 4 MRI SCAN 236 0.0867
## 5 Registration 500 0.184
## 6 Triage and Assessment 500 0.184
## 7 X-Ray 261 0.0959
The start of cases can be described using the start_activities function. Available levels are activity, case, log, resource and resource activity.
patients %>%
start_activities("resource-activity")
## # A tibble: 1 x 5
## employee handling absolute relative cum_sum
## <fct> <fct> <int> <dbl> <dbl>
## 1 r1 Registration 500 1. 1.
This shows that in this event log, all cases are started with the Registration by resource r1.
Conversely, the end_activities functions describes the end of cases, using the same levels: log, case, activity, resource and resource-activity.
patients %>%
end_activities("resource-activity")
## # A tibble: 5 x 5
## employee handling absolute relative cum_sum
## <fct> <fct> <int> <dbl> <dbl>
## 1 r7 Check-out 492 0.984 0.984
## 2 r6 Discuss Results 3 0.00600 0.990
## 3 r2 Triage and Assessment 2 0.00400 0.994
## 4 r5 X-Ray 2 0.00400 0.998
## 5 r3 Blood test 1 0.00200 1.00
In contract to the start of cases, the end of cases seems to differ more frequently, although it is mostly the Check-Out activity.
The trace coverage metric shows the relationship between the number of different activity sequences (i.e. traces) and the number of cases they cover.
patients %>%
trace_coverage("trace") %>%
plot()
In the patients log, there are only 7 different traces, and 2 of them cover nearly 100% of the event log.
The trace length metric describes the length of traces, i.e. the number of activity instances for each case. It can be computed at the levels case, trace and log.
patients %>%
trace_length("log") %>%
plot
It can be seen that in this simple event log, most cases have a trace length of 5 or 6, while a minority has a trace length lower than 5.