Process Discovery Contest @ BPM 2016

Share Embed


Descrição do Produto

Process Discovery Contest @ BPM 2016 Contestants: 1. 2. 3. 4.

Kingsley Okoye – University of East London, London, United Kingdom Abdel-Rahman H. Tawil - University of East London, London, United Kingdom Usman Naeem - University of East London, London, United Kingdom Elyes Lamine - Université de Toulouse, Mines-Albi, Cedex, France

E-mails: 1. [email protected], 2. [email protected], 3. [email protected], 4. [email protected]

Problem Statement: The Process Discovery approach described in this document is directed towards discovery of process models from a Training Event log representing 10 different real time business process executions, and cross-validating the derived model with a set of two Test Event logs provided for evaluation of the process discovery technique. Each of the Test event logs ((test_log_april_1 to test_log_april_10) and (test_log_may_1 to test_log_may_10)) represents part of the model from the Training Log with complete total of 20 traces for each of the logs, and are characterized by having 10 traces that can be replayed (allowed) and 10 traces that cannot be replayed (disallowed) by the model. The total number of traces for the Test event logs (i.e. April log, and May log) is therefore ((10 logs x 20 traces) x 2) = 400 Traces. Our aim is to carry out a classification task to determine the 400 individual traces that makes up the two test event log and then provide a Petri Net representation of the Training model as well as Business Process Model Notation (BPMN) mapping that allows for testing and evaluation of the behaviours/traces recorded in the Test logs. The objective of the proposed approach is to discover and provide process models that matches the original process models in term of balancing between “overfitting” and “underfitting”. A process model is seen as overfitting (the event log) if it is too restrictive, disallowing behaviour which is part of the underlying process. On the other hand, it is underfitting (the reality) if it is not restrictive enough, allowing behaviour which is not part of the underlying process. Following this challenge, we aim to provide a model which is as good in balancing “overfitting” and “underfitting” as it is able to correctly classify the traces that can be replayed based on the analysis of the “test” event log: Thus,  

Given a trace (t) representing real process behaviour, the process model (m) classifies it as allowed, or Given a trace (t) representing a behaviour not related to the process, the process model (m) classifies it as disallowed.

This document contains the classification attempts for the events logs provided and discusses the replaying semantics of the process modelling notation that has been employed. In other words, we discuss how, given any process trace t (for the Test event Log) and process model m (for the training log) in the discovered Petri Net and BPMN replaying notation, it can be unambiguously determined whether or not trace t can be replayed on model (m). We also provide a description of the tools used to discover the process models as well as checking the result of the classification task. The approach we use to solve the process discovery contest is supported by some of the definitions and technique described in [1]

Definition of Methods and Classification Task for Event Logs: In this section we describe the approach we used for the classification task of the Test Event Data Logs, to generate the traces that makes up each of the process executions. We also show how we utilised the Disco tool [2 ] based on Fuzzy Miner Algorithm to generate and map the 20 process models (April and May) from the test event logs for conformance checking that allows us to see the individual Cases (20 for each Log) and the sequence of activity executions (Traces).    

A process consists of Cases A Case consist of events such that each event relates to precisely one case. Events within a Case are ordered Events can have attributes e.g. Activity. Time, Cost, Resources etc.

The event log that has been provided for the contest contains the typical information needed for process mining (in our focus: Process Discovery). The Event Data represents and shows events logs generated from a business process model to show different behavioural characteristics. We assume that each event log contains data related to a single process which refers to a single process instance (Case) and can be related to some Activity. According to [1] a “Case ID” and “Activity” is the minimum requirement for any process mining technique. The given Event logs contains the two attributes case_id and act_name which precisely specify the requirements that allows for implementing the process discovery technique following the definition 4.1 in [1]. We assume the following standard  

#𝑐𝑎𝑠𝑒_𝑖𝑑(𝑒) is the Case associated to any event 𝑒. #𝑎𝑐𝑡_𝑛𝑎𝑚𝑒(𝑒) is the Activity associated to event 𝑒.

These definitions are necessary because for our approach the activities play an important role for the discovered model and thus corresponds to the traces (Cases) within the discovered Petri Net model as well as the BPMN model. As there are multiple events referring to the same Activity, we support the filtering of the 400 individual traces that makes up the test event logs with a classifier (see: definition 4.2) in [1]. A classifier is a function that maps the attributes of an event onto a label used in the resulting process model. If we use the notation 𝑒 to refer to the event name used in the process model, then the classifier for any event in the given log will be, 𝑒 ∈ ℰ, where 𝑒 is the name of the event. Since the events are simply identified by their activity name (𝑎𝑐𝑡_𝑛𝑎𝑚𝑒), we then assume 𝑒 = #𝑎𝑐𝑡_𝑛𝑎𝑚𝑒(𝑒)

We apply this classification conversion of the event logs provided (Simple Event Log, see: Definition 4.4) in [1] to obtain the Log. Applying the described simple event log definition: Let A be a set of 𝑎𝑐𝑡_𝑛𝑎𝑚𝑒. A simple/single trace 𝜎 is a sequence of activities, i.e., 𝜎 ∈ A *. A simple event Log 𝐿 is a multiset of traces over some set A . Thus,

𝐿 ∈ 𝔹 (A* ).

For the Training Log there are 1000 cases (trace) that defines the log. However, our focus is to identify the set of 400 traces (200 for April and 200 for May logs) that characterize the Test Log for use in validating the model. If we Let 𝐿 ⊆ C. be the event logs for the Test Log, and assuming that the classifier 𝑒 ∈ ℰ, is applied to the set of sequences, then from the definition (4.5) in [1] 〈𝑒1, 𝑒2, … , 𝑒𝑛〉 = 〈𝑒1, 𝑒2, … , 𝑒𝑛〉 where 𝐿 = [(ĉ) | 𝑐 ∈ 𝐿 ] is the simple event log corresponding to the Test Log. All the Cases in the Test Log are converted into sequences of the activities (𝑎𝑐𝑡_𝑛𝑎𝑚𝑒) using the classifier. Hence   

A Case 𝑐 ∈ 𝐿, is an identifier from the case C. ĉ = #𝑡𝑟𝑎𝑐𝑒(𝑐) = 〈𝑒1, 𝑒2, … , 𝑒𝑛〉 ∈ ℰ ∗ is the sequence of events executed for 𝑐 (ĉ) = 〈𝑒1, 𝑒2, … , 𝑒𝑛〉 maps these events onto the activity names(𝑎𝑐𝑡_𝑛𝑎𝑚𝑒) using the classifier.

From the described classification approach (𝑒 = #𝑎𝑐𝑡_𝑛𝑎𝑚𝑒(𝑒)), we obtain from the Log containing the first set of 200 traces for the Test Event Log (test_log_april_1) to (test_log_april_10), i.e., 20 Traces for each log as follows; 𝐿 (test_log_april_1) = [〈𝑏, 𝑔 , 𝑒 , 𝑞 , ℎ, 𝑖, 𝑙, 𝑟, 𝑚, 𝑜, 𝑑, 𝑓, 𝑝〉, 〈𝑏, 𝑏, 𝑐, 𝑛, ℎ, 𝑒, 𝑖, 𝑞, 𝑟, 𝑙, 𝑚, 𝑓, 𝑜, 𝑑, 𝑝〉, 〈𝑔, ℎ, 𝐼, 𝑞, 𝑞, 𝑚, 𝑟, 𝑜, 𝑒, 𝑑, 𝑝〉, 〈𝑗, 𝑎, 𝑘, 𝑏, 𝑏, 𝑔, 𝑒, ℎ, 𝑞, 𝑙, 𝑟, 𝑖, 𝑚, 𝑑, 𝑓, 𝑜, 𝑝〉, 〈𝑏, 𝑔, ℎ, 𝑖, 𝑞, 𝑖, 𝑟, 𝑚, 𝑜, 𝑑, 𝑝, 𝑓〉, 〈𝑒, 𝑒, 𝑒, 𝑞, ℎ, 𝑟, 𝑑, 𝑜, 𝑟, 𝑝〉, 〈𝑔, ℎ, 𝑒, 𝑖, 𝑖, 𝑞, 𝑙, 𝑚, 𝑜, 𝑓, 𝑝, 𝑑〉, 〈𝑏, 𝑎, 𝑗, 𝑘, 𝑔, 𝑒, 𝑞, ℎ, 𝑙, 𝑖, 𝑟, 𝑚, 𝑜, 𝑓, 𝑑, 𝑝〉, 〈𝑔, 𝑖, 𝑒, 𝑟, 𝑙, 𝑖, 𝑚, 𝑑, 𝑜, 𝑝, 𝑑, 𝑝〉, 〈𝑏, 𝑏, 𝑔, 𝑒, 𝑙, 𝑙, ℎ, 𝑞, 𝑟, 𝑟, 𝑟, 𝑑, 𝑜, 𝑜, 𝑝, 𝑓〉, 〈𝑏, 𝑔, 𝑒, ℎ, 𝑖, 𝑞, 𝑙, 𝑟, 𝑚, 𝑑, 𝑝, 𝑜, 𝑓〉, 〈𝑏, 𝑞, 𝑔, ℎ, 𝑖, ℎ, 𝑙, 𝑚, 𝑚, 𝑟, 𝑝, 𝑓〉, 〈ℎ, 𝑔, ℎ, 𝑒, 𝑟, 𝑙, 𝑞, 𝑖, 𝑓, 𝑓, 𝑝〉,

〈𝑏, 𝑗, 𝑎, 𝑘, 𝑔, 𝑞, 𝑒, 𝑖, ℎ, 𝑙, 𝑟, 𝑓, 𝑑, 𝑜, 𝑝〉, 〈𝑐, 𝑛, 𝑞, 𝑒, 𝑖, ℎ, 𝑟, 𝑑, 𝑚, 𝑜, 𝑝, 𝑓, 𝑝〉, 〈𝑏, 𝑔, ℎ, 𝑖, 𝑒, 𝑞, 𝑟, 𝑙, 𝑚, 𝑑, 𝑜, 𝑝, 𝑓〉, 〈𝑔, 𝑖, ℎ, 𝑒, 𝑟, 𝑞, 𝑚, 𝑙, 𝑜, 𝑑, 𝑓, 𝑝〉, 〈𝑘, 𝑏, 𝑛, 𝑛, 𝑐, ℎ, ℎ, 𝑒, 𝑞, 𝑙, 𝑞, 𝑟, 𝑟, 𝑖, 𝑚, 𝑓, 𝑓, 𝑖, 𝑝〉, 〈𝑏, 𝑏, 𝑏, 𝑔, 𝑞, 𝑖, ℎ, 𝑒, 𝑟, 𝑙, 𝑚, 𝑓, 𝑜, 𝑑, 𝑝〉, 〈𝑏, 𝑏, 𝑔, 𝑞, 𝑒, ℎ, 𝑖, 𝑟, 𝑚, 𝑙, 𝑑, 𝑜, 𝑝, 𝑓〉] 𝐿 (test_log_april_2) = [〈𝑓, 𝑔, 𝑔, 𝑔, 𝑖, 𝑘, 𝑠, 𝑘, ℎ, ℎ, 𝑒, 𝑛, 𝑒, 𝑗, 𝑑, 𝑑〉, 〈𝑓, 𝑏, 𝑟, 𝑙, 𝑝, 𝑚, 𝑡, 𝑞, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑓, 𝑒, 𝑙, 𝑗, 𝑛, 𝑑, 𝑑〉, 〈𝑎, ℎ, 𝑜, 𝑚, 𝑞, 𝑡, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑓, ℎ, 𝑚, 𝑟, 𝑚, 𝑞, 𝑡, 𝑒, 𝑛, 𝑗, 𝑗, 𝑑〉, 〈𝑎, 𝑖, 𝑜, 𝑙, 𝑚, 𝑡, 𝑒, 𝑛, 𝑑, 𝑗〉, 〈𝑖, 𝑖, 𝑓, ℎ, 𝑗〉, 〈𝑓, 𝑒, 𝑙, 𝑛, 𝑗〉, 〈𝑎, 𝑎, 𝑜, 𝑎, 𝑚, 𝑙, 𝑖, 𝑞, 𝑒, 𝑛, 𝑡, 𝑛, 𝑑〉, 〈𝑓, 𝑏, ℎ, 𝑚, 𝑞, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑎, 𝑠, 𝑖, 𝑘, ℎ, 𝑔, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑎, ℎ, 𝑠, 𝑝, 𝑘, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑓, 𝑔, 𝑝, 𝑐, 𝑙, 𝑒, 𝑛, 𝑗, 𝑑 〉, 〈𝑎, ℎ, 𝑜, 𝑔, 𝑚, 𝑝, 𝑡, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑎, ℎ, 𝑔, 𝑖, 𝑒, 𝑗, 𝑛, 𝑑〉, 〈𝑓, 𝑐, 𝑙, 𝑔, 𝑖, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑓, 𝑙, 𝑟, 𝑚, 𝑡, 𝑡, 𝑒, 𝑡, 𝑒, 𝑒, 𝑛, 𝑑〉, 〈𝑎, 𝑜, ℎ, 𝑚, 𝑡, 𝑞, 𝑒, 𝑛, 𝑗, 𝑑〉,

〈ℎ, 𝑎, 𝑜, 𝑜, 𝑜, 𝑡, 𝑚, 𝑒, 𝑒, 𝑗, 𝑗, 𝑑〉, 〈𝑎, ℎ, 𝑠, 𝑘, 𝑖, 𝑒, 𝑛, 𝑗, 𝑑〉] 𝐿 (test_log_april_3) = [〈𝑐, 𝑦, 𝑎, 𝑒, 𝑏, 𝑝, 𝑠, 𝑗, 𝑞〉, 〈𝑐, 𝑒, 𝑒, 𝑦, 𝑦, 𝑠, 𝑠, 𝑠, 𝑞〉, 〈𝑏, 𝑏, 𝑐, 𝑦, 𝑎, 𝑝, 𝑝, 𝑠, 𝑠, 𝑗, 𝑞〉, 〈𝑒, 𝑦, 𝑦, 𝑣, 𝑐, 𝑏, 𝑝, 𝑎, 𝑓, 𝑛, 𝑠, 𝑗, 𝑖, 𝑗, 𝑜, 𝑞〉, 〈𝑏, 𝑏, 𝑦, 𝑛, 𝑓, 𝑣, 𝑖, 𝑐, 𝑠, 𝑜, 𝑗, 𝑞 〉, 〈𝑦, 𝑐, 𝑎, 𝑏, 𝑝, 𝑒, 𝑠, 𝑗, 𝑞〉, 〈𝑣, 𝑒, 𝑏, 𝑓, 𝑐, 𝑎, 𝑝, 𝑦, 𝑛, 𝑠, 𝑖, 𝑗, 𝑘, 𝑜〉, 〈𝑎, 𝑦, 𝑒, 𝑝, 𝑐, 𝑠, 𝑏, 𝑗, 𝑞〉, 〈𝑐, 𝑎, 𝑦, 𝑒, 𝑏, 𝑏, 𝑏, 𝑏, 𝑞, 𝑗〉, 〈𝑝, 𝑦, 𝑦, 𝑐, 𝑏, 𝑒, 𝑠, 𝑗, 𝑎, 𝑞〉, 〈𝑐, 𝑏, 𝑦, 𝑝, 𝑒, 𝑎, 𝑠, 𝑗, 𝑞〉, 〈𝑐, 𝑏, 𝑦, 𝑒, 𝑎, 𝑝, 𝑠, 𝑞, 𝑗〉, 〈𝑒, 𝑝, 𝑏, 𝑐, 𝑦, 𝑎, 𝑠, 𝑗, 𝑞〉, 〈𝑙, 𝑒, 𝑏, 𝑘, 𝑎, 𝑦, 𝑝, 𝑐, 𝑠, 𝑗, 𝑞〉, 〈𝑏, 𝑐, 𝑎, 𝑝, 𝑒, 𝑦, 𝑠, 𝑗, 𝑞 〉, 〈𝑝, 𝑎, 𝑏, 𝑦, 𝑠, 𝑐, 𝑒, 𝑗, 𝑞〉, 〈𝑣, 𝑦, 𝑓, 𝑐, 𝑐, 𝑒, 𝑝, 𝑎, 𝑛, 𝑏, 𝑖, 𝑗, 𝑞, 𝑜〉, 〈𝑎, 𝑏, 𝑐, 𝑦, 𝑠, 𝑗, 𝑞〉, 〈𝑐, 𝑒, 𝑔, 𝑎〉, 〈𝑏, 𝑏, 𝑒, 𝑦, 𝑝, 𝑝, 𝑎, 𝑠〉] 𝐿 (test_log_april_4) = [〈𝑔, 𝑖, 𝑡, 𝑏, 𝑙, 𝑣, 𝑢, 𝑒, 𝑒, 𝑘, 𝑐, ℎ, 𝑞〉, 〈𝑛, 𝑖, 𝑓, 𝑏, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉,

〈𝑎, 𝑔, 𝑎, 𝑓, 𝑡, 𝑖, 𝑏, 𝑣, 𝑙, 𝑢, 𝑢, 𝑐, 𝑘, ℎ, 𝑞, 𝑞, 𝑞, 𝑞〉, 〈𝑔, 𝑎, 𝑓, 𝑓, 𝑏, 𝑓, 𝑢, 𝑖, 𝑒, ℎ, ℎ, ℎ, 𝑞〉, 〈𝑗, 𝑗, 𝑜, 𝑟, 𝑟, 𝑜, 𝑠, 𝑝, 𝑏, 𝑡, 𝑖, 𝑡, 𝑡, 𝑣, 𝑣, 𝑢, 𝑒, 𝑐, 𝑘〉, 〈𝑜, 𝑝, 𝑝, 𝑖, 𝑣, 𝑙, 𝑣, 𝑢, 𝑘, 𝑘 〉, 〈𝑗, 𝑜, 𝑟, 𝑎, 𝑠, 𝑝, 𝑖, 𝑏, 𝑓, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑎, 𝑓, 𝑡, 𝑏, 𝑖, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, ℎ, 𝑞〉, 〈𝑗, 𝑎, 𝑠, 𝑜, 𝑝, 𝑟, 𝑖, 𝑏, 𝑓, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑓, 𝑔, 𝑖, 𝑖, 𝑣, 𝑙, 𝑣, 𝑚, 𝑐, ℎ, ℎ, 𝑘, 𝑞, ℎ〉, 〈𝑎, 𝑔, 𝑓, 𝑏, 𝑖, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑎, 𝑜, 𝑟, 𝑗, 𝑠, 𝑝, 𝑓, 𝑏, 𝑖, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑜, 𝑎, 𝑗, 𝑟, 𝑠, 𝑝, 𝑓, 𝑖, 𝑏, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑎, 𝑏, 𝑔, 𝑏, 𝑓, 𝑡, 𝑙, 𝑣, 𝑖, 𝑢, 𝑢, 𝑘, 𝑒, 𝑐, ℎ, 𝑞〉, 〈𝑎, 𝑜, 𝑟, 𝑗, 𝑠, 𝑝, 𝑖, 𝑏, 𝑓, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑔, 𝑔, 𝑔, 𝑔, 𝑎, 𝑎, 𝑏, 𝑙, 𝑓, 𝑙, 𝑢, 𝑒, 𝑘, 𝑘 , 𝑑, 𝑐, 𝑐, 𝑞, ℎ〉, 〈𝑎, 𝑔, 𝑖, 𝑏, 𝑣, 𝑒, 𝑘, 𝑐, 𝑙, 𝑐, 𝑞〉, 〈𝑗, 𝑜, 𝑠, 𝑟, 𝑝, 𝑓, 𝑏, 𝑖, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉 〈𝑗, 𝑜, 𝑎, 𝑠, 𝑟, 𝑝, 𝑓, 𝑡, 𝑏, 𝑖, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑜, 𝑎, 𝑗, 𝑟, 𝑠, 𝑝, 𝑓, 𝑖, 𝑡, 𝑏, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉] 𝐿 (test_log_april_5) = [〈ℎ, 𝑓, 𝑠, 𝑖, 𝑐, 𝑡, 𝑑, 𝑚, 𝑜, 𝑤, 𝑙, 𝑔〉, 𝑓, 𝑒, 𝑒, 𝑐, 𝑡, 𝑧, 𝑧, 𝑧, 𝑎𝑎, 𝑏, 𝑣, 𝑥, 𝑎𝑎, 𝑎𝑎, 𝑘, 𝑏, 𝑎𝑎, 𝑏, 𝑎𝑏, 𝑢, 𝑎𝑏, 𝑡, 𝑐, 𝑡, 𝑐, 𝑜, 𝑚, 𝑤, 𝑤, 𝑘, 𝑘, 𝑘, 〈 〉, 𝑦, 𝑛, 𝑦, 𝑦, 𝑦, 𝑦, 𝑦, 𝑡, 𝑡, 𝑜, 𝑜, 𝑜, 𝑚, 𝑤, 𝑘, 𝑛, 𝑛, 𝑜, 𝑡, 𝑜, 𝑚, 𝑜, 𝑚, 𝑤, 𝑚, 𝑦, 𝑛, 𝑦, 𝑜, 𝑡, 𝑤, 𝑚, 𝑔 𝑎, 𝑓, 𝑠, 𝑖, 𝑐, 𝑡, 𝑑, 𝑜, 𝑤, 𝑚, 𝑘, 𝑛, 𝑢, 𝑡, 𝑐, 𝑜, 𝑚, 𝑤, 𝑘, 𝑛, 𝑢, 𝑐, 𝑡, 𝑚, 𝑜, 𝑤, 𝑘, 𝑛, 𝑢, 𝑡, 𝑐, 𝑜, 𝑚, 𝑤, 〈 〉, 𝑘, 𝑛, 𝑢, 𝑐, 𝑡, 𝑜, 𝑚, 𝑤, 𝑙, 𝑔 〈𝑎, 𝑞, 𝑟, 𝑓, 𝑒, 𝑐, 𝑡, 𝑧, 𝑏, 𝑜, 𝑎𝑏, 𝑗, 𝑚, 𝑣, 𝑤, 𝑝, 𝑥, 𝑘, 𝑛, 𝑢, 𝑡, 𝑐, 𝑜, 𝑚, 𝑤, 𝑙, 𝑔〉, 〈ℎ, ℎ, ℎ, ℎ, ℎ, ℎ, 𝑓, ℎ, 𝑏, 𝑏, 𝑓, 𝑏, 𝑏, 𝑏, 𝑏, 𝑐, 𝑐, 𝑜, 𝑎𝑏, 𝑚, 𝑘, 𝑛, 𝑢, 𝑡, 𝑚, 𝑜, 𝑡, 𝑤, 𝑙, 𝑤〉, 〈𝑞, 𝑟, 𝑓, 𝑟, 𝑓, 𝑒, 𝑗, 𝑜, 𝑚, 𝑚, 𝑤, 𝑥, 𝑣, 𝑥, 𝑙, 𝑔〉,

𝑐, 𝑓, 𝑖, 𝑑, 𝑡, 𝑠, 𝑤, 𝑜, 𝑤, 𝑛, 𝑘, 𝑦, 𝑛, 𝑡, 𝑐, 𝑜, 𝑚, 𝑤, 𝑤, 𝑤, 𝑘, 𝑤, 𝑘, 𝑛, 𝑚, 𝑐, 𝑚, 𝑚, 𝑚, 𝑘, 𝑘, 〈 〉, 𝑛, 𝑦, 𝑐, 𝑡, 𝑜, 𝑤, 𝑦, 𝑦, 𝑦, 𝑜, 𝑜, 𝑚, 𝑙, 𝑡 〈ℎ, 𝑓, 𝑡, 𝑧, 𝑎𝑏, 𝑏, 𝑚, 𝑜, 𝑤, 𝑙, 𝑔〉, 〈ℎ, ℎ, ℎ, 𝑡, 𝑡, 𝑎𝑏, 𝑔, 𝑔, 𝑔, 𝑔, 𝑔, 𝑜〉, 〈𝑎, 𝑞, 𝑓, 𝑐, 𝑏, 𝑏, 𝑧, 𝑎𝑏, 𝑡, 𝑜, 𝑚, 𝑤, 𝑘, 𝑛, 𝑢, 𝑐, 𝑜, 𝑚, 𝑙, 𝑔〉, 〈𝑎, 𝑓, 𝑐, 𝑖, 𝑠, 𝑡, 𝑑, 𝑚, 𝑜, 𝑤, 𝑘, 𝑛, 𝑦, 𝑐, 𝑡, 𝑚, 𝑜, 𝑤, 𝑘, 𝑛, 𝑦, 𝑐, 𝑡, 𝑜, 𝑚, 𝑤, 𝑙, 𝑔〉 〈𝑎, ℎ, 𝑓, 𝑡, 𝑐, 𝑏, 𝑚, 𝑧, 𝑎𝑎, 𝑜, 𝑤, 𝑏, 𝑘, 𝑎𝑎, 𝑛, 𝑏, 𝑎𝑏, 𝑦, 𝑡, 𝑐, 𝑚, 𝑜, 𝑤, 𝑘, 𝑛, 𝑢, 𝑐, 𝑡, 𝑚, 𝑜, 𝑤, 𝑙, 𝑔〉, 〈𝑎, ℎ, 𝑓, 𝑒, 𝑡, 𝑏, 𝑐, 𝑗, 𝑎𝑎, 𝑣, 𝑧, 𝑜, 𝑚, 𝑥, 𝑝, 𝑏, 𝑤, 𝑙, 𝑔, 𝑎𝑏〉, 〈𝑞, 𝑟, 𝑎, 𝑓, 𝑐, 𝑡, 𝑧, 𝑚, 𝑜, 𝑏, 𝑤, 𝑎𝑏, 𝑙, 𝑔〉, 〈𝑞, 𝑎, 𝑟, 𝑓, 𝑐, 𝑧, 𝑒, 𝑡, 𝑏, 𝑜, 𝑤, 𝑗, 𝑚, 𝑙, 𝑣, 𝑔, 𝑎𝑎, 𝑝, 𝑏, 𝑎𝑏〉, 〈𝑓, 𝑐, 𝑒, 𝑚, 𝑚, 𝑣, 𝑗, 𝑜, 𝑝, 𝑘, 𝑤, 𝑛, 𝑢, 𝑡, 𝑡, 𝑡, 𝑚, 𝑐, 𝑤, 𝑘, 𝑦, 𝑛, 𝑐, 𝑡, 𝑜, 𝑚, 𝑙〉, 〈𝑞, 𝑎, 𝑟, 𝑓, 𝑐, 𝑧, 𝑏, 𝑡, 𝑚, 𝑜, 𝑎𝑎, 𝑏, 𝑤, 𝑎𝑎, 𝑏, 𝑙, 𝑎𝑏, 𝑔〉, 〈ℎ, 𝑑, 𝑖, 𝑐, 𝑠, 𝑡, 𝑚, 𝑙, 𝑔〉, 〈𝑎, 𝑓, 𝑡, 𝑐, 𝑠, 𝑑, 𝑜, 𝑤, 𝑚, 𝑙, 𝑙, 𝑙, 𝑔, 𝑔〉, 〈𝑞, 𝑟, 𝑓, 𝑑, 𝑠, 𝑐, 𝑡, 𝑖, 𝑚, 𝑜, 𝑤, 𝑙, 𝑔〉] 𝐿 (test_log_april_6) = [〈𝑑, 𝑚, 𝑐, 𝑒〉, 〈𝑎, 𝑏, 𝑛, 𝑔, 𝑘, 𝑗, 𝑒, 𝑐〉, 〈𝑑, 𝑓, 𝑚, 𝑟, 𝑗, 𝑡, 𝑝, 𝑙, 𝑒, ℎ, 𝑗, 𝑒〉, 〈𝑛, 𝑡, 𝑒, 𝑒, 𝑝〉, 〈𝑎, 𝑎, 𝑎, 𝑎, 𝑑, 𝑡, 𝑟, 𝑝, 𝑒, ℎ, ℎ, 𝑡, 𝑗, 𝑗, 𝑗, 𝑒, 𝑙, 𝑒, 𝑒〉, 〈𝑚, 𝑑, 𝑡, 𝑟, 𝑒, 𝑡, 𝑝, 𝑒〉, 〈𝑏, 𝑚, 𝑛, 𝑗, 𝑐, 𝑜, 𝑙, 𝑒, 𝑒, 𝑒, 𝑒, 𝑙, ℎ, 𝑗, 𝑒〉, 〈𝑏, 𝑚, 𝑛, 𝑔, 𝑘, 𝑡, 𝑗, 𝑝, ℎ, 𝑒, 𝑒〉, 〈𝑑, 𝑓, 𝑎, 𝑟, 𝑝, 𝑡, 𝑒, 𝑒, 〉, 〈𝑚, 𝑏, 𝑚, 𝑏, 𝑜, 𝑡, 𝑝, 𝑔, 𝑗, 𝑒, 𝑒〉, 〈𝑎, 𝑑, 𝑓, 𝑟, 𝑐, 𝑗, ℎ, 𝑗, 𝑙, 𝑗, 𝑒〉,

〈𝑎, 𝑏, 𝑛, 𝑔, 𝑜, 𝑐, 𝑗, 𝑙, 𝑗, 𝑒〉, 〈𝑞, 𝑑, 𝑓, 𝑟, 𝑟, 𝑗, 𝑝, 𝑡, 𝑒〉, 〈𝑏, 𝑛, 𝑚, 𝑔, 𝑘, 𝑝, 𝑡, 𝑒, ℎ, 𝑒, 𝑒〉, 〈𝑎, 𝑒, 𝑒, 𝑒, 𝑒, 𝑙, 𝑝, 𝑙, 𝑗, 𝑒〉, 〈𝑞, 𝑑, 𝑟, 𝑓, 𝑟, 𝑝, 𝑡, 𝑒, 𝑒〉, 〈𝑡, 𝑟, 𝑗, 𝑐, 𝑐, 𝑐, ℎ, 𝑒, 𝑒〉, 〈𝑘, 𝑡, 𝑎, 𝑝, 𝑒, 𝑝, 𝑒, ℎ, 𝑒, ℎ, ℎ, ℎ, ℎ, ℎ, 𝑗, 𝑗, 𝑒, 𝑒〉, 〈𝑓, 𝑚, 𝑑, 𝑟, 𝑟, 𝑗, 𝑐, 𝑒〉, 〈𝑏, 𝑔, 𝑔, 𝑛, 𝑔, 𝑜, 𝑜, 𝑝, 𝑡, 𝑝, 𝑝, 𝑗, ℎ, 𝑝, 𝑒〉] 𝐿 (test_log_april_7) = [〈𝑓, 𝑐, 𝑡, 𝑢, 𝑦, 𝑎𝑎, 𝑛, 𝑥, 𝑐, 𝑧, 𝑚, 𝑒〉, 〈𝑟, 𝑡, 𝑎𝑏, 𝑐, 𝑐, 𝑑, 𝑎𝑎, 𝑧, 𝑧, 𝑧, 𝑙, 𝑠, 𝑚, ℎ, 𝑔, 𝑒〉, 〈𝑡, 𝑛, 𝑎𝑎, 𝑓, 𝑐, 𝑞, 𝑧, 𝑤, 𝑚, 𝑛, 𝑒, 𝑤, 𝑛, 𝑤, 𝑛, 𝑤, 𝑛, 𝑥〉, 〈𝑟, 𝑑, 𝑎𝑏, 𝑡, 𝑐, 𝑦, 𝑎𝑎, 𝑙, 𝑐, 𝑧, 𝑠, ℎ, 𝑚, 𝑒, 𝑔〉, 〈𝑓, 𝑓, 𝑓, 𝑝, 𝑝, 𝑓, 𝑝, 𝑐, 𝑝, 𝑡, 𝑞, 𝑞, 𝑎𝑎, 𝑐, 𝑎𝑎, 𝑎𝑎, 𝑚, 𝑚, 𝑒〉, 〈𝑐, 𝑡, 𝑑, 𝑑, 𝑢, 𝑢, 𝑧, 𝑒, 𝑒, 𝑚〉, 〈𝑎, 𝑞, 𝑓, 𝑦, 𝑡, 𝑡, 𝑝, 𝑐, 𝑦, 𝑎𝑎, 𝑦, 𝑦, 𝑐, 𝑦, 𝑦, 𝑐, 𝑦, 𝑐, 𝑐, 𝑧, 𝑒〉, 〈𝑡, 𝑏, 𝑏, 𝑎𝑎, 𝑚, 𝑧, 𝑖, 𝑒, 𝑔〉, 〈𝑟, 𝑡, 𝑣, 𝑐, 𝑎𝑏, 𝑐, 𝑎𝑎, 𝑠, 𝑙, 𝑧, 𝑚, 𝑠, 𝑔〉, 〈𝑐, 𝑐, 𝑏, 𝑡, 𝑙, 𝑝, 𝑎, 𝑠, 𝑧, 𝑎𝑎, 𝑖, 𝑣, 𝑖, 𝑘, 𝑚, 𝑔, 𝑒〉, 〈𝑏, 𝑡, 𝑑, 𝑐, 𝑙, 𝑠, 𝑎𝑎, 𝑦, 𝑖, 𝑜, 𝑐, 𝑔, 𝑦, 𝑐, 𝑧, 𝑚, 𝑒〉, 〈𝑡, 𝑐, 𝑎𝑎, 𝑎, 𝑎, 𝑎, 𝑢, 𝑢, 𝑓, 𝑦, 𝑐, 𝑣, 𝑦, 𝑐, 𝑚, 𝑒〉, 〈𝑎𝑏, 𝑛, 𝑡, 𝑐, 𝑎𝑎, 𝑧, 𝑟, 𝑚, 𝑥, 𝑙, 𝑠, 𝑒, ℎ, 𝑔〉, 〈𝑎, 𝑝, 𝑞, 𝑐, 𝑐, 𝑡, 𝑧, 𝑣, 𝑎𝑎, 𝑚, 𝑒〉, 〈𝑡, 𝑏, 𝑐, 𝑛, 𝑥, 𝑙, 𝑎𝑎, 𝑦, 𝑠, 𝑐, 𝑗, 𝑖, 𝑦, 𝑐, 𝑔, 𝑧, 𝑚, 𝑒〉, 〈𝑞, 𝑛, 𝑡, 𝑓, 𝑐, 𝑤, 𝑦, 𝑛, 𝑐, 𝑥, 𝑎𝑎, 𝑧, 𝑚, 𝑒〉,

〈𝑓, 𝑎, 𝑢, 𝑝, 𝑡, 𝑐, 𝑣, 𝑎𝑎, 𝑧, 𝑚, 𝑒〉, 〈𝑐, 𝑛, 𝑡, 𝑢, 𝑤, 𝑓, 𝑦, 𝑎𝑎, 𝑛, 𝑐, 𝑥, 𝑦, 𝑐, 𝑧, 𝑚, 𝑒〉, 〈𝑡, 𝑑, 𝑎𝑎, 𝑏, 𝑙, 𝑐, 𝑠, 𝑧, 𝑚, 𝑗, 𝑒, 𝑖, 𝑔〉, 〈𝑐, 𝑓, 𝑞, 𝑛, 𝑦, 𝑐, 𝑐, 𝑐, 𝑐, 𝑐, 𝑧, 𝑎𝑎, 𝑚, 𝑒〉] 𝐿 (test_log_april_8) = [〈𝑐, 𝑎, 𝑜, 𝑔, 𝑝, 𝑛, 𝑗〉, 〈𝑡, 𝑐, 𝑒, 𝑒, 𝑔, 𝑞, 𝑝, 𝑞〉, 〈𝑐, 𝑔, 𝑎, 𝑝, 𝑖, 𝑛, 𝑗〉, 〈𝑐, 𝑡, 𝑒, 𝑔, 𝑝, 𝑒, 𝑙, ℎ, 𝑢〉, 〈𝑜, 𝑐, 𝑎, 𝑖, 𝑖, 𝑟, 𝑟, 𝑗, 𝑢〉, 〈𝑒, 𝑐, 𝑔, 𝑞, 𝑝, 𝑢〉, 〈𝑐, 𝑎, 𝑔, 𝑜, 𝑝, 𝑖, 𝑡, 𝑗〉, 〈 𝑐, 𝑔, 𝑔, 𝑖, 𝑔, 𝑘, 𝑔, 𝑙, 𝑐〉, 〈𝑔, 𝑙, 𝑐〉, 〈𝑐, 𝑓, 𝑔, 𝑝, 𝑝, ℎ, 𝑢, 𝑙, 𝑐, 𝑔, 𝑓, 𝑝, 𝑢, ℎ, 𝑙, 𝑙, 𝑙〉, 〈𝑡, 𝑒, 𝑐, 𝑔, 𝑖, 𝑝, 𝑟, 𝑢, 𝑖, 𝑗〉, 〈𝑒, 𝑡, 𝑐, 𝑔, 𝑝, 𝑞, 𝑢, 𝑒〉, 〈𝑐, 𝑜, 𝑔, 𝑙, 𝑝, 𝑢, 𝑞, 𝑒〉, 〈𝑐, 𝑔, 𝑓, 𝑝, 𝑒, 𝑤〉, 〈𝑒, 𝑐, 𝑡, 𝑔, 𝑙, 𝑞, 𝑝, 𝑢, 𝑒〉, 〈𝑎, 𝑐, 𝑜, 𝑔, 𝑖, 𝑝, 𝑖, 𝑢, 𝑟, 𝑗〉, 〈𝑔, 𝑢, ℎ, 𝑒〉, 〈𝑐, 𝑎, 𝑔, 𝑞, 𝑝, 𝑢〉, 〈𝑐, 𝑐, 𝑐, 𝑐, 𝑜, 𝑐, 𝑖, 𝑖, 𝑛, 𝑖, 𝑗〉, 〈𝑎, 𝑐, 𝑜, 𝑔, 𝑝, 𝑒, 𝑢〉]

𝐿 (test_log_april_9) = [〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑠, 𝑚, 𝑐, 𝑜, 𝑛, 𝑝, 𝑑, 𝑏, 𝑡, 𝑟, 𝑘, 𝑗〉, 〈ℎ, ℎ, 𝑔, 𝑞, 𝑖, 𝑐, 𝑛, 𝑚, 𝑒, 𝑡, 𝑟, 𝑓〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑖, 𝑠, 𝑚, 𝑛, 𝑐, 𝑜, 𝑝, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑔, 𝑞, ℎ, 𝑙, 𝑖, 𝑚, 𝑠, 𝑛, 𝑑, 𝑏, 𝑡, 𝑟, 𝑘, 𝑗 〉, 〈𝑔, 𝑎, 𝑔, 𝑞, 𝑞, 𝑖, 𝑙, 𝑐, 𝑐, 𝑝, 𝑜, 𝑜, 𝑜, 𝑒, 𝑜, 𝑓〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑙, 𝑐, 𝑜, 𝑝, 𝑏, 𝑡, 𝑡, 𝑟, 𝑗, 𝑓〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑖, 𝑚, 𝑠, 𝑛, 𝑒, 𝑏, 𝑡, 𝑟, 𝑘, 𝑗〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑐, 𝑜, 𝑠, 𝑝, 𝑚, 𝑛, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑚, 𝑐, 𝑜, 𝑠, 𝑝, 𝑛, 𝑒, 𝑡, 𝑏, 𝑟, 𝑓, 𝑗〉, 〈𝑔, 𝑔, 𝑞, 𝑙, 𝑙, 𝑖, 𝑖, 𝑜, 𝑜, 𝑝, 𝑝, 𝑝, 𝑑, 𝑟, 𝑟, 𝑘〉, 〈𝑎, 𝑎, 𝑔, 𝑞, ℎ, 𝑙, 𝑐, 𝑖, 𝑠, 𝑚, 𝑝, 𝑏, 𝑜, 𝑡, 𝑡, 𝑟, 𝑘, 𝑘, 𝑗〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑠, 𝑚, 𝑐, 𝑛, 𝑝, 𝑜, 𝑒, 𝑏, 𝑡, 𝑟, 𝑘, 𝑗〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑖, 𝑐, 𝑠, 𝑝, 𝑜, 𝑚, 𝑛, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑔, 𝑞, ℎ, 𝑙, 𝑖, 𝑠, 𝑐, 𝑚, 𝑝, 𝑜, 𝑛, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑎, 𝑎, 𝑞, 𝑖, 𝑔, 𝑚, 𝑠, 𝑠, 𝑜, 𝑜, 𝑝, 𝑛, 𝑛, 𝑛, 𝑒, 𝑡, 𝑓, 𝑏, 𝑗, 𝑗〉, 〈𝑎, ℎ, ℎ, ℎ, 𝑔, 𝑞, 𝑖, 𝑙, 𝑝, 𝑜, 𝑚, 𝑠, 𝑠, 𝑑, 𝑏, 𝑡, 𝑟, 𝑟, 𝑘, 𝑗〉, 〈𝑎, 𝑎, 𝑔, 𝑔, 𝑔, 𝑙, 𝑔, 𝑙, 𝑖, 𝑐, 𝑜, 𝑐, 𝑠, 𝑠, 𝑛, 𝑛, 𝑒, 𝑡, 𝑏, 𝑗, 𝑘〉, 〈𝑎, ℎ, ℎ, 𝑞, 𝑔, 𝑚, 𝑖, 𝑐, 𝑜, 𝑐, 𝑝, 𝑜, 𝑠, 𝑒, 𝑛, 𝑡, 𝑡, 𝑓〉, 〈𝑔, 𝑎, ℎ, 𝑞, 𝑖, 𝑐, 𝑐, 𝑐, 𝑛, 𝑡, 𝑘, 𝑏, 𝑗〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑖, 𝑐, 𝑝, 𝑜, 𝑒, 𝑏, 𝑡, 𝑟, 𝑘, 𝑗〉] 𝐿 (test_log_april_10) = [〈𝑚, 𝑣, 𝑚, 𝑝, 𝑜, 𝑑, 𝑡, 𝑓, 𝑘, 𝑢, ℎ, 𝑠〉, 〈𝑚, 𝑣, 𝑝, 𝑚, 𝑡, 𝑜, 𝑓, 𝑘, 𝑑, 𝑢, 𝑠, ℎ〉, 〈𝑣, 𝑚, 𝑝, 𝑚, 𝑜, 𝑑, 𝑑, 𝑓, 𝑓, 𝑘, 𝑘, 𝑢, ℎ〉, 〈𝑚, 𝑚, 𝑝, 𝑣, 𝑡, 𝑜, 𝑘, 𝑓, 𝑑, 𝑢, 𝑠, ℎ〉,

〈𝑣, 𝑚, 𝑡, 𝑚, 𝑝, 𝑑, 𝑜, 𝑘, 𝑓, 𝑓, 𝑢, 𝑠, ℎ〉, 〈𝑚, 𝑣, 𝑝, 𝑚, 𝑡, 𝑜, 𝑓, 𝑑, 𝑘, 𝑟, 𝑠, ℎ 〉, 〈𝑣, 𝑚, 𝑚, 𝑝, 𝑡, 𝑜, 𝑑, 𝑘, 𝑢, 𝑓, ℎ, 𝑠〉, 〈𝑣, 𝑚, 𝑚, 𝑝, 𝑜, 𝑡, 𝑑, 𝑘, 𝑓, 𝑢, 𝑠, ℎ〉, 〈𝑚, 𝑣, 𝑝, 𝑚, 𝑜, 𝑡, 𝑓, 𝑑, 𝑘, 𝑢, ℎ, 𝑠〉, 〈𝑚, 𝑚, 𝑜, 𝑣, 𝑓, 𝑑, 𝑑, 𝑘, 𝑠, 𝑡〉, 〈𝑚, 𝑣, 𝑚, 𝑝, 𝑡, 𝑜, 𝑘, 𝑓, 𝑑, 𝑢, ℎ, 𝑠〉, 〈𝑚, 𝑝, 𝑚, 𝑡, 𝑝, 𝑜, 𝑑, 𝑓, 𝑘, 𝑠, 𝑢, ℎ〉, 〈𝑚, 𝑣, 𝑚, 𝑡, 𝑝, 𝑜, 𝑜, 𝑘, 𝑓, 𝑢, ℎ, 𝑠〉, 〈𝑣, 𝑣, 𝑣, 𝑚, 𝑝, 𝑜, 𝑑, 𝑜, 𝑘, 𝑓, 𝑢, ℎ〉, 〈𝑣, 𝑚, 𝑚, 𝑝, 𝑝, 𝑜, 𝑑, 𝑑, 𝑟, 𝑓, ℎ〉, 〈𝑣, 𝑣, 𝑚, 𝑚, 𝑡, 𝑡, 𝑝, 𝑜, 𝑓, 𝑑, 𝑘, ℎ〉, 〈𝑤, 𝑎, 𝑗, 𝑗, 𝑤, 𝑗, 𝑗〉, 〈𝑝, 𝑜, 𝑑, 𝑚, 𝑘, 𝑢, 𝑢, 𝑓, 𝑠, ℎ〉, 〈𝑚, 𝑚, 𝑡, 𝑡, 𝑝, 𝑜, 𝑑, 𝑘, 𝑢, ℎ, 𝑠〉, 〈𝑚, 𝑝, 𝑚, 𝑡, 𝑝, 𝑜, 𝑘, 𝑑, 𝑓, 𝑢, 𝑠, ℎ〉]

Following on the classification approach (𝑒 = #𝑎𝑐𝑡_𝑛𝑎𝑚𝑒(𝑒)), we further obtain from the Log containing the second set of 200 traces for the Test Event Log (test_log_may_1) to (test_log_may_10), i.e., 20 Traces for each log as follows; 𝐿 (test_log_may_1) = [〈𝑔, 𝑖, 𝑒, 𝑞, ℎ, 𝑙, 𝑟, 𝑚, 𝑑, 𝑝, 𝑓, 𝑜〉, 〈𝑔, 𝑞, ℎ, ℎ, 𝑟, 𝑟, 𝑒, 𝑖, 𝑚, 𝑙, 𝑓, 𝑑, 𝑜, 𝑝〉, 〈𝑏, 𝑏, 𝑔, ℎ, 𝑖, 𝑟, 𝑚, 𝑞, 𝑒, 𝑙, 𝑑, 𝑝, 𝑜, 𝑓〉, 〈𝑏, ℎ, 𝑖, ℎ, 𝑞, 𝑚, 𝑚, 𝑟, 𝑜, 𝑝, 𝑓, 𝑓〉, 〈𝑏, 𝑔, 𝑒, 𝑞, ℎ, 𝑟, 𝑙, 𝑖, 𝑚, 𝑓, 𝑑, 𝑜, 𝑝〉, 〈𝑏, 𝑔, 𝑒, 𝑖, 𝑙, 𝑞, ℎ, 𝑟, 𝑑, 𝑝, 𝑜, 𝑜〉,

〈𝑔, 𝑞, ℎ, 𝑒, 𝑙, 𝑟, 𝑖, 𝑚, 𝑓, 𝑑, 𝑝, 𝑜〉, 〈𝑔, 𝑞, 𝑖, ℎ, 𝑒, 𝑟, 𝑙, 𝑑, 𝑝, 𝑜, 𝑓〉, 〈𝑏, 𝑒, 𝑒, 𝑖, 𝑙, 𝑞, 𝑟, 𝑜, 𝑓, 𝑑, 𝑝〉, 〈𝑏, 𝑏, 𝑐, 𝑛, 𝑖, 𝑒, 𝑞, ℎ, 𝑚, 𝑑, 𝑓, 𝑜, 𝑝, 𝑜〉, 〈𝑔, ℎ, 𝑒, 𝑖, 𝑙, 𝑟, 𝑞, 𝑚, 𝑑, 𝑓, 𝑜, 𝑜, 𝑝〉, 〈𝑐, 𝑛, ℎ, 𝑒, 𝑞, 𝑞, 𝑖, 𝑙, 𝑟, 𝑚, 𝑚, 𝑑, 𝑑, 𝑜, 𝑝〉, 〈𝑏, 𝑎, 𝑗, 𝑘, 𝑏, 𝑏, 𝑔, ℎ, 𝑖, 𝑞, 𝑒, 𝑟, 𝑚, 𝑙, 𝑓, 𝑜, 𝑑, 𝑝〉, 〈𝑏, 𝑗, 𝑎, 𝑘, 𝑏, 𝑏, 𝑏, 𝑏, 𝑏, 𝑏, 𝑏, 𝑔, 𝑒, 𝑞, ℎ, 𝑖, 𝑟, 𝑚, 𝑙, 𝑜, 𝑑, 𝑓, 𝑝〉, 〈𝑛, 𝑛, 𝑖, 𝑖, 𝑖, 𝑞, ℎ, ℎ, 𝑒, 𝑟, 𝑚, 𝑜, 𝑓, 𝑝, 𝑜〉, 〈𝑔, 𝑒, ℎ, 𝑟, 𝑙, 𝑚, 𝑚, 𝑓, 𝑜, 𝑜, 𝑑, 𝑝〉, 〈𝑔, 𝑔, 𝑞, 𝑖, 𝑒, ℎ, 𝑟, 𝑙, 𝑚, 𝑝, 𝑑, 𝑜〉, 〈𝑏, 𝑔, ℎ, 𝑖, 𝑒, 𝑞, 𝑟, 𝑙, 𝑚, 𝑜, 𝑓, 𝑑, 𝑝〉, 〈𝑏, 𝑔, 𝑖, 𝑒, 𝑙, 𝑞, ℎ, 𝑟, 𝑚, 𝑓, 𝑑, 𝑜, 𝑝〉, 〈𝑎, 𝑘, 𝑗, 𝑔, 𝑖, 𝑞, 𝑒, ℎ, 𝑙, 𝑟, 𝑚, 𝑑, 𝑓, 𝑜, 𝑝〉] 𝐿 (test_log_may_2) = [〈𝑓, 𝑐, 𝑜, 𝑔, 𝑙, 𝑚, 𝑡, 𝑒, 𝑛, 𝑗, 𝑑, 𝑓, 𝑐, 𝑔, 𝑙, 𝑚, 𝑡, 𝑒, 𝑗, 𝑑〉, 〈𝑎, ℎ, 𝑖, 𝑔, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑎, 𝑎, 𝑝, 𝑒, 𝑝, 𝑛〉, 〈𝑎, 𝑠, ℎ, 𝑘, 𝑒, 𝑗, 𝑑, 𝑗〉, 〈𝑓, 𝑙, 𝑔, 𝑖, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑎, 𝑙, 𝑘, 𝑠, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑓, 𝑘, ℎ, 𝑠, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑓, ℎ, 𝑒, 𝑛, 𝑑, 𝑗〉, 〈𝑎, 𝑜, ℎ, 𝑝, 𝑚, 𝑞, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑎, 𝑘, 𝑔, 𝑠, 𝑙, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑓, ℎ, ℎ, ℎ, 𝑝, 𝑡, 𝑏, 𝑞, 𝑒, 𝑛, 𝑑〉,

〈𝑎, 𝑔, 𝑐, 𝑐, 𝑝, 𝑙, 𝑜, 𝑞, 𝑚, 𝑛, 𝑒, 𝑗, 𝑑 〉, 〈𝑎, 𝑘, 𝑠, 𝑝, 𝑙, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑎, 𝑔, 𝑐, 𝑠, 𝑘, 𝑙, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑓, 𝑠, ℎ, 𝑘, 𝑒, 𝑝, 𝑑〉, 〈𝑓, ℎ, 𝑝, 𝑒, 𝑛, 𝑛, 𝑗, 𝑑〉, 〈𝑓, 𝑓〉, 〈𝑎, 𝑔〉 〈𝑎, 𝑖, 𝑐, 𝑔, ℎ, 𝑒, 𝑛, 𝑗, 𝑑〉, 〈𝑎, 𝑎〉] 𝐿 (test_log_may_3) = [〈𝑝, 𝑦, 𝑠, 𝑎, 𝑒, 𝑐, 𝑏, 𝑗, 𝑞〉, 〈𝑣, 𝑦, 𝑎, 𝑝, 𝑏, 𝑒, 𝑓, 𝑛, 𝑛, 𝑖, 𝑐, 𝑠, 𝑜, 𝑜, 𝑗, 𝑞, 𝑞〉, 〈𝑏, 𝑦, 𝑝, 𝑠, 𝑒, 𝑐, 𝑎, 𝑎, 𝑎, 𝑝〉, 〈𝑦, 𝑦, 𝑐, 𝑝, 𝑏, 𝑒, 𝑠, 𝑎, 𝑗, 𝑎, 𝑗, 𝑞, 𝑞〉, 〈𝑎, 𝑝, 𝑏, 𝑦, 𝑒, 𝑐, 𝑠, 𝑗, 𝑞〉, 〈𝑝, 𝑏, 𝑘, 𝑦, 𝑐, 𝑙, 𝑒, 𝑠, 𝑗, 𝑞〉, 〈𝑘, 𝑒, 𝑝, 𝑦, 𝑙, 𝑠, 𝑐, 𝑏, 𝑗, 𝑎, 𝑞〉, 〈𝑒, 𝑏, 𝑝, 𝑎, 𝑐, 𝑦, 𝑠, 𝑗, 𝑞〉, 〈𝑒, 𝑐, 𝑔, 𝑎〉, 〈𝑎, 𝑝, 𝑦, 𝑒, 𝑠, 𝑐, 𝑏, 𝑗, 𝑞, 𝑞〉, 〈𝑔, 𝑒, 𝑎, 𝑎, 𝑐, 𝑔, 𝑒, 𝑎, 𝑐〉, 〈𝑦, 𝑏, 𝑐, 𝑒, 𝑎, 𝑝, 𝑠, 𝑗, 𝑘〉, 〈𝑏, 𝑦, 𝑎, 𝑒, 𝑐, 𝑝, 𝑠, 𝑗, 𝑞〉, 〈𝑎, 𝑦, 𝑝, 𝑏, 𝑠, 𝑐, 𝑒, 𝑗, 𝑞〉, 〈𝑦, 𝑎, 𝑐, 𝑝, 𝑏, 𝑒, 𝑠, 𝑗, 𝑞〉, 〈𝑏, 𝑒, 𝑦, 𝑣, 𝑝, 𝑐, 𝑠, 𝑎, 𝑛, 𝑖, 𝑞, 𝑜〉,

〈𝑦, 𝑎, 𝑐, 𝑝, 𝑏, 𝑒, 𝑘, 𝑠, 𝑘〉, 〈𝑐, 𝑎, 𝑎, 𝑒, 𝑣, 𝑣, 𝑝, 𝑝, 𝑝, 𝑛, 𝑏, 𝑏, 𝑖, 𝑦, 𝑠, 𝑗, 𝑜, 𝑞〉, 〈𝑎, 𝑦, 𝑐, 𝑝, 𝑏, 𝑒, 𝑠, 𝑗, 𝑞〉, 〈𝑏, 𝑏, 𝑦, 𝑝, 𝑐, 𝑠, 𝑗, 𝑞〉] 𝐿 (test_log_may_4) = [〈𝑛, 𝑖, 𝑓, 𝑏, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉, 〈 𝑓, 𝑏, 𝑡, 𝑙, 𝑣, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉, 〈𝑗, 𝑜, 𝑠, 𝑝, 𝑟, 𝑖, 𝑓, 𝑏, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉, 〈𝑔, 𝑏, 𝑖, 𝑓, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉, 〈𝑔, 𝑖, 𝑓, 𝑡, 𝑏, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉, 〈𝑔, 𝑏, 𝑓, 𝑖, 𝑙, 𝑣, 𝑒, 𝑚, ℎ, 𝑐, 𝑞〉, 〈𝑔, 𝑖, 𝑏, 𝑓, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉, 〈𝑎, 𝑎, 𝑎, 𝑔, 𝑡, 𝑙, 𝑖, 𝑙, 𝑙, 𝑣, 𝑢, 𝑘, 𝑐, 𝑑, ℎ, 𝑞〉, 〈𝑔, 𝑓, 𝑖, 𝑡, 𝑙, 𝑏, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉, 〈𝑓, 𝑔, 𝐼, 𝑏, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉, 〈𝑎, 𝑔, 𝑖, 𝑓, 𝑏, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑔, 𝑏, 𝑓, 𝑡, 𝑖, 𝑙, 𝑣, 𝑢, 𝑢, 𝑒, 𝑘, 𝑐, ℎ, 𝑞〉, 〈𝑔, 𝑎, 𝑓, 𝑏, 𝑖, 𝑡, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑓, 𝑔, 𝑡, 𝑏, 𝑖, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, 𝑚, 𝑐, ℎ, 𝑞〉, 〈𝑔, 𝑓, 𝑖, 𝑡, 𝑏, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉, 〈𝑔, 𝑓, 𝑓, 𝑓, 𝑡, 𝑏, 𝑙, 𝑖, 𝑒, 𝑣, 𝑘, 𝑚, 𝑐, ℎ, ℎ, 𝑞〉, 〈𝑔, 𝑖, 𝑎, 𝑏, 𝑓, 𝑡, 𝑙, 𝑑, 𝑒, 𝑐, 𝑐, ℎ, 𝑞〉, 〈𝑎, 𝑔, 𝑖, 𝑓, 𝑡, 𝑏, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑑, 𝑐, ℎ, 𝑞〉, 〈𝑓, 𝑡, 𝑏, 𝑣, 𝑙, 𝑢, 𝑢, 𝑢, 𝑢, 𝑒, 𝑚, 𝑘, 𝑐, ℎ, 𝑞〉, 〈𝑔, 𝑓, 𝑡, 𝑏, 𝑖, 𝑙, 𝑣, 𝑢, 𝑒, 𝑘, 𝑚, 𝑐, ℎ, 𝑞〉]

𝐿 (test_log_may_5) = [〈𝑎, ℎ, 𝑓, 𝑡, 𝑏, 𝑎𝑏, 𝑒, 𝑐, 𝑗, 𝑧, 𝑚, 𝑣, 𝑜, 𝑤, 𝑝, 𝑙, 𝑔〉, 〈𝑎, 𝑞, 𝑟, 𝑓, 𝑒, 𝑐, 𝑡, 𝑜, 𝑗, 𝑚, 𝑤, 𝑣, 𝑥, 𝑙, 𝑔, 𝑝〉, 〈ℎ, 𝑎, 𝑓, 𝑏, 𝑐, 𝑡, 𝑎𝑎, 𝑚, 𝑧, 𝑜, 𝑗, 𝑎𝑎, 𝑤, 𝑎𝑎, 𝑣, 𝑝, 𝑙, 𝑏, 𝑏, 𝑏〉, 〈𝑎, 𝑓, 𝑐, 𝑑, 𝑡, 𝑚, 𝑠, 𝑠, 𝑜, 𝑜, 𝑤, 𝑙, 𝑔, 𝑔〉, 〈𝑞, 𝑟, 𝑓, 𝑑, 𝑐, 𝑑, 𝑡, 𝑠, 𝑖, 𝑜, 𝑤, 𝑙, 𝑔, 𝑔〉, 〈𝑎, 𝑎, 𝑓, 𝑡, 𝑧, 𝑏, 𝑎𝑏, 𝑐, 𝑒, 𝑚, 𝑜, 𝑤, 𝑙, 𝑗, 𝑝〉, 𝑓, 𝑎, 𝑓, 𝑐, 𝑑, 𝑡, 𝑠, 𝑜, 𝑖, 𝑤, 𝑚, 𝑘, 𝑛, 𝑦, 𝑡, 𝑐, 𝑚, 𝑜, 𝑤, 𝑙, 𝑔, 𝑎, 𝑑, 𝑓, 𝑑, 𝑐, 𝑠, 𝑡, 〈 〉, 𝑜, 𝑖, 𝑤, 𝑤, 𝑚, 𝑘, 𝑛, 𝑦, 𝑐, 𝑡, 𝑚, 𝑜, 𝑤, 𝑙, 𝑔 〈ℎ, 𝑎, 𝑓, 𝑧, 𝑏, 𝑐, 𝑡, 𝑎𝑏, 𝑜, 𝑚, 𝑤, 𝑙, 𝑔〉, 〈𝑎, 𝑓, 𝑓, 𝑧, 𝑡, 𝑐, 𝑏, 𝑚, 𝑎𝑏, 𝑜, 𝑜, 𝑤, 𝑙, 𝑎, 𝑔, 𝑧, 𝑓, 𝑡, 𝑡, 𝑐, 𝑚, 𝑏, 𝑎𝑏, 𝑜, 𝑤, 𝑔, 𝑙〉, 〈𝑎, 𝑓, 𝑏, 𝑡, 𝑐, 𝑧, 𝑎𝑏, 𝑚, 𝑜, 𝑤, 𝑙, 𝑔〉, 〈𝑞, 𝑟, 𝑟, 𝑓, 𝑐, 𝑠, 𝑑, 𝑖, 𝑡, 𝑜, 𝑜, 𝑚, 𝑤, 𝑤, 𝑙, 𝑔〉, 〈𝑞, 𝑎, 𝑟, 𝑓, 𝑐, 𝑧, 𝑏, 𝑡, 𝑚, 𝑎𝑏, 𝑜, 𝑤, 𝑘, 𝑛, 𝑦, 𝑡, 𝑐, 𝑜, 𝑤, 𝑚, 𝑙, 𝑔〉, 〈𝑞, 𝑟, 𝑓, 𝑡, 𝑒, 𝑗, 𝑣, 𝑏, 𝑥, 𝑎𝑏, 𝑐, 𝑚, 𝑧, 𝑝, 𝑜, 𝑤, 𝑘, 𝑛, 𝑢, 𝑡, 𝑐, 𝑜, 𝑤, 𝑚, 𝑘, 𝑛, 𝑢, 𝑡, 𝑐, 𝑚, 𝑜, 𝑤, 𝑙, 𝑔〉, 〈ℎ, 𝑎, 𝑓, 𝑠, 𝑡, 𝑐, 𝑜, 𝑑, 𝑖, 𝑚, 𝑤, 𝑙, 𝑔〉, 〈𝑞, 𝑟, 𝑓, 𝑡, 𝑧, 𝑐, 𝑏, 𝑜, 𝑒, 𝑎𝑎, 𝑚, 𝑗, 𝑏, 𝑤, 𝑙, 𝑣, 𝑥, 𝑎𝑎, 𝑏, 𝑔, 𝑝, 𝑎𝑏〉, 〈ℎ, 𝑓, 𝑒, 𝑏, 𝑗, 𝑡, 𝑐, 𝑜, 𝑧, 𝑣, 𝑥, 𝑚, 𝑝, 𝑎𝑎, 𝑤, 𝑏, 𝑎𝑎, 𝑙, 𝑏, 𝑎𝑏, 𝑔〉, 〈𝑞, 𝑎, 𝑟, 𝑓, 𝑠, 𝑡, 𝑐, 𝑖, 𝑜, 𝑑, 𝑚, 𝑤, 𝑘, 𝑛, 𝑢, 𝑐, 𝑡, 𝑚, 𝑜, 𝑤, 𝑙, 𝑔〉, 〈𝑎, 𝑎, 𝑚, 𝑒, 𝑗, 𝑗, 𝑤, 𝑘, 𝑣, 𝑣, 𝑣, 𝑣, 𝑣, 𝑣, 𝑣, 𝑐, 𝑢, 𝑡, 𝑚, 𝑜, 𝑙, 𝑔〉, 〈 𝑓, 𝑠, 𝑡, 𝑑, 𝑐, 𝑖, 𝑜, 𝑚, 𝑤, 𝑙, 𝑔〉, 𝑞, 𝑎, 𝑟, 𝑡, 𝑓, 𝑒, 𝑐, 𝑗, 𝑣, 𝑜, 𝑚, 𝑝, 𝑤, 𝑦, 𝑡, 𝑡, 𝑐, 𝑚, 𝑜, 𝑜, 𝑜, 𝑤, 𝑛, 𝑘, 𝑦, 𝑡, 〈 〉] 𝑐, 𝑤, 𝑚, 𝑘, 𝑛, 𝑦, 𝑐, 𝑜, 𝑡, 𝑚, 𝑤, 𝑙, 𝑔 𝐿 (test_log_may_6) = [〈𝑏, 𝑞, 𝑛, 𝑔, 𝑟, 𝑜, 𝑐, 𝑗, 𝑒〉, 〈𝑎, 𝑗, 𝑟, 𝑡, ℎ, ℎ, ℎ, 𝑗, 𝑒〉, 〈𝑞, 𝑑, 𝑓, 𝑟, 𝑟, 𝑝, 𝑡, 𝑗, ℎ, 𝑒, ℎ, 𝑗, 𝑒〉,

〈𝑏, 𝑛, 𝑎, 𝑔, 𝑜, 𝑒, 𝑐, 𝑙, 𝑒, 𝑒〉, 〈𝑎, 𝑏, 𝑛, 𝑔, 𝑘, 𝑒, 𝑐, ℎ, 𝑗, 𝑒〉, 〈𝑎, 𝑏, 𝑛, 𝑔, 𝑜, 𝑡, 𝑝, 𝑒, 𝑒〉, 〈𝑚, 𝑑, 𝑡, 𝑟, 𝑗, 𝑗, ℎ, 𝑒, 𝑝, 𝑝, 𝑒, 𝑡〉, 〈𝑑, 𝑞, 𝑟, 𝑟, 𝑡, 𝑒, 𝑡, ℎ, ℎ, 𝑗, ℎ, 𝑒, 𝑙, 𝑒, 𝑒〉, 〈𝑚, 𝑏, 𝑚, 𝑛, 𝑔, 𝑜, 𝑜, 𝑐, ℎ, 𝑐, 𝑒〉, 〈𝑏, 𝑚, 𝑏, 𝑛, 𝑔, 𝑜, 𝑝, 𝑡, ℎ, ℎ, ℎ, 𝑗, 𝑒〉, 〈𝑚, 𝑏, 𝑛, 𝑔, 𝑘, 𝑐, 𝑗, 𝑒〉, 〈𝑏, 𝑚, 𝑛, 𝑔, 𝑜, 𝑗, 𝑙, 𝑐, 𝑗, 𝑒〉, 〈𝑏, 𝑞, 𝑟, 𝑛, 𝑔, 𝑜, 𝑝, 𝑗, 𝑙, 𝑡, 𝑒, ℎ, 𝑒, 𝑒〉, 〈𝑏, 𝑛, 𝑞, 𝑔, 𝑟, 𝑜, 𝑐, 𝑒, ℎ, 𝑒, 𝑒〉, 〈𝑎, 𝑏, 𝑛, 𝑜, 𝑐, 𝑐, 𝑒, 𝑒〉, 〈𝑏, 𝑎, 𝑛, 𝑔, 𝑘, 𝑒, 𝑡, 𝑝, 𝑒〉, 〈𝑏, 𝑞, 𝑔, 𝑟, 𝑘, 𝑡, 𝑡, 𝑒, 𝑝, ℎ, 𝑗, 𝑗, 𝑒, 𝑒, 𝑒〉, 〈𝑏, 𝑞, 𝑟, 𝑛, 𝑔, 𝑘, 𝑐, 𝑐, 𝑗, 𝑙, 𝑗, ℎ, 𝑒, 𝑒〉, 〈𝑑, 𝑡, 𝑟, 𝑟, 𝑒, 𝑒, 𝑐〉, 〈𝑏, 𝑛, 𝑛, 𝑛, 𝑚, 𝑔, 𝑡, 𝑘, 𝑝, 𝑗, 𝑒〉] 𝐿 (test_log_may_7) = [〈𝑡, 𝑟, 𝑎𝑎, 𝑎𝑏, 𝑙, 𝑠, 𝑛, 𝑐, 𝑧, 𝑤, ℎ, 𝑔, 𝑛, 𝑥, 𝑒〉, 〈𝑎, 𝑏, 𝑡, 𝑣, 𝑐, 𝑠, 𝑘, 𝑘, 𝑚, 𝑘, 𝑔〉, 〈𝑏, 𝑐, 𝑡, 𝑎𝑎, 𝑙, 𝑠, 𝑑, 𝑦, 𝑘, 𝑖, 𝑐, 𝑔, 𝑧, 𝑚, 𝑒〉, 〈𝑝, 𝑡, 𝑎𝑎, 𝑐, 𝑣, 𝑎, 𝑏, 𝑦, 𝑙, 𝑠, 𝑘, 𝑐, 𝑖, 𝑦, 𝑐, 𝑔, 𝑧, 𝑚, 𝑒〉, 〈𝑎𝑏, 𝑐, 𝑡, 𝑑, 𝑟, 𝑎𝑎, 𝑧, 𝑚, 𝑙, 𝑒, 𝑠, ℎ〉, 〈𝑎𝑏, 𝑟, 𝑐, 𝑙, 𝑑, 𝑡, 𝑎𝑎, 𝑠, 𝑧, 𝑚, ℎ, 𝑒, 𝑔〉, 〈𝑑, 𝑡, 𝑎𝑎, 𝑓, 𝑓, 𝑢, 𝑐, 𝑧, 𝑧, 𝑚〉, 〈𝑐, 𝑦, 𝑛, 𝑏, 𝑙, 𝑡, 𝑠, 𝑥, 𝑎𝑎, 𝑐, 𝑖, 𝑧, 𝑗, 𝑔, 𝑚, 𝑒〉,

〈𝑛, 𝑏, 𝑐, 𝑡, 𝑙, 𝑎𝑎, 𝑤, 𝑧, 𝑚, 𝑚, 𝑠, 𝑒, 𝑛, 𝑜, 𝑖, 𝑤, 𝑔, 𝑛, 𝑤, 𝑛, 𝑥〉, 〈𝑛, 𝑐, 𝑞, 𝑧, 𝑥, 𝑡, 𝑓, 𝑎𝑎, 𝑚, 𝑒〉, 〈𝑝, 𝑏, 𝑡, 𝑐, 𝑎𝑎, 𝑙, 𝑎, 𝑣, 𝑦, 𝑠, 𝑐, 𝑗, 𝑖, 𝑧, 𝑚, 𝑔, 𝑒〉, 〈𝑐, 𝑛, 𝑎𝑏, 𝑡, 𝑤, 𝑟, 𝑧, 𝑛, 𝑎𝑎, 𝑤, 𝑠, 𝑚, 𝑛, ℎ, 𝑥, ℎ, 𝑒, 𝑔〉, 〈𝑝, 𝑐, 𝑞, 𝑓, 𝑣, 𝑡, 𝑎, 𝑦, 𝑎𝑎, 𝑐, 𝑦, 𝑐, 𝑦, 𝑐, 𝑦, 𝑐, 𝑦, 𝑐, 𝑦, 𝑐, 𝑦, 𝑐, 𝑦, 𝑐, 𝑧, 𝑚, 𝑒〉, 〈𝑐, 𝑦, 𝑡, 𝑞, 𝑛, 𝑤, 𝑐, 𝑛, 𝑧, 𝑎𝑎, 𝑥, 𝑚, 𝑒〉, 〈𝑓, 𝑡, 𝑎𝑎, 𝑎𝑎, 𝑝, 𝑣, 𝑎, 𝑐, 𝑦, 𝑧, 𝑐, 𝑚, 𝑒〉, 〈𝑦, 𝑎, 𝑡, 𝑢, 𝑝, 𝑐, 𝑎𝑎, 𝑎𝑎, 𝑓, 𝑣, 𝑦, 𝑐, 𝑐, 𝑦, 𝑐, 𝑧, 𝑒〉, 〈𝑎𝑏, 𝑐, 𝑎, 𝑡, 𝑟, 𝑙, 𝑝, 𝑦, 𝑠, 𝑎𝑎, ℎ, 𝑣, 𝑔, 𝑐, 𝑧, 𝑚, 𝑒〉, 〈𝑎𝑏, 𝑐, 𝑡, 𝑦, 𝑛, 𝑐, 𝑟, 𝑙, 𝑎𝑎, 𝑎𝑎, 𝑧, 𝑤, 𝑠, 𝑠, 𝑛, 𝑛, 𝑛, 𝑚, ℎ, 𝑤, 𝑒, 𝑛, 𝑔, 𝑥〉, 〈𝑑, 𝑐, 𝑏, 𝑡, 𝑙, 𝑦, 𝑎𝑎, 𝑐, 𝑠, 𝑧, 𝑜, 𝑖, 𝑚, 𝑒, 𝑔〉, 〈𝑟, 𝑡, 𝑑, 𝑎𝑏, 𝑐, 𝑦, 𝑙, 𝑎𝑎, 𝑐, 𝑧, 𝑠, 𝑚, 𝑒, ℎ, 𝑔〉] 𝐿 (test_log_may_8) = [〈𝑒, 𝑡, 𝑔, 𝑖, 𝑝, 𝑡, 𝑡, 𝑗〉, 〈𝑡, 𝑡, 𝑐, 𝑒, 𝑔, 𝑢, 𝑖, 𝑗, 𝑟, 𝑗〉, 〈𝑎, 𝑐, 𝑔, 𝑝, 𝑖, 𝑟, 𝑖, 𝑖, 𝑢, 𝑗, 𝑗〉, 〈𝑒, 𝑡, 𝑐, 𝑔, 𝑤, 𝑝, 𝑢, 𝑢, 𝑙, 𝑒, 𝑡, 𝑐, 𝑔, 𝑤, 𝑝, 𝑢, 𝑙, 𝑒〉, 〈𝑜, 𝑐, 𝑎, 𝑔, 𝑝, 𝑒, 𝑙, 𝑤, 𝑢〉, 〈𝑐, 𝑎, 𝑔, 𝑒, 𝑝, 𝑞, 𝑒, 𝑢〉, 〈𝑐, 𝑓, 𝑔, 𝑙, 𝑝, 𝑞, 𝑒〉, 〈𝑐, 𝑓, 𝑔, 𝑤, 𝑙〉, 〈𝑓, 𝑐, 𝑖, 𝑖, 𝑝, 𝑢, 𝑟〉, 〈𝑓, 𝑐, 𝑔, 𝑙, 𝑞, 𝑒, 𝑝〉, 〈𝑐, 𝑓, 𝑔, ℎ, 𝑙〉, 〈𝑜, 𝑜, 𝑐, 𝑝, 𝑝, 𝑞, 𝑙, 𝑒〉, 〈𝑐, 𝑓, 𝑔, 𝑝, 𝑒, 𝑤〉,

〈𝑓, 𝑐, 𝑔, 𝑙, ℎ〉, 〈𝑐, 𝑓, 𝑔, 𝑞, 𝑙〉, 〈𝑡, 𝑒, 𝑐, 𝑔, ℎ, 𝑙, 𝑝, 𝑢, 𝑒〉, 〈𝑜, 𝑎, 𝑔, 𝑙, ℎ〉, 〈𝑜, 𝑐, 𝑔, 𝑎, 𝑝, 𝑢, 𝑙, 𝑞〉, 〈𝑜, 𝑐, 𝑔, 𝑔, 𝑛, 𝑖, 𝑛, 𝑗〉, 〈𝑓, 𝑐, 𝑔, 𝑞, 𝑝, 𝑙, 𝑒, 𝑢〉] 𝐿 (test_log_may_9) = [〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑖, 𝑠, 𝑚, 𝑛, 𝑒, 𝑡, 𝑏, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑞, 𝑞, 𝑙, 𝑖, 𝑙, 𝑖, 𝑖, 𝑛, 𝑚, 𝑡, 𝑒, 𝑏, 𝑏, 𝑟, 𝑓, 𝑗, 𝑗〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑛, 𝑒, 𝑡, 𝑏, 𝑟, 𝑘, 𝑗〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑠, 𝑚, 𝑛, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑙, 𝑖, 𝑐, 𝑜, 𝑒, 𝑝, 𝑒, 𝑡, 𝑏, 𝑟, 𝑓, 𝑗 〉, 〈𝑔, 𝑎, ℎ, 𝑞, 𝑖, 𝑐, 𝑚, 𝑚, 𝑝, 𝑠, 𝑠, 𝑜, 𝑠, 𝑛, 𝑒, 𝑡, 𝑏, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑜, 𝑏, 𝑝, 𝑏, 𝑘, 𝑗〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑠, 𝑐, 𝑝, 𝑚, 𝑛, 𝑜, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑖, 𝑐, 𝑠, 𝑚, 𝑜, 𝑝, 𝑛, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑚, 𝑠, 𝑛, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑔, 𝑞, 𝑙, ℎ, 𝑖, 𝑠, 𝑐, 𝑝, 𝑚, 𝑜, 𝑛, 𝑒, 𝑡, 𝑏, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑚, 𝑛, 𝑑, 𝑏, 𝑡, 𝑟, 𝑘〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑐, 𝑝, 𝑜, 𝑒, 𝑡, 𝑏, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑐, 𝑚, 𝑠, 𝑜, 𝑝, 𝑛, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑖, 𝑐, 𝑝, 𝑚, 𝑜, 𝑠, 𝑛, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑖, 𝑐, 𝑜, 𝑝, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈𝑎, 𝑔, ℎ, 𝑞, 𝑙, 𝑖, 𝑠, 𝑚, 𝑐, 𝑛, 𝑝, 𝑜, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉, 〈ℎ, 𝑎, 𝑞, 𝑙, 𝑔, 𝑙, 𝑖, 𝑐, 𝑝, 𝑒, 𝑏, 𝑟, 𝑓, 𝑡, 𝑗〉,

〈𝑎, ℎ, 𝑔, 𝑞, 𝑖, 𝑙, 𝑐, 𝑐, 𝑝, 𝑜, 𝑑, 𝑡, 𝑏, 𝑟, 𝑘, 𝑗〉, 〈𝑎, ℎ, 𝑔, 𝑞, 𝑙, 𝑖, 𝑐, 𝑝, 𝑠, 𝑜, 𝑛, 𝑒, 𝑒, 𝑒, 𝑏, 𝑡, 𝑟, 𝑓, 𝑗〉] 𝐿 (test_log_may_10) = [〈𝑚, 𝑚, 𝑣, 𝑡, 𝑝, 𝑜, 𝑓, 𝑑, 𝑢, 𝑘, ℎ, 𝑠〉, 〈𝑣, 𝑚, 𝑡, 𝑝, 𝑜, 𝑓, 𝑘, 𝑑, 𝑠, 𝑢, ℎ〉, 〈𝑣, 𝑡, 𝑚, 𝑚, 𝑝, 𝑜, 𝑑, 𝑓, 𝑘, 𝑢, 𝑠, ℎ〉, 〈𝑚, 𝑚, 𝑣, 𝑡, 𝑜, 𝑘, 𝑠, 𝑓, 𝑑, 𝑢, ℎ〉, 〈𝑣, 𝑚, 𝑚, 𝑡, 𝑝, 𝑜, 𝑑, 𝑘, 𝑢, ℎ, 𝑓, 𝑠〉, 〈𝑚, 𝑚, 𝑝, 𝑝, 𝑝, 𝑡, 𝑜, 𝑜, 𝑑, 𝑓, 𝑓, 𝑓, 𝑢, 𝑘, 𝑠, 𝑢, ℎ, ℎ〉, 〈𝑚, 𝑚, 𝑣, 𝑡, 𝑜, 𝑑, 𝑓, 𝑢, 𝑓, 𝑠, ℎ〉, 〈𝑚, 𝑚, 𝑝, 𝑣, 𝑜, 𝑘, 𝑑, 𝑓, 𝑡, 𝑠, 𝑢, ℎ 〉, 〈𝑚, 𝑝, 𝑚, 𝑡, 𝑝, 𝑜, 𝑓, 𝑘, 𝑠, 𝑑, 𝑢, ℎ〉, 〈𝑚, 𝑣, 𝑚, 𝑝, 𝑜, 𝑡, 𝑑, 𝑓, 𝑢, 𝑘, ℎ, 𝑠〉 〈𝑐, 𝑘, 𝑠, 𝑑, 𝑓, 𝑣, 𝑣, 𝑢, ℎ, 𝑡〉, 〈𝑚, 𝑚, 𝑝, 𝑝, 𝑜, 𝑡, 𝑑, 𝑓, 𝑘, 𝑠, 𝑢, ℎ〉, 〈𝑚, 𝑝, 𝑣, 𝑣, 𝑜, 𝑡, 𝑑, 𝑘, 𝑓, 𝑢, ℎ, 𝑠〉, 〈𝑣, 𝑡, 𝑚, 𝑚, 𝑝, 𝑜, 𝑑, 𝑘, 𝑓, 𝑟, 𝑠, ℎ〉, 〈𝑚, 𝑣, 𝑚, 𝑡, 𝑝, 𝑜, 𝑘, 𝑠, 𝑓, 𝑑, 𝑢, ℎ〉 〈𝑎, 𝑎, 𝑤, 𝑗, 𝑒〉, 〈𝑚, 𝑚, 𝑣, 𝑡, 𝑡, 𝑜, 𝑜, 𝑢, 𝑓, 𝑘, 𝑠, ℎ〉, 〈𝑚, 𝑣, 𝑡, 𝑚, 𝑝, 𝑝, 𝑜, 𝑑, 𝑢, 𝑘, ℎ, 𝑠〉 〈𝑐, 𝑣, 𝑘, 𝑡, 𝑚, 𝑑, 𝑓, 𝑠, 𝑢, ℎ〉, 〈𝑚, 𝑚, 𝑝, 𝑝, 𝑝, 𝑝, 𝑜, 𝑑, 𝑢, 𝑘, ℎ, ℎ, 𝑓〉]

Checking the Individual Cases (Traces): XES Format – Fuzzy Mining Recently, the most widely standard for storing and exchanging event logs across different platforms for process mining is XES (Extensible Event Streams) because the XES format is less restrictive and truly extendible. XES has been adopted by the IEEE Task Force on Process Mining since 2010 as standard format for process mining and is supported by tools such as ProM, Disco, XE-Same and OpenXES. The Training Log and Test logs for the process discovery has been provided in XES format. The XES document contains Logs which consist of traces. Each trace describes a sequential list of events corresponding to a particular case in terms of the concept:name – 𝑐𝑎𝑠𝑒_𝑖𝑑 and 𝑎𝑐𝑡_𝑛𝑎𝑚𝑒 attribute. The XES files refers to these extensions to provide semantics for the Logs. As shown in Figure 1, attributes can be of five core Types: String, Date, Int, Float, and Boolean - such as the 𝑐𝑎𝑠𝑒_𝑖𝑑, 𝑎𝑐𝑡_𝑛𝑎𝑚𝑒 which are of String Type.

Fig 1. Attributes declaration in XES The extensions gives semantics for a particular attribute. This extension corresponds to the #𝑎𝑐𝑡_𝑛𝑎𝑚𝑒(𝑒) attribute which we used to classify the traces for the test logs. There are three classifiers defined by XES; namely i. ii. iii.

Classifier Activity (concept:name), Classifier Resource (org:resource), Classifier Both (concept:name and org:resource).

However, for the purpose of the work in this document we focus on the Classifier Activity because our objective is to classify the events in the test log based on the concept:name attributes (i.e 𝑎𝑐𝑡_𝑛𝑎𝑚𝑒 for Event Name, and 𝑐𝑎𝑠𝑒_𝑖𝑑 for Lifecycle Transition) and then cross validate the resulting traces with the Training Model. XES supports the classifier concept and as such helps in specifying the list of the attributes associated with the concept:name as shown in Figure 2.

Fig 2. Fragment of the XES file format for the test event log

Following the technique described in the above sections we classified the test event logs. We also imported the XES files for the Test Logs into Disco [2] to see in details how this processes has been performed (Process mapping), and more importantly to determine the individual Cases (trace) that makes up the process in order to check if it matches with the classified traces.

Fig. 3(a) Event Log mining and analysis using Fuzzy miner algorithm in Disco. In Fig 3(a) we assigned the ID Tag to the first column (case_id) in order to identify the events and the second column (act_name) to the set of Activity that makes up the process. The results is a fuzzy model that represents the various cases and activities sequence mapping as shown in Fig. 3(b) and 3(c).

Fig. 3(b) Case View for the test_log_april_1 showing the 20 cases and graph for activities sequence

Fig 3(c) Case view for the test_log_april_1 showing the 20 cases with an example of case 1 (trace) with 13 events and table of the set of Activities for trace 1. The approach described above is what we used to check the results of our classification task to see if they confirm to the given event logs.

Process Models Discovery To discover the process models for the 10 training log provided, we used the Fuzzy miner algorithm in Disco [2] and the Inductive Miner algorithm in ProM [3] to process the data. The results are a Fuzzy Model and a Petri net respectively. We has decided to further use Petri net to analyse our data because Fuzzy models are ambiguous and in essence does not allow the descriptions or semantic behind the activity labels/tags but shows the sequence of activities [4]. According to definition 2.2 in [1] Petri nets - (triplet 𝑁 = (𝑃, 𝑇, 𝐹) are the oldest and best investigated process modelling language that allows for concurrency which consist of Places and Transistions, governed by the firing rule – Tokens with (AND, XOR, OR) splits and joins notations. We imported the XES file for the training log in ProM 6.6 [3] to carry out discovery and further analysis of the event data. We describe in Fig. 4(a) and 4(b) the steps we took in order to discover the Petri net model for the event logs. We also show in Fig. 5(a) to 5(j) the Fuzzy process models discovered using the Fuzzy Miner Algorithm in Disco [2], while Fig. 6(a) to 6(j) shows the Petri net models discovered using the Inductive Miner Algorithm in ProM workbench [3].

Fig. 4(a) XES file import in ProM

Fig. 4(b) Inductive Miner Algorithm in ProM

Fig. 5(a) Fuzzy Model for training_log_1

Fig. 6(a) Petri net Model for training_log_1

Fig. 5(b) Fuzzy Model for training_log_2

Fig. 6(b) Petri net Model for training_log_2

Fig. 5(c) Fuzzy Model for training_log_3

Fig. 6(c) Petri net Model for training_log_3

Fig. 5(d) Fuzzy Model for training_log_4

Fig. 6(d) Petri net Model for training_log_4

Fig. 5(e) Fuzzy Model for training_log_5

Fig. 6(e) Petri net Model for training_log_5

Fig. 5(f) Fuzzy Model for training_log_6

Fig. 6(f) Petri net Model for training_log_6

Fig. 5(g) Fuzzy Model for training_log_7

Fig. 6(g) Petri net Model for training_log_7

Fig. 5(h) Fuzzy Model for training_log_8

Fig. 6(h) Petri net Model for training_log_8

Fig. 5(i) Fuzzy Model for training_log_9

Fig. 6(i) Petri net Model for training_log_9

Fig. 5(j) Fuzzy Model for training_log_10

Fig. 6(j) Petri net Model for training_log_10

Process Analysis, Trace Replay and Fitness Evaluation Process mining aims to address the problem of establishing a direct connection between the discovered process models and actual low-level event data about the processes. Process discovery techniques allows for viewing the same reality from different angles and at different levels of abstraction. To evaluate and cross-validate the classification task for the test log (April and May) with the Training model, we base our technique towards balancing between overfitting and underfitting as described in section 5.4.3 in [1] which focuses on expending measures of data performance indicator using the four quality criteria’s - Fitness, Precision, Generalisation and Simplicity as shown in Figure 7.

Fig 7. Four competing quality criteria for evaluation of process models [1] In Fig. 7 we consider the four quality criteria to explain the fitness of our discovered model as defined in section 3.6 in [1] in order to determine which fractions of the traces in the test log can be fully replayed or are disallowed.  





Fitness: the discovered model should allow for the behaviour seen in the event log. Is the event log possible according to the discovered model? Precision (avoid underfitting): the discovered model should not allow for behaviour completely unrelated to what was seen in the event log. Is the model not underfitting i.e allows for too much? Generalization (avoid overfitting): the discovered model should generalize the example behaviour seen in the event log. Is the model not overfitting i.e only allows for particular examples? Simplicity (Occam’s razor principle): the discovered model should be as simple as possible. Is the discovered model the simplest? One should not increase, beyond what is necessary, the number of entities required to explain anything”, i.e., one should look for the “simplest model” that can explain what is observed in the data set.

The fitness of the discovered model is judged on the Training Log measured against the test log classification results (see Fig. 8) referred to as Cross-Validation in section 3.6.2 in [1].

Figure 8. Cross-validation using test set and training set [1] According to [1], conformance checking is closely related to measuring the fitness of the discovered model and it can also be used to evaluate and compare process discovery algorithms. Section 7.2 of [1] discusses the replaying semantics (Token Replay) for the process models with respect to the four quality criteria. The token replay shows how the notion of event log fitness can be quantified i.e the proportion of behaviour in the event log that are possible according to the discovered model. The token replay are used to establish a tight coupling between the model and event logs. In the following section, we show a step by step guide on how we used the Inductive Visual Miner plugin in ProM - Fig 9(a) to 9(e) to analyse the discovered Petri net model. According to [4] and [5] process deviations are a crucial part of evaluation, they show precisely what parts of the model deviate with respect to the event log and are visualised to show which parts of the model fit well and which parts do not. This is important for drawing reliable conclusions. In Fig. 10(a) to 10(j) we show the resulting visual models generated by applying the Inductive Visual Miner algorithm useful towards process instantiation, replay and test log fitness and analysis.

Fig. 9(a) Inductive Visual Miner in ProM

Fig. 9(b) Resulting Visual Model

Fig. 9(c) Operators for the discovered training_log_1 process model

Fig. 9(d) Pre-mining filter configuration and settings

Fig. 9(e) Trace View for the individual cases for training_log_1 In Figure 10(a) – 10(j), we show the resulting visual models discovered from the training log using the Inductive Visual Miner plugin in ProM.

Fig. 10(a) Inductive Visual Model for training_log_1

Fig. 10(b) Inductive Visual Model for training_log_2

Fig. 10(c) Inductive Visual Model for training_log_3

Fig. 10(d) Inductive Visual Model for training_log_4

Fig. 10(e) Inductive Visual Model for training_log_5

Fig. 10(f) Inductive Visual Model for training_log_6

Fig. 10(g) Inductive Visual Model for training_log_7

Fig. 10(h) Inductive Visual Model for training_log_8

Fig. 10(i) Inductive Visual Model for training_log_9

Fig. 10(j) Inductive Visual Model for training_log_10 The final step for our approach is to determine the fitness of the individual traces from the test event log (April and May) with the discovered process models (training logs). To achieve this objective, it was necessary to construct a BPMN model with notational elements capable of describing the nesting of individual activities (referred to as Task) by using the event-based (AND – XOR – OR) split and join gateways as shown in Figure 11(a). According to [1] an event in BPMN model is comparable to a place in a Petri net, and just like Petri net, are token based semantics which can be used to replay a particular trace within the discovered process model. In Figure 11(b), we use the Convert Petri net to BPMN plugin in ProM to discover the BPMN models for the training logs. Figure 12(a) to 12(j) shows the discovered BPMN Diagram for the training_log_1 to training_log_10.

Fig. 11(a) BPMN Gateway Notations

Fig. 11(b) Conversion of the resulting Petri net model to BPMN Diagram.

Fig. 12(a) BPMN model for training_log_1

Fig. 12(b) BPMN model for training_log_2

Fig. 12(c) BPMN model for training_log_3

Fig. 12(d) BPMN model for training_log_4

Fig. 12(e) BPMN model for training_log_5

Fig. 12(f) BPMN model for training_log_6

Fig. 12(g) BPMN model for training_log_7

Fig. 12(h) BPMN model for training_log_8

Fig. 12(i) BPMN model for training_log_9

Fig. 12(j) BPMN model for training_log_10 In Table 1 and 2, we provide the classification attempt for the event log (test_log_april_1 to test_log_april_10) and (test_log_may_1 to test_log_may_10) respectively. Where each cell indicates if the discovered model classifies the corresponding trace as fitting or not fitting. The columns represents the process models for the training logs, while the rows are used to represents the individual traces for the test event log. For example, cell at (row Trace_3; column Training model_5) contains the classification attempt for the 3rd trace discovered from the test_log_april_5 cross-validated against the training_log_5.

TABLE 1. TRACE FITNESS FORM FOR THE TEST EVENT LOGS (TEST_LOG_APRIL_1 TO TEST_LOG_APRIL_10)

Trace_1 Trace_2 Trace_3 Trace_4 Trace_5 Trace_6 Trace_7 Trace_8 Trace_9 Trace_10 Trace_11 Trace_12 Trace_13 Trace_14 Trace_15 Trace_16 Trace_17 Trace_18 Trace_19 Trace_20

Training model_1 TRUE TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE FALSE TRUE TRUE

Training model_2 TRUE TRUE FALSE TRUE TRUE FALSE FALSE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE TRUE

Training model_3 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE

Training model_4 TRUE TRUE FALSE FALSE FALSE FALSE TRUE FALSE TRUE FALSE TRUE TRUE TRUE FALSE TRUE FALSE FALSE TRUE TRUE TRUE

Training model_5 TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE

Training model_6 TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE FALSE TRUE

Training model_7 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Training model_8 TRUE TRUE TRUE TRUE FALSE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE

Training model_9 TRUE FALSE TRUE TRUE FALSE FALSE TRUE TRUE TRUE FALSE FALSE TRUE TRUE TRUE FALSE FALSE FALSE FALSE FALSE TRUE

Training model_10 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE

TABLE 2. TRACE FITNESS FORM FOR THE TEST EVENT LOGS (TEST_LOG_MAY_1 TO TEST_LOG_MAY_10)

Trace_1 Trace_2 Trace_3 Trace_4 Trace_5 Trace_6 Trace_7 Trace_8 Trace_9 Trace_10 Trace_11 Trace_12 Trace_13 Trace_14 Trace_15 Trace_16 Trace_17 Trace_18 Trace_19 Trace_20

Training model_1 TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Training model_2 FALSE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Training model_3 FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Training model_4 TRUE TRUE TRUE TRUE TRUE FALSE TRUE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE FALSE TRUE FALSE TRUE

Training model_5 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE

Training model_6 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Training model_7 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Training model_8 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE

Training model_9 TRUE FALSE FALSE TRUE TRUE FALSE FALSE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE FALSE TRUE TRUE

Training model_10 TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

Final Results of the Contest The contest committee published on the website [6] (a) 10 test logs, each of which containing 20 traces, that are used to score the submissions; and (b) 10 reference process models in BPMN that have been generated from the original event logs that were kept secret. Table (3), (4) and (5) shows the result of the scoring by the contest committee which contains the model with the real classifications, model with the provided classifications, and the final result after scoring, respectively. TABLE 3. MODEL WITH THE REAL CLASSIFICATION

TABLE 4. MODEL WITH THE PROVIDED CLASSIFICATION

TABLE 5. FINAL RESULT AFTER SCORING BY THE CONTEST COMMITTE

The final result after scoring by the committee shows that the process mining approach we employed has correctly classified 85.5% of the traces in the original process model.

Presently, the only other contest related to process mining is the annual Business Process Intelligence Challenge (BPIC). The BPIC uses real-life data without objective evaluation criteria: It is about the perceived value of the analysis and is not limited to the process discovery task - also conformance checking, performance analysis, etc. The report is evaluated by a jury. The Process Discovery Contest is different. The focus is on process discovery. Synthetic data are used to have an objectified “proper” answer. Process discovery is whirled into a classification task with a training set and a test set. A process model needs to decide whether traces are fitting or not.

References [1] W.M.P van der Aalst (2011) “Process Mining: Discovery, Conformance and Enhancement of Business Processes” Springer, 2011. http://www.springer.com/us/book/9783642434952 [2] Disco Software, http://fluxicon.com/disco/, User Guide: https://fluxicon.com/disco/files/Disco-User-Guide.pdf [3] ProM Tool, http://www.processmining.org/prom/start [4] S J. J. Leemans, D. Fahland and W. M. P. van der Aalst (2015) Exploring Processes and Deviations. In Business Process Management Workshops, Volume 202 of the series Lecture Notes in Business Information Processing pp 304-316. Available at: http://wwwis.win.tue.nl/~wvdaalst/publications/z4.pdf [5] S J. J. Leemans, D. Fahland and W. M. P. van der Aalst (2014) Process and Deviation Exploration with Inductive visual Miner [Online] Available at: http://www.processmining.org/_media/blogs/pub2014/bpmdemoleemans.pdf (Accessed June, 2016) [6] Process Discovery Contest @ BPM 2016: during the BPI workshop that is co-located with the BPM 2016 conference, Rio de Janeiro, Brazil, September 18th, 2016. [Online] Available at: https://www.win.tue.nl/ieeetfpm/doku.php?id=shared:process_discovery_contest

Lihat lebih banyak...

Comentários

Copyright © 2017 DADOSPDF Inc.