In this third assignment we work on a real-life event log taken from a Dutch financial institution. This log contains some 262.200 events in 13.087 cases. Apart from some anonymization, the log contains all data as it came from the financial institute. The process represented in the event log is an application process for a personal loan or overdraft within a global financing organization. Your task is to analyze, visualize and mine the loan application process using ProM.
This is Assignment 3 of the Business Intelligence and Process Modelling course taught at Leiden University.
For each part of the assignment, the number of points awarded for a 100% perfect answer is listed between brackets and sums to a total of 100 points plus 10 bonus points. Your assignment grade (between 1 and 10, bounds included) is computed by dividing your number of points by 10 and rounding it to the nearest half. If you get an insufficient grade for the assignment, you can retake the assignment by meeting the assignment retake deadline. Please do not be late with handing in your work. If you are late with handing in your work, it means that you failed the assignment and that you are automatically using the retake deadline for the assignment. You are allowed to work in teams consisting of exactly two people. For each question, clearly describe how you obtained your answer, and write down any non-trivial assumptions. This assignment can be done on the student workstations. Hand in your final assignment report (in PDF, generated using LaTeX) via the link on the course website.
Questions or remarks? Contact the lecturer at email@example.com, walk by Snellius room 157b, or ask your questions during one of the weekly lectures or "werkcolleges".
Warning: Before getting started with this assignment, it is highly recommended that you do the ProM Getting Started Tutorial and walk through the Exercises, focussing on the part on the discovery of Petri nets (and not so much on the other models and techniques).
To get ProM running under UNIX, edit ProM641.sh and change JAVA=java to JAVA=/usr/lib/jvm/java-1.7.0-openjdk-amd64/jre/bin/java
The datafile can be found here: BPI_Challenge_2012.xes.gz (unzip and import this file in ProM). The amount (size of the loan) requested by the customer is indicated in the case attribute AMOUNT_REQ, which is global, i.e. every case contains this attribute. The event log is a merger of three intertwined sub processes. The first letter of each task name identifies from which sub process (source) it originated from. Feel free to run analyses on the process as a whole, on selections of the whole process and/or the individual sub processes. Event types are explained in the table below.
Informal process description: An application is submitted through a webpage. Then, some automatic checks are performed, after which the application is complemented with additional information. This information is obtained trough contacting the customer by phone. If an applicant is eligible, an offer is sent to the client by mail. After this offer is received back, it is assessed. When it is incomplete, missing information is added by again contacting the customer. Then a final assessment is done, after which the application is approved and activated.
|States of the application|
|States of the offer belonging to the application|
|States of the work item belonging to the application|
|COMPLETE||The task (of type ‘A_’ or ‘O_’) is completed|
|SCHEDULE||The work item (of type ‘W_’) is created in the queue (automatic step following manual actions)|
|START||The work item (of type ‘W_’) is obtained by the resource|
|COMPLETE||The work item (of type ‘W_’) is released by the resource and put back in the queue or transferred to another queue (SCHEDULE)|
The goal of the assignment is to become familiar with a tool such as ProM to analyze business processes based on event data. You will need to write a report on your activities, addressing both technical (Process Mining) and domain-specific (Process Analysis) aspects:Process Mining [60p]
Analyze the event logs using ProM in at least four different ways:
For each technique, report the most important findings and results. Which steps did you take in the ProM tool to obtain the desired results? What settings and parameters did you tune? Please do include plenty of screenshots and diagrams.
[20p] Explain the differences between the different analysis techniques in terms of what information they conceptually provide, and how they work in practice on larger datasets such as the one provided in this assignment.
The bank is interested in all valuable information hidden in the event data. The main question is: what does the process model look like, and what can we learn from it?
[20p] Try to answer some of the questions below to the best of your ability. [20p] Make sure to relate them to the financial application domain, explaining how answering these questions may improve the bank's business processes:
Think of plotting some relevant values, averages and distributions. Remember to always provide an answer which is based on the data, and always explain how you obtained your answer.
Optional: convert the .xes-data into .csv and also experiment with exploring this data using EventPad [up to 10p bonus points]).
Good luck with the assignment! Ask questions. A lot, if you have to. The deadline is posted on the course website.
Full credits for this exercise and the data goes to the BPI Challlenge 2012 held at the 8th International Workshop on Business Process Intelligence.