A Recommender System for Process Discovery J Ribeiro
A Recommender System for Process Discovery J. Ribeiro 1, J. Carmona 1, M. Misir 2, M. Sebag 3 1 Universitat Politecnica de Catalunya { jribeiro, jcarmona }@cs. upc. edu 2 Singapore Management University misir@gmail. com 3 TAO (INRIA) Michele. Sebag@lri. fr
Recommending discovery algorithms Plugin available in Pro. M under the Recommendation package 2
Outline • Recommender Systems & Algorithm Selection • Process Mining • System Characterization – Training – Recommending • Experiments • Parameters Exploration • Future Directions 3
Recommender Systems • Predicting items that may be interesting to a specific user or customer. • Idea: Feedback information is used to group objects sharing the same characteristics. Within a group, objects with known behavior are used to predict the unknown behavior of other objects. • Possible applications: – E-commerce [2] (e. g. , e. Bay) – Media providers [1] (e. g. , Net. Flix) – Portfolio-based selection algorithms [3] (e. g. , SATzilla) 1: Bennett et al. 2: Schafer et al. 3: Xu et al. 4
Algorithm Selection • Algorithm Selection: Find, in a portfolio, the best algorithm for a given problem (Rice, 1976) 5
Collaborative filtering for Algorithm Selection • Collaborative Filtering: if users X and Y rate a number of movies similarly then it is likely that X will like the movies Y liked Feedback information is used to predict new cases 6
Process Mining • Extracting non-trivial knowledge from process data. – Process Discovery – Conformance Checking – Extension (or Enhancement) Event Log Control-Flow Miner Process Model ? ? ? 7
Recommender System for Process Discovery • Motivation Example 8
Generating Performance Knowledge Control-Flow Miner Process Model Event Log Report Conformance Enhancer Checker Experiment Database f(x) Prediction Models Event Logs Models Experiment Results 9
Server-Client Architecture Client pu t Server Log In Co. Be. Fra Training f(x) Predicting Prediction Models Ou tp u t Pro. M Top-k Control-Flow Miners Repository My. SQL 10
System: Training Client In pu t Server Log Training f(x) Prediction Models Predicting Ou tp u t Top-k Control-Flow Miners Repository 11
Training: Evaluation Framework Management Tools Conformance Checkers Conformance Checking Repository Event Logs Process Discovery Process Models Control-Flow Miners Experiment Results 12
Training: Discovery Experiment Database Event Logs Models Miner Log Model Control-Flow Miners Experiment Results Petri nets • Alpha Miner • ILP Miner • Inductive Miner • Passages Miner • Genet (on TS Miner) • Flower Miner Causal nets • Heuristics Miner • Flexible Heuristics Miner • Genetic Miner Transition systems • Transition System (TS) Miner Fuzzy Models • Fuzzy Miner 13
Training: Conformance Experiment Database Event Logs Models Experiment Results Checker Log Report Conformance Checkers Fitness • Token-Based Fitness • Alignment-Based Fitness • Behavioral Recall Precision • ETC Precision • Alignment-Based Precision • Behavioral Precision Generalization • Negative Event Generalization • Alignment-Based Probabilistic Generalization • Behavioral Generalization Simplicity • Model’s Elements • Cut Vertices • Average Node Arc Degree Performance • Runtime 14 • Used Memory
Training: Prediction Models Database Event Logs Models Classifier Experiment Results Extractor Log Feature Extractors Log • Concurrency ARS • Density • Entropy Traces • Total Traces • Distinct Traces • Average Trace Length • Event f(x) Repetitions Intra Trace Event • Total Events Prediction Models • Distinct Events • Start Events • End Events Flow • Length-One Loops 15
ARS: Algorithm Recommender System Apply SVM or NN Apply SVD 16
ARS: Example Performance Knowledge: Fitness 0. 7 Log 1 ILP > FUZZY = ALPHA 2 FHM Fitness 0. 8 Log 2 FHM = ILP 2 ALPHA Precision 0. 7 Log 3 ALPHA > ILP 3 FUZZY Precision 0. 5 Log 1 1 FHM Precision 0. 8 Log 2 2 ILP Fitness 0. 8 Log 3 1 1 ALPHA Fitness 0. 7 Log 1 3 3 FHM Precision 0. 7 Log 2 3 1 4 1 3 ALPHA Fitness 0. 8 Log 3 1 2 3 2 1 ILP Fitness 0. 9 3 ILP Fitness 0. 6 2 1 1 1 3 1 FHM ILP FUZZY ILP 1 FUZZY Value FUZZY Measure FHM Miner ALPHA Log ALPHA Experiments Results 2 17
System: Recommending Client In pu t Server Log Training f(x) Prediction Models Predicting Ou tp u t Top-k Control-Flow Miners Repository 18
Recommending Client Features Classifier Prediction Models f(x) BPA 2 Retriever Extractor Log Feature Extractor Predictions Top-k Control-Flow Miners 19
ARS: Algorithm Recommender System 20
Predicting Rankings Traces ALPHA FHM FUZZY ILP 20 1500 3% Log 1 3 1 50 500 1% Log 2 3 1 4 1 15 5000 5% Log 3 1 2 3 2 25 2500 2% Log X 3? 1? 4? 2? . . . Noise Events (Fitness) 21
# Miner 1 ILP 2 ALPHA 3 1 FHM 2 ILP FHM 3 4 TS 5 FUZZY . . . # Miner 1 ALPHA 2 FHM ALPHA 3 4 TS 5 FUZZY . . . # Miner 1 FHM 2 ALPHA FUZZY 3 FUZZY 4 TS 5 ILP . . . Weight: 1. 0 Weight: 0. 75 Weight: 0. 5 Global Score ILP: FHM: ALPHA: TS: Fuzzy: 5. 50 = 1× 1. 0+2× 0. 75+5× 0. 1 4. 85 = 3× 1. 0+1× 0. 75+2× 0. 5+1× 0. 1 4. 95 = 2× 1. 0+3× 0. 75+1× 0. 5+2× 0. 1 9. 40 = 4× 1. 0+4× 0. 75+4× 0. 1 10. 55 = 5× 1. 0+5× 0. 75+3× 0. 1 Memory Miner Runtime # Precision Fitness Retrieving the Top-k Techniques . . . Weight: 0. 1 22
Experiments • Training – 130 event logs (112 synthetic, 18 real life) – 1129 discovery experiments • 882 process models, 5475 measurements • Recommending – 13 event logs (9 synthetic, 4 real life) • Accuracy: – 1, if predicted best-performing technique matches the measured bestperforming one – 0, if predicted best-performing technique matches the measured worst-performing one – between 0 and 1 (min-max normalization), otherwise 23
Accuracy by event log 1 0, 9 0, 8 0, 7 0, 6 0, 5 0, 4 0, 3 0, 2 0, 1 0 R 1 R 2 R 3 R 4 S 1 S 2 S 3 S 4 S 5 S 6 S 7 S 8 S 9 24
Accuracy by measure category 1 Real Life 0, 9 Synthetic 0, 8 0, 7 0, 6 0, 5 0, 4 0, 3 0, 2 0, 1 0 Fitness Generalization Performance Precision Simplicity 25
Miner Parameters Event Log Control-Flow Miner Process Model p 1 = 0 Parameter Values • • Flexible Heuristics Miner: 7 parameters What is the best parameter setting? Genetic Miner: 13 parameters What parameters to focus on? ILP Miner: 9 parameters. . . 26
Elementary Effect Method • Elementary Effect method – Sobol’s numbers – Radial OAT strategy p 2 (p 1, p 2, p 3) (1, 2, 2) (1, 1, 0) • Preliminary study (2, 1, 2) – FHM: 7 parameters – 4 measures (1, 1, 2) p 1 p 3 A = (1, 1, 2) B = (2, 2, 0) 27
Parameters Exploration: Results 100% 80% 60% Others P 3 P 2 40% P 1 20% 0% 1 2 3 4 5 6 7 8 28
Future Directions • Parameters Exploration • User Feedback Loop • Recommender System as a Discovery Technique • Extension of Techniques, Measure, and Features • Submission of Logs and Techniques • . . . 29
RS 4 PD: A Tool for Recommending Control-Flow Algorithms J. Ribeiro and J. Carmona { jribeiro, jcarmona }@cs. upc. edu Questions ? How to get it Plugin available in Pro. M under the Recommendation package
- Slides: 30