Business Data Solution Using Clustering Linear Programming and

Business Data Solution Using Clustering, Linear Programming, and Neural Net A presentation to El Paso del Norte Software Association Somnath (Shom) Mukhopadhyay Information and Decision Sciences Department The University of Texas at El Paso August 27 th 2003 1

Outline of Presentation - Data Mining Definition - Introduction of Neural Net - Physiological flavor General framework Classes of PDP models Sigma-PI units Conclusion 2

Outline of Presentation (Continued) - - Examples of real-world application problems Organization of theoretical concepts Three methods used for classification A new LP based method for classification problem. Application to a fictitious problem with four classes. Comparing LP method results with the results from a neural network method Q&A 3

Data Mining - definition - Exploring relationships in large amount of data - Should generalize - Should be empirically validated - Examples - Customer Relationship Management (CRM) - Credit Scoring - Clinical decision support 4

PDP Models and Brain - Physiological Flavor - Representation and Learning in PDP models - Origins of PDP - Jackson (1869) and Luria (1966) Hebb (1950) Rosenblatt (1959) Grossberg (1970) Rumelhart (1977) 5

General Framework for PDP - A set of processing units A state of activation An output function of each unit A pattern of connectivity among units A propagation rule An activation rule A learning rule An operating environment 6

The Basic Components of a PDP system 7

Classes of PDP models - Simple Linear Models - Linear Threshold Units - Brain State in a Box (BSB) by J. A. Anderson - Thermodynamic models - Grossberg - Connectionist modeling 8

Sigma-PI Units 9

A few real-world applications of interest to organizations and individuals 1. Breast cancer detection 2. Heart disease diagnosis 3. Enemy sub-marine detection 4. Mortgage delinquency prediction 5. Stock market prediction 6. Japanese Character recognition and conversion 10

11

What is classification? Identification of a set of certain mutually exclusive classes – Identify a set of meaningful attributes that discriminate among the classes – Illustrations • Using a meaningful set of attributes, can we differentiate between frequent and infrequent occurrence? 12

Decision Boundaries of a typical classification problem 13

Three Methods for Classification – Identifying decision boundaries for each class region • Linear discriminant (Glover at al. , 1988) • Linear programming (Roy and Mukhopadhyay, 1991) • Neural Networks (Rumelhart, 1986) 14

A new LP based method for classification problem Step 1. Identify and discard outliers using Clustering Step 2. Form decision boundaries for each class region by using LP 15

Step 2: Form Decision Boundaries – Development of Boundary Functions • Use convex functions to calibrate the boundary. One example function: f(x) = ai Xi + bi Xi 2 + cij Xi Xj + d where j = i + 1 16

Step 2: Form Decision Boundaries (Contd. ) – One instance of the general function. f. A(x) = a 1 X 1 + a 2 X 2 + b 1 X 12 + b 2 X 22 + d 17

Step 2: Form Decision Boundaries (Contd. ) – LP formulation of the previous problem instance Minimize e s. t. f. A(x 1) >= e … f. A(x 8) >= e f. A(x 9) <= -e. . . f. A(x 18) <= -e e>= a small positive constant. Minimize e s. t. a 2 + b 2 + d >= e for pattern x 1 a 1 + b 1 + d >= e for pattern x 2 - a 2 + b 2 + d >= e for pattern x 3 - a 1 + b 1 + d >= e for pattern x 4 …. a 1 + a 2 + b 1 + b 2 + d <= - e for pattern x 15 a 1 - a 2 + b 1 + b 2 + d <= - e for pattern x 16 - a 1 - a 2 + b 1 + b 2 + d <= - e for pattern x 17 - a 1 + a 2 + b 1 + b 2 + d <= - e for pattern x 18 e>= a small positive constant. 18

Step 2: Form Decision Boundaries (Contd. ) – Solution of this LP formulation gives decision boundaries. Specifically we get, a 1 = 0, a 2 = 0, b 1 = -1, b 2 = -1, d = 1+e Therefore, the boundary function f. A(x) = a 1 X 1 + a 2 X 2 + b 1 X 12 + b 2 X 22 + d translates into: f. A(x) = 1 - X 12 - X 22 + e 19

Step 2: Form Decision Boundaries (Contd. ) – Putting this result into picture we have the following decision boundary: 20

Step 2: Form Multiple Decision Boundaries A class does not have to be neatly packed within one boundary. ¨ For problems requiring multiple decision boundaries, the algorithm can find multiple disjointed regions for the same class. For example, a class called “corner seats” in a soccer stadium is scattered into four disjointed regions. 21

An example of a decision space of a fictitious problem (It has four classes: A, B, C, D) 22

Decision Boundary Identification Process for Class D only 23

Six Decision Boundaries found for Class B 24

Constructing MLP from masks Masking functions put on a network to exploit parallelism. 25

Neural Networks Method for Classification – Neural networks • develops non-linear functions to associate inputs with outputs • no assumptions about distribution of data • handles missing data well (graceful degradation) – Supervised neural networks • Estimating and testing the model – Construct a training sample and a holdout sample – Estimate model parameters using training sample – Test the estimated model’s classification ability using holdout sample 26

Comparison between LP and NN performance for three real-world problem Problem Test Error Rate (%) Total Number of Parameters Trained LP NN 1. Breast Cancer 1. 7 2. 96 19 990 2. Heart Disease 18. 38 36. 36 27 900 3. Submarine Detection 9. 62 N/A 61 N/A 27

Future Research - Autonomous Learning: • • learn without outside interventions does class dependent feature selection derives simple if-then type classification rules that humans can understand develops non-linear functions to associate inputs with outputs 28

Q&A Thank you. 29