3 Decision Tree Rule Induction Poll Which data

  • Slides: 56
Download presentation
3. 의사결정나무 Decision Tree (Rule Induction)

3. 의사결정나무 Decision Tree (Rule Induction)

Poll: Which data mining technique. . ?

Poll: Which data mining technique. . ?

Classification Process with 10 records Step 1: Model Construction with 6 records Training Data

Classification Process with 10 records Step 1: Model Construction with 6 records Training Data Classification Algorithms Classifier (Model) IF rank = ‘professor’ OR years > 6 THEN tenured = ‘yes’

Step 2: Test model with 6 records & Use the Model in Prediction Classifier

Step 2: Test model with 6 records & Use the Model in Prediction Classifier Testing Data Unseen Data (Jeff, Professor, 4) Tenured?

Who buys notebook computer? Training Dataset is given below: This follows an example from

Who buys notebook computer? Training Dataset is given below: This follows an example from Quinlan’s ID 3

Tree Output: A Decision Tree for Credit Approval age? <=30 student? overcast 30. .

Tree Output: A Decision Tree for Credit Approval age? <=30 student? overcast 30. . 40 yes >40 credit rating? no yes excellent fair no yes no

Extracting Classification Rules from Trees • Represent the knowledge in the form of IF-THEN

Extracting Classification Rules from Trees • Represent the knowledge in the form of IF-THEN rules • One rule is created for each path from the root to a leaf • Each attribute-value pair along a path forms a conjunction • The leaf node holds the class prediction • Rules are easier for humans to understand • Example IF IF IF age age age = = = “<=30” AND student = “no” THEN buys_computer = “no” “<=30” AND student = “yes” THEN buys_computer = “yes” “ 31… 40” THEN buys_computer = “yes” “>40” AND credit_rating = “excellent” THEN buys_computer = “yes” “>40” AND credit_rating = “fair” THEN buys_computer = “no”

An Example of ‘Car Buyers’ – Who buys Lexton? no Job M/F Area Age

An Example of ‘Car Buyers’ – Who buys Lexton? no Job M/F Area Age Y/N 1 NJ M N 35 N 2 NJ F N 51 N 3 OW F N 31 Y 4 EM M N 38 Y 5 EM F S 33 Y 6 EM M S 54 Y 7 OW F S 49 Y 8 NJ F N 32 N 9 NJ M N 32 Y 10 EM M S 35 Y 11 NJ F S 54 Y 12 OW M N 50 Y 13 OW F S 36 Y 14 EM M N 49 N Job (14, 5, 9) Emplyee (5, 2, 3) Owner (4, 0, 4) No Job (5, 3, 2) Age Y Res. Area Below 43 (3, 0, 3) Above 43 (2, 2, 0) South (2, 0, 2) North (3, 3, 0) Y N * (a, b, c) means a: total # of records, b: ‘N’ counts, c: ‘Y’ counts

Lab on Decision Tree(1) • SPSS Clementine, SAS Enterprise Miner • See 5/C 5.

Lab on Decision Tree(1) • SPSS Clementine, SAS Enterprise Miner • See 5/C 5. 0 Download See 5/C 5. 0 2. 02 • Evaluation from http: //www. rulequest. com

Lab on Decision Tree(2) • From below initial screen, choose File – Locate Data

Lab on Decision Tree(2) • From below initial screen, choose File – Locate Data

Lab on Decision Tree(3) • Select housing. data from Samples folder and click open.

Lab on Decision Tree(3) • Select housing. data from Samples folder and click open.

Lab on Decision Tree(3(4) • This data set is on deciding house price in

Lab on Decision Tree(3(4) • This data set is on deciding house price in Boston area. It has 350 cases and 13 variables.

Lab on Decision Tree (5) • Input variables • crime rate • proportion large

Lab on Decision Tree (5) • Input variables • crime rate • proportion large lots: residential space • proportion industrial: ratio of commercial area • CHAS: dummy variable • nitric oxides ppm: polution rate in ppm • av rooms per dwelling: # of room for dwelling • proportion pre-1940 • distance to employment centers: distance to the center of city • accessibility to radial highways: accessibility to high way • property tax rate per $10, 000 • pupil-teacher ratio: teachers’ rate • B: racial statistics • percentage low income earners: ratio of low income people • Decision variable • Top 20%, Bottom 80%

Lab on Decision Tree(6) • For the analysis, click Construct Classifier or click Construct

Lab on Decision Tree(6) • For the analysis, click Construct Classifier or click Construct Classifier from File menu

Lab on Decision Tree(7) • Click on Global pruning to (V ). Then, click

Lab on Decision Tree(7) • Click on Global pruning to (V ). Then, click OK

Lab on Decision Tree(8) Decision Tree Evaluation with Training data Evaluation with Test data

Lab on Decision Tree(8) Decision Tree Evaluation with Training data Evaluation with Test data

Lab on Decision Tree(9) • Understanding picture • We can see that (av rooms

Lab on Decision Tree(9) • Understanding picture • We can see that (av rooms per dwelling) is the most important variable in deciding house price.

Lab on Decision Tree(11) • 의사결정나무 그림으로는 규칙을 알아보기 어렵다. • To view the

Lab on Decision Tree(11) • 의사결정나무 그림으로는 규칙을 알아보기 어렵다. • To view the rules, close current screen and click Construct Classifier again or click Construct Classifier from File menu.

Lab on Decision Tree(12) • Choose/click Rulesets. Then click OK.

Lab on Decision Tree(12) • Choose/click Rulesets. Then click OK.

Lab on Decision Tree(13)

Lab on Decision Tree(13)

How decision tree is derived from a data set : A case of predicting

How decision tree is derived from a data set : A case of predicting Play/Not Play with weather information

A sample problem Predict Play or Not Play (ex: Playing Golf) with independent variables

A sample problem Predict Play or Not Play (ex: Playing Golf) with independent variables such as • outlook • temperature • humidity • windy

Decision Variables Output Variables(decision variables). Play (golf). Not Play(golf)

Decision Variables Output Variables(decision variables). Play (golf). Not Play(golf)

Data set

Data set

Sort data with outlook But, it still needs to be refined!

Sort data with outlook But, it still needs to be refined!

Final Decision Tree Induced from Data

Final Decision Tree Induced from Data

4. 인공신경망 (Neural Networks) 125

4. 인공신경망 (Neural Networks) 125

Table of Contents I. Introduction of Neural Networks II. Application of Neural Networks III.

Table of Contents I. Introduction of Neural Networks II. Application of Neural Networks III. Theory of Neural Networks IV. A Neural Network Demo

What is neural networks ? • http: //www. youtube. com/watch? v=DG 5 Uy. RBQD

What is neural networks ? • http: //www. youtube. com/watch? v=DG 5 Uy. RBQD 4&feature=rellist&playnext=1&list=PL 4 FA 5 D 71 B 0 BA 92 C 1 C

I. Introduction of Neural Networks • It is simulation of human brain • It

I. Introduction of Neural Networks • It is simulation of human brain • It is the most well known artificial intelligence techniques • We are using them: voice recognition system, reading hand writes, door rocks et al. • It is a called black box

It is a simulator for human brain • Neural Networks simulate human brain •

It is a simulator for human brain • Neural Networks simulate human brain • Learning in Human Brain • Neurons • Connection Between Neurons • Neural Networks As Simulator For Human Brain • Processing Elements or Nodes • Weights

II. Applications of Neural Networks • Prediction of Outcomes • Patterns Detection in Data

II. Applications of Neural Networks • Prediction of Outcomes • Patterns Detection in Data • Classification

Business ANN Applications -1 • Accounting • Identify tax fraud • Enhance auditing by

Business ANN Applications -1 • Accounting • Identify tax fraud • Enhance auditing by finding irregularities • Finance • • • Signatures and bank note verifications Foreign exchange rate forecasting Bankruptcy prediction Customer credit scoring Credit card approval and fraud detection* Stock and commodity selection and trading Forecasting economic turning points Pricing initial public offerings* Loan approvals

Business ANN Applications -2 • Human Resources • Predicting employees’ performance and behavior •

Business ANN Applications -2 • Human Resources • Predicting employees’ performance and behavior • Determining personnel resource requirements • Management • Corporate merger prediction • Country risk rating • Marketing • Consumer spending pattern classification • Sales forecasts • Targeted marketing, … • Operations • Vehicle routing • Production/job scheduling, …

III. Theory of Neural Networks • Neural Computing is a problem solving methodology that

III. Theory of Neural Networks • Neural Computing is a problem solving methodology that attempts to mimic how human brain functions • Artificial Neural Networks (ANN) • Machine Learning/Artificial Intelligence

The Biological Analogy • Neurons: brain cells • Nucleus (at the center) • Dendrites

The Biological Analogy • Neurons: brain cells • Nucleus (at the center) • Dendrites provide inputs • Axons send outputs • Synapses increase or decrease connection strength and cause excitation or inhibition of subsequent neurons

Artificial Neural Networks (ANN) Biological Soma Dendrites Axon Synapse Artificial <-> <-> Node Input

Artificial Neural Networks (ANN) Biological Soma Dendrites Axon Synapse Artificial <-> <-> Node Input Output Weight Three Interconnected Artificial Neurons

Basic structure of Neural Networks • Network Structure : Layers, Nodes and Weights Input

Basic structure of Neural Networks • Network Structure : Layers, Nodes and Weights Input Layer Hidden Layer Output Layer

ANN Fundamentals

ANN Fundamentals

ANN Fundamentals: how informatio is processed in ANN • Processing Information by the Network

ANN Fundamentals: how informatio is processed in ANN • Processing Information by the Network • • Inputs Outputs Weights Summation Function • Figure 15. 5

1. Learning in NN(Neural Network) is finding the best numeric values (X), representing input

1. Learning in NN(Neural Network) is finding the best numeric values (X), representing input (4) and output(8) relationship ( ex: 4 * X outputs =8) Compute *Try with x= 1, x= 2, x=3, …… When x=4, it solve the problem. 2. Compare outputs with desired targets 3. Adjust the weights and repeat the process

Neural Network Architecture • There are several ANN architectures : feed forward, recurrent, Hopfield

Neural Network Architecture • There are several ANN architectures : feed forward, recurrent, Hopfield et al.

Neural Network Architecture • Feed forward Neural Network : Multi Layer Perceptron, - Two,

Neural Network Architecture • Feed forward Neural Network : Multi Layer Perceptron, - Two, Three, sometimes Four or Five Layers, But normally 3 layers are common structure.

How a Network Learns • Step function evaluates the summation of input values •

How a Network Learns • Step function evaluates the summation of input values • Calculating outputs • Measure the error (delta) between outputs and desired values • Update weights, reinforcing correct results At any step in the process for a neuron, j, we get Delta(Error) = Zj - Yj where Z and Y are the desired and actual outputs, respectively

Backpropagation 1. Initialize the weights 2. Read the input vector 3. Generate the output

Backpropagation 1. Initialize the weights 2. Read the input vector 3. Generate the output 4. Compute the error Error = Output – Desired output 5. Change the weights n Drawbacks: n n A large network can take a very long time to train May not converge

Training A Neural Networks • Neural Networks learn from data • Learning is finding

Training A Neural Networks • Neural Networks learn from data • Learning is finding the best weights values which represent the input and output relationship in Neural Networks • (ex: 4*X= 8)-> finding the value for X

training data set and test data • Collect data and separate it into set

training data set and test data • Collect data and separate it into set • • • Training Training set set set (50%), (60%), (70%), (80%), (90%), Testing Testing set set set (50%) (40%) (30%) (20%) (10%) • Use training data set to build model • Use test data set to validate the trained network

Prediction with New Data • If the Neural Network's performance in test is good

Prediction with New Data • If the Neural Network's performance in test is good , it can be used to predict outcome of new unseen data • If the performance with test is not good, you should collect more data, add more input variables

How does Neural Network for prediction? Terms in Neural Networks

How does Neural Network for prediction? Terms in Neural Networks

Demo – How does Neural Network for prediction?

Demo – How does Neural Network for prediction?

ANN Development Tools • • E-Miner Clementine • Neuro. Solutions Statistica Neural Network •

ANN Development Tools • • E-Miner Clementine • Neuro. Solutions Statistica Neural Network • Toolkit • Braincel (Excel Add-in) • Neural. Works Brainmaker Path. Finder Trajan Neural Network Simulator Neuro. Shell Easy SPSS Neural Connector Neuro. Ware

Why use Neural Networks in Prediction? major benefits of Neural Networks

Why use Neural Networks in Prediction? major benefits of Neural Networks

Benefits of ANN Advantages: • Non-linear model leads to better performance • It works

Benefits of ANN Advantages: • Non-linear model leads to better performance • It works generally good when data size is small • It works generally good when there are noises in data • It works generally good when there are missing in data (incomplete data set) • Fast decision making Diverse Applications: • Pattern recognition • Character, speech and visual recognition

Limitations of ANN • Black box that is hardly understood by human • Lack

Limitations of ANN • Black box that is hardly understood by human • Lack of explanation capabilities • Training time can be excessive and tedious

IV. A Neural Networks Demo • How do neural networks learn? : trials and

IV. A Neural Networks Demo • How do neural networks learn? : trials and errors http: //www. youtube. com/watch? v=0 Str 0 Rdkxxo