Virtual University of Pakistan Data Warehousing Lecture31 Supervised

  • Slides: 16
Download presentation
Virtual University of Pakistan Data Warehousing Lecture-31 Supervised vs. Unsupervised Learning Ahsan Abdullah Assoc.

Virtual University of Pakistan Data Warehousing Lecture-31 Supervised vs. Unsupervised Learning Ahsan Abdullah Assoc. Prof. & Head Center for Agro-Informatics Research www. nu. edu. pk/cairindex. asp National University of Computers & Emerging Sciences, Islamabad Email: ahsan 101@yahoo. com

Data Structures in Data Mining • Data matrix – Table or database – n

Data Structures in Data Mining • Data matrix – Table or database – n records and m attributes, – n >> m • Similarity matrix – Symmetric square matrix – n x n or m x m C 1, 1 C 1, 2 C 1, 3 … C 1, m C 2, 1 C 2, 2 C 2, 3 C 2, m C 3, 1 C 3, 2 C 3, 3 C 3, m Cn, 1 Cn, 2 Cn, 3 … Cn, m 1 S 1, 2 S 1, 3 … S 1, n S 2, 1 1 S 2, 3 S 2, n S 3, 1 S 3, 2 1 S 3, n Sn, 1 Sn, 2 Sn, 3 . . … . . . 1

Main types of DATA MINING Supervised • Bayesian Modeling • Decision Trees • Neural

Main types of DATA MINING Supervised • Bayesian Modeling • Decision Trees • Neural Networks • Etc. Type and number of classes are known in advance Unsupervised • One-way Clustering • Two-way Clustering Type and number of classes are NOT known in advance

Clustering: Min-Max Distance Intra-cluster distances are minimized outlier Inter-cluster distances are maximized Salary 20

Clustering: Min-Max Distance Intra-cluster distances are minimized outlier Inter-cluster distances are maximized Salary 20 40 Age 60

How Clustering works?

How Clustering works?

One-way clustering example Black spots are noise INPUT OUTPUT White spots are missing data

One-way clustering example Black spots are noise INPUT OUTPUT White spots are missing data

Data Mining Agriculture data clusters INPUT Clustered OUTPUT

Data Mining Agriculture data clusters INPUT Clustered OUTPUT

Classification Which class? Classifier (model) Unseen Data

Classification Which class? Classifier (model) Unseen Data

How Classification work? Inputs Output Confidence Level

How Classification work? Inputs Output Confidence Level

Classification Process (1): Model Construction Relationship between shopping time and items bought Training Data

Classification Process (1): Model Construction Relationship between shopping time and items bought Training Data Classification Algorithms (observations, measurements, etc. ) Classifier (Model) IF time/items >= 6 THEN gender = ‘F’

Classification Process (2): Use the Model in Prediction Classifier Testing Data Unseen Data (Firdous,

Classification Process (2): Use the Model in Prediction Classifier Testing Data Unseen Data (Firdous, Time= 15 Items = 1) Gender?

Clustering vs. Cluster Detection

Clustering vs. Cluster Detection

Clustering vs. Cluster Detection Example A B

Clustering vs. Cluster Detection Example A B

The K-Means Clustering

The K-Means Clustering

The K-Means Clustering: Example A B 10 10 9 9 8 8 7 7

The K-Means Clustering: Example A B 10 10 9 9 8 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0 0 1 2 3 4 5 6 7 8 9 0 10 10 10 9 9 8 8 7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 1 2 3 4 5 6 7 8 9 10 0 0 1 2 3 4 D 5 6 7 8 9 10 0 1 2 3 4 5 6 C 7 8 9 10

The K-Means Clustering: Comment

The K-Means Clustering: Comment