Emerging Database course Classification of Data Mining Techniques
Emerging Database course: Classification of Data Mining Techniques
Classification of Data Mining Techniques Recently, there have been a large number of works on data mining in universities, research institutions, and commercial vendors. There are several criteria to classifying those works. The most common criterion is the type of knowl edgeto be obtained. According to the criterion, data mining techniques are classified into characterization, classification, clustering, association discovery, sequential pattern discovery, prediction, and so on. Characterization is the activity of obtaining generalized descriptions to rep resent a large number of detail data records. Classification is the activity of finding out rules to discriminate a group of objects from others, whereas clustering is grouping similar objects based on certain similarity mea sures. Association rules represent the co-occurrence tendency of mul tiple events. A rule such as "if an event A occurs, it is very likely for an event B to occur simultaneously" is an example of the association rule. The sequential pattern is a variation of the association rule, which considers relative orders of occurrence. Prediction is the activity of interpolating or extrapo latingan unknown value of interest based on known values. Linear regression is a typical example of prediction.
To the best of our knowledge, there does not exist any notion of completeness for types of knowledge to be obtained. It means that it is quite possible to develop new types of knowledge according to the application requirements evolved continuously. The other criteria for classification of data mining techniques can be types of input data. Even though many input data are stored in relational databases currently, useful data can reside in legacy file systems, object-oriented databases, object-relational databases, spatial databases, multimedia databases, Internet information-base, and so on. According to the type of input data, it may be re quired to differentiate data mining techniques. Especially, Internet information base is regarded as the fruitful resource for data mining recently. Meanwhile, since various techniques in several disciplines such as machine learning, statistics and database management are combined to provide data mining solutions, we can classify data mining techniques based on the types of adopted techniques. For example, techniques based on symbolic artificial intelligence are likely to produce human-understandable knowledge while neural network techniques may not.
REFERENCES • Timon C. Du. , Emerging Database System Architectures • Bochmann, G. Concepts for Distributed Systems Design • Capron, H. L. Computers: Tools for an Information Age
- Slides: 4