Novelty Detection Based on Information Matrix Alexander N
Novelty Detection Based on Information Matrix Alexander N. Dolia (ad@ecs. soton. ac. uk) 17 March, 2005 School of Electronics and Computer Sciences University of Southampton, UK
2 Motivation Dr. A. N. Dolia www. difdtc. com
3 Novelty Detection: definitions § What is a PATTERN ? What is an OUTLIER ? § “For novelty detection, the description of normality is learnt by fitting a model to the set of normal examples, and previously unseen patterns are then tested by comparing their novelty score (as defined by the model)” (Nairac, Corrbet-Clark, Townsend, Tarassenko, 1997) § “An outlier would be an observation that deviates so much from other observations as to arouse suspicions that it was generated by a different mechanism” (Hawkins, 1980) Dr. A. N. Dolia www. difdtc. com
4 Example 1 “Normal” “Novel” Dr. A. N. Dolia “Novel” www. difdtc. com
5 Training set: Researches in Machine Learning Potential problems: §Bad features §Missing data, outliers in a training set §Data non-stationarity Is it a known researcher or novice/outlier? Hint: KPCA… Hint: Convex…normalised cut Dr. A. N. Dolia www. difdtc. com
6 Possible approaches Density Estimation: Estimate a density based on training data Quantile Estimation: Estimate a quantile of the distribution underlying the training data: for a fixed constant , attempt to find a small set such that Dr. A. N. Dolia www. difdtc. com
7 Experimental design Where is the cost of an observation taken at point x Information matrix Dr. A. N. Dolia www. difdtc. com
8 Equivalence Theorem 1. the design 2. The design 3. maximize [minimize (Kiefer, 1961) ], are equivalent. The information matrices of all designs satisfying (1) -(3) coincide among themselves. Any linear combination of designs satisfying (1)-(3) also satisfying (1)-(3) Dr. A. N. Dolia www. difdtc. com
9 Minimum Covering Ellipsoid The problem of computing minimum covering ellipsoid (MCE) for the set of points can be regarded as the dual of a problem in optimal design for parameter estimation in linear regression, with the data set as the design space (Titterington, 1975) Dr. A. N. Dolia www. difdtc. com
10 Minimum Covering Ellipsoid in • Ellipsoid centred in the origin where (Titterington, 1975) The theory optimal experimental design suggests: § at least k+1 of the are non-zero; § at most k(k+3)/2 are non-zero; § if exactly k+1 are non-zero, they all equal; § the point has positive weight only if it lies on the surface of the ellipsoid, that is, only if Dr. A. N. Dolia www. difdtc. com
11 Optimization § is a design measure for all r; § the sequence is monotonic is a fixed point of the § § increasing, strictly unless recursion; the same is true of in the limit we obtain optimum design measure and MCE Dr. A. N. Dolia www. difdtc. com
12 Titterington’s algorithm: Experiment (1) Dr. A. N. Dolia www. difdtc. com
13 Proposed approach Dr. A. N. Dolia www. difdtc. com
14 Lagrangian Dr. A. N. Dolia www. difdtc. com
15 Dual problem, experimental design Dr. A. N. Dolia www. difdtc. com
16 What is outlier? Decision rules Dr. A. N. Dolia www. difdtc. com
17 Rousseeuw's MCD method The objective of Rousseeuw's MCD method is similar to the v-MCD method and is to find m observations (out of N) whose covariance has the lowest determinant (Rousseeuw, 84). The MCD estimate of location is then the average of these m points, whereas the MCD estimate of scatter is their covariance matrix. All possible sets can be found by using exhaustive search or Monte-Carlo Dr. A. N. Dolia www. difdtc. com
18 Experiment (2) Dr. A. N. Dolia www. difdtc. com
19 Experiment (3) Dr. A. N. Dolia www. difdtc. com
Kernel Principal Component Analysis 20 (Scholkopf, Smola, Muller, 1996) § Given N data point in k dimensions let § where each column represents one data point § Choose an appropriate kernel and form the Gram matrix § Form the modified Gram matrix § Diagonilized to get eigenvalues and eigenvectors § Use a feature selection method to choose subset of § Project the data points on the eigenvectors Dr. A. N. Dolia www. difdtc. com
21 Properties of non-robust MCE using KPCA The theory optimal experimental design and kernel PCA suggest: § if k<N at most k+1 of the are non-zero; § at least of are non-zero or and § if exactly k+1 are non-zero, they all equal; § the point has positive weight only if it lies on the surface of the ellipsoid, that is, only if Dr. A. N. Dolia www. difdtc. com
22 Experiment (4) Dr. A. N. Dolia www. difdtc. com
23 Experiment (5) Dr. A. N. Dolia www. difdtc. com
24 KPCA+Rousseeuw's MCD method Dr. A. N. Dolia www. difdtc. com
The minimum covering sphere problem and S-optimum experimental design 25 The minimum covering sphere problem is the S-optimum experimental design Dr. A. N. Dolia www. difdtc. com
26 Simple algebra Dr. A. N. Dolia www. difdtc. com
27 Illustrations of novelty detection methods Dr. A. N. Dolia www. difdtc. com
28 Relations to Tax and Scholkopf methods • • Tax method, Scholkopf method and multiply by (-1) or find min Dr. A. N. Dolia www. difdtc. com
29 Potential Applications RADAR Management Tactical Aircraft §Passive and active RADAR §Multi-site data fusion §Improved tracking and ID §Multi-sensor platform §Multiple mission objectives §Conflicting requirements §Early warning systems §Ground air based Require optimum tracking and ID combined with stealth. ASW (Anti Submarine Warfare) Robotics & UAV’s §Optimal sonobouy placement §Passive and active sonar §Adaptive array processing §Search and rescue robotics §Reconnaissance §Anti-Terrorist (e. g. Airport) This is a scenario with multiple constraints: Robot perception covers many areas, and sensor management is likely to cover a broad spectrum of application, both military and civilian. It aims to optimise any suite of sensor resources to improve system performance. Applicable to both ground air based platforms. §Sonobouy cost §Sensor lifetime §Sensor localisation §Deployment constraints Dr. A. N. Dolia www. difdtc. com
30 Simulation and Testing Robot Demonstrator Simulation Modern mobile robot platform with dissimilar sensor suite to provide proof of concept and valuable simulation data. Additionally will provide data for project 8. 5 ‘Intelligent Sensor’ MATLAB and C/C++ environments for algorithm development and testing. Simulation will form the foundation of the research and is complemented by demonstrator Pioneer 3 DX Mobile Robot SICK Laser Mapping System 16 Sensor Sonar Array PTZ Imaging with Active Infra-Red Additional Stationary Sensor Resources PTZ cameras with IR Directed microphone arrays Low cost fixed location sensors Linux / C++ / MATLAB driven Modular Decentralised Processing Tracking experiments REAL TIME CAPABILITY www. activrobots. com Dr. A. N. Dolia www. difdtc. com
31 Conclusions § New algorithm for novelty detection based on § § § Information Matrix is proposed We view the novelty detection or single-classification as the experimental design problem Preliminary simulation experiments illustrate the application to the novelty detection problem We demonstrate that Scholkopf’s and Tax’s algorithms could be a particular case of our approach when the objective is the trace of the information matrix Dr. A. N. Dolia www. difdtc. com
32 Future work • More sophisticated algorithms for large scale optimization (e. g. , based on a conditional gradient algorithm and active set strategy) • Modified Titterington algorithm with upper bound on Lagrangian multipliers • On-line novelty detection using Information Matrix • Bounds on rate of convergence and generalizations Dr. A. N. Dolia www. difdtc. com
33 Acknowledgement Many thanks to T. De Bie, J. S. Shawe-Taylor, S. Szedmak and D. M. Titterington for helpful suggestions and discussions. Many thanks to C. J. Harris, S. F. Page, N. M. White This research is partially supported by the Data Information Fusion Defence Technology Centre, United Kingdom, under DTC Projects 8. 1: ``Active multi-sensor management'' and the PASCAL network of excellence. Dr. A. N. Dolia www. difdtc. com
34 Thank you! Dr. A. N. Dolia www. difdtc. com
- Slides: 34