Modeling Ultrahigh Dimensional Feature Selection as a Slow

  • Slides: 27
Download presentation
Modeling Ultra-high Dimensional Feature Selection as a Slow Intelligence System Wang Yingze CS 2650

Modeling Ultra-high Dimensional Feature Selection as a Slow Intelligence System Wang Yingze CS 2650 Project

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway results

Introduction v Ultra. High-dimensional variable selection is the hot topic in statistics and machine

Introduction v Ultra. High-dimensional variable selection is the hot topic in statistics and machine learning. v Model relationship between one response and associated features , based on a sample of size n.

Application v Associated studies between phenotypes and SNPs. v Gene selection or disease classification

Application v Associated studies between phenotypes and SNPs. v Gene selection or disease classification in bioinformatics. each patient’s data with p genes n Patients’ degree of disease sickness Important genes selected one Gene expression level

Challenges v Dimensionality grows rapidly with interactions of the features Portfolio selection and networking

Challenges v Dimensionality grows rapidly with interactions of the features Portfolio selection and networking modeling: 2000 stocks involves over 2 millions unknown parameters in the covariance matrix. Protein-protein interaction: the sample size may be in the order of thousands, but the number of features can be in the order of millions. v To construct effective method to learn relationship between features and response in high dimension for scientific purposes.

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway results

Existing methods 1. LASSO : L 1 regularization linear regression 2. Forward regression: sequentially

Existing methods 1. LASSO : L 1 regularization linear regression 2. Forward regression: sequentially add variables 3. Backward regression: start with them all then delete them on the bases of smallest change in 4. Stepwise regression: at each step one can be entered (on basis of greatest improvement in but one also may be removed if the change (reduction) in is not significant. 5. Least-angle regression: estimated parameters are increased in a direction equiangular to each one's correlations with the residual.

State-of-the-Art Approach v Interactive feature selection method proposed by Jianqing Fan in Princeton University

State-of-the-Art Approach v Interactive feature selection method proposed by Jianqing Fan in Princeton University “Ultrahigh dimensional feature selection: beyond the linear model” v Contribution: § Ultrahigh dimensional data § Accuracy § slow

Step 1: Large-scale screening v Apply Pearson correlation and ranking to pick a set

Step 1: Large-scale screening v Apply Pearson correlation and ranking to pick a set

Step 2 : Moderate-scale selection v Employ an existing regression method to select a

Step 2 : Moderate-scale selection v Employ an existing regression method to select a subset of these indices.

Step 3: Large-scale screening v Adding other features one each time with regression model:

Step 3: Large-scale screening v Adding other features one each time with regression model: to the

Step 3 (con’t) v Ranking j features according to , select the top numbers

Step 3 (con’t) v Ranking j features according to , select the top numbers of features. And add to forming the new feature set v Repeats Steps 2 -3, select new until from , then form new

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway results

Slow Intelligence System v “A General Framework for Slow Intelligence Systems”, by S. K.

Slow Intelligence System v “A General Framework for Slow Intelligence Systems”, by S. K. Chang, International Journal of Software Engineering and Knowledge Engineering

Time Controller v Slow decision cycle(s) to complement quick decision cycle(s): SIS possesses at

Time Controller v Slow decision cycle(s) to complement quick decision cycle(s): SIS possesses at least two decision cycles. Therefore, Slow Intelligence Systems work usually correctly but not always fast. v Time Controller Design § Panic Button § Petri-net model

Motivation v “Modeling Human Intelligence as A Slow Intelligence System” by Tiansi Dong, DMS

Motivation v “Modeling Human Intelligence as A Slow Intelligence System” by Tiansi Dong, DMS 2010 v SIS for object mapping between scenes § Two object tracing results due to two different priorities 1. Priority on spatial changes (minimal spatial changes) 2. Priority on object categories (objects are mapped within same categories)

SIS 1 for Object tracing (priority on spatial changes) v Enumerate all possible mapping

SIS 1 for Object tracing (priority on spatial changes) v Enumerate all possible mapping v Elimination and concentration the mapping with the minimal spatial changes

SIS 2 for Object tracing (priority on object category) v Enumerate all possible mapping

SIS 2 for Object tracing (priority on object category) v Enumerate all possible mapping v Elimination and concentration the mapping with the same category

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway results

Task one v Modeling Ultra-high Dimensional Feature Selection as a Slow Intelligence System §

Task one v Modeling Ultra-high Dimensional Feature Selection as a Slow Intelligence System § Use SIS to model Iterative feature selection method to five phases: Enumeration, Elimination, Adaptation, Propagation, Concentration. § The whole SIS system contains additional Sub-SIS system. v Represent it in Mathematical formulation

Task two v Design time controller in term of Petri Net and introduce Knowledge

Task two v Design time controller in term of Petri Net and introduce Knowledge base § Time controller controls the time to evoke each phase of SIS. § Knowledge base contains five different moderate-scale selection algorithms. KB can be changed and updated in the slow cycle. v Represent Time controller in Petri Net using Re. New Editor

Future work v Experiment: Use some real data (colon cancer data) to do the

Future work v Experiment: Use some real data (colon cancer data) to do the experiment and compare the results with some existing feature selection method like LASSO, forward, backward regression, etc. Weka: Demo v I will use some visualization tool to visualize the result and the process of feature selection.

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway

Outline Introduction Iterative feature selection Framework of Slow Intelligence System Tasks for project Midway results

Diagram: Main SIS System

Diagram: Main SIS System

Diagram: Sub SIS system

Diagram: Sub SIS system

Diagram: Petri-Net Model

Diagram: Petri-Net Model