NonNegative Matrix Factorization NCSU ECCR Workshop Stan Young

  • Slides: 16
Download presentation
Non-Negative Matrix Factorization NCSU ECCR Workshop Stan Young NCSU ECCR, NISS 23, 24 Feb

Non-Negative Matrix Factorization NCSU ECCR Workshop Stan Young NCSU ECCR, NISS 23, 24 Feb 2007 1

Outline 1. Introduction 2. NMF Chemistry Problem 3. Non-negative matrix factorization 4. Logistics 2

Outline 1. Introduction 2. NMF Chemistry Problem 3. Non-negative matrix factorization 4. Logistics 2

NMF Algorithm Green are the “spectra”. Red are the “weights”. Compounds Features A WH

NMF Algorithm Green are the “spectra”. Red are the “weights”. Compounds Features A WH = Start with random elements in red and green. +E Optimize so that (aij – whij)2 is minimized. 3

NCSU ECCR – Chemical Informatics 4

NCSU ECCR – Chemical Informatics 4

Power. MV 5

Power. MV 5

Data File 6

Data File 6

NMF for Clustering Profile Likelihood 7

NMF for Clustering Profile Likelihood 7

RH Vector 8

RH Vector 8

Cluster 1 Compounds 9

Cluster 1 Compounds 9

Cluster 1 Compounds 10

Cluster 1 Compounds 10

Contention: NMF finds “parts” SVD RH EV elements come from a composite. (They come

Contention: NMF finds “parts” SVD RH EV elements come from a composite. (They come from regression. ) NMF commits one vector to each mechanism. (True? ? ) “For such databases there is a generative model in terms of ‘parts’ and NMF correctly identifies the ‘parts’. ” 11

Key Papers 1. Good (1969) Technometrics – SVD. 2. Liu et al. (2003) PNAS

Key Papers 1. Good (1969) Technometrics – SVD. 2. Liu et al. (2003) PNAS – r. SVD. 3. Lee and Seung (1999) Nature – NMF. 4. Kim and Tidor (2003) Genome Research. 5. Brunet et al. (2004) PNAS – Micro array. 6. Fogel et al. (2007) Bioinformatics 12

Summary 1. NMF is an attractive alternative to SVD. 2. Mechanisms appear to be

Summary 1. NMF is an attractive alternative to SVD. 2. Mechanisms appear to be captured in separate vectors. 3. SVD is central to many linear statistical methods. Substitute NMF for SVD! 4. Many statistical problems are open for research. 13

Logistics 1. 2. 3. 4. Two Rooms and web. Locals with cars identify yourselves.

Logistics 1. 2. 3. 4. Two Rooms and web. Locals with cars identify yourselves. See www. niss. org/ir. MF CS, Applications, Statistics. 14

Friday Program Speaker 1: 00 -1: 15 -3: 00 Stan Young Team Teaching* 3:

Friday Program Speaker 1: 00 -1: 15 -3: 00 Stan Young Team Teaching* 3: 00 -3: 45 -4: 00 -4: 45 *Atina Brooks Barbara Ball Amy Langville Paul Fogel Break Michael Berry 4: 45 -5: 30 Kevin Heinrich Topic Introduction Review of NMF and Comparison of popular NMF algorithms Linking genetic profiles to biological outcome Using Non-negative Matrix and Tensor Factorizations for Email Surveillance Automated Gene Classification Using NMF within SGO 15

Saturday Program 8: 00 – 8: 45 – 9: 30 – 10: 00 -10:

Saturday Program 8: 00 – 8: 45 – 9: 30 – 10: 00 -10: 30 -11: 15 -12: 00 -1: 00 -1: 30 -2: 00 -2: 15 -3: 15 Continental Breakfast Speaker Topic Inderjit S. Dhillon Fast Newton-type Methods for Nonnegative Matrix Approximation Haesun Park Sparse NMF via Alternating Non-negativity Constrained Least Squares Break Doug Hawkins Two-Block Analysis Moody Chu Low Dimensional Polytope Approximation and Its Applications to NMF Lunch Bob Plemmons Nonnegative Tensor Factorization for Object Identification using Hyperspectral Data Gary Howell Computational Efficiency and (Break) Low Rank Factorization Panel Discussion James Cox, SAS; Jackie Hughes-Oliver, NCSU; Mike Marshall, Fortune Interactive; 16 Bob Plemmons, WFU