Matrix Factorization with Unknown Noise Deyu Meng Deyu

  • Slides: 27
Download presentation
Matrix Factorization with Unknown Noise Deyu Meng 参考文献: ①Deyu Meng, Fernando De la Torre.

Matrix Factorization with Unknown Noise Deyu Meng 参考文献: ①Deyu Meng, Fernando De la Torre. Robust Matrix Factorization with Unknown Noise. International Conference of Computer Vision (ICCV), 2013. ②Qian Zhao, Deyu Meng, Zongben Xu, Wangmeng Zuo, Lei Zhang. Robust principal component analysis with complex noise, International Conference of Machine Learning (ICML), 2014.

Ø Low-rank matrix factorization are widely used in computer vision. Structure from Motion (E.

Ø Low-rank matrix factorization are widely used in computer vision. Structure from Motion (E. g. , Eriksson and Hengel , 2010) Face Modeling (E. g. , Candes et al. , 2012) Photometric Stereo (E. g. , Zheng et al. , 2012) Background Subtraction (E. g. Candes et al. , 2012)

Ø Complete, clean data (or with Gaussian noise) n SVD: Global solution

Ø Complete, clean data (or with Gaussian noise) n SVD: Global solution

Ø Complete, clean data (or with Gaussian noise) n SVD: Global solution Ø There

Ø Complete, clean data (or with Gaussian noise) n SVD: Global solution Ø There always missing data Ø There always heavy and complex noise

L 2 norm model ØYoung diagram (CVPR, 2008) Ø L 2 Wiberg (IJCV, 2007)

L 2 norm model ØYoung diagram (CVPR, 2008) Ø L 2 Wiberg (IJCV, 2007) Ø LM_S/LM_M (IJCV, 2008) Ø SALS (CVIU, 2010) Ø LRSDP (NIPS, 2010) Ø Damped Wiberg (ICCV, 2011) Ø Weighted SVD (Technometrics, 1979) Ø WLRA (ICML, 2003) Ø Damped Newton (CVPR, 2005) Ø CWM (AAAI, 2013) Ø Reg-ALM-L 1 (CVPR, 2013) Pros: smooth model, faster algorithm, have global optimum for nonmissing data Cons: not robust to heavy outliers

L 1 norm model L 2 norm model ØYoung diagram (CVPR, 2008) Ø L

L 1 norm model L 2 norm model ØYoung diagram (CVPR, 2008) Ø L 2 Wiberg (IJCV, 2007) Ø LM_S/LM_M (IJCV, 2008) Ø SALS (CVIU, 2010) Ø LRSDP (NIPS, 2010) Ø Damped Wiberg (ICCV, 2011) Ø Weighted SVD (Technometrics, 1979) Ø WLRA (ICML, 2003) Ø Damped Newton (CVPR, 2005) Ø CWM (AAAI, 2013) Ø Reg-ALM-L 1 (CVPR, 2013) Pros: smooth model, faster algorithm, have global optimum for nonmissing data Cons: not robust to heavy outliers Ø Torre&Black (ICCV, 2001) Ø R 1 PCA (ICML, 2006) Ø PCAL 1 (PAMI, 2008) Ø ALP/AQP (CVPR, 2005) Ø L 1 Wiberg (CVPR, 2010, best paper award) Ø Reg. L 1 ALM (CVPR, 2012) Pros: robust to extreme outliers Cons: non-smooth model, slow algorithm, perform badly in Gaussian noise data

Ø L 2 model is optimal to Gaussian noise Ø L 1 model is

Ø L 2 model is optimal to Gaussian noise Ø L 1 model is optimal to Laplacian noise Ø But real noise is generally neither Gaussian nor Laplacian

Yale B faces: … Saturation and shadow noise Camera noise

Yale B faces: … Saturation and shadow noise Camera noise

We propose Mixture of Gaussian (Mo. G) Universal approximation property of Mo. G Any

We propose Mixture of Gaussian (Mo. G) Universal approximation property of Mo. G Any continuous distributions Mo. G (Maz’ya and Schmidt, 1996) Ø E. g. , a Laplace distribution can be equivalently expressed as a scaled Mo. G (Andrews and Mallows, 1974)

MLE Model Ø Use EM algorithm to solve it!

MLE Model Ø Use EM algorithm to solve it!

Ø E Step: Ø M Step:

Ø E Step: Ø M Step:

Synthetic experiments Ø Three noise cases Ø Gaussian noise Ø Sparse noise Ø Mixture

Synthetic experiments Ø Three noise cases Ø Gaussian noise Ø Sparse noise Ø Mixture noise Ø Six error measurements What L 2 and L 1 methods optimize Good measures to estimate groundtruth subspace

Our method L 2 methods L 1 methods Gaussian noise experiments Ø Mo. G

Our method L 2 methods L 1 methods Gaussian noise experiments Ø Mo. G performs similar with L 2 methods, better than L 1 methods. Sparse noise experiments Ø Mo. G performs as good as the best L 1 method, better than L 2 methods. Mixture noise experiments Ø Mo. G performs better than all L 2 and L 1 competing methods

Why Mo. G is robust to outliers? Ø L 1 methods perform well in

Why Mo. G is robust to outliers? Ø L 1 methods perform well in outlier or heavy noise cases since it is a heavy-tail distribution. Ø Through fitting the noise as two Gaussians, the obtained Mo. G distribution is also heavy tailed.

Face modeling experiments

Face modeling experiments

Explanation Saturation and shadow noise Camera noise

Explanation Saturation and shadow noise Camera noise

Background Subtraction

Background Subtraction

Background Subtraction

Background Subtraction

Summary Ø We propose a LRMF model with a Mixture of Gaussians (Mo. G)

Summary Ø We propose a LRMF model with a Mixture of Gaussians (Mo. G) noise Ø The new method can well handle outliers like L 1 -norm methods but using a more efficient way. Ø The extracted noises are with certain physical meanings

Thanks!

Thanks!