Matrix Factorization Recovering latent factors in a matrix
- Slides: 50
Matrix Factorization
Recovering latent factors in a matrix n users m movies v 11 … … … vij … vnm V[i, j] = user i’s rating of movie j
Recovering latent factors in a matrix m movies y 1 a 2 . . … am x 2 y 2 b 1 b 2 … … bm . . n users x 1 ~ v 11 … … … vij … … xn yn … vnm V[i, j] = user i’s rating of movie j
KDD 2011 talk pilfered from …. .
Recovering latent factors in a matrix r m movies y 1 a 2 . . H … am x 2 y 2 b 1 b 2 … … bm . . n users x 1 W … … xn yn ~ v 11 … … … vij V … vnm V[i, j] = user i’s rating of movie j
for image denoising
Matrix factorization as SGD step size
Matrix factorization as SGD - why does this work? step size
Matrix factorization as SGD - why does this work? Here’s the key claim:
Checking the claim Think for SGD for logistic regression • LR loss = compare y and ŷ = dot(w, x) • similar but now update w (user weights) and x (movie weight)
What loss functions are possible? N 1, N 2 - diagonal matrixes, sort of like IDF factors for the users/movies “generalized” KL-divergence
What loss functions are possible?
What loss functions are possible?
ALS = alternating least squares
KDD 2011 talk pilfered from …. .
Similar to Mc. Donnell et al with perceptron learning
Slow convergence…. .
More detail…. • Randomly permute rows/cols of matrix • Chop V, W, H into blocks of size d x d – m/d blocks in W, n/d blocks in H • Group the data: – Pick a set of blocks with no overlapping rows or columns (a stratum) – Repeat until all blocks in V are covered • Train the SGD – Process strata in series – Process blocks within a stratum in parallel
More detail…. Z was V
More detail…. M= • Initialize W, H randomly – not at zero • Choose a random ordering (random sort) of the points in a stratum in each “sub-epoch” • Pick strata sequence by permuting rows and columns of M, and using M’[k, i] as column index of row i in subepoch k • Use “bold driver” to set step size: – increase step size when loss decreases (in an epoch) – decrease step size when loss increases • Implemented in Hadoop and R/Snowfall
Wall Clock Time 8 nodes, 64 cores, R/snow
Number of Epochs
Varying rank 100 epochs for all
Hadoop scalability Hadoop process setup time starts to dominate
Hadoop scalability
- Equation for specific latent heat of fusion
- Middlesbrough recovering together
- Latent factors recommender system
- Matrix factorization
- Site versus situation
- Are flowers biotic or abiotic
- Abiotic factors and biotic factors
- Abiotic vs biotic factors
- Is a raspberry bush biotic or abiotic
- Site factors vs situation factors
- Find the factors of 24
- Common factors of 10 and 20
- Factors of 8-
- Factors of 145
- Write the exponent of 3 in the prime factorization of 162
- 2352 prime factorization
- Square 1 to 25
- 120 prime factorization
- What's the prime factorization of 64
- Prime numbers chart
- 5x5^2 in index form
- 78 factors
- Factor tree of 78
- Prime factorization quantum algorithm
- Which unit tiles are needed to complete the factorization?
- Paths start and stop at
- Highest common factor
- Elementary scaling matrix
- Prime and composite jeopardy
- Prime factorization of 40
- Lowest common multiple of 80 and 60
- Find hcf by prime factorisation method worksheet
- Lcm of 150 and 180
- Gcf of 24 and 16
- Highest common factor of 36 and 90
- What is a composite number
- What is the greatest common factor
- Factoring problem
- List factors of 54
- Prime factorization of 200
- Number 53
- 360 prime factor tree
- Find the cube root of 17576 by estimation method
- Lu factorization
- Prime factorization of 45
- Prime factorization of 90
- What is the prime factorization of 24
- Factor tree of 48
- Sophie germain prime
- 3 factors of 16
- Prime factorization of 20