Petuum An IterativeConvergent Distributed Machine Learning Framework SU

Sign up to view full document!

Petuum: An Iterative-Convergent Distributed Machine Learning Framework SU YUXIN JAN 20, 2014

Petuum: An Iterative-Convergent Distributed Machine Learning Framework SU YUXIN JAN 20, 2014

Outline Introduction Implementation Questions Demo

Outline Introduction Implementation Questions Demo

Introduction to Petuum

Introduction to Petuum

Bulk Synchronous Parallel

Bulk Synchronous Parallel

Asynchronous Parameters read / update at any time

Asynchronous Parameters read / update at any time

Stale Synchronous Parallel

Stale Synchronous Parallel

Convergence

Convergence

Programming read(table, row, col) inc(table, row, col, value) iteration()

Programming read(table, row, col) inc(table, row, col, value) iteration()

Implementation

Implementation

Overview in Logic

Overview in Logic

Overview in the Real

Overview in the Real

Main Components

Main Components

Table

Table

Consistency. Controller: : Do. Get()

Consistency. Controller: : Do. Get()

Consistency. Controller: : iterate()

Consistency. Controller: : iterate()

Server: : Get. Row()

Server: : Get. Row()

Least-Recently-Used(LRU) Strategy

Least-Recently-Used(LRU) Strategy

Questions

Questions

Is Lock-Free Possible ? Data exchange in real-time ? next …

Is Lock-Free Possible ? Data exchange in real-time ? next …

Is Auto-Rescheduling Possible ? sub-centralized server reduce communication cost

Is Auto-Rescheduling Possible ? sub-centralized server reduce communication cost

Is Auto-Partition Possible ? Run ML algorithms like that in a single thread A

Is Auto-Partition Possible ? Run ML algorithms like that in a single thread A Solution for all ML algorithms

In-Memory or In-Storage ? Data capacity is greater than memory size. Memory should be

In-Memory or In-Storage ? Data capacity is greater than memory size. Memory should be a cache for disk storage. Solution for disk storage: Hadoop Spark ….

New Schema to Reduce the Upper Bound?

New Schema to Reduce the Upper Bound?

STRADS Scheduler Variable Correlations Auto-Parallelization Dynamic Prioritization Monitor the contribution of variables to objective

STRADS Scheduler Variable Correlations Auto-Parallelization Dynamic Prioritization Monitor the contribution of variables to objective function Load-Balancing in Task

Demo Switch to my laptop …

Demo Switch to my laptop …

Slides: 25

Download presentation