Angel at a glance Fitz Wang Tencent Agenda

  • Slides: 10
Download presentation
Angel at a glance Fitz Wang Tencent

Angel at a glance Fitz Wang Tencent

Agenda • Overview of Angel • Infrastructure • License • User

Agenda • Overview of Angel • Infrastructure • License • User

Overview of Angel • High-performance distributed machine learning platform based on Parameter Server (PS)

Overview of Angel • High-performance distributed machine learning platform based on Parameter Server (PS) • High-performance math library makes Angel suite for high dimensional sparse data • Currently, Angel we focus on the algorithms in recommendation system, including newly proposed deep learning algorithms

Angel Architecture

Angel Architecture

Angel Math Library • Problems of existing math library: • Generic type is low

Angel Math Library • Problems of existing math library: • Generic type is low in efficient • The model is to big for a java array to hold, we need long key vector • Sparse data is hard to deal with • Our design: • Data type: Double, Float, Long, Integer • Storage: Dense, Sparse, Sorted, Component Vector • Calculation/Operation: • • Reduction: sum, average, std, norm, max, min Element wise: +, -, *, /, % Unary: exp, log 1 p. pow, sigmoid, softthreshold Binary: dot, axpy

Algorithms (1/2) • Angel • Logistic Regression • Large Scale Piece-wise Linear Model •

Algorithms (1/2) • Angel • Logistic Regression • Large Scale Piece-wise Linear Model • Matrix Factorization • SVM • Kmeans • GBDT • LDA* (Wrap. LDA) • Spark on Angel • • Logistic Regression Sparse LR with FTRL GBDT VLDB: LDA*: A Robust and Large-scale Topic Modeling System Kmeans SIGMOD: Heterogeneity-aware Distributed Parameter Servers ICDE: Tencent. Boost: A Gradient Boosting Tree System with Parameter Server

Logistic. Regression Random. Forest KMeans Fast. Unfolding Word 2 Vec SVM Collaborative. Filteri. .

Logistic. Regression Random. Forest KMeans Fast. Unfolding Word 2 Vec SVM Collaborative. Filteri. . . Label. Propagation GBDT LDA XGboost Decision. Tree Linear. Regression Naive. Bayes Page. Rank BRNN Encoder Isolation. Forest PCA Decision. Tree. Regr. . . Hyper. Anf TF-IDF KCore FPGrowth CNN classification Hashing. TF Common. Friends HANP RNN LSTM DBSCAN SVD LPA CNN regression MF KMeans RNN Seq 2 seq DBN classification Mobile. Net Alex. Net Label. Propagation Inception SSD Res. Net VGG LR_FTRL Scatter. Line Algorithms (2/2) Algorithms 900 800 700 600 500 400 300 200 100 0

Infrastructure • Building tool: maven • CI tool: Travis CI • Although Angel supports

Infrastructure • Building tool: maven • CI tool: Travis CI • Although Angel supports local mode, to fully testing Angel, we recommend cluster mode. Typically, Angel requires: • • a client a global master, which manages all the PSs and workers three PSs three workers. • Since Angel is running on Yarn, Hadoop suite is also required.

License • BSD 3 -Clause License • https: //github. com/Tencent/angel/blob/master/LICENSE. TXT

License • BSD 3 -Clause License • https: //github. com/Tencent/angel/blob/master/LICENSE. TXT

Users

Users