Quantum Boltzmann Machine Mohammad Amin DWave Systems Inc

Sign up to view full document!

Quantum Boltzmann Machine Mohammad Amin D-Wave Systems Inc. • 1 • Copyright© 2016, D-Wave

Quantum Boltzmann Machine Mohammad Amin D-Wave Systems Inc. • 1 • Copyright© 2016, D-Wave Systems Inc.

Not the only use of QA Maybe not the best use of QA •

Not the only use of QA Maybe not the best use of QA • 2 • Copyright© 2016, D-Wave Systems Inc.

Adiabatic Quantum Computation • 3 • s = t/tf gmin 0 s 1 Solution

Adiabatic Quantum Computation • 3 • s = t/tf gmin 0 s 1 Solution energy levels Initial state H(t) = (1 -s)HD + s. HP , tf ~ (1/gmin)2 Copyright© 2016, D-Wave Systems Inc.

Thermal Noise k. BT Bath Interaction energy levels System P 0 0 s 1

Thermal Noise k. BT Bath Interaction energy levels System P 0 0 s 1 Dynamical freeze-out • 4 • Copyright© 2016, D-Wave Systems Inc.

Open quantum calculations of a 16 qubit random problem Classical energies • 5 •

Open quantum calculations of a 16 qubit random problem Classical energies • 5 • Copyright© 2016, D-Wave Systems Inc.

Equilibration Cause Correlation with simulated annealing Hen et al. , PRA 92, 042325 (2015)

Equilibration Cause Correlation with simulated annealing Hen et al. , PRA 92, 042325 (2015) • 6 • Copyright© 2016, D-Wave Systems Inc.

Equilibration Cause Correlation with Quantum Monte Carlo Boixo et al. , Nature Phys. 10,

Equilibration Cause Correlation with Quantum Monte Carlo Boixo et al. , Nature Phys. 10, 218 (2014) • 7 • Copyright© 2016, D-Wave Systems Inc.

Equilibration Cause Correlation with spin vector Monte Carlo Shin et al. , ar. Xiv:

Equilibration Cause Correlation with spin vector Monte Carlo Shin et al. , ar. Xiv: 1401. 7087 SVMC • 8 • SVMC Copyright© 2016, D-Wave Systems Inc.

Equilibration Can Mask Quantum Speedup Brooke et al. , Science 284, 779 (1999) Quantum

Equilibration Can Mask Quantum Speedup Brooke et al. , Science 284, 779 (1999) Quantum advantage is expected to be dynamical • 9 • Copyright© 2016, D-Wave Systems Inc.

Equilibration Can Mask Quantum Speedup Ronnow et al. , Science 345, 420 (2014) Hen

Equilibration Can Mask Quantum Speedup Ronnow et al. , Science 345, 420 (2014) Hen et al. , ar. Xiv: 1502. 01663 King et al. , ar. Xiv: 1502. 02098 Equilibrated probability!!! Computation time is independent of dynamics! • 10 • Copyright© 2016, D-Wave Systems Inc.

Residual Energy vs Annealing Time 50 random problems, 100 samples per problem per annealing

Residual Energy vs Annealing Time 50 random problems, 100 samples per problem per annealing time Bimodal (J=-1, +1 , h=0) Mean residual energy Lowest residual energy • 11 • Annealing time (ms) Copyright© 2016, D-Wave Systems Inc.

Residual Energy vs Annealing Time 50 random problems, 100 samples per problem per annealing

Residual Energy vs Annealing Time 50 random problems, 100 samples per problem per annealing time Frustrated loops (a=0. 25) • 12 • Annealing time (ms) Bimodal (J=-1, +1 , h=0) Annealing time (ms) Copyright© 2016, D-Wave Systems Inc.

Boltzmann sampling is #P harder than NP What can we do with a Quantum

Boltzmann sampling is #P harder than NP What can we do with a Quantum Boltzmann Distribution? • 13 • Copyright© 2016, D-Wave Systems Inc.

ar. Xiv: 1601. 02036 Evgeny Andriyash • 14 • Jason Rolfe Bohdan Kulchytskyy Roger

ar. Xiv: 1601. 02036 Evgeny Andriyash • 14 • Jason Rolfe Bohdan Kulchytskyy Roger Melko Copyright© 2016, D-Wave Systems Inc.

Machine Learning in our Daily Life • 15 • Copyright© 2016, D-Wave Systems Inc.

Machine Learning in our Daily Life • 15 • Copyright© 2016, D-Wave Systems Inc.

Introduction to Machine Learning Data Unseen data • 16 • Model 3 Copyright© 2016,

Introduction to Machine Learning Data Unseen data • 16 • Model 3 Copyright© 2016, D-Wave Systems Inc.

Probabilistic Models Data Probability distribution Model Variables Parameters q Training: Tune q such that

Probabilistic Models Data Probability distribution Model Variables Parameters q Training: Tune q such that • 17 • Copyright© 2016, D-Wave Systems Inc.

Boltzmann Machine Data Model Variables Parameters q Boltzmann distribution (b =1) • 18 •

Boltzmann Machine Data Model Variables Parameters q Boltzmann distribution (b =1) • 18 • Copyright© 2016, D-Wave Systems Inc.

Boltzmann Machine Ising model: spins parameters • 19 • Copyright© 2016, D-Wave Systems Inc.

Boltzmann Machine Ising model: spins parameters • 19 • Copyright© 2016, D-Wave Systems Inc.

Adding Hidden Variables zi zn za = (zn , zi) visible hidden visible •

Adding Hidden Variables zi zn za = (zn , zi) visible hidden visible • 21 • hidden Copyright© 2016, D-Wave Systems Inc.

Training a BM Tune such that Maximize log-likelihood: We need an efficient way to

Training a BM Tune such that Maximize log-likelihood: We need an efficient way to calculate Or minimize: training rate gradient descent technique • 22 • Copyright© 2016, D-Wave Systems Inc.

Calculating the Gradient Average with clamped visibles • 23 • Unclamped average Copyright© 2016,

Calculating the Gradient Average with clamped visibles • 23 • Unclamped average Copyright© 2016, D-Wave Systems Inc.

Training Ising Hamiltonian Parameters Clamped average Unclamped average Gradients can be estimated using sampling!

Training Ising Hamiltonian Parameters Clamped average Unclamped average Gradients can be estimated using sampling! • 24 • Copyright© 2016, D-Wave Systems Inc.

Question: Is it possible to train a quantum Boltzmann machine? Ising Hamiltonian • 25

Question: Is it possible to train a quantum Boltzmann machine? Ising Hamiltonian • 25 • Transverse Ising Hamiltonian Copyright© 2016, D-Wave Systems Inc.

Transverse Ising Hamiltonian • 26 • Copyright© 2016, D-Wave Systems Inc.

Transverse Ising Hamiltonian • 26 • Copyright© 2016, D-Wave Systems Inc.

Quantum Boltzmann Distribution Boltzmann probability distribution: Density matrix: • 27 • Projection operator Identity

Quantum Boltzmann Distribution Boltzmann probability distribution: Density matrix: • 27 • Projection operator Identity matrix Copyright© 2016, D-Wave Systems Inc.

Gradient Descent Classically: = = Clamped average • 28 • Unclamped average Copyright© 2016,

Gradient Descent Classically: = = Clamped average • 28 • Unclamped average Copyright© 2016, D-Wave Systems Inc.

Calculating the Gradient cannot be estimated using sampling! ≠ ≠ Clamped average • 29

Calculating the Gradient cannot be estimated using sampling! ≠ ≠ Clamped average • 29 • Unclamped average Copyright© 2016, D-Wave Systems Inc.

Two Useful Properties of Trace Golden-Thompson inequality: For Hermitian matrices A and B •

Two Useful Properties of Trace Golden-Thompson inequality: For Hermitian matrices A and B • 30 • Copyright© 2016, D-Wave Systems Inc.

Finding lower bounds Golden-Thompson inequality • 31 • Copyright© 2016, D-Wave Systems Inc.

Finding lower bounds Golden-Thompson inequality • 31 • Copyright© 2016, D-Wave Systems Inc.

Finding lower bounds Golden-Thompson inequality Lower bound for log-likelihood • 32 • Copyright© 2016,

Finding lower bounds Golden-Thompson inequality Lower bound for log-likelihood • 32 • Copyright© 2016, D-Wave Systems Inc.

Calculating the Gradients Minimize the upper bound ? • 33 • Unclamped average Copyright©

Calculating the Gradients Minimize the upper bound ? • 33 • Unclamped average Copyright© 2016, D-Wave Systems Inc.

Clamped Hamiltonian for Infinite energy penalty for states different from v Visible qubits are

Clamped Hamiltonian for Infinite energy penalty for states different from v Visible qubits are clamped to their classical values given by the data • 34 • Copyright© 2016, D-Wave Systems Inc.

Estimating the Steps Clamped average Unclamped average We can now use sampling to estimate

Estimating the Steps Clamped average Unclamped average We can now use sampling to estimate the steps • 35 • Copyright© 2016, D-Wave Systems Inc.

Training the Transverse Field (Ga) Minimizing the upper bound: Two problems: cannot be estimated

Training the Transverse Field (Ga) Minimizing the upper bound: Two problems: cannot be estimated from measurements for all visible qubits, thus Gn cannot be trained using the bound • 36 • Copyright© 2016, D-Wave Systems Inc.

Example: 10 -Qubit QBM Graph: fully connected (K 10), fully visible • 37 •

Example: 10 -Qubit QBM Graph: fully connected (K 10), fully visible • 37 • Copyright© 2016, D-Wave Systems Inc.

Example: 10 -Qubit QBM Training set: M-modal distribution Random spin orientation Single mode: Multi-mode:

Example: 10 -Qubit QBM Training set: M-modal distribution Random spin orientation Single mode: Multi-mode: • 38 • Hamming distance p = 0. 9 M=8 Copyright© 2016, D-Wave Systems Inc.

Exact Diagonalization Results KL-divergence: Bound gradient D=2 Classical BM Exact gradient (D is trained)

Exact Diagonalization Results KL-divergence: Bound gradient D=2 Classical BM Exact gradient (D is trained) D final = 2. 5 • 39 • Copyright© 2016, D-Wave Systems Inc.

Sampling from D-Wave Dickson et al. , Nat. Commun. 4, 1903 (2013) Probabilities cross

Sampling from D-Wave Dickson et al. , Nat. Commun. 4, 1903 (2013) Probabilities cross at the anticrossing • 40 • Copyright© 2016, D-Wave Systems Inc.

Conclusions: • A quantum annealer can provide fast samples of quantum Boltzmann distribution •

Conclusions: • A quantum annealer can provide fast samples of quantum Boltzmann distribution • QBM can be trained by sampling • QBM may learn some distributions better than classical BM • See ar. Xiv: 1601. 02036 • 41 • Copyright© 2016, D-Wave Systems Inc.

Slides: 40

Download presentation