CS 252 Graduate Computer Architecture Lecture 28 Esoteric

  • Slides: 34
Download presentation
CS 252 Graduate Computer Architecture Lecture 28 Esoteric Computer Architecture DNA Computing & Quantum

CS 252 Graduate Computer Architecture Lecture 28 Esoteric Computer Architecture DNA Computing & Quantum Computing Prof John D. Kubiatowicz http: //www. cs. berkeley. edu/~kubitron/cs 252

DNA Computing • Can we use DNA to do massive computations? – Organisms do

DNA Computing • Can we use DNA to do massive computations? – Organisms do it – DNA has very high information density: » 4 different base pairs: • Adenine/Thymine • Guanine/Cytosine » Always paired on opposite strands Energetically favorable – Active operations: » Copy: Split strands of DNA apart in solution, gain 2 copies » Concatenate: eg: GTAATCCT will combine XXXXXCATT with AGGAYYYYY » Polymerase Chain Reaction (PCR): amplifies region of molecule between two marker molecules 5/11/2009 cs 252 -S 09, Lecture 28 © 1999 Access Excellence @ the National Health Museum 2

DNA Computing and Hamiltonian Path • Given a set of cities and costs between

DNA Computing and Hamiltonian Path • Given a set of cities and costs between them (possibly directed paths): – Find shortest path to all cities – Simpler: find single path that visits all cities • DNA Computing example is latter version: – Every city represented by unique 20 base-pair strand – Every path between cities represented by complementary pairs: 10 pairs from source city, 10 pairs from destination – Shorter example: AAGT for city 1, TTCG for city 2 Path 1 ->2: CAAA Will build: AAGTTTCG. . CAAA. . – Dump “city molecules” and “path molecules” into testtube. Select and amplify paths of right length. Analyze for result. – Been done for 6 cities! (Adleman, ~1998!) 5/11/2009 cs 252 -S 09, Lecture 28 3

Even more promising uses of DNA • Self-assembly of components DNA Active Region Bonding

Even more promising uses of DNA • Self-assembly of components DNA Active Region Bonding – DNA serves as substrate – Attach active elements in middle of components. – Final step – metal deposited over DNA Active • Other interesting structures could be built. Region 5/11/2009 cs 252 -S 09, Lecture 28 4

Use Quantum Mechanics to Compute? • Weird but useful properties of quantum mechanics: –

Use Quantum Mechanics to Compute? • Weird but useful properties of quantum mechanics: – Quantization: Only certain values or orbits are good » Remember orbitals from chemistry? ? ? – Superposition: Schizophrenic physical elements don’t quite know whether they are one thing or another • All existing digital abstractions try to eliminate QM – Transistors/Gates designed with classical behavior – Binary abstraction: a “ 1” is a “ 1” and a “ 0” is a “ 0” • Quantum Computing: Use of Quantization and Superposition to compute. • Interesting results: – Shor’s algorithm: factors in polynomial time! – Grover’s algorithm: Finds items in unsorted database in time proportional to square-root of n. – Materials simulation: exponential classically, linear-time QM 5/11/2009 cs 252 -S 09, Lecture 28 5

Quantization: Use of “Spin” North Representation: |0> or |1> Spin ½ particle: (Proton/Electron) South

Quantization: Use of “Spin” North Representation: |0> or |1> Spin ½ particle: (Proton/Electron) South • Particles like Protons have an intrinsic “Spin” when defined with respect to an external magnetic field • Quantum effect gives “ 1” and “ 0”: – Either spin is “UP” or “DOWN” nothing between 5/11/2009 cs 252 -S 09, Lecture 28 6

Kane Proposal II (First one didn’t quite work) Single Spin Control Gates Inter-bit Control

Kane Proposal II (First one didn’t quite work) Single Spin Control Gates Inter-bit Control Gates Phosphorus Impurity Atoms • Bits Represented by combination of proton/electron spin • Operations performed by manipulating control gates – Complex sequences of pulses perform NMR-like operations • Temperature < 1° Kelvin! 5/11/2009 cs 252 -S 09, Lecture 28 7

Now add Superposition! • The bit can be in a combination of “ 1”

Now add Superposition! • The bit can be in a combination of “ 1” and “ 0”: – Written as: = C 0|0> + C 1|1> – The C’s are complex numbers! – Important Constraint: |C 0|2 + |C 1|2 =1 • If measure bit to see what looks like, – With probability |C 0|2 we will find |0> (say “UP”) – With probability |C 1|2 we will find |1> (say “DOWN”) • Is this a real effect? Options: – This is just statistical – given a large number of protons, a fraction of them (|C 0|2 ) are “UP” and the rest are down. – This is a real effect, and the proton is really both things until you try to look at it • Reality: second choice! – There are experiments to prove it! 5/11/2009 cs 252 -S 09, Lecture 28 8

A register can have many values! • Implications of superposition: – An n-bit register

A register can have many values! • Implications of superposition: – An n-bit register can have 2 n values simultaneously! – 3 -bit example: = C 000|000>+ C 001|001>+ C 010|010>+ C 011|011>+ C 100|100>+ C 101|101>+ C 110|110>+ C 111|111> • Probabilities of measuring all bits are set by coefficients: – So, prob of getting |000> is |C 000|2, etc. – Suppose we measure only one bit (first): » We get “ 0” with probability: P 0=|C 000|2+ |C 001|2+ |C 010|2+ |C 011|2 Result: = (C 000|000>+ C 001|001>+ C 010|010>+ C 011|011>) » We get “ 1” with probability: P 1=|C 100|2+ |C 101|2+ |C 110|2+ |C 111|2 Result: = (C 100|100>+ C 101|101>+ C 110|110>+ C 111|111>) • Problem: Don’t want environment to measure before ready! – Solution: Quantum Error Correction Codes! 5/11/2009 cs 252 -S 09, Lecture 28 9

Spooky action at a distance • Consider the following simple 2 -bit state: =

Spooky action at a distance • Consider the following simple 2 -bit state: = C 00|00>+ C 11|11> – Called an “EPR” pair for “Einstein, Podolsky, Rosen” • Now, separate the two bits: Light-Years? • If we measure one of them, it instantaneously sets other one! – Einstein called this a “spooky action at a distance” – In particular, if we measure a |0> at one side, we get a |0> at the other (and vice versa) • Teleportation – Can “pre-transport” an EPR pair (say bits X and Y) – Later to transport bit A from one side to the other we: » Perform operation between A and X, yielding two classical bits » Send the two bits to the other side » Use the two bits to operate on Y » Poof! State of bit A appears in place of Y 5/11/2009 cs 252 -S 09, Lecture 28 10

Model: Operations on coefficients + measurements Input Complex State Unitary Transformations Measure Output Classical

Model: Operations on coefficients + measurements Input Complex State Unitary Transformations Measure Output Classical Answer • Basic Computing Paradigm: – Input is a register with superposition of many values » Possibly all 2 n inputs equally probable! – Unitary transformations compute on coefficients » Must maintain probability property (sum of squares = 1) » Looks like doing computation on all 2 n inputs simultaneously! – Output is one result attained by measurement • If do this poorly, just like probabilistic computation: – If 2 n inputs equally probable, may be 2 n outputs equally probable. – After measure, like picked random input to classical function! – All interesting results have some form of “fourier transform” computation being done in unitary transformation 5/11/2009 cs 252 -S 09, Lecture 28 11

 • Security of Factoring • Easy Hard Easy 5/11/2009 The Security of RSA

• Security of Factoring • Easy Hard Easy 5/11/2009 The Security of RSA Public-key cryptosystems depends on the difficult of factoring a number N=pq (product of two primes) – – Classical computer: sub-exponential time factoring Quantum computer: polynomial time factoring Shor’s Factoring Algorithm (for a quantum computer) 1) Choose random x : 2 x N-1. 2) If gcd(x, N) 1, Bingo! 3) Find smallest integer r : xr 1 (mod N) 4) If r is odd, GOTO 1 5) If r is even, a x r/2 (mod N) (a-1) (a+1) = k. N 6) If a = N-1 GOTO 1 7) ELSE gcd(a ± 1, N) is a non trivial factor of N. cs 252 -S 09, Lecture 28 12

Finding r with xr 1 (mod N) Quantum Fourier Transform • Finally: Perform measurement

Finding r with xr 1 (mod N) Quantum Fourier Transform • Finally: Perform measurement – Find out r with high probability – Get |y>|aw’> where y is of form k/r and w’ is related 5/11/2009 cs 252 -S 09, Lecture 28 13

ION Trap Quantum Computer: Promising technology Cross. Sectional View Top • IONS of Be+

ION Trap Quantum Computer: Promising technology Cross. Sectional View Top • IONS of Be+ trapped in oscillating quadrature field – Internal electronic modes of IONS used for quantum bits – MEMs technology – Target? 50, 000 ions – ROOM Temperature! • Ions moved to interaction regions – Ions interactions with one another moderated by lasers 5/11/2009 Top View Proposal: NIST Group cs 252 -S 09, Lecture 28 14

Ion Trap Quantum Computer Q 1 Two-Qubit Gate • Major Components - Data =

Ion Trap Quantum Computer Q 1 Two-Qubit Gate • Major Components - Data = an ion - Gate = a location • Ballistic Movement - Apply pulse sequences to electrodes - Electrostatic forces move ion - Intersections similar, but more complicated pulse sequences Q 2 5/11/2009 cs 252 -S 09, Lecture 28 15

Ballistic Movement Network One-Qubit Gate Q 2 One-Qubit Gate Two-Qubit Gate R R R

Ballistic Movement Network One-Qubit Gate Q 2 One-Qubit Gate Two-Qubit Gate R R R Q 3 R Two-Qubit Gate Q 4 Memory Cell Q 1 Interconnection Network Q 5 Memory Cell • Problem: Noise accumulation! 5/11/2009 cs 252 -S 09, Lecture 28 16

Qubit Error Noise Accumulation from Movement Distance Moved in Gates • Noise may increase

Qubit Error Noise Accumulation from Movement Distance Moved in Gates • Noise may increase error by factor of 100 5/11/2009 cs 252 -S 09, Lecture 28 17

Movement Option 2: Teleportation Source Location 2. Local Ops D 3. Transmit two classical

Movement Option 2: Teleportation Source Location 2. Local Ops D 3. Transmit two classical bits Entanglement E 1 E 2 Target Location 4. Local Ops D? D 1. Generate EPR pair • Goal: Transfer the state, not the data ion • Problem: EPR pairs become noisy • Teleportation Benefits - Error Correction of data (arbitrary state): ~100 ms Purification of EPR pair (known state): ~120 µs - Pre-distribution of EPR pairs 5/11/2009 cs 252 -S 09, Lecture 28 18

EPR Pair Distribution Network One-Qubit Gate Q 2 One-Qubit Gate Two-Qubit Gate R R

EPR Pair Distribution Network One-Qubit Gate Q 2 One-Qubit Gate Two-Qubit Gate R R EPR Pair R Generators R Q 3 Two-Qubit Gate Q 4 Memory Cell Q 1 5/11/2009 Interconnection Network cs 252 -S 09, Lecture 28 Q 5 Memory Cell 19

Setting Up a Teleportation Link • Purification = Amplification of EPR pair link -

Setting Up a Teleportation Link • Purification = Amplification of EPR pair link - Two EPR pairs One “purer” pair, one junk pair - Chance of failure • Need to send multiple pairs EPR Qubits Entanglement. EPR Qubits STRONGER P G Recycled Qubits 5/11/2009 P Recycled Qubits For Data Teleportation cs 252 -S 09, Lecture 28 20

Chained Teleportation T P G Teleportation T G T G T • Adjacent T

Chained Teleportation T P G Teleportation T G T G T • Adjacent T nodes linked for teleportation P • Positive Features - T node linking not on critical path - Pre-purification (Link Amplification): part of link setup 5/11/2009 cs 252 -S 09, Lecture 28 21

Quantum Network Architecture T G G P Gate G T G T P Gate

Quantum Network Architecture T G G P Gate G T G T P Gate • Grid of T nodes , linked by G nodes • Packet-switched network - Dimension-order routing • Each qubit has associated message 5/11/2009 cs 252 -S 09, Lecture 28 22

Classical Control • Quantum Datapath Layer - T Nodes and G Nodes (P Nodes

Classical Control • Quantum Datapath Layer - T Nodes and G Nodes (P Nodes and Gates not shown) • Classical Control Layers - Messages Associated with Qubits - Teleportation and Purification Bits 5/11/2009 cs 252 -S 09, Lecture 28 23

Running a Quantum Circuit • Simple gates (transversal) • More complex gates (non-transversal) –

Running a Quantum Circuit • Simple gates (transversal) • More complex gates (non-transversal) – Exist in any universal set • Quantum Error Correction (QEC) – 10 -8 to 10 -6 error rates from gates, movement and idleness – Data must be encoded and periodically error corrected • Ancilla (helper) qubits – Necessary for complex gates and for QEC – Computation with ancilla qubits > 90% of quantum program Q 0 Q 1 T time 5/11/2009 H QEC T QEC C X QEC QEC QEC T T QEC H QEC Serial Circuit Latency cs 252 -S 09, Lecture 28 24

Running a Quantum Circuit • Ancilla qubits are independent of data – Preparation may

Running a Quantum Circuit • Ancilla qubits are independent of data – Preparation may be pulled offline – Ancilla qubits should be ready just in time to avoid noise from idleness Q 0 H Q 1 T 5/11/2009 QEC C X QEC T QEC H QEC Parallel Circuit Latency cs 252 -S 09, Lecture 28 25

Running at The Speed of Data • Ideally, execution time determined solely by data

Running at The Speed of Data • Ideally, execution time determined solely by data Operations involving data qubits hardware time Ancilla encoding 5/11/2009 cs 252 -S 09, Lecture 28 26

Execution Time of a 32 -Bit QCLA (μs) Limited Ancilla Bandwidth Encoded Ancilla Bandwidth

Execution Time of a 32 -Bit QCLA (μs) Limited Ancilla Bandwidth Encoded Ancilla Bandwidth Available (Ancillae per ms) • 32 -bit Quantum Carry-Lookahead Adder – Varying rate at which encoded zero ancillae are provided for QEC – Conclusion: design architecture with “ancilla factories” 5/11/2009 cs 252 -S 09, Lecture 28 27

Ancilla Factory Design I • “In-place” ancilla preparation QEC 0 Prep Cat Prep Verify

Ancilla Factory Design I • “In-place” ancilla preparation QEC 0 Prep Cat Prep Verify ? Bit Correct ? Phase Correct ? Ancilla Generation Circuit Encoded Ancilla Verification Qubits • Ancilla factory consists of many of these – – 5/11/2009 Encoded ancilla prepared in many places, then moved to output port Movement is costly! cs 252 -S 09, Lecture 28 In-place Prep 28

Idealized Qalypso Architecture • Dense data region – Data qubits only – Local communication

Idealized Qalypso Architecture • Dense data region – Data qubits only – Local communication • Shared Ancilla Factories – – Distributed to data as needed Fully multiplexed to all data Output ports ( ): close to data Input ports ( ): may be far from data, since recycled qubits have irrelevant state • Goals – Design ancilla factories – Answer Question: How much hardware is needed for ancilla generation to run at the speed of data? 5/11/2009 cs 252 -S 09, Lecture 28 29

Discussion of ISCA 2009 paper • “A Fault Tolerant, Area Efficient Architecture for Shor’s

Discussion of ISCA 2009 paper • “A Fault Tolerant, Area Efficient Architecture for Shor’s Factoring Algorithm” – Mark Whitney, Nemanja Isailovic, Yatish Patel, and John Kubiatowicz – ISCA 2009 • How to compare layouts? (what is good)? – Probabilistic circuits need metric that includes probability of failure – ADCR = Probabilistic version of area-delay product – Lower is better • What to optimize? – Many different “datapath organizations” – Far too much error correction 5/11/2009 cs 252 -S 09, Lecture 28 30

Datapath Organizations • QLA: “Quantum Logic Array” – Every compute region has space for

Datapath Organizations • QLA: “Quantum Logic Array” – Every compute region has space for 2 bits and ancilla generation for 2 bits (to correct after every operation • CQLA: “Compressed Quantum Logic Array” – Same compute regions as QLA, but ability to have less ancilla generation/bit for memory (idle bits less prone to error) • Qalypso – Matching ancilla generation to needs 5/11/2009 cs 252 -S 09, Lecture 28 31

Error Correction Optimization • Selectively error correction placement – Standard techniques: correct after every

Error Correction Optimization • Selectively error correction placement – Standard techniques: correct after every error – Instead – only correct bits that are particularly “dirty” • Error correction modeled after retiming optimization – Only place correction when approximate “EDist” parameter reaches threshold – Then, perform full mapping – Choose EDist by optimizing ADCR • Very successful at reducing area/latency • Even improves probability of success in some cases! – Why? Because error correction involves operations which can introduce error 5/11/2009 cs 252 -S 09, Lecture 28 32

Shor’s Factoring Circuit • Most of time spent in modular exponentiation – QFT is

Shor’s Factoring Circuit • Most of time spent in modular exponentiation – QFT is much smaller fraction of time – Easiest way to build modular exponentiation: with adders » Build multiplier instead? Not studied yet – would be very large • Paper result: can factor in 7659 mm 2 – Previous result was 0. 9 m 2 5/11/2009 cs 252 -S 09, Lecture 28 33

Conclusion • Computing can be done in a variety of ways – Normal silicon

Conclusion • Computing can be done in a variety of ways – Normal silicon gates not required • DNA Computing – Limited use “demonstration of concept” – Form of massive parallelism – Interesting consequences: self assembly • Quantum Computing: – Computing using superposition and quantization – Ion Traps: a particularly promising technology • CAD Tools for Quantum Computing – Can actually optimize circuits – just like classical case – ADCR = probabilistic version of Area-Delay product 5/11/2009 cs 252 -S 09, Lecture 28 34