Discrete Optimization MA 2827 Fondements de loptimisation discrte

  • Slides: 22
Download presentation
Discrete Optimization MA 2827 Fondements de l’optimisation discrète Dynamic programming (Part 2) https: //project.

Discrete Optimization MA 2827 Fondements de l’optimisation discrète Dynamic programming (Part 2) https: //project. inria. fr/2015 ma 2827/ Material based on the lectures of Erik Demaine at MIT and Pascal Van Hentenryck at Coursera

Outline • Dynamic programming – Guitar fingering • Quiz: bracket sequences • More dynamic

Outline • Dynamic programming – Guitar fingering • Quiz: bracket sequences • More dynamic programming – Tetris – Blackjack

Dynamic programming • DP ≈ “careful brute force” • DP ≈ recursion + memoization

Dynamic programming • DP ≈ “careful brute force” • DP ≈ recursion + memoization + guessing • Divide the problem into subproblems that are connected to the original problem • Graph of subproblems has to be acyclic (DAG) • Time = #subproblems · time/subproblem

5 easy steps of DP 1. Define subproblems 2. Guess part of solution 3.

5 easy steps of DP 1. Define subproblems 2. Guess part of solution 3. Relate subproblems (recursion) Analysis: #subproblems #choices time/subproblem time 4. Recurse + memoize OR build DP table bottom-up - check subprobs be acyclic / topological order 5. Solve original problem extra time

Guitar fingering Task: find the best way to play a melody

Guitar fingering Task: find the best way to play a melody

Guitar fingering Task: find the best way to play a melody • Input: sequence

Guitar fingering Task: find the best way to play a melody • Input: sequence of notes to play with right hand • One note at a time! • Which finger to use? 1, 2, …, F = 5 for humans • Measure d( f, p, g, q ) of difficulty to go from note p with finger f to note q with finger g Examples of rules: crossing fingers: 1 < f < g and p > q => uncomfortable stretching: p << q => uncomfortable legato (smooth): ∞ if f = g

Guitar fingering Task: find the best way to play a melody Goal: minimize overall

Guitar fingering Task: find the best way to play a melody Goal: minimize overall difficulty Subproblems: min. difficulty for suffix note[ i : ] #subproblems = O( n ) where n = #notes Guesses: finger f for the first note[ i ] #choices = F Recurrence: DP[ i ] = min{ DP[ i + 1 ] + d( note[ i ], f, note[ i +1 ], next finger ) } Not enough information!

Guitar fingering Task: find the best way to play a melody Goal: minimize overall

Guitar fingering Task: find the best way to play a melody Goal: minimize overall difficulty Subproblems: min. difficulty for suffix note[ i : ] when finger f is on note[ i ] #subproblems = O( n F ) Guesses: finger f for the next note, note[ i + 1 ] #choices = F Recurrence: DP[ i, f ] = min{ DP[ i + 1, g ] + d( note[ i ], f, note[ i +1 ], g ) | all g } Base-case: DP[ n, f ] = 0 time/subproblem = O( F )

Guitar fingering Task: find the best way to play a melody Goal: minimize overall

Guitar fingering Task: find the best way to play a melody Goal: minimize overall difficulty Subproblems: min. difficulty for suffix note[ i : ] when finger f is on note[ i ] #subproblems = O( n F ) Guesses: finger f for the next note, note[ i + 1 ] #choices = F Recurrence: DP[ i, f ] = min{ DP[ i + 1, g ] + d( note[ i ], f, note[ i +1 ], g ) | all g } Base-case: DP[ n, f ] = 0 time/subproblem = O( F )

Guitar fingering Task: find the best way to play a melody Topological order: notes

Guitar fingering Task: find the best way to play a melody Topological order: notes for i = n-1, n-2, …, 0: for f = 1, …, F: fingers total time = O( n F 2 ) Final problem: find minimal DP[ 0, f ] for f = 1, …, F guessing the first finger

Quiz: bracket sequences Consider sequences of brackets: ( ) [ ] { } A

Quiz: bracket sequences Consider sequences of brackets: ( ) [ ] { } A sequence of brackets is correct when 1. each opening bracket matches to a closing one (same type) 2. substring inside a matching pair is correct Examples: [ () () { [ ] } ] )()()( [][()} correct incorrect

Quiz: bracket sequences Consider sequences of brackets: ( ) [ ] { } A

Quiz: bracket sequences Consider sequences of brackets: ( ) [ ] { } A sequence of brackets is correct when 1. each opening bracket matches to a closing one (same type) 2. substring inside a matching pair is correct Task 1: How many correct sequences of length 2 n exist? Task 2: Given a sequence of length n (incorrect), how many (minimum) symbols do you need to add make the sequence correct? Example: ( { ] ) => ( { } [ ] )

Tetris Task: win in the game of Tetris!

Tetris Task: win in the game of Tetris!

Tetris Task: win in the game of Tetris! • Input: a sequence of n

Tetris Task: win in the game of Tetris! • Input: a sequence of n Tetris pieces and an empty board of small width w • Choose orientation and position for each piece • Must drop piece till it hits something • Full rows do not clear • Goal: survive i. e. , stay within height h

Tetris Task: stay within height h Subproblem: survival? in suffix [ i : ]

Tetris Task: stay within height h Subproblem: survival? in suffix [ i : ] given a particular column profile #subproblems = O( n hw ) Guesses: where to drop piece i? #choices = O( w ) Recurrence: DP[ i, p ] = max { DP[ i + 1, q ] | q is a valid move from p } Base-case: DP[ n+1, p ] = true for all profiles p time/subproblem = O( w )

Tetris Task: stay within height h pieces Topological order: for i = n –

Tetris Task: stay within height h pieces Topological order: for i = n – 1, n – 2, …, 0: for p = 0, …, hw – 1: total time O( n w hw ) Final problem: DP[ 0, empty ] profiles

Blackjack Task: beat the blackjack (twenty-one)!

Blackjack Task: beat the blackjack (twenty-one)!

Blackjack Task: beat the blackjack! Rules of Blackjack (simplified): • The player and the

Blackjack Task: beat the blackjack! Rules of Blackjack (simplified): • The player and the dealer are initially given 2 cards each • Each card gives points: - Cards 2 -10 are valued at the face value of the card - Face cards (King, Queen, Jack) are valued at 10 - The Ace card can be valued either at 11 or 1 • The goal of the player is to get more points than the dealer, but less than 21, if more than 21 than he looses (busts) • Player can take any number of cards (hits) • After that the dealer hits deterministically: until ≥ 17 points

Perfect-information Blackjack Task: beat the blackjack with a marked deck! • Input: a deck

Perfect-information Blackjack Task: beat the blackjack with a marked deck! • Input: a deck of cards c 0, …, cn-1 • Player vs. dealer one-on-one • Goal: maximize winning for a fixed bet $1 • Might benefit from loosing to get a better deck

Perfect-information Blackjack Task: beat the blackjack with a marked deck! Subproblem: BJ[ i ]

Perfect-information Blackjack Task: beat the blackjack with a marked deck! Subproblem: BJ[ i ] = best play of ci, …, cn-1 #subproblems = O( n ) Guesses: how many times player hits? #choices ≤ n Topological order: Final problem: Recurrence: BJ[ i ] = max{ outcome {-1, 0, 1} + BJ[ i + 4 + #hits + #dealer hits ] | for #hits = 0, …, n if valid play }

Perfect-information Blackjack Detailed recursion: def BJ(i): if n − i < 4: return 0

Perfect-information Blackjack Detailed recursion: def BJ(i): if n − i < 4: return 0 (not enough cards) outcome = [ ] for p = 2, …, n − i − 2: (# cards taken) player = ci + ci+2 + ci+4 + … + ci+p+2 if player > 21: (bust) outcome. append( -1 + BJ(i+p+2) ) break for d = 2, …, n – i – p – 1 dealer = ci+1 + ci+3 + ci+p+2 + … + ci+p+d if dealer ≥ 17: break if dealer > 21: dealer = 0 (bust) outcome. append( cmp(player, dealer) + BJ(i + p + d) ) return max( outcome )

Perfect-information Blackjack Task: beat the blackjack with a marked deck! Subproblem: BJ[ i ]

Perfect-information Blackjack Task: beat the blackjack with a marked deck! Subproblem: BJ[ i ] = best play of ci, …, cn-1 #subproblems = O( n ) Guesses: how many times player hits? #choices ≤ n Topological order: for i = n-1, …, 0: total time O( n 3 ) Final problem: BJ[ 0 ] Recurrence: BJ[ i ] = max{ outcome {-1, 0, 1} + BJ[ i + 4 + #hits + #dealer hits ] | for #hits = 0, …, n if valid play } time/subproblem = O( n 2 )