Celso C Ribeiro Universidade Federal Fluminense Brazil Memory

Celso C. Ribeiro Universidade Federal Fluminense, Brazil Memory approaches to improve multi-start constructive heuristics Joint work with Eraldo Fernandes (M. Sc. , PUC-Rio, Brazil) WEA’ 2005 – IV Workshop on Experimental and Efficient Algorithms Santorini, May 2005 Abril-agosto 2004 Modelos e métodos de otimização

Summary § § § § § Application: DNA sequencing Motivation: sequencing by hybridization Multi-start randomized constructive heuristic Adaptive memory strategy Vocabulary building Complete heuristic: MS+MEM+VB Computational experiments Numerical results and comparisons Concluding remarks May 2005 Memory approaches to improve multi-start constructive heuristics 2

DNA sequencing § DNA molecule: sequence formed by a combination of four different nucleotide bases - A, C, G, and T § Each DNA molecule may be represented as a word over the alphabet {A, C, G, T} of nucleotide bases § Example: ATAGGCAGGA § Sequencing: identification of the contents of a DNA molecule • Gel electrophoresis • Chemical method May 2005 Memory approaches to improve multi-start constructive heuristics 3

Sequencing by hybridization § SBH: alternative approach to DNA sequencing § Two phases: • Biochemical: hybridization experiment involving a DNA array and the target molecule to be sequenced • Computational: reconstruction problem using the results of the hybridization experiment May 2005 Memory approaches to improve multi-start constructive heuristics 4

Sequencing by hybridization § DNA array: • Bidimensional grid • Each cell contains a probe: small sequence of q nucleotides • Library C(q): set of all 4 q probes of size q in the array § Hybridization experiment: • Array is introduced into a solution containing many copies of the target sequence • A copy of the target sequence reacts with a probe if the latter is a subsequence (of the complement) of the former • Spectrum: set of all probes of size q that reacted with the target sequence, i. e. , subsequences of size q that appear in May 2005 the target. Memory approaches to improve multi-start constructive heuristics 5

Sequencing by hybridization Library C(4): AAAA ATAA ACAA AGAA TAAA TTAA TCAA TGAA CAAA CTAA CCAA CGAA GAAA GTAA GCAA GGAA AAAT ATAT ACAT AGAT TAAT TTAT TCAT TGAT CAAT CTAT CCAT CGAT GAAT GTAT GCAT GGAT AAAC ATAC ACAC AGAC TAAC TTAC TCAC TGAC CAAC CTAC CCAC CGAC GAAC GTAC GCAC GGAC AAAG ATAG ACAG AGAG TAAG TTAG TCAG TGAG CAAG CTAG CCAG CGAG GAAG GTAG GCAG GGAG AATA ATTA ACTA AGTA TATA TTTA TCTA TGTA CATA CTTA CCTA CGTA GATA GTTA GCTA GGTA AATT ATTT ACTT AGTT TATT TTTT TCTT TGTT CATT CTTT CCTT CGTT GATT GTTT GCTT GGTT AATC ATTC ACTC AGTC TATC TTTC TCTC TGTC CATC CTTC CCTC CGTC GATC GTTC GCTC GGTC AATG ATTG ACTG AGTG TATG TTTG TCTG TGTG CATG CTTG CCTG CGTG GATG GTTG GCTG GGTG AACA ATCA ACCA AGCA TACA TTCA TCCA TGCA CACA CTCA CCCA CGCA GACA GTCA GCCA GGCA AACT ATCT ACCT AGCT TACT TTCT TCCT TGCT CACT CTCT CCCT CGCT GACT GTCT GCCT GGCT AACC ATCC ACCC AGCC TACC TTCC TCCC TGCC CACC CTCC CCCC CGCC GACC GTCC GCCC GGCC AACG ATCG ACCG AGCG TACG TTCG TCCG TGCG CACG CTCG CCCG CGCG GACG GTCG GCCG GGCG AAGA ATGA ACGA AGGA TAGA TTGA TCGA TGGA CAGA CTGA CCGA CGGA GAGA GTGA GCGA GGGA AAGT ATGT ACGT AGGT TAGT TTGT TCGT TGGT CAGT CTGT CCGT CGGT GAGT GTGT GCGT GGGT AAGC AAGG ATGC ATGG ACGC ACGG AGGC AGGG TAGC TAGG TTGC TTGG TCGC TCGG TGGC TGGG CAGC CAGG CTGC CTGG CCGC CCGG CGGC CGGG GAGC GAGG GTGC GTGG GCGC GCGG GGGC GGGG Target sequence: ATAGGCAGGA May 2005 Memory approaches to improve multi-start constructive heuristics 6

Sequencing by hybridization Library C(4): AAAA ATAA ACAA AGAA TAAA TTAA TCAA TGAA CAAA CTAA CCAA CGAA GAAA GTAA GCAA GGAA AAAT ATAT ACAT AGAT TAAT TTAT TCAT TGAT CAAT CTAT CCAT CGAT GAAT GTAT GCAT GGAT AAAC ATAC ACAC AGAC TAAC TTAC TCAC TGAC CAAC CTAC CCAC CGAC GAAC GTAC GCAC GGAC AAAG ATAG ACAG AGAG TAAG TTAG TCAG TGAG CAAG CTAG CCAG CGAG GAAG GTAG GCAG GGAG AATA ATTA ACTA AGTA TATA TTTA TCTA TGTA CATA CTTA CCTA CGTA GATA GTTA GCTA GGTA AATT ATTT ACTT AGTT TATT TTTT TCTT TGTT CATT CTTT CCTT CGTT GATT GTTT GCTT GGTT AATC ATTC ACTC AGTC TATC TTTC TCTC TGTC CATC CTTC CCTC CGTC GATC GTTC GCTC GGTC AATG ATTG ACTG AGTG TATG TTTG TCTG TGTG CATG CTTG CCTG CGTG GATG GTTG GCTG GGTG AACA ATCA ACCA AGCA TACA TTCA TCCA TGCA CACA CTCA CCCA CGCA GACA GTCA GCCA GGCA AACT ATCT ACCT AGCT TACT TTCT TCCT TGCT CACT CTCT CCCT CGCT GACT GTCT GCCT GGCT AACC ATCC ACCC AGCC TACC TTCC TCCC TGCC CACC CTCC CCCC CGCC GACC GTCC GCCC GGCC AACG ATCG ACCG AGCG TACG TTCG TCCG TGCG CACG CTCG CCCG CGCG GACG GTCG GCCG GGCG AAGA ATGA ACGA AGGA TAGA TTGA TCGA TGGA CAGA CTGA CCGA CGGA GAGA GTGA GCGA GGGA AAGT ATGT ACGT AGGT TAGT TTGT TCGT TGGT CAGT CTGT CCGT CGGT GAGT GTGT GCGT GGGT AAGC AAGG ATGC ATGG ACGC ACGG AGGC AGGG TAGC TAGG TTGC TTGG TCGC TCGG TGGC TGGG CAGC CAGG CTGC CTGG CCGC CCGG CGGC CGGG GAGC GAGG GTGC GTGG GCGC GCGG GGGC GGGG Target sequence: ATAGGCAGGA Spectrum: {ATAG, TAGG, AGGC, GGCA, GCAG, CAGG, AGGA} May 2005 Memory approaches to improve multi-start constructive heuristics 7

Sequencing by hybridization § Reconstruction problem: ATAG • Second phase: reconstruction of the target TAGG AGGC sequence from the spectrum GGCA • Find a sequence of the probes in the GCAG CAGG spectrum such that consecutive probes AGGA have q-1 bases of ATAGGCAGGA superposition § Hamiltonian path problem on the spectrum: • One vertex for each probe u in the spectrum Memory approaches to improve multi-start constructive heuristics May 2005 8

Sequencing by hybridization Spectrum: {ATAG, TAGG, AGGC, GGCA, GCAG, CAGG, AGGA} TAGG AGGC ATAG GGCA AGGA CAGG May 2005 GCAG Memory approaches to improve multi-start constructive heuristics 9

Sequencing by hybridization Spectrum: {ATAG, TAGG, AGGC, GGCA, GCAG, CAGG, AGGA} TAGG AGGC ATAG GGCA AGGA CAGG May 2005 GCAG Memory approaches to improve multi-start constructive heuristics 10

Sequencing by hybridization Spectrum: {ATAG, TAGG, AGGC, GGCA, GCAG, CAGG, AGGA} TAGG ATAG AGGA CAGG May 2005 ATAG AGGC TAGG AGGC GGCA GCAG CAGG AGGA GCAG ATAGGCAGGA Memory approaches to improve multi-start constructive heuristics 11

Sequencing by hybridization § Hybridization errors: • Hybridization experiment is not perfect • False positives: probes that appear in the spectrum but not in the target sequence • False negatives: probes that occur in the target sequence but not in the spectrum May 2005 Memory approaches to improve multi-start constructive heuristics ATAG TAGG AGGC ---GCAG CAGG AGGA ATAGGCAGGA 12

Sequencing by hybridization § Problem of sequencing by hybridization (PSBH): given the spectrum S = {s 1, s 2, . . . , sm}, the size q of the probes, the length n, and the first probe s 0 of the target sequence, find a sequence with size smaller than or equal to n with a maximum number of probes. § PSBH is NP-hard (Blazewicz et al. , 1999) May 2005 Memory approaches to improve multi-start constructive heuristics 13

Sequencing by hybridization § Directed graph G = (V, E) • • May 2005 V = S (probes in the spectrum) E = {(u, v): u S and v S} Superposition o(u, v) between two probes u, v S: size of the largest sequence that is both a suffix of u and a prefix of v Weight w(u, v) of the arc (u, v): Memory approaches to improve multi-start constructive heuristics 14

Sequencing by hybridization Spectrum: {ATAG, TAGG, AGGC, GCAG, CAGG, AGGA, GGCG} (q = 4) TAGG AGGC ATAG GGCG AGGA CAGG GGCG: false positive GGCA: false negative GCAG Target sequence: ATAGGCAGGA (n = 10) May 2005 Memory approaches to improve multi-start constructive heuristics 15

Sequencing by hybridization Spectrum: {ATAG, TAGG, AGGC, GCAG, CAGG, AGGA, GGCG} (q = 4) 1 TAGG 1 1 ATAG AGGC 3 1 2 GGCG AGGA 3 1 1 CAGG 1 GGCG: false positive GGCA: false negative GCAG Target sequence: ATAGGCAGGA (n = 10) May 2005 Memory approaches to improve multi-start constructive heuristics 16

Sequencing by hybridization § Feasible solutions: acyclic paths in G emanating from vertex s 0 with weight less than or equal to n-q § A path in G is a sequence a = (a 1, a 2, . . . , ak) of probes ai S, i {1, 2, . . . , k} § An optimal solution visits a maximum number of vertices and respects the above constraints § Heuristics: ant colony, tabu search, genetic algorithm § This work: multi-start constructive heuristic with a memory-based strategy May 2005 Memory approaches to improve multi-start constructive heuristics 17

Sequencing by hybridization Spectrum: {ATAG, TAGG, AGGC, GCAG, CAGG, AGGA, GGCG} (q = 4) 1 TAGG 1 1 ATAG AGGC 3 1 2 GGCG AGGA 3 1 1 CAGG 1 GGCG: false positive GGCA: false negative GCAG Target sequence: ATAGGCAGGA (n = 10) May 2005 Memory approaches to improve multi-start constructive heuristics 18

Sequencing by hybridization Spectrum: {ATAG, TAGG, AGGC, GCAG, CAGG, AGGA, GGCG} (q = 4) 1 TAGG 1 1 ATAG AGGC 3 1 2 GGCG AGGA 3 1 1 CAGG 1 GGCG: false positive GGCA: false negative GCAG Target sequence: ATAGGCAGGA (n = 10) May 2005 Memory approaches to improve multi-start constructive heuristics 19

Sequencing by hybridization Spectrum: {ATAG, TAGG, AGGC, GCAG, CAGG, AGGA, GGCG} (q = 4) ATAG 1 TAGG AGGC 1 TAGG 1 3 1 2 AGGC ATAG ---GGCG GCAG AGGA 3 CAGG 1 AGGA 1 CAGG 1 GCAG ATAGGCAGGA GGCG: false positive Target sequence: ATAGGCAGGA GGCA: false negative (n = 10) May 2005 Memory approaches to improve multi-start constructive heuristics 20

heuristic § Iteratively builds multiple solutions using a randomized constructive algorithm § Randomized constructive algorithm builds a different solution at each run § Returns the best solution found § Initial solution formed by a unique probe: a = (s 0) § Current partial solution (path) is extended at each iteration by the insertion of a new probe at the end May 2005 Memory approaches to improve multi-start constructive heuristics 21

heuristic § Current partial solution (path) is extended at each iteration by the insertion of a new probe at the end § Probe to be inserted is probabilistically selected from a restricted candidate list (RCL) § S(a): probes in the current partial solution a § u: last probe in the current path § RCL = {v SS(a): o(u, v) ≥ (1 - ). max t SS(a) o(u, t) and w(a) + w(u, v) n-q} § Randomly select a probe v from RCL with probability p(u, v) = (1/w(u, v))/Σ t SS(a) (1/w(u, t)) greediness May 2005 Memory approaches to improve multi-start constructive heuristics 22

Adaptive memory strategy § Application to QAP: Fleurent and Glover, 1999 § Pool Q of elite solutions (best solutions found): diversity § Intensification strategy for the constructive algorithm § Makes use of two kinds of information in the construction: superposition between the probes and frequency of the arcs in the elite solutions § Parameter used to balance the weights of the two terms: greediness (superposition) and frequency (memory) May 2005 Memory approaches to improve multi-start constructive heuristics 23

Adaptive memory strategy greediness higher when the superposition between probes u and v is larger frequency higher for arcs (u, v) appearing more often in the solutions of the elite set Probability p(u, v) of selecting a probe v from the RCL to extend the current partial solution whose last probe is u: May 2005 Memory approaches to improve multi-start constructive heuristics 24

Adaptive memory strategy § Pool update: • Pool size: at most q solutions • Solution a is a candidate to be inserted into the pool Q if it is better than the worst solution currently in the pool, i. e. , |a| > min a’ Q|a’| • Candidate solution a replaces the worst solution in the pool if it is better than the best solution in the pool (|a| > max a’ Q|a’|) or if it is sufficiently different from every other solution in the pool (min a’ Q dist(a, a’) ≥ dmin) May 2005 Memory approaches to improve multi-start constructive heuristics 25

Vocabulary building § Good solutions are very often formed by the same building blocks (paths) § Optimal solutions formed by components appearing in suboptimal solutions § Identify short paths with optimal superposition and combine them to build optimal solutions § Vocabulary building: Glover and Laguna, 1997 • Find common paths appearing in good solutions (words) • Combine them into new good solutions (phrases) May 2005 Memory approaches to improve multi-start constructive heuristics 26

Vocabulary building § Solutions encoded as adjacency vectors • Solution a = (a 1, a 2, . . . , ak) represented as a vector x = x 1, x 2, . . . , x|S| • If xu = s, then probe s follows immediately after probe u, i. e. , the arc (u, s) is used in the path 1 a = (1, 4, 2, 3, 5) 6 3 5 May 2005 2 4 Memory approaches to improve multi-start constructive heuristics 27

Vocabulary building § Solutions encoded as adjacency vectors • Solution a = (a 1, a 2, . . . , ak) represented as a vector x = x 1, x 2, . . . , x|S| • If xu = s, then probe s follows immediately after probe u, i. e. , the arc (u, s) is used in the path 1 a = (1, 4, 2, 3, 5) 6 3 5 May 2005 2 4 Memory approaches to improve multi-start constructive heuristics 28

Vocabulary building § Solutions encoded as adjacency vectors • Solution a = (a 1, a 2, . . . , ak) represented as a vector x = x 1, x 2, . . . , x|S| • If xu = s, then probe s follows immediately after probe u, i. e. , the arc (u, s) is used in the path 1 a = (1, 4, 2, 3, 5) 6 3 5 May 2005 2 u 1 2 3 4 xu 4 3 5 2 5 - 6 - 4 Memory approaches to improve multi-start constructive heuristics 29

Vocabulary building § Some notation: • • May 2005 Set X of adjacency vectors Size(x): number of arcs in the adjacency vector x Inter(X): subset of arcs that appear in all vectors in X Enclosure(y, X): set formed by all vectors in X that contain the arcs in the adjacency vector y Memory approaches to improve multi-start constructive heuristics 30

3 4 4 2 5 1 6 8 u 3 7 8 7 1 2 3 4 5 6 7 8 u 1 2 3 4 5 6 7 8 xu 2 3 6 7 - 4 8 - xu 2 3 6 7 4 5 8 - Inter(x 1, x 2): May 2005 Memory approaches to improve multi-start constructive heuristics 31

3 4 4 2 5 1 6 8 u 3 7 8 7 1 2 3 4 5 6 7 8 u 1 2 3 4 5 6 7 8 xu 2 3 6 7 - 4 8 - xu 2 3 6 7 4 5 8 - Inter(x 1, x 2): May 2005 Memory approaches to improve multi-start constructive heuristics 32

3 4 4 2 5 1 6 8 u 3 7 8 7 1 2 3 4 5 6 7 8 u 1 2 3 4 5 6 7 8 xu 2 3 6 7 - 4 8 - xu 2 3 6 7 4 5 8 - 3 Inter(x 1, x 2): 2 5 1 6 8 May 2005 4 u 1 2 3 4 5 6 7 8 xu 2 3 6 7 - - 8 - 7 Memory approaches to improve multi-start constructive heuristics 33

Vocabulary building § Some notation: • • Set X of adjacency vectors Size(x): number of arcs in the adjacency vector x Inter(X): subset of arcs that appear in all vectors in X Enclosure(y, X): set formed by all vectors in X that contain the arcs in the adjacency vector y § Find words: given an elite set X, find vectors y with |Enclosure(y, X)| as large as possible and Size(y) ≥ smin (non-elementary small words), where smin is a parameter May 2005 Memory approaches to improve multi-start constructive heuristics 34

Vocabulary building § Algorithm Find. Words(X, smin): Y , X’ X while X’ do x rand(X’), Z {x}, X’’ X - {x} while X’’ do x rand(X’’) if Size(Inter(Z {x})) ≥ smin then Z Z {x} X’’ - {x}; end-while if |Z| > 1 then y Inter(Z); Y Y {y} X’ – Z Martins and Plastino, 2005: more effective end-while algorithm based on data mining strategies return Y May 2005 Memory approaches to improve multi-start constructive heuristics 35

Vocabulary building § Additional notation: • x and y: adjacency vectors • Ext. Inter(x, y): undefined variables in one of the vectors are filled with the corresponding defined variables in the other May 2005 Memory approaches to improve multi-start constructive heuristics 36

3 4 2 5 1 6 8 7 u 1 2 3 4 5 6 7 8 xu 2 3 xu - 3 4 5 6 - - - - 8 - Ext. Inter(x 1, x 2): May 2005 Memory approaches to improve multi-start constructive heuristics 37

3 4 2 5 1 6 8 7 u 1 2 3 4 5 6 7 8 xu 2 3 xu - 3 4 5 6 - - - 2 5 u 1 2 3 4 5 6 7 8 1 6 xu 2 3 4 5 6 - 8 - - - 8 3 Ext. Inter(x 1, x 2): 8 May 2005 4 7 Memory approaches to improve multi-start constructive heuristics 38

Vocabulary building § Additional notation: • x and y: adjacency vectors • Ext. Inter(x, y): undefined variables in one of the vectors are filled with the corresponding defined variables in the other § Combine words: given a set of words Y, combine them into phrases • Very similar to the algorithm that finds words, replacing the original operator Inter by the new operator Ext. Inter May 2005 Memory approaches to improve multi-start constructive heuristics 39

Vocabulary building § Algorithm Combine. Words(Y): Z , Y’ Y while Y’ do y rand(Y’), W {y}, Y’’ Y - {y} while Y’’ do y rand(Y’’) if Max. In. Degree(Ext. Inter(W, y)) = 1 then W W {y} Y’’ - {y}; end-while if |W| > 1 then z Ext. Inter(W); Z Z {z} Y’ – W end-while return Z May 2005 Memory approaches to improve multi-start constructive heuristics 40

Vocabulary building § Combine words: given a set of words Y, combine them into phrases • Very similar to the algorithm that finds words, replacing the original operator Inter by the new operator Ext. Inter § Phrases may be incomplete or unfeasible § Make feasible the unfeasible phrases (solutions) • Insert probe s 0 in the best place in case it does not appear in the phrase • Complete the solution joining subpaths of the phrase May 2005 Memory approaches to improve multi-start constructive heuristics 41

Vocabulary building § Algorithm Vocabulary. Building(X, smin): Y Find. Words(X, smin) Z Combine. Words(Y) A for each z Z do a Make. Feasible(z) A A {a} end-for return A May 2005 Memory approaches to improve multi-start constructive heuristics 42

Complete heuristic: MS+MEM+VB § Algorithm MS+MEM+VB: Q: pool of elite solutions for adaptive memory X: pool of elite solutions for vocabulary building Q, X ; a* null |X|>>|Q| for i = 1, . . . , MAXITER a Greedy. Randomized. Memory(Q, ) if |a| > |a*| then a* a update weight and use a to update pools Q and X if i mod(n. VB) = 0 then A Vocabulary. Building(X, smin) for every a A do use a to update pools Q and X and if |a| > |a*| then a* a end-for May 2005 Memory approaches to improve multi-start constructive heuristics 43 return a*

Computational experiments § Conditions: • Pentium 2. 4 GHz with 512 M of RAM memory • Linux 10. 0 with kernel 2. 6. 3 • Codes in ANSI C++ compiled with GNU compiler version 3. 3. 2 § Instances: • set A: instances generated from real human DNA sequences obtained from Gen. Bank • set R: instances randomly generated May 2005 Memory approaches to improve multi-start constructive heuristics 44

Computational experiments § Instances A: • Origin: 40 Gen. Bank sequences • Five smaller sequences are generated from each original sequence, corresponding to their prefixes of size n = 109, 209, 309, 409, 509 • For each of them, we consider its ideal spectrum, with size resp. equal to 100, 200, 300, 400, 500, using an array with probes of size q = 10 • Total: 200 instances • 20% of false negatives and 20% of false positives generated for each instance (probe s 0 appears in all of them, no repetitions) May 2005 Memory approaches to improve multi-start constructive heuristics 45

Computational experiments § Instances R: • Origin: 100 random sequences • Ten smaller sequences are generated from each original sequence, corresponding to their prefixes of size n = 100, 200, . . . , 1000 • For each of them, we consider its ideal spectrum, with size resp. equal to 92, 192, . . . , 992, using an array with probes of size q = 7 • Total: 1000 instances • 20% of false negatives and 20% of false positives generated for each instance (probe s 0 appears in all of them, no repetitions) May 2005 Memory approaches to improve multi-start constructive heuristics 46

Computational experiments § Solution quality evaluation: 1. Number of probes in the solution: |a| 2. Similarity with the target sequence: • • Perform the alignment between the solution and the target sequence (matches: +1, missmatches: -1) to compute the value align( (a), *) by dynamic programming Compute similarity(a) = 100. (align( (a), *)+nmax)/(2. nmax), with nmax = max{| (a)|, | *|} 3. Fraction: Memory approaches to improve multi-start constructive heuristics May 2005 47

Computational experiments § Random instances in set R used for parameter seting and tuning • Weight decreases with the iteration counter • Small values of are used in the beginning, so as that purely greedy solutions are generated when no frequency information is available • Initial value of decreases with the problem size • MAXITER = 10. n (iterations) • Parameters and are updated after blocks of n/2 iterations May 2005 Memory approaches to improve multi-start constructive heuristics 48

Numerical results Average similarity with the target sequence over all R instances with the same size MS+MEM+VB Each additional component (memory, VB) improves the multi-start heuristic MS May 2005 Memory approaches to improve multi-start constructive heuristics 49

Numerical results Average computation time over all R instances with the same size May 2005 Memory approaches to improve multi-start constructive heuristics 50

Numerical results Average similarity with the target sequence observed with algorithm MS+Mem+VB over all R instances with the same size for different rates of errors May 2005 Memory approaches to improve multi-start constructive heuristics 51

Numerical results Average similarity with the target sequence observed with algorithm MS+Mem+VB over all R instances with the same size for different probe sizes May 2005 Memory approaches to improve multi-start constructive heuristics 52

Numerical results Best known solution for an instance in set R (n=1000) vs. iteration counter May 2005 Memory approaches to improve multi-start constructive heuristics 53

Numerical results Best known solution for an instance in set R (n=1000) vs. processing time (10. 4 seconds) May 2005 Memory approaches to improve multi-start constructive heuristics 54

Numerical results Best known solution for an instance in set R (n=1000) vs. processing time (10. 4 seconds) May 2005 Memory approaches to improve multi-start constructive heuristics 55

Numerical results Best known solution for another instance in set R (n=1000) vs. iteration counter May 2005 Memory approaches to improve multi-start constructive heuristics 56

Numerical results Best known solution for another instance in set R (n=1000) vs. processing time (9. 0 seconds) May 2005 Memory approaches to improve multi-start constructive heuristics 57

Numerical results Additional memory computations speedup the multi-start heuristic (better solutions in the same computation time), in spite of the increase in the time per iteration Memory helps! May 2005 Best known solution for another instance in set R (n=1000) vs. processing time (9. 0 seconds) Memory approaches to improve multi-start constructive heuristics 58

Numerical results increases and the greedy solutions deteriorate May 2005 decreases and the memory acts to improve the solutions Memory approaches to improve multi-start constructive heuristics 59

Numerical results § Instance in set R with n = 500 § Empirical distributions of the time to target solution value § Set a target value (in this case, the optimal value) § Run each algorithm 100 times and record the running time when a solution at least as good as the target value is found § Plot the empirical distributions May 2005 Memory approaches to improve multi-start constructive heuristics 60

Numerical results Instance in set R with n = 500 May 2005 Memory approaches to improve multi-start constructive heuristics 61

Numerical results Algorithms with memory are more robust (time to target values are more stable) Instance in set R with n = 500 Algorithms with memory find target values more quickly (algorithm to the left are preferable) May 2005 Memory approaches to improve multi-start constructive heuristics 62

Comparisons § Best algorithms in the literature: • Tabu search: Blazewicz et al. , 2000 • Overlapping windows heuristic: Blazewicz et al. , 2002 • SOPAS – Genetic algorithm: Endo, 2004 May 2005 Memory approaches to improve multi-start constructive heuristics 63

Comparisons Average similarity with the target sequence observed with the four algorithms over all A instances with the same size Sequence length (n) Algorithm 109 209 309 409 509 TS 98. 6 94. 1 89. 6 88. 5 80. 7 OW 99. 4 95. 2 95. 7 92. 1 90. 1 GA 98. 3 97. 9 99. 1 98. 1 93. 5 MS+Mem+V B 100. 0 99. 2 99. 4 99. 5 May 2005 Memory approaches to improve multi-start constructive heuristics 64

Comparisons (alternatively) Average similarity with the target sequence observed with the four algorithms over all A instances with the same size May 2005 Memory approaches to improve multi-start constructive heuristics 65

Comparisons Number of target sequences found by each of the four algorithms over all A instances with the same size Sequence length (n) Algorithm 109 209 309 409 509 TS 28 23 17 10 10 OW 28 20 21 13 14 GA 37 30 28 MS+Mem+V B 40 40 39 39 39 May 2005 Memory approaches to improve multi-start constructive heuristics 66

Comparisons Average computation times in seconds observed for each of the four algorithms over all A instances with the same size Sequence length (n) Algorithm 109 209 309 409 509 TS <1. 0 5. 0 14. 0 28. 0 51. 0 OW <1. 0 GA 0. 1 0. 3 0. 9 1. 5 2. 1 MS+Mem+V B 0. 1 0. 4 0. 9 3. 1 6. 2 Cray T 3 E-900 May 2005 Memory approaches to improve multi-start constructive heuristics 67

Comparisons (alternatively) Average computation times in seconds observed for each of the four algorithms over all A instances with the same size Cray T 3 E-900 May 2005 Memory approaches to improve multi-start constructive heuristics 68

Comparisons Number of target sequences found by MS+Mem+VB and the GA over all R instances with the same size Sequence length (n) Algorithm 100 200 300 400 500 600 700 800 900 1000 GA 70 61 55 37 23 11 9 3 1 2 MS+Mem+VB 79 74 83 73 61 52 34 10 13 2 May 2005 Memory approaches to improve multi-start constructive heuristics 69

Comparisons Average similarity with the target sequence over all R instances with the same size May 2005 Memory approaches to improve multi-start constructive heuristics 70

Comparisons Average computation times in seconds observed for each algorithm over all R instances with the same size May 2005 Memory approaches to improve multi-start constructive heuristics 71

Concluding remarks § New multi-start heuristic to PSBH performs very well § Memory approaches (adaptive memory and vocabulary building) are able to improve multistart solutions § Parameter tuning may be further improved § Approach can be applied to other optimization problems (e. g. car sequencing problem) May 2005 Memory approaches to improve multi-start constructive heuristics 72