On the Range MaximumSum Segment Query Problem KuanYu
On the Range Maximum-Sum Segment Query Problem Kuan-Yu Chen and Kun-Mao Chao Department of Computer Science and Information Engineering, National Taiwan University, Taiwan 2021/9/17 Chen and Chao 1
The Maximum-Sum Segment n n n Also called the maximum-sum interval or the maximum-scoring region Given a sequence of numbers, the maximum-sum segment is simply the contiguous subsequence having the With greatest total sum = 8 <5, -5. 1, 1, 3, -4, 2, 3, -4, 7> Zero prefix-/suffix-sums are possible. 2021/9/17 Chen and Chao 6
A Relevant Problem - RMQ n n n Range Minima (Maxima) Query Problem (also called Discrete Range Searching) Given a sequence of numbers, by preprocessing the sequence we wish to retrieve the minimum (maximum) value within a given querying interval efficiently <5, -5. 1, 1, 3, -4, 2, 3, -4, 7> Maximum 2021/9/17 Minimum Chen and Chao 7
Range Maximum-Sum Segment Query Problem n n n Definition: The input is a sequence <a 1, a 2, …… an> of real numbers which is to be preprocessed. A query is comprised of two intervals S and E. Our goal is to return the maximum-sum segment whose starting index lies in S and end index lies in E. 2021/9/17 Chen and Chao 8
A Nonoverlapping Example n n Input Sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Startin g region 2021/9/17 Total sum = 6 Chen and Chao End region 9
An Overlapping Example n n Input Sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Total sum = 8 Startin g region 2021/9/17 End region Chen and Chao 10
Our Results n We propose an algorithm that runs in O(n) preprocessing time and O(1) query time under the unit-cost RAM model. n In fact, we show that RMSQ and RMQ are computationally linearly equivalent. n We show that the RMSQ techniques yield alternative O(n) time algorithms for the following problems: n The maximum-sum segment with length constraints n All maximal-sum segments 2021/9/17 Chen and Chao 11
Strategy n Reduce the RMSQ to the RMQ problem O(n) RMSQ RMQ O(1) n Theorem. If there is a <f(n), g(n)>-time solution for the RMQ problem, then there is a <f(n)+O(n), g(n)+O(1)>-time solution for the RMSQ problem. 2021/9/17 Chen and Chao 12
Computing sum(i, j) in O(1) time n prefix-sum(i) = a 1+a 2+…+ai n n all n prefix sums are computable in O(n) time. sum(i, j) = prefix-sum(j) – prefix-sum(i-1) i j prefix-sum(j) prefix-sum(i-1) 2021/9/17 Chen and Chao 14
Case 1: Nonoverlapping Maximize n Maximize Minimize sum(i, j) = prefix-sum(j) – prefix-sum(i-1) Prefix-sum sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Range Minima Query Find the highest point here Find the lowest point here 2021/9/17 Chen and Chao 15
Case 2: Overlapping n n n Some problems may occur Prefix-sum sequence 9, -10, 4, -2, 5, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Negative Sum !! Find the highest point here 2021/9/17 Find the lowest point here Chen and Chao 16
Case 2: Overlapping n n n Divide into 3 possible cases: Prefix-sum sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Range Minima Query Preprocessing time == f(n) Query time == g(n) Find the highest point here What should we do? here point 2021/9/17 Find the lowest point Find the here lowest Chen and Chao point here 17
Dealing with the Special Case: Single Range Query n n Input Sequence: 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Total sum = 6 n Challenge: Can this special case be reduced to the RMQ problem? 2021/9/17 Chen and Chao 18
Reduction Procedure n n n Step 1. Find a partner for each index. Step 2. Record the sum of each pair in an array Step 3. Retrieve the maximum-sum pair by applying the RMQ techniques 2021/9/17 Chen and Chao 19
Our First Attempt (1) n n Step 1: For each index i, we define the lowest point preceding i as its partner Prefix-sum sequence: i Lowest point 2021/9/17 Chen and Chao Find a partner within this region 20
Our First Attempt (2) n Step 2: Record sum(partner(i), i) in an array i Lowest point 2021/9/17 Chen and Chao sum(partner(i), i) 21
Our First Attempt (3) n Step 3: Apply the RMQ techniques to the array i Querying this interval Applying RMQ to this sequence Lowest point 2021/9/17 Chen and Chao The maximum-sum pair can be retrieved sum(partner(i), i) 22
Bump into Difficulties n What if its partners go beyond the querying interval? We might have to update every pair! partner(i) 2021/9/17 i Needs to be updated Chen and Chao sum(partner(i), i) 23
A Better Partner How? n Prefix-sum sequence Left_bound(i) Find the nearest point at least as large as i i Find the lowest point New partner(i) 2021/9/17 Chen and Chao 24
Why Is It Better? (1) n n It remains the best choice. It saves lots of update steps. n It turns out that zero or one point needs to be updated. 2021/9/17 Chen and Chao 25
Why Is It Better? (2) -- Remains the Best Find the nearest higher point Left_bound(i) i Find the lowest point partner(i) Impossible region 2021/9/17 Chen and Chao 26
Why Is It Better? (3) -- Minimal-Maximal Property n Height(partner(i))< Height(j) < Height(i), for all partner(i)< j< i Next higher point Maximal point Minimal point i partner(i) No one higher than i No one lower than partner(i) 2021/9/17 Chen and Chao 27
Why Is It Better? (4) -- Save Some Updates n Prefix-sum sequence Next higher point Can not be the right end of the maximum-sum segment i Querying interval partner(i) No one higher than i 2021/9/17 Chen and Chao 28
Why Is It Better? (5) -- Nesting Property n For two indices i < j, it cannot be the case that partner(i)<partner(j) ≦i<j Maximal point Minimal point i partner(j) partner(i) 2021/9/17 Chen and Chao j Maximal point 29
Why Is It Better? (6) -- An example n No overlapping is allowed n 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 Nesting Property n 9, -10, 4, -2, 4, -5, 4, -3, 6, -11, 8, -3, 4, -5, 3 n 2021/9/17 Chen and Chao 30
When a Query Comes -- Case 1: No Exceeding n The maximum pair (partner(i), i) lies in the querying interval Retrieve the maximum pair i Querying interval partner(i) We are done. Output (partner(i), i). 2021/9/17 Chen and Chao 31
When a Query Comes -- Case 2: Exceeding n The maximum pair (partner(i), i) goes beyond the querying interval Querying interval i partner(i) Minimal Update partner(i) Retrieve the maximum pair j Maximal partner(j) (Partner(i), i)the is _ the maximum pair. Nesting Can Compare not property be (new right partner(i) end of, i) the and maximum-sum (partner(j), j)segment. 2021/9/17 Chen and Chao 32
Time Complexity n RMSQ can be reduced to the RMQ problem in O(n) time O(n) RMQ RMSQ O(1) n n Since under the unit-cost RAM model, there is a <O(n), O(1)>-time solution for the RMQ problem, there is a <O(n), O(1)>-time solution for the RMSQ problem. On the other hand, RMQ can be reduced to the RMSQ problem in O(n) time, too. (Range Maxima Query: For each two adjacent elements, we augment a negative number whose absolute value is larger than them. ) 2021/9/17 Chen and Chao 33
Use RMSQ Techniques to Solve Two Relevant Problems n 1. Finding the Maximum-Sum Segment with length constraints in O(n) time. - Y. -L. Lin, T. Jiang, K. -M. Chao, 2002 - T. -H Fan et al. , 2003 n 2. Finding all maximal scoring subsequences in O(n) time. - W. L. Ruzzo & M. Tompa, 1999 2021/9/17 Chen and Chao 34
Problem 1: The Maximum-Sum Segment with Length Constraints n Lin, Jiang, and Chao [JCSS 2002] and Fan et al. [CIAA 2003] gave O(n)-time algorithms for this problem. n Length at least L, and at most U L U 2021/9/17 Chen and Chao 35
Problem 1: Finding the Maximum. Sum Segment with Length Constraints n n Length at least L, at most U For each index i, find the maximum-sum segment whose starting point lies in [i. U+1, i-L+1] and end point is i i RMSQ query L U Runs in O(n) time since each query costs O(1) time 2021/9/17 Chen and Chao 36
Problem 2: All Maximal-Sum Segments n n Ruzzo and Tompa [ISMB 1999] gave a O(n)-time algorithm for this problem. Recursive definition. L(S) R(S) S 2021/9/17 Chen and Chao 37
Problem 2: Finding All Maximal Scoring Subsequences n n Recursive calls. Input sequence: L(S) R(S) S RMSQ query Runs in O(n) time since each query costs O(1) time 2021/9/17 Chen and Chao 38
- Slides: 33