Segment Tree
A segment tree is a data structure used to store information about array segments and answer segment queries efficiently:
- range(i, j): gives the sum of the array elements starting at index i and ending at index j.
- update(i, val): updates the value at index i to the val in the original array and updates the segment tree accordingly.
Structure of the Segment Tree
Consider an array arr
of size n
and a corresponding Segment Tree T
:
- The root of
T
will represent the whole arrayarr[0..n-1]
- Each leaf in the segment tree
T
represent a single element arr[i] such that0 <= i < n
- The internal nodes in the segment tree
T
represent the union of elementary intervals arr[i..j] where0 <= i < j < n
We can take a divide-and-conquer approach when it comes to array segments. We compute and store the sum of the elements of the whole array, i.e. the sum of the segment arr[0..n-1]
. We then split the array into two halves arr[0..n / 2 - 1]
and arr[n / 2..n - 1]
and compute the sum of each halve and store them. Each of these two halves in turn are split in half, and so on until all segments reach size 1
.
Construction
Before constructing the segment tree, we need to decide:
- The value that gets stored at each node of the segment tree. For example, in a sum segment tree, a node would store the sum of the elements in its range
[l..r]
.- The merge operation that merges two siblings in a segment tree. For example, in a sum segment tree, the two nodes corresponding to the rangesarr[l1..r1]
andarr[l2..r2]
would be merged into a node corresponding to the rangearr[l1..r2]
by adding the values of the two nodes.
Note that a vertex is a leaf vertex, if its corresponding segment covers only one value in the original array. It is present at the lowermost level of a segment tree. Its value would be equal to the (corresponding) element arr[i]
We can use an array to represent Segment Trees. For each node at index i
- The left child is at index (2 * i + 1)
- The right child at (2 * i + 2)
- The parent is at floor((i – 1) / 2).
Segment trees have some nice properties:
- If the underlying array has
n
elements, the segment tree has exactly(2n - 1)
nodes —n
leaves and(n - 1)
internal nodes - The height of the tree is
O(logn)
- Each segment can be split into
O(log n)
non-intersecting segments that correspond to the nodes of the segment tree
When n
is not a perfect power of two, not all levels are filled entirely — the last layer may be incomplete.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
- arr: the input array
- node: the index of the current vertex
- left and right: the boundaries of the current segment
In the main program this function will be called with the parameters node = 1
, left = 0
, right = n - 1
How to calculate segmentTree array size
If we have array of n
elements, then the segment tree will have a leaf node for each of these n
entries. Thus, we have n
leaf nodes, and also (n - 1)
internal nodes.
Total number of nodes = n + (n - 1) = 2n - 1
. Now, we know its a full binary tree and thus the height is: std::ceil(std::log2(n)) + 1
Total no. of nodes = 20 + 21 + ... + 2h
where h is the height of the segment tree
We have Sum of n Terms in GP:
Sn = a * r0 + a * r + a * r2 + ... + a * rn - 1 = a * (rn - 1) / (r - 1)
Apply the formula we have
Total no. of nodes = 2 * 2 std::ceil(std::log2(n)) - 1
1 2 |
|
Time complexity: O(n)
.
Update
Now we want to modify a specific element in the array, let's say we want to do the assignment arr[i] = x
And we have to rebuild the Segment Tree, such that it correspond to the new, modified array.
It is easy to see, that the update request can be implemented using a recursive function. The function gets passed the current tree vertex, and it recursively calls itself with one of the two child vertices (the one that contains arr[i]
in its segment), and after that recomputes its sum value, similar how it is done in the build method (that is as the sum of its two children).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
Time complexity: O(logn)
.
Query
For now we are going to answer sum queries. As an input we receive two integers l
and r
, and we have to compute the sum of the segment arr[l..r]
in O(logn)
time.
There are three possible cases.
- The easiest case is when the segment
arr[l..r]
is equal to the corresponding segment of the current vertex. Eg:arr[l..r] = segmentTree[tl..tr]
, then we are finished and can return the precomputed sum that is stored in the vertex. - The segment of the query can fall completely into the domain of either the left or the right child. In this case we can simply go to the child vertex, which corresponding segment covers the query segment.
- And then there is the last case, the query segment intersects with both children. In this case we have no other option as to make two recursive calls, one for each child. First we go to the left child, compute a partial answer for this vertex, then go to the right child, compute the partial answer using that vertex, and then combine the answers by adding them.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Time complexity: O(logn)
.
Updating an interval (Lazy Propagation)
Sometimes problems will ask you to update an interval from l
to r
, instead of a single element. One solution is to update all the elements one by one. Complexity of this approach will be O(N)
per operation since where are N
elements in the array and updating a single element will take O(logN)
time.
To avoid multiple call to update function, we can modify the update function to work on an interval.
Let's be Lazy i.e., do work only when needed. How ? When we need to update an interval, we will update a node and mark its child that it needs to be updated and update it when needed. For this we need an array lazy[]
of the same size as that of segment tree.
Initially all the elements of the lazy[]
array will be 0
representing that there is no pending update. If there is non-zero element lazy[k]
then this element needs to update node k
in the segment tree before making any query operation.
To update an interval we will keep 3 things in mind.
- If current segment tree node has any pending update, then first add that pending update to current node.
- If the interval represented by current node lies completely in the interval to update, then update the current node and update the lazy[] array for children nodes.
- If the interval represented by current node overlaps with the interval to update, then update the nodes as the earlier update function
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
Query
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
Advanced versions of Segment Trees
Finding the maximum
Let us slightly change the condition of the problem described above: instead of querying the sum, we will now make maximum queries.
The tree will have exactly the same structure as the tree described above. We only need to change the way segmentTree[node]
is computed in the build
and update
functions. segmentTree[node]
will now store the maximum of the corresponding segment. And we also need to change the calculation of the returned value of the query
function (replacing the summation by the maximum).
Finding the maximum and the number of times it appears
This task is very similar to the previous one. In addition of finding the maximum, we also have to find the number of occurrences of the maximum.
To solve this problem, we store a pair of numbers at each vertex in the tree: In addition to the maximum we also store the number of occurrences of it in the corresponding segment.
Compute the greatest common divisor / least common multiple
In this problem we want to compute the GCD / LCM of all numbers of given ranges of the array.
This interesting variation of the Segment Tree can be solved in exactly the same way as the Segment Trees we derived for sum / minimum / maximum queries: it is enough to store the GCD / LCM of the corresponding vertex in each vertex of the tree. Combining two vertices can be done by computing the GCD / LCM of both vertices.
Counting the number of zeros, searching for the k-th zero
In this problem we want to find the number of zeros in a given range, and additionally find the index of the k-th
zero using a second function.
Again we have to change the store values of the tree a bit: This time we will store the number of zeros in each segment in segmentTree[]
. It is pretty clear, how to implement the build
, update
and countZero
functions, we can simply use the ideas from the sum query problem. Thus we solved the first part of the problem.
Now we learn how to solve the problem of finding the k-th
zero in the array arr
. To do this task, we will descend the Segment Tree, starting at the root vertex, and moving each time to either the left or the right child, depending on which segment contains the k-th
zero. In order to decide to which child we need to go, it is enough to look at the number of zeros appearing in the segment corresponding to the left vertex. If this precomputed count is greater or equal to k
, it is necessary to descend to the left child, and otherwise descent to the right child. Notice, if we chose the right child, we have to subtract the number of zeros of the left child from k
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Searching for an array prefix with a given amount
The task is as follows: for a given value x
we have to quickly find smallest index i
such that the sum of the first i
elements of the array arr
is greater or equal to x
(assuming that the array arr
only contains non-negative values).
This task can be solved using binary search, computing the sum of the prefixes with the Segment Tree. However this will lead to a O(logn*logn)
solution
Instead we can use the same idea as in the previous section, and find the position by descending the tree: by moving each time to the left or the right, depending on the sum of the left child. Thus finding the answer in O(logn)
time
Searching for the first element greater than a given amount
The task is as follows: for a given value x
and a range [l..r]
find the smallest i
in the range arr[l..r]
, such that arr[i]
is greater than x
.
This task can be solved using binary search over max prefix queries with the Segment Tree. However, this will lead to a O(logn*logn)
solution.
Instead, we can use the same idea as in the previous sections, and find the position by descending the tree: by moving each time to the left or the right, depending on the maximum value of the left child. Thus finding the answer in O(logn)
time
Finding subsegments with the maximal sum
Here again we receive a range arr[l..r]
for each query, this time we have to find a subsegment arr[l'..r']
such that l <= l'
and r' <= r
and the sum of the elements of this segment is maximal
This time we will store four values for each vertex:
- the sum of the segment
- the maximum prefix sum
- the maximum suffix sum
- the sum of the maximal subsegment in it
You can find more here