patternMinor
Prefix Sums in Mutable 2D Array
Viewed 0 times
prefixmutablearraysums
Problem
Suppose I have a 2D array
$$
\sum_{i = 0}^{k-1} \sum_{j = 0}^{l-1} M[i][j]?
$$
Of course, all these values can be computed in $\mathcal O(n^2)$ time total, and after that queries take $\mathcal O(1)$. However, my array is mutable, and each time I change a value, the obvious solution requires a $\mathcal O(n^2)$ update.
We can create a quad tree over
My question is:
Can we improve significantly on the queries without sacrificing too much on the updates?
I am especially interested in getting both the update and query operations sub-linear, and in particular getting them both to $\mathcal O(n^\epsilon)$.
Edit: for some more information, although I think the problem is interesting even without this further restriction, I expect to do roughly $\mathcal O(n^3)$ queries, and about $\mathcal O(n^2)$ updates. The ideal goal is to get the full runtime down to about $\mathcal O(n^{3+\epsilon})$. Thus, a situation where an update takes $\mathcal O(n \log(n))$ while a query takes $\mathcal O(\log(n))$ would also be interesting to me.
M[n][n] of integers (in fact, binary is fine, but I doubt it matters). I am interested in repeated queries of the form: given a coordinate pair $k,l$, what is$$
\sum_{i = 0}^{k-1} \sum_{j = 0}^{l-1} M[i][j]?
$$
Of course, all these values can be computed in $\mathcal O(n^2)$ time total, and after that queries take $\mathcal O(1)$. However, my array is mutable, and each time I change a value, the obvious solution requires a $\mathcal O(n^2)$ update.
We can create a quad tree over
M; the preprocessing takes $\mathcal O(n^2\log(n))$, and this allows us to do queries in $\mathcal O(n\log(n))$, and updates in $\mathcal O(\log(n))$.My question is:
Can we improve significantly on the queries without sacrificing too much on the updates?
I am especially interested in getting both the update and query operations sub-linear, and in particular getting them both to $\mathcal O(n^\epsilon)$.
Edit: for some more information, although I think the problem is interesting even without this further restriction, I expect to do roughly $\mathcal O(n^3)$ queries, and about $\mathcal O(n^2)$ updates. The ideal goal is to get the full runtime down to about $\mathcal O(n^{3+\epsilon})$. Thus, a situation where an update takes $\mathcal O(n \log(n))$ while a query takes $\mathcal O(\log(n))$ would also be interesting to me.
Solution
There is a relatively straightforward solution where each query and each update can be done in $O(\log^2 n)$ time. The data structure uses $O(n^2)$ space.
We will have $\lg n$ "granularities" of data structure, one for each power of two $2^m$ such that $1 \le 2^m \le n$. The data structure for granularity $2^m$ stores the sums
$$\sum_{i=k_0 \cdot 2^m}^{(k_0+1) \cdot 2^m -1} \sum_{j=0}^{l-1} M[i,j]$$
for each $k_0,l$. This data structure for granularity $2^m$ can in turn be represented using $n/2^m$ balanced trees (one for each possible value of $k_0$) to store prefix sums.
Now to look up the prefix sum for $k,l$, we break up the interval $[0,k-1]$ into a union of intervals of power-of-two length; at most $\lg n$ intervals are needed. For each such interval of length $2^m$, we do a lookup into the data structure of granularity $2^m$. Thus queries can be answered by doing $O(\log n)$ lookups into a balanced tree, each of which takes $O(\log n)$ time, for a total time of $O(\log^2 n)$ per query.
Updates can also be done in $O(\log^2 n)$ time. To update $M[i,j]$, for each granularity $2^m$, you update the appropriate balanced tree in the data structure of granularity $2^m$. This is $O(\log n)$ updates oin $O(\log n)$ balanced trees; each such takes $O(\log n)$ time, so the total time is $O(\log^2 n)$ time.
Finally, the data structure of granularity $2^m$ contains $n/2^m$ trees, each taking up $O(n)$ space, so the total space usage is $O(n^2 \cdot (1 + 1/2 + 1/4 + \cdots)) = O(n^2)$.
We will have $\lg n$ "granularities" of data structure, one for each power of two $2^m$ such that $1 \le 2^m \le n$. The data structure for granularity $2^m$ stores the sums
$$\sum_{i=k_0 \cdot 2^m}^{(k_0+1) \cdot 2^m -1} \sum_{j=0}^{l-1} M[i,j]$$
for each $k_0,l$. This data structure for granularity $2^m$ can in turn be represented using $n/2^m$ balanced trees (one for each possible value of $k_0$) to store prefix sums.
Now to look up the prefix sum for $k,l$, we break up the interval $[0,k-1]$ into a union of intervals of power-of-two length; at most $\lg n$ intervals are needed. For each such interval of length $2^m$, we do a lookup into the data structure of granularity $2^m$. Thus queries can be answered by doing $O(\log n)$ lookups into a balanced tree, each of which takes $O(\log n)$ time, for a total time of $O(\log^2 n)$ per query.
Updates can also be done in $O(\log^2 n)$ time. To update $M[i,j]$, for each granularity $2^m$, you update the appropriate balanced tree in the data structure of granularity $2^m$. This is $O(\log n)$ updates oin $O(\log n)$ balanced trees; each such takes $O(\log n)$ time, so the total time is $O(\log^2 n)$ time.
Finally, the data structure of granularity $2^m$ contains $n/2^m$ trees, each taking up $O(n)$ space, so the total space usage is $O(n^2 \cdot (1 + 1/2 + 1/4 + \cdots)) = O(n^2)$.
Context
StackExchange Computer Science Q#88759, answer score: 4
Revisions (0)
No revisions yet.