patternMinor
Longest substrings of common length with the same parity
Viewed 0 times
substringsthesamewithlengthparitylongestcommon
Problem
Given two sequences $a$ and $b$, find largest $x$ such that in $a$ there is substring $A$ and substring $B$ in $b$ meeting those conditions:
Lengths of $a$ and $b$ are up to $5\times10^5$, so simple $O(n^2)$ solution won't do.
Example:
$a = [0, 1, 2, 3, 4, 5]$
$b = [3, 1, 3, 6]$
Answer: 3 (one of the possible solutions is $A = [2, 3, 4], B = [3, 1, 3]$).
I've thought about it for hours and can't find a solution. How to do this in linear or linearithmic time complexity?
I'm quite sure the problem can be simplified to for index $i$, storing only sum of elements up to $i$ modulo $2$. However, it doesn't help me much.
(The problem comes from a rather-old Israeli book תכנות תחרותי: סביב אתגרים ('Competetitive programming') by Mordechai Ben-Ari. The book isn't well known and I couldn't find any solution in Hebrew, so I translated the problem into English for a better chance of getting answer.)
Any help will be appreciated.
- length of both $A$ and $B$ is equal to $x$;
- sum of elements in $A$ has the same parity as sum of elements in $B$.
Lengths of $a$ and $b$ are up to $5\times10^5$, so simple $O(n^2)$ solution won't do.
Example:
$a = [0, 1, 2, 3, 4, 5]$
$b = [3, 1, 3, 6]$
Answer: 3 (one of the possible solutions is $A = [2, 3, 4], B = [3, 1, 3]$).
I've thought about it for hours and can't find a solution. How to do this in linear or linearithmic time complexity?
I'm quite sure the problem can be simplified to for index $i$, storing only sum of elements up to $i$ modulo $2$. However, it doesn't help me much.
(The problem comes from a rather-old Israeli book תכנות תחרותי: סביב אתגרים ('Competetitive programming') by Mordechai Ben-Ari. The book isn't well known and I couldn't find any solution in Hebrew, so I translated the problem into English for a better chance of getting answer.)
Any help will be appreciated.
Solution
Here is an $O(n\log^2 n)$ solution.
The first step is to reduce to an easier problem:
Given $x_1,\ldots,x_n \in \{0,1\}$, determine, for all $0 \leq i \leq n$ and $b \in \{0,1\}$, whether there is a contiguous subarray of length $i$ whose sum is equal to $b$ modulo 2.
Applying this to both $A$ and $B$, we can solve the original problem in $O(n)$.
The basic idea is to use divide and conquer. Given an array $x_1,\ldots,x_n$, divide it into two arrays $x_1,\ldots,x_m$ and $x_{m+1},\ldots,x_n$ of roughly equal size. Every contiguous subarray of $x_1,\ldots,x_n$ of length $i$ with parity $b$ has one of the following forms:
We can determine whether subarrays of the first two forms exist, for all $i$ and $b$, by running the procedure recursively on the two halves. The interesting part is the third form. We start by determining the parity of all arrays of the form $x_{m-i_1+1},\ldots,x_m$ and of the form $x_{m+1},\ldots,x_{m+i_2}$ for all $i_1,i_2$, which takes time $O(n)$. Let $L_{b_1}$ be the set of $i_1$ such that $x_{m-i_1+1},\ldots,x_m$ has parity $b_1$, and define $R_{b_2}$ analogously. Using FFT (exercise), for each $b_1,b_2$ we can determine in $O(n\log n)$ the set $X_{b_1b_2}$ of $i$ such that there exist $i_1 \in L_{b_1}$ and $i_2 \in R_{b_2}$ summing to $i$. Given the sets $X_{b_1b_2}$ for all $b_1,b_2$, we can determine whether subarrays of the third form exist, for all $i$ and $b$.
In total, denoting the running time of our algorithm by $T(n)$, we get the recurrence
$$ T(n) = T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + O(n\log n), $$
whose solution is $T(n) = O(n\log^2 n)$.
The first step is to reduce to an easier problem:
Given $x_1,\ldots,x_n \in \{0,1\}$, determine, for all $0 \leq i \leq n$ and $b \in \{0,1\}$, whether there is a contiguous subarray of length $i$ whose sum is equal to $b$ modulo 2.
Applying this to both $A$ and $B$, we can solve the original problem in $O(n)$.
The basic idea is to use divide and conquer. Given an array $x_1,\ldots,x_n$, divide it into two arrays $x_1,\ldots,x_m$ and $x_{m+1},\ldots,x_n$ of roughly equal size. Every contiguous subarray of $x_1,\ldots,x_n$ of length $i$ with parity $b$ has one of the following forms:
- Contiguous subarray of $x_1,\ldots,x_m$ of length $i$ with parity $b$.
- Contiguous subarray of $x_{m+1},\ldots,x_n$ of length $i$ with parity $b$.
- The concatenation of a subarray $x_{m-i_1+1},\ldots,x_m$ of parity $b_1$ with a subarray $x_{m+1},\ldots,x_{m+i_2}$ of parity $b_2$, where $i_1+i_2 = i$ and $b_1+b_2 \equiv b \pmod{2}$.
We can determine whether subarrays of the first two forms exist, for all $i$ and $b$, by running the procedure recursively on the two halves. The interesting part is the third form. We start by determining the parity of all arrays of the form $x_{m-i_1+1},\ldots,x_m$ and of the form $x_{m+1},\ldots,x_{m+i_2}$ for all $i_1,i_2$, which takes time $O(n)$. Let $L_{b_1}$ be the set of $i_1$ such that $x_{m-i_1+1},\ldots,x_m$ has parity $b_1$, and define $R_{b_2}$ analogously. Using FFT (exercise), for each $b_1,b_2$ we can determine in $O(n\log n)$ the set $X_{b_1b_2}$ of $i$ such that there exist $i_1 \in L_{b_1}$ and $i_2 \in R_{b_2}$ summing to $i$. Given the sets $X_{b_1b_2}$ for all $b_1,b_2$, we can determine whether subarrays of the third form exist, for all $i$ and $b$.
In total, denoting the running time of our algorithm by $T(n)$, we get the recurrence
$$ T(n) = T(\lfloor n/2 \rfloor) + T(\lceil n/2 \rceil) + O(n\log n), $$
whose solution is $T(n) = O(n\log^2 n)$.
Context
StackExchange Computer Science Q#99778, answer score: 2
Revisions (0)
No revisions yet.