HiveBrain v1.2.0
Get Started
← Back to all entries
patternMajor

Algorithm for merging arrays while keeping their order

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
keepingarrayswhileordermergingalgorithmfortheir

Problem

My question is generalizable to arrays of any type but I'll use strings to keep it short.

We take a couple of strings as an input. Let's denote them ${S_1,...S_n}$. (Ex: "ABEF" and "CDE"). The output $U$ must fulfill the following conditions:

  • The output must contain the entire "vocabulary" of all of the inputs and not more: $x\in U \Leftrightarrow \exists i : x\in S_i$



  • Each element in the output appears only once in the output.



  • The elements in the output keep their relative ordering from their original strings. So if 'A' comes before 'B' in the string $S_i$, then 'A' must come before 'B' in the output $U$.



The problem can have multiple solutions or no solutions. I'm fine with finding just one of them. Some examples might be:

inputs
possible solutions

ABEF, CDE
ABCDEF, CDABEF, CABDEF...

ABEF, CDE, BC
ABCDEF (only solution)

ABEF, CDE, CB
CDABEF, CADBEF, ...

I found this question with a similar algorithm. However, their algorithm allows repeated elements in their output.

Is there a name for this algorithm? What's the time complexity? Does anyone know an implementation in a Python library? Thanks!

Clarifications:

  • the inputs do not have an ordering. The ordering is given by the inputs themselves. That's why "CDABEF" is a valid solution in the first example.



  • If the problem does not have a solution, throw an exception.

Solution

You can easily solve your problem by creating a directed graph $G=(V,E)$ in which $V$ is the set of all characters that appear in some input string and $E$ contains the edge $(u,v)$ if character $u$ is immediately followed by character $v$ in some string.

This graph has $O(n)$ edges where $n$ is the total length of the input (i.e., the sum of the strings' lengths) and can be built in $O( \min\{n+|\Sigma|, n \log n\} )$ worst-case time, where $\Sigma$ is your alphabet (if the alphabet has constant size then this is just $O(n)$) or $O(n)$ expected time.

Then your problem admits a solution if and only if $G$ is a directed acyclic graph (DAG). This can be tested in $O(n)$ time. If $G$ is indeed a DAG, any topological ordering of the vertices in $G$ is a valid solution to your problem. A topological ordering of $G$ can be found in $O(n)$ time.

Context

StackExchange Computer Science Q#152572, answer score: 23

Revisions (0)

No revisions yet.