patternMinor

Use dynamic programming to merge two arrays such that the number of repetitions of the same element is minimised

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

bioinformatics cs stackoverflow algorithms subsequences dynamic-programming

suchsamearraysthenumbermergeprogrammingrepetitionstwothat

Problem

Let's say we have two arrays m and n containing the characters from the set a, b, c , d, e. Assume each character in the set has a cost associated with it, consider the costs to be a=1, b=3, c=4, d=5, e=7.

for example

m = ['a', 'b', 'c', 'd', 'd', 'e', 'a']
n = ['b', 'b', 'b', 'a', 'c', 'e', 'd']

Suppose we would like to merge m and n to form a larger array s.

An example of s array could be

s = ['a', 'b', 'c', 'd', 'd', 'e', 'a', 'b', 'b', 'b', 'a', 'c', 'e', 'd']

s = ['b', 'a', 'd', 'd', 'd', 'b', 'e', 'c', 'b', 'a', 'b', 'a', 'c', 'e']

If there are two or more identical characters adjacent to eachother a penalty is applied which is equal to: number of adjacent characters of the same type the cost for that character. Consider the second example for s above which contains a sub-array ['d', 'd', 'd']. In this case a penalty of 35 will be applied because the cost associated with d is 5 and the number of repetitions of d is 3.

I need to design a dynamic programming algorithm which minimises the cost associated with s.

Does anyone have any resources, papers, or algorithms they could share to help point me in the right direction?

Solution

Here is an algorithm that computes the minimum cost that is about as simple as possible and as fast as possible.

Count the total number of each character in $m$ and $n$. Let them be $c(a), c(b), \cdots$ respectively. Let $x$ be one of $a,b,\cdots$ such that $c(x)$ is the maximum.

Let $\sigma$ be the sum of all $c(i)$ where $i$ goes through $a, b, \cdots$ except $x$.

If $c(x)\le 1 + \sigma$, return 0.

Else return $p(x)(c(x) -\sigma))$, where $p(x)$ is the penalty associated with $x$.

The time-complexity of the algorithm is $O(\ell)$, where $\ell$ is the sum of the length of $m$ and the length of $n$.

Since the algorithm above is simple and clear, there is no need to apply the heavy machinery of dynamic programming.

Exercises

Here are two exercises that prove the correctness of the algorithm above.

Exercise 1. If $s(x)\le 1 + \sigma$, design a procedure to produce a merged array that does not have adjacent pairs of the same letter.

Exercise 2. If $s(x)\gt 1 + \sigma$, then any merged array must have at least $s(x) -\sigma$ $x$s each of which is adjacent to another $x$. Design a procedure to produce a merged array that has no other penalties except for $s(x) -\sigma$ $x$s.

Context

StackExchange Computer Science Q#106982, answer score: 3

Revisions (0)

No revisions yet.