patternMinor

Sorting array of strings (with repetitions) according to a given ordering

Submitted by: @import:stackexchange-cs·Mar 10, 2026·

Viewed 0 times

sorting cs comparison algorithms stackoverflow string-matching strings

sortingarraywithorderingrepetitionsaccordingstringsgiven

Problem

We get two arrays:

ordering = ["one", "two", "three"]

and

input = ["zero", "one", "two", "two", "three", "three", "three", "four"];

We want to find the array output so that

output = ["one", "two", "two", "three", "three", "three", "zero", "four"]
// or
output = ["one", "two", "two", "three", "three", "three", "four", "zero"]

The strings (with possible repetitions) should be sorted as in the ordering array. Not found/contained strings should be put at the end of the new array and their order doesn't matter.

The $n^{2}$ solution is obvious, can we do better? The memory doesn't matter and it doesn't have to be an in-place algorithm.

Solution

The other answers provide correct solutions. However, you can do a bit better considering multivariate complexity. Note that the provided running times in the other answers are not quite specific since they ignore either the size of the first array or the lengths of the strings.

Here are 2 different methods with their specific running times. The first of which is a brief description of the method in the other answers with specific analysis of the running time. The second of which has better running time and is almost optimal.

Let us call the first (ordered) list in the input $A$ and the second (to be ordered) $S$. Let $n$ be the size of $A$ and $m$ the size of $S$. Let $l$ be the maximum length of a string in the input.

The first method.
This span the methods in the other answers. Note that in both we need to find for each string the whether it lays in the original ordered array and if so at what position exactly. We need either apply a linear search which will cost something like $O(ml)$ per element or we can build some kind of a dictionary data-structure such as search trees and hashing tables or a "trie" (prefix-tree) data structure to achieve each look-up operation in linear time in the size of the strings. Using this we can achieve a running time of $O((m+n)l)$ for finding the positions over all strings. Afterwards we need to sort all strings in $O(n \log n)$ if we assign the indices directly to the strings (using pairs) or $O(l n \log n)$ by applying the look-ups while searching. So the best running time we can achieve using the methods in the other answers would be $O(l (n+m) + n \log n)$.

The second method. Note that we did not make use of the original sorted array and that is why we had to sort the strings again. We can solve the problem more efficiently by using this extra informaiton. To achieve this, let us bild a dictionary $D$ (for the sake of the theoretical bounds let us define it as a trie but in you can implement it as a hash-table as well). Let us define a new list $L$ (a linked list) and a function $C$ that counts how many times each string appeared in the first array. Assume $D$ assigns to each string a distinct number from $1$ to $m$ and $C$ is an array of length $m$.
The algorithm goes as follow:

For all i in {1, .., m}
 Set C[i] := 0
// Now build D
Set index := 0
For each string s in A do
 Set D[s] = index
 Increase index by one
// Now find how many times each string occurs
Let L be an empty list
For each string s in S do
 If s in D then
  Find index = D[s]
  Increase C[index] by one
 Else
  Append s to L
//Now output
For each string s in S do
 Find index := D[s]
 Find c := C[index]
 while c > 0
  output s
  decrease c by one
For s in L
 output s

This algorithm runs in $O(l(n+m)$ which is linear (optimal) if the lengths of the strings are bounded by a constant.

Code Snippets

For all i in {1, .., m}
 Set C[i] := 0
// Now build D
Set index := 0
For each string s in A do
 Set D[s] = index
 Increase index by one
// Now find how many times each string occurs
Let L be an empty list
For each string s in S do
 If s in D then
  Find index = D[s]
  Increase C[index] by one
 Else
  Append s to L
//Now output
For each string s in S do
 Find index := D[s]
 Find c := C[index]
 while c > 0
  output s
  decrease c by one
For s in L
 output s

Context

StackExchange Computer Science Q#120986, answer score: 4

Revisions (0)

No revisions yet.