snippetpythonModerate

Order a list of tuples or a NumPy array in a specific format

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

formatordernumpyarrayspecificlisttuples

Problem

I have a list of tuples or a NumPy array (it can be either of them as the variable comes as a list and will end up being a NumPy array) and I want to order them in a specific way (that I am not able to phrase it).

Let's make an example:

boundary=[(1, 2), (1, 3), (3, 4), (2, 4)];

Desired output:

[(1, 2), (2, 4), (4, 3), (3, 1)]

(or a NumPy array of 2x4 will do also)

The logic behind it is to get the first row, read its last value and find that value in the array. Get that tuple, swap it to make the found value the first one (if needed) and repeat.

The idea is that ...N),(N... , N is the same. There will always be 2 and only 2 repetitions of the numbers.

I can figure out how to do it "C-like", iterating, searching, swapping etc, but I wonder if there is a pythonic way (those amazing one-liners) of achieving it. Ultimately I am just trying to increase my Python-fu.

boundary=np.array(boundary)
ordered=np.zeros([boundary.shape[0],boundary.shape[1]]);
idx=0
idy=0
for i in range(len(boundary)):
    ordered[i,0]=boundary[idx,idy]
    ordered[i,1]=boundary[idx,(idy+1)%2]
    ix,iy=np.where(boundary==boundary[idx,(idy+1)%2])
    for j in range(len(ix)):
        isused=np.where(np.all(ordered==boundary[ix[j],],axis=1))
        if isused[0].size==0:
            idx=ix[j]
            idy=iy[j]
            print idx, idy
            break

Solution

This is really a graph problem! Draw a graph with a vertex for every number in your list, and an edge between vertices \$a\$ and \$b\$ if there is a pair \$(a, b)\$ or \$(b, a)\$ in your list:

What you want to do is to find a path in this graph that visits all the edges. Such a path is known as an Eulerian path. Your condition that each number appears exactly twice in the list means that the path must return to the original vertex: that makes it an Eulerian cycle.

To find this cycle, we must first represent the graph using a suitable data structure. Here we can use the adjacency list representation (really, adjacency set):

from collections import defaultdict

def graph(edges):
    """Given a list of edges, return a map from vertex to a set of
    adjacent vertices.

    >>> graph([(1, 2), (1, 3), (3, 4), (2, 4)])
    defaultdict(, {1: {2, 3}, 2: {1, 4}, 3: {1, 4}, 4: {2, 3}})

    """
    g = defaultdict(set)
    for v, w in edges:
        g[v].add(w)
        g[w].add(v)
    return g

Then, to find the path, we can use Hierholzer's algorithm. This is really simple: we just pick a starting vertex and then repeatedly pick any unvisited edge from that vertex. Your condition that each number appears exactly twice in the list of edges means that the algorithm can't get stuck: there are exactly two edges meeting each vertex, so having gone in on one we can go out on the other:

def eulerian_path(edges, v):
    """Generate an Eulerian path over the given collection of edges,
    starting at the vertex v.

    >>> list(eulerian_path([(1, 2), (1, 3), (3, 4), (2, 4)], 1))
    [(1, 2), (2, 4), (4, 3), (3, 1)]

    """
    g = graph(edges)
    while True:
        try:
            w = g[v].pop()
        except KeyError:
            return
        g[w].remove(v)
        yield v, w
        v = w

What might happen, however, is that your data contains multiple disjoint cycles. For example if the input is:

[(1, 2), (2, 3), (3, 1), (4, 5), (5, 6), (6, 4)]

then this corresponds to the graph:

The eulerian_path algorithm only finds one of the cycles. What should you do in ths case? Stop at the end of the first cycle? Keep going and find all the other cycles? Maybe this can't happen in your problem, so raise an error? You didn't explain what your problem really is, so I can't tell you what the right thing is to do in this case.

Update

If you're determined to stick to NumPy, then consider using the Compressed Sparse Graph Routines. The idea is to use scipy.sparse.csr_matrix to convert your array into a CSR matrix:

>>> edges = np.array([(1, 2), (1, 3), (3, 4), (2, 4)])
>>> m = scipy.sparse.csr_matrix((np.ones(len(edges)), (edges[:,0], edges[:,1])), shape=(5, 5))

and then call scipy.sparse.csgraph.depth_first_order to get the vertices on the Eulerian path:

>>> scipy.sparse.csgraph.depth_first_order(m, 1, False)[0]
array([1, 2, 4, 3], dtype=int32)

If you need to turn this array of vertices back into an array of edges (as in the OP), you could use numpy.vstack and numpy.roll, like this:

>>> np.vstack((vertices, np.roll(vertices, -1))).T
array([[1, 2],
       [2, 4],
       [4, 3],
       [3, 1]], dtype=int32)

Code Snippets

from collections import defaultdict

def graph(edges):
    """Given a list of edges, return a map from vertex to a set of
    adjacent vertices.

    >>> graph([(1, 2), (1, 3), (3, 4), (2, 4)])
    defaultdict(<class 'set'>, {1: {2, 3}, 2: {1, 4}, 3: {1, 4}, 4: {2, 3}})

    """
    g = defaultdict(set)
    for v, w in edges:
        g[v].add(w)
        g[w].add(v)
    return g

def eulerian_path(edges, v):
    """Generate an Eulerian path over the given collection of edges,
    starting at the vertex v.

    >>> list(eulerian_path([(1, 2), (1, 3), (3, 4), (2, 4)], 1))
    [(1, 2), (2, 4), (4, 3), (3, 1)]

    """
    g = graph(edges)
    while True:
        try:
            w = g[v].pop()
        except KeyError:
            return
        g[w].remove(v)
        yield v, w
        v = w

[(1, 2), (2, 3), (3, 1), (4, 5), (5, 6), (6, 4)]

>>> edges = np.array([(1, 2), (1, 3), (3, 4), (2, 4)])
>>> m = scipy.sparse.csr_matrix((np.ones(len(edges)), (edges[:,0], edges[:,1])), shape=(5, 5))

>>> scipy.sparse.csgraph.depth_first_order(m, 1, False)[0]
array([1, 2, 4, 3], dtype=int32)

Context

StackExchange Code Review Q#95598, answer score: 11

Revisions (0)

No revisions yet.