patternpythonMinor

Speedup cython dictionary counter

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

speedupdictionarycythoncounter

Problem

I wrote a simple cython script to optimize the collections.Counter of a dictionary counter and the python zip implementation (the main input is a list of tuples). Is there a way to speed it up?

%%cython --annotate
cimport cython
import numpy as np
cimport numpy as np
from collections import defaultdict

@cython.boundscheck(False)
@cython.wraparound(False)
def uniqueCounterListCython(list x not None):

    cdef:
        Py_ssize_t i,n  

    n = len(x)    
    dx = defaultdict(int)
    for i from 0 <= i < n:
        dx[x[i]] += 1
    return dx

@cython.boundscheck(False)
@cython.wraparound(False)
def zipCython(np.ndarray[long,ndim=1] x1 not None, np.ndarray[long,ndim=1] x2 not None):

    cdef:
        Py_ssize_t i,n

    n = x1.shape[0]
    l=[]
    for i from 0 <= i < n:
        l.append(((x1[i],x2[i])))
    return l

Sample input -

uniqueCounterListCython(zipCython(np.random.randint(0,3,200000),np.random.randint(0,3,200000)))

EDIT:
Found a kind of trivial way to speed things up - just merge the two functions:

@cython.boundscheck(False)
@cython.wraparound(False)
def uniqueCounterListCythonWithZip(np.ndarray[long,ndim=1] x1 not None, np.ndarray[long,ndim=1] x2 not None):

    cdef:
        Py_ssize_t i,n  

    n = x1.shape[0]    
    dx = defaultdict(int)
    for i from 0 <= i < n:
        dx[((x1[i],x2[i]))] += 1
    return dx

Any more suggestions?

Solution

You don't give us much context for this problem, so it's unclear to me exactly what you are trying to achieve. But in your example, you have a pair of NumPy arrays containing integers in the range 0–2, and you seem to want to count the number of occurrences of each pair of values.

So I suggest encoding pairs of integers in the range 0–2 into a single integer in the range 0–8, using numpy.bincount to do the counting, and then using numpy.reshape to decode the result, like this:

>>> import numpy as np
>>> x, y = np.random.randint(0,3,200000), np.random.randint(0,3,200000)
>>> counts = np.bincount(x * 3 + y).reshape((3, 3))
>>> counts
array([[22282, 22093, 22247],
       [22084, 22295, 22396],
       [22012, 22243, 22348]])

A quick check that I got the encoding/decoding right:

>>> counts[0,2] == np.count_nonzero((x == 0) & (y == 2))
True

This runs much faster than the code in your question (assuming I have interpreted your profile screenshots correctly):

>>> from timeit import timeit
>>> timeit(lambda:np.bincount(x * 3 + y).reshape((3, 3)), number=1000)
2.7519797360000666

Code Snippets

>>> import numpy as np
>>> x, y = np.random.randint(0,3,200000), np.random.randint(0,3,200000)
>>> counts = np.bincount(x * 3 + y).reshape((3, 3))
>>> counts
array([[22282, 22093, 22247],
       [22084, 22295, 22396],
       [22012, 22243, 22348]])

>>> counts[0,2] == np.count_nonzero((x == 0) & (y == 2))
True

>>> from timeit import timeit
>>> timeit(lambda:np.bincount(x * 3 + y).reshape((3, 3)), number=1000)
2.7519797360000666

Context

StackExchange Code Review Q#42846, answer score: 4

Revisions (0)

No revisions yet.