patternpythonMinor

Find binary sequence in NumPy binary array

Submitted by: @import:stackexchange-codereview·Mar 10, 2026·

Viewed 0 times

numpyarraysequencebinaryfind

Problem

As part of creating a software defined receiver and decoder, I want to find sync pulses in weather satellite data. The data is first digitized to a 0-1 array.

The code below is working. The convolution method is sufficiently fast. The slow method is not.

I was wondering if there was a more efficient or "Pythonic" way to do it.

def findPulseSlow(data, pulse):
    # find indexes of occurrences of pulse in data
    # data and pulse are numpy arrays of 0-1 elements
    # tolerate up to 3 incorrect values
    # takes 2 minutes per Mb on AMD FX8150
    maxerr=3
    return np.fromiter( \
        ( i for i in range(0,len(data)-len(pulse)) \
            if sum(np.logical_xor(data[i:(i+len(pulse))], pulse))=required)[0]

Here is a complete test framework:

testPulse.py

```
import numpy as np

NOAAsyncA=np.concatenate( ([4[0],7[1,1,0,0],8*[0]]) )

def getFixedTestData():
data = np.zeros(1000)
syncL = len(NOAAsyncA)
data[12:(12+syncL)]=NOAAsyncA
data[257:(257+syncL)]=NOAAsyncA
data[900:(900+syncL)]=NOAAsyncA
return ([12,257,900], data)

def getRandomTestData():
# create random 1 Megabit data, insert 10 NOAA sync pulses in data
# return correct pulse start locations, data
global NOAAsyncA
data = np.random.randint(2, size=1000000)
# make an ascending list of random spots to insert the pulse
insertAidx = np.sort(np.random.randint(1000000, size=10))
# insert the pulses
for l in insertAidx:
data[l:(l+len(NOAAsyncA))]=NOAAsyncA
# return the pulse locations and the data
return (insertAidx, data)

def findPulseSlow(data, pulse):
# find indexes of occurrences of pulse in data
# data and pulse are numpy arrays of 0-1 elements
# tolerate up to 3 incorrect values
# takes 2 minutes per Mb on AMD FX8150
maxerr=3
return np.fromiter( \
( i for i in range(0,len(data)-len(pulse)) \
if sum(np.logical_xor(data[i:(i+len(pulse))], pulse))=required)[0]

def findPulseCon

Solution

Before I get to your specific question, here are some general comments:

-
Since you are working with binary data, you should tell NumPy that your dtype is bool everywhere that you can. That will lower memory requirements and speed up certain operations. For example, data = np.zeros(1000) in getFixedTestData() should become data = np.zeros(1000, dtype=bool). The default NumPy dtype is float which is not what you want.

-
The python convention for naming functions is snake_case i.e. all lowercase with underscores, not camelCase. So findPulseSlow should be find_pulse_slow etc.

-
The right approach for optimizing your currently too-slow function is to use line profiling. I really like the line_profiler module. If you use IPython you can use it very easily as an inline magic function.

-
Doing #3 will mean unpacking that formidable iterator comprehension you wrote! Instead of squeezing it all on one line, I'd define a generator function for the iterator, and try to be extremely explicit so that every line has only one function call, like this:

def yield_convolve(arr_1, arr_2):
    """Yields indices of arr_1 where arr_2 nearly matches."""
    MAX_ERR = 3
    len_1, len_2 = len(arr_1), len(arr_2)  # I assume arr_2 is always shorter than arr_1
    min_matches = len_2 - MAX_ERR
    num_windows = len_1 - len_2
    for idx in xrange(num_windows):
        arr_1_window = arr_1[idx:(idx+len_2)]
        summand = np.equal(arr_1_window, arr_2)
        sum_ = np.count_nonzero(summand)
        condition = sum_ >= min_matches
        if condition:
            yield idx    

def find_pulse_slow_revised(arr_1, arr_2):
    return np.fromiter(yield_convolve(arr_1, arr_2), dtype=int)

I was probably overly expressive in the function I made -- it could be cleaned up by avoiding assigment to so many temporary variables but when doing line profiling this style really helps isolate the slow steps.

-
Watch out for places where you are inadvertently causing interconversion of NumPy data types and native Python data types. For example, sum(np.logical_xor(...) should be np.sum(np.logical_xor(...)

-
I thought the np.logical_xor part you did was clever, but to my eyes at least it was a bit easier to grok using len(pulse) - maxerr as a criterion and using np.equal instead of np.logical_xor, so that is what I did in my example above.

-
I did a few timing tests with my revised function and it was at least ~20-30x faster than your findPulseSlow(). Using np.sum instead of sum was responsible for most of the speedup. Using np.count_nonzero instead of np.sum gave an additional 2× or so. Here's an example of some code from my IPython notebook, along with some output from the line profiler.

_, fixed_dat = getFixedTestData()

%timeit findPulseSlow(fixed_dat, NOAAsyncA)

%timeit find_pulse_slow_revised(fixed_dat, NOAAsyncA)

%lprun -f find_pulse_slow_revised -f yield_convolve find_pulse_slow_revised(fixed_dat, NOAAsyncA)

10 loops, best of 3: 54 ms per loop
1000 loops, best of 3: 1.43 ms per loop

Timer unit: 1e-06 s

Total time: 0.004039 s
File: 
Function: yield_convolve at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def yield_convolve(arr_1, arr_2):
     2                                               """Yields indices of arr_1 where arr_2 nearly matches."""
     3         1            1      1.0      0.0      MAX_ERR = 3
     4         1            1      1.0      0.0      len_1, len_2 = len(arr_1), len(arr_2)  # I assume arr_2 is always shorter than arr_1
     5         1            0      0.0      0.0      min_matches = len_2 - MAX_ERR
     6         1            0      0.0      0.0      num_windows = len_1 - len_2
     7       961          407      0.4     10.1      for idx in xrange(num_windows):
     8       960          579      0.6     14.3          arr_1_window = arr_1[idx:(idx+len_2)]
     9       960         1511      1.6     37.4          summand = np.equal(arr_1_window, arr_2)
    10       960          642      0.7     15.9          sum_ = np.count_nonzero(summand)
    11       960          476      0.5     11.8          condition = sum_ >= min_matches
    12       960          422      0.4     10.4          if condition:
    13         3            0      0.0      0.0              yield idx       

Total time: 0.006642 s
File: 
Function: find_pulse_slow_revised at line 15

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    15                                           def find_pulse_slow_revised(arr_1, arr_2):
    16         1         6642   6642.0    100.0      return np.fromiter(yield_convolve(arr_1, arr_2), dtype=int)

Using the built-in np.convolve solution is definitely the way to go and is still way faster than even the

Code Snippets

def yield_convolve(arr_1, arr_2):
    """Yields indices of arr_1 where arr_2 nearly matches."""
    MAX_ERR = 3
    len_1, len_2 = len(arr_1), len(arr_2)  # I assume arr_2 is always shorter than arr_1
    min_matches = len_2 - MAX_ERR
    num_windows = len_1 - len_2
    for idx in xrange(num_windows):
        arr_1_window = arr_1[idx:(idx+len_2)]
        summand = np.equal(arr_1_window, arr_2)
        sum_ = np.count_nonzero(summand)
        condition = sum_ >= min_matches
        if condition:
            yield idx    



def find_pulse_slow_revised(arr_1, arr_2):
    return np.fromiter(yield_convolve(arr_1, arr_2), dtype=int)

_, fixed_dat = getFixedTestData()

%timeit findPulseSlow(fixed_dat, NOAAsyncA)

%timeit find_pulse_slow_revised(fixed_dat, NOAAsyncA)

%lprun -f find_pulse_slow_revised -f yield_convolve find_pulse_slow_revised(fixed_dat, NOAAsyncA)


10 loops, best of 3: 54 ms per loop
1000 loops, best of 3: 1.43 ms per loop

Timer unit: 1e-06 s

Total time: 0.004039 s
File: <ipython-input-95-5e22e2fbabc9>
Function: yield_convolve at line 1

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
     1                                           def yield_convolve(arr_1, arr_2):
     2                                               """Yields indices of arr_1 where arr_2 nearly matches."""
     3         1            1      1.0      0.0      MAX_ERR = 3
     4         1            1      1.0      0.0      len_1, len_2 = len(arr_1), len(arr_2)  # I assume arr_2 is always shorter than arr_1
     5         1            0      0.0      0.0      min_matches = len_2 - MAX_ERR
     6         1            0      0.0      0.0      num_windows = len_1 - len_2
     7       961          407      0.4     10.1      for idx in xrange(num_windows):
     8       960          579      0.6     14.3          arr_1_window = arr_1[idx:(idx+len_2)]
     9       960         1511      1.6     37.4          summand = np.equal(arr_1_window, arr_2)
    10       960          642      0.7     15.9          sum_ = np.count_nonzero(summand)
    11       960          476      0.5     11.8          condition = sum_ >= min_matches
    12       960          422      0.4     10.4          if condition:
    13         3            0      0.0      0.0              yield idx       

Total time: 0.006642 s
File: <ipython-input-95-5e22e2fbabc9>
Function: find_pulse_slow_revised at line 15

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    15                                           def find_pulse_slow_revised(arr_1, arr_2):
    16         1         6642   6642.0    100.0      return np.fromiter(yield_convolve(arr_1, arr_2), dtype=int)

Context

StackExchange Code Review Q#64817, answer score: 3

Revisions (0)

No revisions yet.