patternpythonMinor
Find binary sequence in NumPy binary array
Viewed 0 times
numpyarraysequencebinaryfind
Problem
As part of creating a software defined receiver and decoder, I want to find sync pulses in weather satellite data. The data is first digitized to a 0-1 array.
The code below is working. The convolution method is sufficiently fast. The slow method is not.
I was wondering if there was a more efficient or "Pythonic" way to do it.
Here is a complete test framework:
testPulse.py
```
import numpy as np
NOAAsyncA=np.concatenate( ([4[0],7[1,1,0,0],8*[0]]) )
def getFixedTestData():
data = np.zeros(1000)
syncL = len(NOAAsyncA)
data[12:(12+syncL)]=NOAAsyncA
data[257:(257+syncL)]=NOAAsyncA
data[900:(900+syncL)]=NOAAsyncA
return ([12,257,900], data)
def getRandomTestData():
# create random 1 Megabit data, insert 10 NOAA sync pulses in data
# return correct pulse start locations, data
global NOAAsyncA
data = np.random.randint(2, size=1000000)
# make an ascending list of random spots to insert the pulse
insertAidx = np.sort(np.random.randint(1000000, size=10))
# insert the pulses
for l in insertAidx:
data[l:(l+len(NOAAsyncA))]=NOAAsyncA
# return the pulse locations and the data
return (insertAidx, data)
def findPulseSlow(data, pulse):
# find indexes of occurrences of pulse in data
# data and pulse are numpy arrays of 0-1 elements
# tolerate up to 3 incorrect values
# takes 2 minutes per Mb on AMD FX8150
maxerr=3
return np.fromiter( \
( i for i in range(0,len(data)-len(pulse)) \
if sum(np.logical_xor(data[i:(i+len(pulse))], pulse))=required)[0]
def findPulseCon
The code below is working. The convolution method is sufficiently fast. The slow method is not.
I was wondering if there was a more efficient or "Pythonic" way to do it.
def findPulseSlow(data, pulse):
# find indexes of occurrences of pulse in data
# data and pulse are numpy arrays of 0-1 elements
# tolerate up to 3 incorrect values
# takes 2 minutes per Mb on AMD FX8150
maxerr=3
return np.fromiter( \
( i for i in range(0,len(data)-len(pulse)) \
if sum(np.logical_xor(data[i:(i+len(pulse))], pulse))=required)[0]Here is a complete test framework:
testPulse.py
```
import numpy as np
NOAAsyncA=np.concatenate( ([4[0],7[1,1,0,0],8*[0]]) )
def getFixedTestData():
data = np.zeros(1000)
syncL = len(NOAAsyncA)
data[12:(12+syncL)]=NOAAsyncA
data[257:(257+syncL)]=NOAAsyncA
data[900:(900+syncL)]=NOAAsyncA
return ([12,257,900], data)
def getRandomTestData():
# create random 1 Megabit data, insert 10 NOAA sync pulses in data
# return correct pulse start locations, data
global NOAAsyncA
data = np.random.randint(2, size=1000000)
# make an ascending list of random spots to insert the pulse
insertAidx = np.sort(np.random.randint(1000000, size=10))
# insert the pulses
for l in insertAidx:
data[l:(l+len(NOAAsyncA))]=NOAAsyncA
# return the pulse locations and the data
return (insertAidx, data)
def findPulseSlow(data, pulse):
# find indexes of occurrences of pulse in data
# data and pulse are numpy arrays of 0-1 elements
# tolerate up to 3 incorrect values
# takes 2 minutes per Mb on AMD FX8150
maxerr=3
return np.fromiter( \
( i for i in range(0,len(data)-len(pulse)) \
if sum(np.logical_xor(data[i:(i+len(pulse))], pulse))=required)[0]
def findPulseCon
Solution
Before I get to your specific question, here are some general comments:
-
Since you are working with binary data, you should tell NumPy that your
-
The python convention for naming functions is
-
The right approach for optimizing your currently too-slow function is to use line profiling. I really like the
-
Doing #3 will mean unpacking that formidable iterator comprehension you wrote! Instead of squeezing it all on one line, I'd define a generator function for the iterator, and try to be extremely explicit so that every line has only one function call, like this:
I was probably overly expressive in the function I made -- it could be cleaned up by avoiding assigment to so many temporary variables but when doing line profiling this style really helps isolate the slow steps.
-
Watch out for places where you are inadvertently causing interconversion of NumPy data types and native Python data types. For example,
-
I thought the
-
I did a few timing tests with my revised function and it was at least ~20-30x faster than your
-
Since you are working with binary data, you should tell NumPy that your
dtype is bool everywhere that you can. That will lower memory requirements and speed up certain operations. For example, data = np.zeros(1000) in getFixedTestData() should become data = np.zeros(1000, dtype=bool). The default NumPy dtype is float which is not what you want.-
The python convention for naming functions is
snake_case i.e. all lowercase with underscores, not camelCase. So findPulseSlow should be find_pulse_slow etc.-
The right approach for optimizing your currently too-slow function is to use line profiling. I really like the
line_profiler module. If you use IPython you can use it very easily as an inline magic function.-
Doing #3 will mean unpacking that formidable iterator comprehension you wrote! Instead of squeezing it all on one line, I'd define a generator function for the iterator, and try to be extremely explicit so that every line has only one function call, like this:
def yield_convolve(arr_1, arr_2):
"""Yields indices of arr_1 where arr_2 nearly matches."""
MAX_ERR = 3
len_1, len_2 = len(arr_1), len(arr_2) # I assume arr_2 is always shorter than arr_1
min_matches = len_2 - MAX_ERR
num_windows = len_1 - len_2
for idx in xrange(num_windows):
arr_1_window = arr_1[idx:(idx+len_2)]
summand = np.equal(arr_1_window, arr_2)
sum_ = np.count_nonzero(summand)
condition = sum_ >= min_matches
if condition:
yield idx
def find_pulse_slow_revised(arr_1, arr_2):
return np.fromiter(yield_convolve(arr_1, arr_2), dtype=int)I was probably overly expressive in the function I made -- it could be cleaned up by avoiding assigment to so many temporary variables but when doing line profiling this style really helps isolate the slow steps.
-
Watch out for places where you are inadvertently causing interconversion of NumPy data types and native Python data types. For example,
sum(np.logical_xor(...) should be np.sum(np.logical_xor(...)-
I thought the
np.logical_xor part you did was clever, but to my eyes at least it was a bit easier to grok using len(pulse) - maxerr as a criterion and using np.equal instead of np.logical_xor, so that is what I did in my example above.-
I did a few timing tests with my revised function and it was at least ~20-30x faster than your
findPulseSlow(). Using np.sum instead of sum was responsible for most of the speedup. Using np.count_nonzero instead of np.sum gave an additional 2× or so. Here's an example of some code from my IPython notebook, along with some output from the line profiler._, fixed_dat = getFixedTestData()
%timeit findPulseSlow(fixed_dat, NOAAsyncA)
%timeit find_pulse_slow_revised(fixed_dat, NOAAsyncA)
%lprun -f find_pulse_slow_revised -f yield_convolve find_pulse_slow_revised(fixed_dat, NOAAsyncA)
10 loops, best of 3: 54 ms per loop
1000 loops, best of 3: 1.43 ms per loopTimer unit: 1e-06 s
Total time: 0.004039 s
File:
Function: yield_convolve at line 1
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1 def yield_convolve(arr_1, arr_2):
2 """Yields indices of arr_1 where arr_2 nearly matches."""
3 1 1 1.0 0.0 MAX_ERR = 3
4 1 1 1.0 0.0 len_1, len_2 = len(arr_1), len(arr_2) # I assume arr_2 is always shorter than arr_1
5 1 0 0.0 0.0 min_matches = len_2 - MAX_ERR
6 1 0 0.0 0.0 num_windows = len_1 - len_2
7 961 407 0.4 10.1 for idx in xrange(num_windows):
8 960 579 0.6 14.3 arr_1_window = arr_1[idx:(idx+len_2)]
9 960 1511 1.6 37.4 summand = np.equal(arr_1_window, arr_2)
10 960 642 0.7 15.9 sum_ = np.count_nonzero(summand)
11 960 476 0.5 11.8 condition = sum_ >= min_matches
12 960 422 0.4 10.4 if condition:
13 3 0 0.0 0.0 yield idx
Total time: 0.006642 s
File:
Function: find_pulse_slow_revised at line 15
Line # Hits Time Per Hit % Time Line Contents
==============================================================
15 def find_pulse_slow_revised(arr_1, arr_2):
16 1 6642 6642.0 100.0 return np.fromiter(yield_convolve(arr_1, arr_2), dtype=int)- Using the built-in
np.convolvesolution is definitely the way to go and is still way faster than even the
Code Snippets
def yield_convolve(arr_1, arr_2):
"""Yields indices of arr_1 where arr_2 nearly matches."""
MAX_ERR = 3
len_1, len_2 = len(arr_1), len(arr_2) # I assume arr_2 is always shorter than arr_1
min_matches = len_2 - MAX_ERR
num_windows = len_1 - len_2
for idx in xrange(num_windows):
arr_1_window = arr_1[idx:(idx+len_2)]
summand = np.equal(arr_1_window, arr_2)
sum_ = np.count_nonzero(summand)
condition = sum_ >= min_matches
if condition:
yield idx
def find_pulse_slow_revised(arr_1, arr_2):
return np.fromiter(yield_convolve(arr_1, arr_2), dtype=int)_, fixed_dat = getFixedTestData()
%timeit findPulseSlow(fixed_dat, NOAAsyncA)
%timeit find_pulse_slow_revised(fixed_dat, NOAAsyncA)
%lprun -f find_pulse_slow_revised -f yield_convolve find_pulse_slow_revised(fixed_dat, NOAAsyncA)
10 loops, best of 3: 54 ms per loop
1000 loops, best of 3: 1.43 ms per loopTimer unit: 1e-06 s
Total time: 0.004039 s
File: <ipython-input-95-5e22e2fbabc9>
Function: yield_convolve at line 1
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1 def yield_convolve(arr_1, arr_2):
2 """Yields indices of arr_1 where arr_2 nearly matches."""
3 1 1 1.0 0.0 MAX_ERR = 3
4 1 1 1.0 0.0 len_1, len_2 = len(arr_1), len(arr_2) # I assume arr_2 is always shorter than arr_1
5 1 0 0.0 0.0 min_matches = len_2 - MAX_ERR
6 1 0 0.0 0.0 num_windows = len_1 - len_2
7 961 407 0.4 10.1 for idx in xrange(num_windows):
8 960 579 0.6 14.3 arr_1_window = arr_1[idx:(idx+len_2)]
9 960 1511 1.6 37.4 summand = np.equal(arr_1_window, arr_2)
10 960 642 0.7 15.9 sum_ = np.count_nonzero(summand)
11 960 476 0.5 11.8 condition = sum_ >= min_matches
12 960 422 0.4 10.4 if condition:
13 3 0 0.0 0.0 yield idx
Total time: 0.006642 s
File: <ipython-input-95-5e22e2fbabc9>
Function: find_pulse_slow_revised at line 15
Line # Hits Time Per Hit % Time Line Contents
==============================================================
15 def find_pulse_slow_revised(arr_1, arr_2):
16 1 6642 6642.0 100.0 return np.fromiter(yield_convolve(arr_1, arr_2), dtype=int)Context
StackExchange Code Review Q#64817, answer score: 3
Revisions (0)
No revisions yet.