patternpythonMinor
Python Octree Implementation
Viewed 0 times
implementationoctreepython
Problem
I'm working with 3D point clouds stored in Numpy arrays. I'd succesfully used the scipy's KDTree implementation for task like k-neighbors search and outlier filtering.
However I wanted to try the octrees as an alternative data structure for other task like downsampling.
I was looking for an octree implementation in Python but I dind't find what I was looking for and as a result of that, I tried to implement a solution by myself.
The solution that I've writed so far, works, but it scales pretty bad when it has to manage "large" point clouds (+ 3 Million points) and when the number of divisions goes up.
The first thing that I use the octree for was to replace the points cointained in each of the nodes by their centroid, in order to downsample the density of the point cloud.
I thought that the fastest way to achieve that was to assing an index to each point indicating to wich node it belongs.
PART 1
Starting from a point cloud (I would use a really small point cloud to simplification) stored in a numpy array:
Where the each rows represents 1 point and each colum the corresponding
I get the minimum bounding box of the point cloud as follows:
And use it for obtain the first 8 nodes of the octree, using the next function:
SPLIT FUNCTION
```
def split(xyzmin, xyzmax):
""" Splits the node defined by xyzmin and xyzmax into 8 sub-nodes
Parameters
----------
xyzmin: (3,) ndarray
The x,y,z minimum coordinates that delimite the node
xyzmax: (3,) ndarray
The x,y,z maxymum coordinates that delimite the node
Returns
------
However I wanted to try the octrees as an alternative data structure for other task like downsampling.
I was looking for an octree implementation in Python but I dind't find what I was looking for and as a result of that, I tried to implement a solution by myself.
The solution that I've writed so far, works, but it scales pretty bad when it has to manage "large" point clouds (+ 3 Million points) and when the number of divisions goes up.
The first thing that I use the octree for was to replace the points cointained in each of the nodes by their centroid, in order to downsample the density of the point cloud.
I thought that the fastest way to achieve that was to assing an index to each point indicating to wich node it belongs.
PART 1
Starting from a point cloud (I would use a really small point cloud to simplification) stored in a numpy array:
xyz
array([[ 0. , 0. , 0. ],
[ 0.125, 0.125, 0.125],
[ 0.25 , 0.25 , 0.25 ],
[ 0.375, 0.375, 0.375],
[ 0.5 , 0.5 , 0.5 ],
[ 0.625, 0.625, 0.625],
[ 0.75 , 0.75 , 0.75 ],
[ 0.875, 0.875, 0.875],
[ 1. , 1. , 1. ]])Where the each rows represents 1 point and each colum the corresponding
x, y and z coordinates.I get the minimum bounding box of the point cloud as follows:
xyzmin = np.min(xyz,axis=0)
xyzmax = np.max(xyz,axis=0)And use it for obtain the first 8 nodes of the octree, using the next function:
SPLIT FUNCTION
```
def split(xyzmin, xyzmax):
""" Splits the node defined by xyzmin and xyzmax into 8 sub-nodes
Parameters
----------
xyzmin: (3,) ndarray
The x,y,z minimum coordinates that delimite the node
xyzmax: (3,) ndarray
The x,y,z maxymum coordinates that delimite the node
Returns
------
Solution
I have twisted the problem a little and find a way of optimize it at the cost of only get the information of one level of subdivision.
Whith this I mean that the bellow method obtains the same result as the method posted on the question (it assings 1 unique index to all the points that are inside a node) but instead of having a concatenation of the indexes obtained along all the subdivisions untill the last one, the indexes are only based on one unique level (what would be the last level in the previous method) of subdivision.
So, starting again from a point cloud, I get the minimum bounding box as before:
But now I also instanciate a variable
And I find the number of segments in wich each of the coordinate axis would be subdivided at that level:
After that I use that number of segments to create 3 linspances (1 for each axis):
I create an empty container for the future indexes:
And use this function to fill the container:
What the function does is pretty intuitive, it loops over all the points in the point cloud, looking for the segment of the
Afther that function we have an (N,3) array where each column indicates the segment of the corresponding axis where the point lies. By concatenating the columns with this function:
We obtain an (N,) array with unique indixes values that allow us to agroup the points that lies on the same node.
TIMINGS
On a 3.Million point cloud and a subdivision level of 4:
The improvements on the timing are huge, and as the use that I was giving to the indexes was not influenced by having the information of past subdivision, the new method I tried here serves as well.
Here are the resoults for the same point cloud used in the question:
I could even go further and say that the indexes of the new method are a little more intuitive about where each point lies.
Whith this I mean that the bellow method obtains the same result as the method posted on the question (it assings 1 unique index to all the points that are inside a node) but instead of having a concatenation of the indexes obtained along all the subdivisions untill the last one, the indexes are only based on one unique level (what would be the last level in the previous method) of subdivision.
So, starting again from a point cloud, I get the minimum bounding box as before:
xyzmin = np.min(xyz,axis=0)
xyzmax = np.max(xyz,axis=0)But now I also instanciate a variable
n wich represent the subdivision level where I want to locate the points. In the above question this variable correspond to the X in the line first line of the while loop: while n < Xn = 4And I find the number of segments in wich each of the coordinate axis would be subdivided at that level:
n = (2 ** n) + 1After that I use that number of segments to create 3 linspances (1 for each axis):
x = np.linspace(xyzmin[0], xyzmax[0], n)
y = np.linspace(xyzmin[1], xyzmax[1], n)
z = np.linspace(xyzmin[2], xyzmax[2], n)I create an empty container for the future indexes:
idx = np.zeros_like(xyz,dtype=int)And use this function to fill the container:
@jit(nopython=True)
def fill_idx(xyz, x, y, z, empty_index):
for i in range(len(xyz)):
for j in range(len(x)):
if xyz[i,0] >= x[j] and xyz[i,0] = y[k] and xyz[i,1] = z[l] and xyz[i,2] <= z[l+1]:
empty_index[i,0] = j+1
empty_index[i,1] = k+1
empty_index[i,2] = l+1
break
break
breakWhat the function does is pretty intuitive, it loops over all the points in the point cloud, looking for the segment of the
x axis where the point lies; once that segement is founded, start a loop again over the 'y' axis, and repeat after that with the 'z' axis.Afther that function we have an (N,3) array where each column indicates the segment of the corresponding axis where the point lies. By concatenating the columns with this function:
def merge_idx(idx):
a = idx[:,0] * (10 ** (np.log10(idx[:,1]).astype(int) + 1)) + idx[:,1]
b = a * (10 ** (np.log10(idx[:,2]).astype(int) + 1)) + idx[:,2]
return bWe obtain an (N,) array with unique indixes values that allow us to agroup the points that lies on the same node.
TIMINGS
On a 3.Million point cloud and a subdivision level of 4:
%%timeit
build(xyz)
1 loop, best of 3: 26.4 s per loop
%%timeit
build2(xyz, 4)
1 loop, best of 3: 477 ms per loopThe improvements on the timing are huge, and as the use that I was giving to the indexes was not influenced by having the information of past subdivision, the new method I tried here serves as well.
Here are the resoults for the same point cloud used in the question:
index
Out[6]: [11111, 11177, 11777, 17177, 17777, 71177, 71777, 77177, 77777]
index2
Out[7]:
array([ 111, 444, 888, 121212, 161616, 202020, 242424, 282828, 323232])I could even go further and say that the indexes of the new method are a little more intuitive about where each point lies.
Code Snippets
xyzmin = np.min(xyz,axis=0)
xyzmax = np.max(xyz,axis=0)n = (2 ** n) + 1x = np.linspace(xyzmin[0], xyzmax[0], n)
y = np.linspace(xyzmin[1], xyzmax[1], n)
z = np.linspace(xyzmin[2], xyzmax[2], n)idx = np.zeros_like(xyz,dtype=int)@jit(nopython=True)
def fill_idx(xyz, x, y, z, empty_index):
for i in range(len(xyz)):
for j in range(len(x)):
if xyz[i,0] >= x[j] and xyz[i,0] <= x[j+1]:
for k in range(len(y)):
if xyz[i,1] >= y[k] and xyz[i,1] <= y[k+1]:
for l in range(len(z)):
if xyz[i,2] >= z[l] and xyz[i,2] <= z[l+1]:
empty_index[i,0] = j+1
empty_index[i,1] = k+1
empty_index[i,2] = l+1
break
break
breakContext
StackExchange Code Review Q#126521, answer score: 2
Revisions (0)
No revisions yet.