patternpythonModerate
Generating a 3D Point Cloud
Viewed 0 times
generatingcloudpoint
Problem
I'm trying to produce a 3D point cloud from a depth image and some camera intrinsics. The image is 640x480, and is a NumPy array of bytes. The output is a (rows * columns) x 3 array of points.
I've gotten the function to work perfectly, but it's way too slow! (takes like 2 seconds per image to process). I'm wondering if there are any optimizations I can make before giving up and writing a C module.
I've gotten the function to work perfectly, but it's way too slow! (takes like 2 seconds per image to process). I'm wondering if there are any optimizations I can make before giving up and writing a C module.
def create_point_cloud(self, depth_image):
shape = depth_image.shape;
rows = shape[0];
cols = shape[1];
points = np.zeros((rows * cols, 3), np.float32);
bytes_to_units = (1.0 / 256.0);
# Linear iterator for convenience
i = 0
# For each pixel in the image...
for r in xrange(0, rows):
for c in xrange(0, cols):
# Get the depth in bytes
depth = depth_image[r, c, 0];
# If the depth is 0x0 or 0xFF, its invalid.
# By convention it should be replaced by a NaN depth.
if(depth > 0 and depth < 255):
# The true depth of the pixel in units
z = depth * bytes_to_units;
# Get the x, y, z coordinates in units of the pixel
points[i, 0] = (c - self.cx) / self.fx * z;
points[i, 1] = (r - self.cy) / self.fy * z;
points[i, 2] = z
else:
# Invalid points have a NaN depth
points[i, 2] = np.nan;
i = i + 1
return pointsSolution
- Review
-
No docstring. What does this function do? What parameters does it take? What shape must the
depth_image parameter be? What does it return?-
In Python, there is no need for a semi-colon at the end of a statement (unless another statement follows on the same line) and it is best to omit it.
-
It seems that
depth_image is required to have three dimensions, but only the coordinate 0 is used on the third dimension. It would be simpler if the function took a two-dimensional depth image. A caller with a three-dimensional image could pass depth_image[...,0].-
The result is returned as a linear array rather than in the natural shape as a cols × rows × 3 array. This loses information about the shape of the image that might be needed by some callers. There's a comment that says this is "for convenience" but if so, the caller can easily call
numpy.reshape.- Vectorize
As always with Numpy performance problems, an important step is to scrutinize all the loops that run in the Python interpreter (the
for and while loops) to see if they can be vectorized. A loop running inside Numpy is usually hundreds of times faster than the same loop running in native Python.In this case it should be straightforward: instead of looping over pixels and doing some operations on each pixel, apply those operations to the whole image and let Numpy worry about looping over the pixels.
We only need the 0 plane on the third axis, as discussed in §1.3 above:
depth = depth_image[..., 0]Let's make a mask array of valid pixels:
valid = (depth > 0) & (depth < 255)The z-coordinates of the result can be computed by using
numpy.where to distinguish between valid and invalid pixels:z = np.where(valid, depth / 256.0, np.nan)For the other coordinates of the result, we need row and column coordinates for each pixel, which can be generated using
numpy.meshgrid:rows, cols = depth_image.shape
c, r = np.meshgrid(np.arange(cols), np.arange(rows), sparse=True)Then we have:
x = np.where(valid, z * (c - self.cx) / self.fx, 0)
y = np.where(valid, z * (r - self.cy) / self.fy, 0)The coordinate arrays can be stacked using
numpy.dstack to produce the result:return np.dstack((x, y, z))If the caller really needs a linear array, they can flatten it using
numpy.reshape.- Revised code
Putting this together:
def point_cloud(self, depth):
"""Transform a depth image into a point cloud with one point for each
pixel in the image, using the camera transform for a camera
centred at cx, cy with field of view fx, fy.
depth is a 2-D ndarray with shape (rows, cols) containing
depths from 1 to 254 inclusive. The result is a 3-D array with
shape (rows, cols, 3). Pixels with invalid depth in the input have
NaN for the z-coordinate in the result.
"""
rows, cols = depth.shape
c, r = np.meshgrid(np.arange(cols), np.arange(rows), sparse=True)
valid = (depth > 0) & (depth < 255)
z = np.where(valid, depth / 256.0, np.nan)
x = np.where(valid, z * (c - self.cx) / self.fx, 0)
y = np.where(valid, z * (r - self.cy) / self.fy, 0)
return np.dstack((x, y, z))I find that this is about fifty times as fast as the original code.
Code Snippets
depth = depth_image[..., 0]valid = (depth > 0) & (depth < 255)z = np.where(valid, depth / 256.0, np.nan)rows, cols = depth_image.shape
c, r = np.meshgrid(np.arange(cols), np.arange(rows), sparse=True)x = np.where(valid, z * (c - self.cx) / self.fx, 0)
y = np.where(valid, z * (r - self.cy) / self.fy, 0)Context
StackExchange Code Review Q#79032, answer score: 19
Revisions (0)
No revisions yet.