“The sculpture is already complete within the marble block, before I start my work. It is already there, I just have to chisel away the superfluous material.”

– Michaelangelo


Computer vision is the field of making computers see, using nothing but algorithms and images from cameras to make sense of the world. One of the principle problems is that an image is a 2D representation of a 3D world. Pictures are flat. Our world has depth. However, there’s a variety of techniques in computer vision to handle this fundamental constraint. One such technique is called “space carving”.

Space carving is the technique of recreating a 3D object from images, literally recreating an entire missing dimension using different perspectives.

The technique involves 2 basic steps. [All images below are taken from a project on using space carving to phenotype corn growth]

  1. Start with a virtual block composed of millions of little cubes. Each cube is called a “voxel”. Analogous to a 2D “pixel”, it’s the smallest unit of resolution for 3D objects. You can think of this block as being similar to the solid chunk of wood or marble that a sculptor uses to begin their craft.

    Cube of voxels A voxel grid in Matlab.

  2. Iteratively take images from different orientations, apply some image processing, and project the post-processed image onto the virtual block. Then you carve away all the voxels that are not part of the object.

    5 images of corn plant taken at varying angles Images of a growing corn stalk taken at 5 different orientations.

    Background extraction and bitmask creation from image of corn Image processing to remove the background and create a bitmask in the shape of the corn stalk.

    Space carving of the voxel cube using projections from the 2D corn images. Creation of convex hull. Space carving of the corn stalk using multiple different perspectives applied to the same block of voxels. The final image contains the convex hull visualized on the 3D model.


Notice that when there is only a single perspective, the object looks nothing like a plant. It’s a single perspective projected into a world of higher dimensionality. As we add more perspectives, the object begins to resolve itself into a plant.

Once we have recreated a model of our object, we can analyze it in 3D. Computing 3D properties of an object are usually more relevant to us than computing 2D properties.

Note that it’s important to use perspectives that aren’t collinear. If you used 4 orientations at 0°, 90°, 180°, and 270°, you would effectively only have 2 perspectives – half the number. Opposite perspectives would mirror one another. When projected onto a 3D block, the mirrored perspectives would not provide any additional information about depth.


Extra Links