Researchers at Stanford University, MIT’s Computer and AI lab, and the Autodesk AI Lab have created a deep learning framework to construct 3D objects. The Manual-to-Executable-Plan Network (MEPNet) framework was tested on Lego sets generated by computers. The training included genuine Lego set instructions and Minecraft-style voxel building plans.
Existing methods of rendering 3D objects are simple but computationally expensive and not very good at handling unseen shapes. Additionally, a few problems surface when existing AI techniques interpret 2D instructions to transform them into 3D. Visual instructions like Lego sets consist entirely of images; hence, identifying differences between 2D and 3D can become complex because they are usually assembled.
The researchers said, “This increases the difficulty for machines to interpret Lego manuals: it requires inferring 3D poses of unseen objects composed of seen primitives.”
Read More: Meta AI releases Implicitron for representations in PyTorch3D
MEPNet combines the existing upsides and new 3D rendering techniques by starting with a 3D model of components, Lego set, and 2D manual images. It predicts a set of 2D keypoints and masks each component.
After masking, 2D keypoints are “back-projected to 3D by finding possible connections between the base shape and the new components.” The team said the combination “maintains the efficiency of learning-based models.”
All you have to do is interpret MEPNet’s 3D renderings, which would hopefully be easier than flat-pack furniture instructions. You can test MEPNet here if you are familiar with PyTorch.