Multi-view Consistency as Supervisory Signal
for Learning Shape and Pose Prediction


Shubham Tulsiani
Alexei A. Efros
Jitendra Malik

University of California, Berkeley
In CVPR, 2018


We learn to predict the shape and pose of an object from a single input view. Our framework can leverage training data of the form of multi-view observations of objects, and learn shape and pose prediction despite the lack of any direct supervision.



Paper

Tulsiani, Efros, Malik.

Multi-view Consistency as Supervisory Signal
for Learning Shape and Pose Prediction.

CVPR, 2018. (to appear)

[preprint]
[Bibtex]


Code


 [GitHub]


Results


Learning using online images. We can train our system using images downloaded from eBay and corresponding automatically obtained segmentations. We visualize our learned shape and pose predictions using the depicted training data.

Acknowledgements

We thank David Fouhey for insightful discussions, and Saurabh Gupta and Tinghui Zhou for helpful comments. This work was supported in part by Intel/NSF VEC award IIS-1539099 and NSF Award IIS-1212798. We gratefully acknowledge NVIDIA corporation for the donation of GPUs used for this research. This webpage template was borrowed from some colorful folks.