Shubham Tulsiani

I am an Assistant Professor at Carnegie Mellon University in the Robotics Institute, where I am a part of the Computer Vision group. I am interested in building perception systems that can infer the spatial and physical structure of the world they observe. Please see these recent talks for an overview.

Prior to joining CMU, I was a Research Scientist at FAIR, Pittsburgh working with Abhinav Gupta. I previously graduated from UC, Berkeley where I was advised by Jitendra Malik, and also frequently collaborated with Alyosha Efros.

email: shubhtuls AT
Office: Smith Hall 230
My picture

Research Group

Our group is interested in inferring physically and spatially grounded representations from perceptual input, and leveraging these for advances in fundamental problems in computer vision and robot manipulation. We believe that to enable machines to understand the physical world, we should reduce the reliance on supervision by annotation, and instead develop learning mechanisms informed by the real, physical world we live in – by incorporating our knowledge about its structure and laws as a 'meta-supervisory' signal.

If you are interested in joining our group, please read this.

Dear Prospective Students,
Thanks for the interest in being a part of our group! Unfortunately, I am unable to reply to individual emails, but hope you find the following helpful:

I am a CMU student. How do I join your group?
Send me an email and/or drop by my office - I'd be happy to chat! If you are an undergraduate, also consider reaching out to the PhD students in our group if their projects align with your interests.

I want to join CMU. What graduate programs should I apply to?
PhD. Applicants: While I am primarily affiliated with RI, I can supervise students admitted in any SCS department (e.g. MLD, CSD) so apply to the department that best matches your interests and background. If you are interested in working with me, mention this in your application statement.
MS Applicants: RI offers MSR (research focused) and MSCV (industry focused) MS programs among others. Please apply to the program most aligned with your future goals.

Should I contact you before applying to CMU for admission?
Admissions across all PhD/MS programs are done by department-level committees and I am unable to help with individual applications. Please do feel free to reach out after you are admitted.

Are you accepting interns/visitors?
We do not have any short-term positions at this time.

PhD Students
Himangi Mittal
Homanga Bharadhwaj (co-advised with Abhinav Gupta)
Hanzhe Hu
Yehonathan Litman (with Fernando De la Torre)
Yufei (Judy) Ye (co-advised with Abhinav Gupta)

MS Students
Yanbo Xu (MSR)
Qitao Zhao (MSCV)

Undergraduate Students
Amy Lin
Lucas Wu

Jason Zhang (co-advised with Deva Ramanan), Sparse-view 3D in the Wild, 2024. Google

Bharath Raj. PhD at Cornell
Zhizhuo (Z) Zhou. PhD at Stanford

MSCV: Poorvi Hebbar, Naveen Venkat, Mayank Agarwal,
Yen-Chi Cheng, Paritosh Mittal


16-822: Geometry-based Methods in Vision. Fall 2023, 2022
16-825: Learning for 3D Vision. Spring 2024, 2023, 2022

Publications (all | selected)


[New] Track2Act: Predicting Point Tracks from Internet Videos Enables Diverse Zero-shot Manipulation
Homanga Bharadhwaj, Roozbeh Mottaghi*, Abhinav Gupta*, Shubham Tulsiani*
ECCV, 2024
[New] UpFusion: Novel View Diffusion from Unposed Sparse View Observations
Bharath Raj Nagoor Kani, Hsin-Ying Lee, Sergey Tulyakov, Shubham Tulsiani
ECCV, 2024
[New] G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis
Yufei Ye, Abhinav Gupta, Kris Kitani, Shubham Tulsiani
CVPR, 2024
[New] MVD-Fusion: Single-view 3D via Depth-consistent Multi-view Generation
Hanzhe Hu*, Zhizhuo Zhou*, Varun Jampani, Shubham Tulsiani
CVPR, 2024
[New] Cameras as Rays: Pose Estimation via Ray Diffusion
Jason Y. Zhang*, Amy Lin*, Moneish Kumar, Tzu-Hsuan Yang, Deva Ramanan, Shubham Tulsiani
ICLR, 2024
Towards Generalizable Zero-Shot Manipulation via Translating Human Interaction Plans
Homanga Bharadhwaj, Abhinav Gupta*, Vikash Kumar*, Shubham Tulsiani*
ICRA, 2024 (Finalist for Best Paper Award in Robot Manipulation)
RoboAgent: Towards Sample Efficient Robot Manipulation with Semantic Augmentations and Action Chunking
Homanga Bharadhwaj*, Jay Vakil*, Mohit Sharma*, Abhinav Gupta, Shubham Tulsiani, Vikash Kumar
ICRA, 2024
RelPose++: Recovering 6D Poses from Sparse-view Observations
Amy Lin*, Jason Y. Zhang*, Deva Ramanan, Shubham Tulsiani
3DV, 2024
Diffusion-Guided Reconstruction of Everyday Hand-Object Interaction Clips
Yufei Ye, Poorvi Hebbar, Abhinav Gupta, Shubham Tulsiani
ICCV, 2023
Manipulate by Seeing: Creating Manipulation Controllers from Pre-Trained Representations
Jianren Wang*, Sudeep Dasari*, Mohan Kumar Srirama, Shubham Tulsiani, Abhinav Gupta
ICCV, 2023
Mesh2Tex: Generating Mesh Textures from Image Queries
Alexey Bokhovkin, Shubham Tulsiani, Angela Dai
ICCV, 2023
Visual Affordance Prediction for Guiding Robot Exploration
Homanga Bharadhwaj, Abhinav Gupta, Shubham Tulsiani
ICRA, 2023
Analogy-Forming Transformers for Few-Shot 3D Parsing
Nikolaos Gkanatsios*, Mayank Singh*, Zhaoyuan Fang, Shubham Tulsiani, Katerina Fragkiadaki
ICLR, 2023
SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction
Zhizhuo Zhou, Shubham Tulsiani
CVPR, 2023
Affordance Diffusion: Synthesizing Hand-Object Interactions
Yufei Ye, Xueting Li, Abhinav Gupta, Shalini De Mello, Stan Birchfield, Jiaming Song, Shubham Tulsiani, Sifei Liu
CVPR, 2023
Monocular Dynamic View Synthesis: A Reality Check
Hang Gao, Ruilong Li, Shubham Tulsiani, Bryan Russell, Angjoo Kanazawa
NeurIPS, 2022
RelPose: Predicting Probabilistic Relative Rotation for Single Objects in the Wild
Jason Y. Zhang, Deva Ramanan, Shubham Tulsiani
ECCV, 2022
Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction
Kalyan Vasudev Alwala, Abhinav Gupta, Shubham Tulsiani
CVPR, 2022
What's in your hands? 3D Reconstruction of Generic Objects in Hands
Yufei Ye, Abhinav Gupta, Shubham Tulsiani
CVPR, 2022
AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation
Paritosh Mittal*, Yen-Chi Cheng*, Maneesh Singh, Shubham Tulsiani
CVPR, 2022
NeRS: Neural Reflectance Surfaces for Sparse-view 3D Reconstruction in the Wild
Jason Y. Zhang, Gengshan Yang, Shubham Tulsiani*, and Deva Ramanan*
NeurIPS, 2021
No RL, No Simulation: Learning to Navigate without Navigating
Meera Hahn, Devendra Chaplot, Shubham Tulsiani, Mustafa Mukadam, James M. Rehg, Abhinav Gupta
NeurIPS, 2021
A Differentiable Recipe for Learning Visual Non-Prehensile Planar Manipulation
Bernardo Aceituno, Alberto Rodriguez, Shubham Tulsiani, Abhinav Gupta, Mustafa Mukadam
CoRL, 2021
Where2Act: From Pixels to Actions for Articulated 3D Objects
Kaichun Mo, Leonidas J. Guibas, Mustafa Mukadam, Abhinav Gupta, Shubham Tulsiani
ICCV, 2021
PixelTransformer: Sample Conditioned Signal Generation
Shubham Tulsiani, Abhinav Gupta
ICML, 2021
Shelf-Supervised Mesh Prediction in the Wild
Yufei Ye, Shubham Tulsiani, Abhinav Gupta
CVPR, 2021
See, Hear, Explore: Curiosity via Audio-Visual Association
Victoria Dean, Shubham Tulsiani, Abhinav Gupta
NeurIPS, 2020
Visual Imitation Made Easy
Sarah Young, Dhiraj Gandhi, Shubham Tulsiani, Abhinav Gupta, Pieter Abbeel, Lerrel Pinto
CORL, 2020
Articulation-aware Canonical Surface Mapping
Nilesh Kulkarni, Abhinav Gupta, David Fouhey, Shubham Tulsiani
CVPR, 2020
Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects
Kiana Ehsani, Shubham Tulsiani, Saurabh Gupta, Ali Farhadi, Abhinav Gupta
CVPR, 2020
Intrinsic Motivation for Encouraging Synergistic Behavior
Rohan Chitnis, Shubham Tulsiani, Saurabh Gupta, Abhinav Gupta
ICLR, 2020
Discovering Motor Programs by Recomposing Demonstrations
Tanmay Shankar, Shubham Tulsiani, Lerrel Pinto, Abhinav Gupta
ICLR, 2020
Efficient Bimanual Manipulation using Learned Task Schemas
Rohan Chitnis, Shubham Tulsiani, Saurabh Gupta, Abhinav Gupta
ICRA, 2020
Object-centric Forward Modeling for Model Predictive Control
Yufei Ye, Dhiraj Gandhi, Abhinav Gupta, Shubham Tulsiani
CORL, 2019
Canonical Surface Mapping via Geometric Cycle Consistency
Nilesh Kulkarni, Abhinav Gupta*, Shubham Tulsiani*
ICCV, 2019
Compositional Video Prediction
Yufei Ye, Maneesh Singh, Abhinav Gupta*, Shubham Tulsiani*
ICCV, 2019
3D-RelNet: Joint Object and Relational Network for 3D Prediction
Nilesh Kulkarni, Ishan Misra, Shubham Tulsiani, Abhinav Gupta
ICCV, 2019
Order-Aware Generative Modeling Using the 3D-Craft Dataset
Zhuoyuan Chen*, Demi Guo*, Tong Xiao*, et. al.
ICCV, 2019
Learning Unsupervised Multi-View Stereopsis via Robust Photometric Consistency
Tejas Khot*, Shubham Agrawal*, Shubham Tulsiani, Christoph Mertz, Simon Lucey, Martial Hebert
arXiv preprint, 2019
Layer-structured 3D Scene Inference via View Synthesis
Shubham Tulsiani, Richard Tucker, Noah Snavely
ECCV, 2018
Learning Category-Specific Mesh Reconstruction from Image Collections
Angjoo Kanazawa*, Shubham Tulsiani*, Alexei A. Efros, Jitendra Malik
ECCV, 2018
Multi-view Consistency as Supervisory Signal for Learning Shape and Pose Prediction
Shubham Tulsiani, Alexei A. Efros, Jitendra Malik
CVPR, 2018
Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene
Shubham Tulsiani, Saurabh Gupta, David Fouhey, Alexei A. Efros, Jitendra Malik
CVPR, 2018
Multi-view Supervision for Single-view Reconstruction via Differentiable Ray Consistency
Shubham Tulsiani, Tinghui Zhou, Alexei A. Efros, Jitendra Malik
CVPR, 2017
Learning Shape Abstractions by Assembling Volumetric Primitives
Shubham Tulsiani, Hao Su, Leonidas J. Guibas, Alexei A. Efros, Jitendra Malik
CVPR, 2017
Hierarchical Surface Prediction for 3D Object Reconstruction
Christian Häne, Shubham Tulsiani, Jitendra Malik
3DV, 2017
Learning Category-Specific Deformable 3D Models for Object Reconstruction
Shubham Tulsiani*, Abhishek Kar*, João Carreira, Jitendra Malik
TPAMI, 2016
View Synthesis by Appearance Flow
Tinghui Zhou, Shubham Tulsiani, Weilun Sun, Jitendra Malik, Alexei A. Efros
ECCV, 2016
Pose Induction for Novel Object Categories
Shubham Tulsiani, João Carreira, Jitendra Malik
ICCV, 2015
Amodal Completion and Size Constancy in Natural Scenes
Abhishek Kar, Shubham Tulsiani, João Carreira, Jitendra Malik
ICCV, 2015
Viewpoints and Keypoints
Shubham Tulsiani, Jitendra Malik
CVPR, 2015
Category-Specific Object Reconstruction from a Single Image
Abhishek Kar*, Shubham Tulsiani*, João Carreira, Jitendra Malik
CVPR, 2015 (Best Student Paper Award)
Virtual View Networks for Object Reconstruction
João Carreira, Abhishek Kar, Shubham Tulsiani, Jitendra Malik
CVPR, 2015
A colorful approach to text processing by example
Kuat Yessenov, Shubham Tulsiani, Aditya Menon, Robert C Miller, Sumit Gulwani, Butler Lampson, Adam Kalai
UIST, 2013
