Talking Papers Podcast

Instant3D - Jiahao Li

Itzik Ben-Shabat Season 1 Episode 32

Welcome to another exciting episode of the Talking Papers Podcast! In this episode, I had the pleasure of hosting Jiahao Li, a talented PhD student at Toyota Technological Institute at Chicago (TTIC), who discussed his groundbreaking research paper titled "Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model". This paper, published in ICLR 2024, introduces a novel method that revolutionizes text-to-3D generation.

Instant3D addresses the limitations of existing methods by combining a two-stage approach. First, a fine-tuned 2D text-to-image diffusion model generates a set of four structured and consistent views from the given text prompt. Then, a transformer-based sparse-view reconstructor directly regresses the NeRF from the generated images. The results are stunning: high-quality and diverse 3D assets are produced within a mere 20 seconds, making it a hundred times faster than previous optimization-based methods.

As a 3D enthusiast myself, I found the outcomes of Instant3D truly captivating, especially considering the short amount of time it takes to generate them. While it's unusual for a 3D person like me to experience these creations through a 2D projection, the astonishing results make it impossible to ignore the potential of this approach. This paper underscores the importance of obtaining more and better 3D data, paving the way for exciting advancements in the field.

Let me share a little anecdote about our guest, Jiahao Li. We were initially introduced through Yicong Hong, another brilliant guest on our podcast. Yicong, who was a PhD student at ANU during my postdoc, and Jiahao interned together at Adobe while working on this very paper. Coincidentally, Yicong also happens to be a coauthor of Instant3D. It's incredible to see such brilliant minds coming together on groundbreaking research projects.

Now, unfortunately, the model developed in this paper is not publicly available. However, given the computational resources required to train these advanced models and obvious copyright issues, it's understandable that Adobe has chosen to keep it proprietary. Not all of us have a hundred GPUs lying around, right?

Remember to hit that subscribe button and join the conversation in the comments section. Let's delve into the exciting world of Instant3D with Jiahao Li on this episode of Talking Papers Podcast!

#TalkingPapersPodcast #ICLR2024 #Instant3D #TextTo3D  #ResearchPapers #PhDStudents #AcademicResearch

All links and resources are available in the blogpost: https://www.itzikbs.com/instant3d

🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com

📧Subscribe to our mailing list: http://eepurl.com/hRznqb

🐦Follow us on Twitter: https://twitter.com/talking_papers

🎥YouTube Channel: https://bit.ly/3eQOgwP

People on this episode