Talking Papers Podcast

Talking Papers Podcast: deep dives into research papers in computer vision, 3D, machine learning, and AI, with the authors who wrote them. Where research meets conversation. By researchers, for researchers.

Each episode is structured like the paper itself: a TL;DR / abstract to set the stage, then related work, approach, results, conclusions, and future work. We close with a bonus segment called "What did Reviewer 2 say?", where the authors share the candid peer-review story behind the publication.

Hosted by Itzik Ben-Shabat. Guests are PhD students, postdocs, and faculty from leading labs across academia and industry. Aimed at fellow researchers and graduate students who want the candid version of the work, not a polished press release.

All Episodes

Talking Papers Podcast

NeRF-Det - Chenfeng Xu

September 06, 2023 • Itzik Ben-Shabat • Season 1 • Episode 26

0:00 | 29:47

Welcome to another exciting episode of the Talking Papers Podcast! In this installment, I had the pleasure of hosting Chengfenfg Xu to discuss his paper "NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection" which was published at ICCV2023.

In recent times, NeRF has gained widespread prominence, and the field of 3D detection has encountered well-recognized challenges. The principal contribution of this study lies in its ability to address the detection task while simultaneously training a NeRF model and enabling it to generalize to previously unobserved scenes. Although the computer vision community has been actively addressing various tasks related to images and point clouds for an extended period, it is particularly invigorating to witness the application of NeRF representation in tackling this specific challenge.

Chenfeng is currently a Ph.D. candidate at UC Berkeley, collaborating with Prof. Masayoshi Tomizuka and Prof. Kurt Keutzer. His affiliations include Berkeley DeepDrive (BDD) and Berkeley AI Research (BAIR), along with the MSC lab and PALLAS. His research endeavors revolve around enhancing computational and data efficiency in machine perception, with a primary focus on temporal-3D scenes and their downstream applications. He brings together traditionally separate approaches from geometric computing and deep learning to establish both theoretical frameworks and practical algorithms for temporal-3D representations. His work spans a wide range of applications, including autonomous driving, robotics, AR/VR, and consistently demonstrates remarkable efficiency through extensive experimentation. I am eagerly looking forward to see his upcoming research papers.

PAPER
NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection

AUTHORS
Chenfeng Xu, Bichen Wu, Ji Hou, Sam Tsai, Ruilong Li, Jialiang Wang, Wei Zhan, Zijian He, Peter Vajda, Kurt Keutzer, Masayoshi Tomizuka

ABSTRACT
NeRF-Det is a novel method for 3D detection with posed RGB images as input. Our method makes novel use of NeRF in an end-to-end manner to explicitly estimate 3D geometry, thereby improving 3D detection performance. Specifically, to avoid the significant extra latency associated with per-scene optimization of NeRF, we introduce sufficient geometry priors to enhance the generalizability of NeRF-MLP. We subtly connect the detection and NeRF branches through a shared MLP, enabling an efficient adaptation of NeRF to detection and yielding geometry-aware volumetric representations for 3D detection. As a result of our joint-training design, NeRF-Det is able to generalize well to unseen scenes for object detection, view synthesis, and depth estimation tasks without per-scene optimization.

All links and resources are available on the blog post:
https://www.itzikbs.com/nerf-det

🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com

📧Subscribe to our mailing list: http://eepurl.com/hRznqb

🐦Follow us on Twitter: https://twitter.com/talking_papers

🎥YouTube Channel: https://bit.ly/3eQOgwP

Yizhak Ben-Shabat (Itzik)

Host