14 episodes

A podcast by researchers for researchers. This podcast aims to be a new medium for disseminating research. In each episode I talk to the main author of an academic paper in the field of computer vision, machine learning, artificial intelligence, graphics and everything in between. Each episode is structured like a paper and includes a TL;DR (abstract), related work, approach, results, conclusions and a future work section. It also includes the bonus "what did reviewer 2 say" section where authors share their experience in the peer review process. Enjoy!

Talking Papers Podcast Itzik Ben-Shabat

    • Technology
    • 5.0 • 4 Ratings

A podcast by researchers for researchers. This podcast aims to be a new medium for disseminating research. In each episode I talk to the main author of an academic paper in the field of computer vision, machine learning, artificial intelligence, graphics and everything in between. Each episode is structured like a paper and includes a TL;DR (abstract), related work, approach, results, conclusions and a future work section. It also includes the bonus "what did reviewer 2 say" section where authors share their experience in the peer review process. Enjoy!

    BACON - David Lindell

    BACON - David Lindell

     In this episode of the Talking Papers Podcast, I hosted David B. Lindell to chat about his paper "BACON: Band-Limited Coordinate Networks for Multiscale Scene Representation”, published in CVPR 2022. 

    In this paper, they took on training a coordinate network. They do this by introducing a new type of neural network architecture that has an analytical Fourier spectrum. This allows them to do things like multi-scale signal representation, and, it gives an interpretable architecture, with an explicitly controllable bandwidth.

    David recently completed his Postdoc at Stanford and will join the University of Toronto as an Assistant Professor. During our chat, I got to know a stellar academic with a unique view of the field and where it is going. We even got to meet in person at CVPR. I am looking forward to seeing what he comes up with next. It was a pleasure having him on the podcast. 

    AUTHORS
    David B. Lindell, Dave Van Veen, Jeong Joon Park, Gordon Wetzstein

    ABSTRACT
     Neural implicit fields have recently emerged as a useful representation for 3D shapes. These fields are Coordinate-based networks have emerged as a powerful tool for 3D representation and scene reconstruction. These networks are trained to map continuous input coordinates to the value of a signal at each point. Still, current architectures are black boxes: their spectral characteristics cannot be easily analyzed, and their behavior at unsupervised points is difficult to predict. Moreover, these networks are typically trained to represent a signal at a single scale, so naive downsampling or upsampling results in artifacts. We introduce band-limited coordinate networks (BACON), a network architecture with an analytical Fourier spectrum. BACON has constrained behavior at unsupervised points, can be designed based on the spectral characteristics of the represented signal, and can represent signals at multiple scales without per-scale supervision. We demonstrate BACON for multiscale neural representation of images, radiance fields, and 3D scenes using signed distance functions and show that it outperforms conventional single-scale coordinate networks in terms of interpretability and quality.

    RELATED PAPERS
    📚SIREN
    📚Multiplicative Filter Networks (MFN)
    📚Mip-Nerf
    📚Followup work: Residual MFN

    LINKS AND RESOURCES
    💻Project website
    📚 Paper
    💻Code
    🎥Video

    To stay up to date with David's latest research, follow him on:
    👨🏻‍🎓Personal Page
    🐦Twitter
    👨🏻‍🎓Google Scholar
    👨🏻‍🎓LinkedIn

    Recorded on June 15th 2022.
    CONTACT
    If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com


    SUBSCRIBE AND FOLLOW
    🎧Subscribe on your favorite podcast app
    📧Subscribe to our mailing list
    🐦Follow us on Twitter
    🎥YouTube Channel

    • 41 min
    Lipschitz MLP - Hsueh-Ti Derek Liu

    Lipschitz MLP - Hsueh-Ti Derek Liu

    In this episode of the Talking Papers Podcast, I hosted Hsueh-Ti Derek Liu to chat about his paper "Learning Smooth Neural Functions via Lipschitz Regularization”, published in SIGGRAPH 2022. 
    In this paper, they took on the unique task of enforcing smoothness on Neural Fields (modelled as a neural network). They do this by introducing a regularization term that forces the Lipschitz constant of the network to be very small. They show the performance of their method on shape interpolation, extrapolation and partial shape reconstruction from 3D point clouds. I mostly like the fact that it is implemented in only 4 lines of code. 


    Derek will soon complete his PhD at the University of Toronto and will start a research scientist position at Roblox Research. This work was done when he was interning at NVIDIA. During our chat, I had the pleasure to discover that Derek is one of the few humans on the plant that has the ability to take a complicated idea and explain it in a simple and easy-to-follow way. His strong background in geometry processing makes this paper, which is well within the learning domain, very unique in the current paper landscape. It was a pleasure recording this episode with him. 

    AUTHORS
    Hsueh-Ti Derek Liu, Francis Williams, Alec Jacobson, Sanja Fidler, Or Litany

    ABSTRACT
    Neural implicit fields have recently emerged as a useful representation for 3D shapes. These fields are commonly represented as neural networks which map latent descriptors and 3D coordinates to implicit function values. The latent descriptor of a neural field acts as a deformation handle for the 3D shape it represents. Thus, smoothness with respect to this descriptor is paramount for performing shape-editing operations. In this work, we introduce a novel regularization designed to encourage smooth latent spaces in neural fields by penalizing the upper bound on the field's Lipschitz constant. Compared with prior Lipschitz regularized networks, ours is computationally fast, can be implemented in four lines of code, and requires minimal hyperparameter tuning for geometric applications. We demonstrate the effectiveness of our approach on shape interpolation and extrapolation as well as partial shape reconstruction from 3D point clouds, showing both qualitative and quantitative improvements over existing state-of-the-art and non-regularized baselines.

    RELATED PAPERS
    📚DeepSDF
    📚Neural Fields (collection of works)
    📚Sorting Out Lipschitz Function Approximation

    LINKS AND RESOURCES
    💻Project website
    📚 Paper
    💻Code

    To stay up to date with Derek's latest research, follow him on:
    👨🏻‍🎓Personal Page
    🐦Twitter
    👨🏻‍🎓Google Scholar

    Recorded on May 30th 2022.
    CONTACT
    If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com


    SUBSCRIBE AND FOLLOW
    🎧Subscribe on your favorite podcast app
    📧Subscribe to our mailing list
    🐦Follow us on Twitter
    🎥YouTube Channel

    • 35 min
    DiGS - Chamin Hewa Koneputugodage

    DiGS - Chamin Hewa Koneputugodage

    In this episode of the Talking Papers Podcast, I hosted Chamin Hewa Koneputugodage to chat about OUR paper "DiGS: Divergence guided shape implicit neural representation for unoriented point clouds”, published in CVPR 2022.
    In this paper, we took on the task of surface reconstruction using a novel divergence-guided approach.  Unlike previous methods, we do not use normal vectors for supervision. To compensate for that, we add a divergence minimization loss as a regularize to get a coarse shape and then anneal it as training progresses to get finer detail. Additionally, we propose two new geometric initialization for SIREN-based networks that enable learning shape spaces. 

    PAPER TITLE
    "DiGS: Divergence guided shape implicit neural representation for unoriented point clouds" 

    AUTHORS
    Yizhak Ben-Shabat, Chamin Hewa Koneputugodage, Stephen Gould

    ABSTRACT
    Shape implicit neural representations (INR) have recently shown to be effective in shape analysis and reconstruction tasks. Existing INRs require point coordinates to learn the implicit level sets of the shape. When a normal vector is available for each point, a higher fidelity representation can be learned, however normal vectors are often not provided as raw data. Furthermore, the method's initialization has been shown to play a crucial role for surface reconstruction. In this paper, we propose a divergence guided shape representation learning approach that does not require normal vectors as input. We show that incorporating a soft constraint on the divergence of the distance function favours smooth solutions that reliably orients gradients to match the unknown normal at each point, in some cases even better than approaches that use ground truth normal vectors directly. Additionally, we introduce a novel geometric initialization method for sinusoidal INRs that further improves convergence to the desired solution. We evaluate the effectiveness of our approach on the task of surface reconstruction and shape space learning and show SOTA performance compared to other unoriented methods.

    RELATED PAPERS
    📚 DeepSDF 
    📚 SIREN

    LINKS AND RESOURCES
    💻 Project Page
    💻 Code 
    🎥 5 min video

    To stay up to date with Chamin's latest research, follow him on:
    🐦 Twitter
    👨🏻‍🎓LinkedIn

    Recorded on April 1st 2022.

    CONTACT

    If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com


    SUBSCRIBE AND FOLLOW

    🎧Subscribe on your favourite podcast app

    📧Subscribe to our mailing list

    🐦Follow us on Twitter

    🎥YouTube Channel


    #talkingpapers #CVPR2022 #DiGS #NeuralImplicitRepresentation #SurfaceReconstruction #ShapeSpace #3DVision #ComputerVision #AI #DeepLearning #MachineLearning  #deeplearning #AI #neuralnetworks #research  #artificialintelligence

    • 40 min
    Dejan Azinović - Neural RGBD Surface Reconstruction

    Dejan Azinović - Neural RGBD Surface Reconstruction

     In this episode of the Talking Papers Podcast, I hosted Dejan Azinović to chat about his paper "Neural RGB-D Surface Reconstruction”, published in CVPR 2022.

    In this paper, they take on the task of RGBD surface reconstruction by using novel view synthesis.  They incorporate depth measurements into the radiance field formulation by learning a neural network that stores a truncated signed distance field. This formulation is particularly useful in regions where depth is missing and the color information can help fill in the gaps.

    PAPER TITLE
    "Neural RGB-D Surface Reconstruction" 

    AUTHORS
    Dejan Azinović Ricardo Martin-Brualla Dan B Goldman Matthias Nießner Justus Thies

    ABSTRACT
    In this work, we explore how to leverage the success of implicit novel view synthesis methods for surface reconstruction. Methods which learn a neural radiance field have shown amazing image synthesis results, but the underlying geometry representation is only a coarse approximation of the real geometry. We demonstrate how depth measurements can be incorporated into the radiance field formulation to produce more detailed and complete reconstruction results than using methods based on either color or depth data alone. In contrast to a density field as the underlying geometry representation, we propose to learn a deep neural network which stores a truncated signed distance field. Using this representation, we show that one can still leverage differentiable volume rendering to estimate color values of the observed images during training to compute a reconstruction loss. This is beneficial for learning the signed distance field in regions with missing depth measurements. Furthermore, we correct for misalignment errors of the camera, improving the overall reconstruction quality. In several experiments, we show-cast our method and compare to existing works on classical RGB-D fusion and learned representations.

    RELATED PAPERS
    📚 NeRF
    📚 BundleFusion

    LINKS AND RESOURCES
    💻 Project Page 
    💻 Code 

    To stay up to date with Dejan's latest research, follow him on:
    👨🏻‍🎓 Dejan's personal page
    🎓 Google Scholar
    🐦 Twitter
    👨🏻‍🎓LinkedIn:

    Recorded on April 4th 2022.

    CONTACT

    If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com


    SUBSCRIBE AND FOLLOW

    🎧Subscribe on your favourite podcast app

    📧Subscribe to our mailing list

    🐦Follow us on Twitter

    🎥YouTube Channel


    #talkingpapers #CVPR2022 #NeuralRGBDSurfaceReconstruction #SurfaceReconstruction #NeRF  #3DVision #ComputerVision #AI #DeepLearning #MachineLearning  #deeplearning #AI #neuralnetworks #research  #artificialintelligence

    • 31 min
    Yuliang Xiu - ICON

    Yuliang Xiu - ICON

    In this episode of the Talking Papers Podcast, I hosted Yuliang Xiu to chat about his paper "ICON: Implicit Clothed humans Obtained from Normals”, published in CVPR 2022. SMPL(-X) body model to infer clothed humans (conditioned on the normals).  Additionally, they propose an inference-time feedback loop that alternates between refining the body's normals and the shape. 

    PAPER TITLE 
    "ICON: Implicit Clothed humans Obtained from Normals"  https://bit.ly/3uXe6Yw

    AUTHORS
    Yuliang Xiu, Jinlong Yang, Dimitrios Tzionas, Michael J. Black

    ABSTRACT
    Current methods for learning realistic and animatable 3D clothed avatars need either posed 3D scans or 2D images with carefully controlled user poses. In contrast, our goal is to learn an avatar from only 2D images of people in unconstrained poses. Given a set of images, our method estimates a detailed 3D surface from each image and then combines these into an animatable avatar. Implicit functions are well suited to the first task, as they can capture details like hair and clothes. Current methods, however, are not robust to varied human poses and often produce 3D surfaces with broken or disembodied limbs, missing details, or non-human shapes. The problem is that these methods use global feature encoders that are sensitive to global pose. To address this, we propose ICON ("Implicit Clothed humans Obtained from Normals"), which, instead, uses local features. ICON has two main modules, both of which exploit the SMPL(-X) body model. First, ICON infers detailed clothed-human normals (front/back) conditioned on the SMPL(-X) normals. Second, a visibility-aware implicit surface regressor produces an iso-surface of a human occupancy field. Importantly, at inference time, a feedback loop alternates between refining the SMPL(-X) mesh using the inferred clothed normals and then refining the normals. Given multiple reconstructed frames of a subject in varied poses, we use SCANimate to produce an animatable avatar from them. Evaluation on the AGORA and CAPE datasets shows that ICON outperforms the state of the art in reconstruction, even with heavily limited training data. Additionally, it is much more robust to out-of-distribution samples, e.g., in-the-wild poses/images and out-of-frame cropping. ICON takes a step towards robust 3D clothed human reconstruction from in-the-wild images. This enables creating avatars directly from video with personalized and natural pose-dependent cloth deformation.

    RELATED PAPERS
    📚 Monocular Real-Time Volumetric Performance Capture https://bit.ly/3L2S4JF
    📚 PIFu https://bit.ly/3rBsrYN
    📚 PIFuHD https://bit.ly/3rymDiE

    LINKS AND RESOURCES
    💻 Project Page https://icon.is.tue.mpg.de/
    💻 Code  https://github.com/yuliangxiu/ICON

    To stay up to date with Yulian'gs latest research, follow him on:
    👨🏻‍🎓 Yuliang's personal page:  https://bit.ly/3jQb16n
    🎓 Google Scholar:  https://bit.ly/3JW25ae
    🐦 Twitter:  https://twitter.com/yuliangxiu
    👨🏻‍🎓LinkedIn: https://www.linkedin.com/in/yuliangxiu/

    Recorded on March11th 2022.

    CONTACT

    If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com


    SUBSCRIBE AND FOLLOW

    🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikb...

    📧Subscribe to our mailing list: http://eepurl.com/hRznqb

    🐦Follow us on Twitter: https://twitter.com/talking_papers

    🎥YouTube Channel: https://bit.ly/3eQOgwP

    This episode was recorded on February 11 2022.

    #talkingpapers #CVPR2022 #ICON #ImplicitHumans  #3DVision #ComputerVision #AI #DeepLearning #MachineLearning  #deeplearning #AI #neuralnetworks #research  #artificialintelligence

    • 36 min
    Itai Lang - SampleNet

    Itai Lang - SampleNet

    In this episode of the Talking Papers Podcast, I hosted Itai Lang to chat about his paper "SampleNet: Differentiable Point Cloud Sampling”, published in CVPR 2020. In this paper, they propose a point soft-projection to allow differentiating through the sampling operation and enable learning task-specific point sampling. Combined with their regularization and task-specific losses, they can reduce the number of points to 3% of the original samples with a very low impact on task performance. I met Itai for the first time at CVPR 2019.  Being a point-cloud guy myself, I have been following his research work ever since. It is amazing how much progress he has made and I can't wait to see what he comes up with next. It was a pleasure hosting him in the podcast. 

    PAPER TITLE 
    "SampleNet: Differentiable Point Cloud Sampling"  https://bit.ly/3wMFwll

    AUTHORS
    Itai Lang, Asaf Manor, Shai Avidan

    ABSTRACT
    and offered a workaround instead. We introduce a novel differentiable relaxation for point cloud sampling that approximates sampled points as a mixture of points in the primary input cloud. Our approximation scheme leads to consistently good results on classification and geometry reconstruction applications. We also show that the proposed sampling method can be used as a front to a point cloud registration network. This is a challenging task since sampling must be consistent across two different point clouds for a shared downstream task. In all cases, our approach outperforms existing non-learned and learned sampling alternatives. Our code is publicly available.

    RELATED PAPERS
    📚 Learning to Sample https://bit.ly/3vd1FZd
    📚 Farthest Point Sampling (FPS)  https://bit.ly/3Lkcyx9

    LINKS AND RESOURCES
    💻 Code  https://bit.ly/3NoS0pb

    To stay up to date with Itai's latest research, follow him on:

    🎓 Google Scholar: https://bit.ly/3wCMY2u
    🐦 Twitter: https://twitter.com/ItaiLang

    Recorded on February 15th 2022.

    CONTACT

    If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com


    SUBSCRIBE AND FOLLOW

    🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikb...

    📧Subscribe to our mailing list: http://eepurl.com/hRznqb

    🐦Follow us on Twitter: https://twitter.com/talking_papers

    🎥YouTube Channel: https://bit.ly/3eQOgwP

    This episode was recorded on February 11 2022.

    #talkingpapers #SampleNet #LearnToSample #CVPR2020 #3DVision #ComputerVision #AI #DeepLearning #MachineLearning  #deeplearning #AI #neuralnetworks #research  #artificialintelligence

    • 37 min

Customer Reviews

5.0 out of 5
4 Ratings

4 Ratings

You Might Also Like