
20 episodes

Talking Papers Podcast Itzik Ben-Shabat
-
- Technology
-
-
5.0 • 2 Ratings
-
A podcast by researchers for researchers. This podcast aims to be a new medium for disseminating research. In each episode I talk to the main author of an academic paper in the field of computer vision, machine learning, artificial intelligence, graphics and everything in between. Each episode is structured like a paper and includes a TL;DR (abstract), related work, approach, results, conclusions and a future work section. It also includes the bonus "what did reviewer 2 say" section where authors share their experience in the peer review process. Enjoy!
-
INR2Vec - Luca De Luigi
All links are available in the blog post: https://www.itzikbs.com/inr2vec/
In this episode of the Talking Papers Podcast, I hosted Luca De Luigi. We had a great chat about his paper “Deep Learning on Implicit Neural Representations of Shapes”, AKA INR2Vec, published in ICLR 2023 .
In this paper, they take implicit neural representations to the next level and use them as input signals for neural networks to solve multiple downstream tasks. The core idea was captured by one of the authors in a very catchy and concise tweet: "Signals are networks so networks are data and so networks can process other networks to understand and generate signals".
Luca recently received his PhD from the University of Bolognia and is currently working at a startup based in Bolognia eyecan.ai. His research focus is on neural representations of signals, especially for 3D geometry. To be honest, I knew I wanted to get Luca on the podcast the second I saw the paper on arXiv because I was working on a related topic but had to shelf it due to time management issues. This paper got me excited about that topic again. I didn't know Luca before recording the episode and it was a delight to get to know him and his work.
AUTHORS
Luca De Luigi, Adriano Cardace, Riccardo Spezialetti, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano
ABSTRACT
pes, INRs allow to overcome the fragmentation and shortcomings of the popular discrete representations used so far. Yet, considering that INRs consist in neural networks, it is not clear whether and how it may be possible to feed them into deep learning pipelines aimed at solving a downstream task. In this paper, we put forward this research problem and propose inr2vec, a framework that can compute a compact latent representation for an input INR in a single inference pass. We verify that inr2vec can embed effectively the 3D shapes represented by the input INRs and show how the produced embeddings can be fed into deep learning pipelines to solve several tasks by processing exclusively INRs.
RELATED PAPERS
📚SIREN
📚DeepSDF
📚PointNet
LINKS AND RESOURCES
📚 Paper
💻Project page
SPONSOR
This episode was sponsored by YOOM. YOOM is an Israeli startup dedicated to volumetric video creation. They were voted as the 2022 best start-up to work for by Dun’s 100.
Join their team that works on geometric deep learning research, implicit representations of 3D humans, NeRFs, and 3D/4D generative models.
Visit https://www.yoom.com/
For job opportunities with YOOM visit https://www.yoom.com/careers/
CONTACT
If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com
This episode was recorded on March 22, 2023.
#talkingpapers #ICLR2023 #INR2Vec #ComputerVision #AI #DeepLearning #MachineLearning #INR #ImplicitNeuralRepresentation #research #artificialintelligence #podcasts
🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com
📧Subscribe to our mailing list: http://eepurl.com/hRznqb
🐦Follow us on Twitter: https://twitter.com/talking_papers
🎥YouTube Channel: https://bit.ly/3eQOgwP -
CLIPasso - Yael Vinker
In this episode of the Talking Papers Podcast, I hosted Yael Vinker. We had a great chat about her paper "CLIPasso: SEmantically-Aware Object Sketching”, SIGGRAPH 2022 best paper award winner.
In this paper, they convert images into sketches with different levels of abstraction. They avoid the need for sketch datasets by using the well-known CLIP model to distil the semantic concepts from sketches and images. There is no network training here, just optimizing the control points of Bezier curves to model the sketch strokes (initialized by a saliency map). How is this differentiable? They use a differentiable rasterizer. The degree of abstraction is controlled by the number of strokes. Don't miss the amazing demo they created.
Yael is currently a PhD student at Tel Aviv University. Her research focus is on computer vision, machine learning, and computer graphics with a unique twist of combining art and technology. This work was done as part of her internship at EPFL
AUTHORS
Yael Vinker, Ehsan Pajouheshgar, Jessica Y. Bo, Roman Bachmann, Amit Haim Bermano, Daniel Cohen-Or, Amir Zamir, Ariel Shamir
ABSTRACT
Abstraction is at the heart of sketching due to the simple and minimal nature of line drawings. Abstraction entails identifying the essential visual properties of an object or scene, which requires semantic understanding and prior knowledge of high-level concepts. Abstract depictions are therefore challenging for artists, and even more so for machines. We present an object sketching method that can achieve different levels of abstraction, guided by geometric and semantic simplifications. While sketch generation methods often rely on explicit sketch datasets for training, we utilize the remarkable ability of CLIP (Contrastive-Language-Image-Pretraining) to distil semantic concepts from sketches and images alike. We define a sketch as a set of Bézier curves and use a differentiable rasterizer to optimize the parameters of the curves directly with respect to a CLIP-based perceptual loss. The abstraction degree is controlled by varying the number of strokes. The generated sketches demonstrate multiple levels of abstraction while maintaining recognizability, underlying structure, and essential visual components of the subject drawn.
RELATED PAPERS
📚CLIP: Connecting Text and Images
📚Differentiable Vector Graphics Rasterization for Editing and Learning
LINKS AND RESOURCES
📚 Paper
💻Project page
SPONSOR
This episode was sponsored by YOOM. YOOM is an Israeli startup dedicated to volumetric video creation. They were voted as the 2022 best start-up to work for by Dun’s 100.
Join their team that works on geometric deep learning research, implicit representations of 3D humans, NeRFs, and 3D/4D generative models.
Visit YOOM.com.
CONTACT
If you would like to be a guest, sponsor or share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com
🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com
📧Subscribe to our mailing list: http://eepurl.com/hRznqb
🐦Follow us on Twitter: https://twitter.com/talking_papers
🎥YouTube Channel: https://bit.ly/3eQOgwP -
Random Walks for Adversarial Meshes - Amir Belder
All links are available in the blog post.
In this episode of the Talking Papers Podcast, we hosted Amir Belder. We had a great chat about his paper "Random Walks for Adversarial Meshes”, published in SIGGRAPH 2022.
In this paper, they take on the task of creating an adversarial attack for triangle meshes. This is a non-trivial task since meshes are irregular. To solve the irregularity they use random walks instead of the raw mesh. On top of that, they trained an imitating network that mimics the predictions of the attacked network and used the gradients to perturb the input points.
Amir is currently a PhD student at the Computer Graphics and Multimedia Lab at the Technion Israel Institute of Technology. His research focus is on computer graphics and geometric processing and machine learning. We spend a lot of time together at the lab and chat often about science, papers and where the field is headed. Having this paper published was a great opportunity to share one of these conversations with you.
AUTHORS
Amir Belder, Gal Yefet, Ran Ben-Itzhak, Ayellet Tal
ABSTRACT
have recently emerged as a useful representation for 3D shapes. These fields are We A polygonal mesh is the most-commonly used representation of surfaces in computer graphics. Therefore, it is not surprising that a number of mesh classification networks have recently been proposed. However, while adversarial attacks are wildly researched in 2D, the field of adversarial meshes is under explored. This paper proposes a novel, unified, and general adversarial attack, which leads to misclassification of several state-of-the-art mesh classification neural networks. Our attack approach is black-box, i.e. it has access only to the network’s predictions, but not to the network’s full architecture or gradients. The key idea is to train a network to imitate a given classification network. This is done by utilizing random walks along the mesh surface, which gather geometric information. These walks provide insight onto the regions of the mesh that are important for the correct prediction of the given classification network. These mesh regions are then modified more than other regions in order to attack the network in a manner that is barely visible to the naked eye.
RELATED PAPERS
📚Explaining and Harnessing Adversarial Examples
📚Meshwalker: Deep mesh understanding by random walks
LINKS AND RESOURCES
📚 Paper
💻Code
To stay up to date with Amir's latest research, follow him on:
🐦Twitter
👨🏻🎓Google Scholar
👨🏻🎓LinkedIn
CONTACT
If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail.com
This episode was recorded on November 23rd 2022.
#talkingpapers #SIGGRAPH2022 #RandomWalks #MeshWalker #AdversarialAttacks #Mesh #ComputerVision #AI #DeepLearning #MachineLearning #ComputerGraphics #research #artificialintelligence #podcasts
🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com
📧Subscribe to our mailing list: http://eepurl.com/hRznqb
🐦Follow us on Twitter: https://twitter.com/talking_papers
🎥YouTube Channel: https://bit.ly/3eQOgwP -
SPSR - Silvia Sellán
In this episode of the Talking Papers Podcast, I hosted Silvia Sellán. We had a great chat about her paper "Stochastic Poisson Surface Reconstruction”, published in SIGGRAPH Asia 2022.
In this paper, they take on the task of surface reconstruction with a probabilistic twist. They take the well-known Poisson Surface reconstruction algorithm and generalize it to give it a full statistical formalism. Essentially their method quantifies the uncertainty of surface reconstruction from a point cloud. Instead of outputting an implicit function, they represent the shape as a modified Gaussian process. This unique perspective and interpretation enables conducting statistical queries, for example, given a point, is it on the surface? is it inside the shape?
Silvia is currently a PhD student at the University of Toronto. Her research focus is on computer graphics and geometric processing. She is a Vanier Doctoral Scholar, an Adobe Research Fellow and the winner of the 2021 UoFT FAS Deans Doctoral excellence scholarship. I have been following Silvia's work for a while and since I have some work on surface reconstruction when SPSR came out, I knew I wanted to host her on the podcast (and gladly she agreed). Silvia is currently looking for postdoc and faculty positions to start in the fall of 2024. I am really looking forward to seeing which institute snatches her.
In our conversation, I particularly liked her explanation of Gaussian Processes with the example "How long does it take my supervisor to answer an email as a function of the time of day the email was sent", You can't read that in any book. But also, we took an unexpected pause from the usual episode structure to discuss the question of "papers" as a medium for disseminating research. Don't miss it.
AUTHORS
Silvia Sellán, Alec Jacobson
ABSTRACT
shapes from 3D point clouds. Instead of outputting an implicit function, we represent the reconstructed shape as a modified Gaussian Process, which allows us to conduct statistical queries (e.g., the likelihood of a point in space being on the surface or inside a solid). We show that this perspective: improves PSR's integration into the online scanning process, broadens its application realm, and opens the door to other lines of research such as applying task-specific priors.
RELATED PAPERS
📚Poisson Surface Reconstruction
📚Geometric Priors for Gaussian Process Implicit Surfaces
📚Gaussian processes for machine learning
LINKS AND RESOURCES
📚 Paper
💻Project page
To stay up to date with Silvia's latest research, follow him on:
🐦Twitter
👨🏻🎓Google Scholar
🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com
📧Subscribe to our mailing list: http://eepurl.com/hRznqb
🐦Follow us on Twitter: https://twitter.com/talking_papers
🎥YouTube Channel: https://bit.ly/3eQOgwP -
Beyond Periodicity - Sameera Ramasinghe
In this episode of the Talking Papers Podcast, I hosted Sameera Ranasinghe. We had a great chat about his paper "Beyond Periodicity: Towards a Unifying Framework for Activations in Coordinate-MLPs”, published in ECCV 2022 as an oral presentation.
In this paper, they propose a new family of activation functions for coordinate MLPs and provide a theoretical analysis of their effectiveness. Their main proposition is that the stable rank is a good measure and design tool for such activation functions. They show that their proposed activations outperform the traditional ReLU and Sine activations for image parametrization and novel view synthesis. They further show that while the proposed family of activations does not require positional encoding they can benefit from using it by reducing the number of layers significantly.
Sameera is currently an applied scientist at Amazon and the CTO and co-founder of ConscientAI. His research focus is theoretical machine learning and computer vision. This work was done when he was a postdoc at the Australian Institute of Machine Learning (AIML). He completed his PhD at the Australian National University (ANU). We first met back in 2019 when I was a research fellow at ANU and he was still doing his PhD. I immediately noticed we share research interests and after a short period of time, I flagged him as a rising star in the field. It was a pleasure to chat with Sameera and I am looking forward to reading his future papers.
AUTHORS
Sameera Ramasinghe, Simon Lucey
RELATED PAPERS
📚NeRF
📚SIREN
📚"Fourier Features Let Networks Learn High-Frequency Functions in Low Dimensional Domains"
📚On the Spectral Bias of Neural Networks
LINKS AND RESOURCES
📚 Paper
💻Code
To stay up to date with Marko's latest research, follow him on:
🐦Twitter
👨🏻🎓Google Scholar
👨🏻🎓LinkedIn
Recorded on November 14th 2022.
🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com
📧Subscribe to our mailing list: http://eepurl.com/hRznqb
🐦Follow us on Twitter: https://twitter.com/talking_papers
🎥YouTube Channel: https://bit.ly/3eQOgwP -
KeypointNeRF - Marko Mihajlovic
In this episode of the Talking Papers Podcast, I hosted Marko Mihajlovic . We had a great chat about his paper "KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints”, published in ECCV 2022.
In this paper, they create a generalizable NeRF for virtual avatars. To get a high-fidelity reconstruction of humans (from sparse observations), they leverage an off-the-shelf keypoint detector in order to condition the NeRF.
Marko is a 2nd year PhD student at ETH, supervised by Siyu Tang. His research focuses on photorealistic reconstruction of static and dynamic scenes and also modeling of parametric human bodies. This work was done mainly during his internship at Meta Reality Labs. Marko and I met at CVPR 2022.
AUTHORS
Marko Mihajlovic, Aayush Bansal, Michael Zollhoefer, Siyu Tang, Shunsuke Saito
ABSTRACT
Neural implicit fields have recently emerged as a useful representation for 3D shapes. These fields are Coordinate-based networks have emerged as a powerful tool for 3D representation and scene reconstruction. These networks are trained to map continuous input coordinates to the value of a signal at each point. Still, current architectures are black boxes: their spectral characteristics cannot be easily Image-based volumetric humans using pixel-aligned features promise generalization to unseen poses and identities. Prior work leverages global spatial encodings and multi-view geometric consistency to reduce spatial ambiguity. However, global encodings often suffer from overfitting to the distribution of the training data, and it is difficult to learn multi-view consistent reconstruction from sparse views. In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views. One of the key ideas is to encode relative spatial 3D information via sparse 3D keypoints. This approach is robust to the sparsity of viewpoints and cross-dataset domain gap. Our approach outperforms state-of-the-art methods for head reconstruction. On human body reconstruction for unseen subjects, we also achieve performance comparable to prior work that uses a parametric human body model and temporal feature aggregation. Our experiments show that a majority of errors in prior work stem from an inappropriate choice of spatial encoding and thus we suggest a new direction for high-fidelity image-based human modeling.
RELATED PAPERS
📚NeRF
📚IBRNet
📚PIFu
LINKS AND RESOURCES
💻Project website
📚 Paper
💻Code
🎥Video
To stay up to date with Marko's latest research, follow him on:
👨🏻🎓Personal Page
🐦Twitter
👨🏻🎓Google Scholar
CONTACT
If you would like to be a guest, sponsor or just share your thoughts, feel free to reach out via email: talking.papers.podcast@gmail
🎧Subscribe on your favourite podcast app: https://talking.papers.podcast.itzikbs.com
📧Subscribe to our mailing list: http://eepurl.com/hRznqb
🐦Follow us on Twitter: https://twitter.com/talking_papers
🎥YouTube Channel: https://bit.ly/3eQOgwP