13 episodes

It's a podcast akin to grabbing coffee with Creative AI researchers. The target audience is people with technical competency. The focus will be on the Creative AI space.

Obsessed with consuming the latest Creative AI research, we found ourselves stuck between two types output from the nascent industry: PhD level research or BuzzFeed level simplification. This is our attempt to deliver a hybrid solution that is consumable by people that can comprehend & some day contribute to.

Creative AI Podcast Chad Wittman

    • Technology

It's a podcast akin to grabbing coffee with Creative AI researchers. The target audience is people with technical competency. The focus will be on the Creative AI space.

Obsessed with consuming the latest Creative AI research, we found ourselves stuck between two types output from the nascent industry: PhD level research or BuzzFeed level simplification. This is our attempt to deliver a hybrid solution that is consumable by people that can comprehend & some day contribute to.

    Weakly Supervised Generative Adversarial Networks for 3D

    Weakly Supervised Generative Adversarial Networks for 3D

    Something that we often take for granted as humans is the ability to look an object and have an approximately correct mental model of it’s 3D shape. For example, if you were to see a small image of a steering wheel, you’d probably have a fairly accurate statistical representation of other aspects of the car. You’d have a pretty good guess of where the driver might be, the engine, the four wheels, etc.

    However, this intuitive approach for humans is difficult for machines. Therefore, we need to develop techniques to pass this information to machines. The faster / easier it is for us to do so, the more quickly we can leverage this technology and it’s widespread applications.

    These researchers are working on improving just that. Their work is taking 2D imagery and generating 3D representations of the object. This is incredible work and I’m super excited to see better / faster ways of inputting 3D data into a world that we’re starting to explore with things like augmented & virtual reality.

    Check out their paper here: https://arxiv.org/pdf/1705.10904.pdf

    Are you a Machine Learning Engineer looking for a new job? Machine Learning Jobs by Amazon: http://jobs.ai-guild.com

    • 36 min
    Training Object Class Detectors with Click Supervision

    Training Object Class Detectors with Click Supervision

    Having a machine identify an object within an image has made significant progress over the years, however researchers are continuing to work on how to make this happen more accurately and faster, ultimately cheaper. One way of doing that is improving the training data. In their research, we see how the baseline object classification is applied to the image. In most cases, it covers a part of the intended target — but often missing key details. By adding a modest annotation input from humans, in this case asking someone to click on the object, they’ve been able to reduce total annotation time by approximately 10x.

    http://calvin.inf.ed.ac.uk/datasets/center-click-annotations/

    Full Paper: https://arxiv.org/pdf/1704.06189.pdf

    Abstract: "Training object class detectors typically requires a large set of images with objects annotated by bounding boxes. However, manually drawing bounding boxes is very time consuming. In this paper we greatly reduce annotation time by proposing center-click annotations: we ask annotators to click on the center of an imaginary bounding box which tightly encloses the object instance. We then incorporate these clicks into existing Multiple Instance Learning techniques for weakly supervised object localization, to jointly localize object bounding boxes over all training images. Extensive experiments on PASCAL VOC 2007 and MS COCO show that: (1) our scheme delivers high-quality detectors, performing substantially better than those produced by weakly supervised techniques, with a modest extra annotation effort; (2) these detectors in fact perform in a range close to those trained from manually drawn bounding boxes; (3) as the center-click task is very fast, our scheme reduces total annotation time by 9× to 18×."

    • 33 min
    Image-to-Image Translation w/ Conditional Adversarial Nets

    Image-to-Image Translation w/ Conditional Adversarial Nets

    In this week's episode I chat with Philip. He worked on Image-to-Image Translation with Conditional Adversarial Networks. For this episode, I'd highly recommend you check out the YouTube version. You'll see a few different examples such as taking machine vision from something like an autonomous car and translating that to a more real image. Another example of the same model is converting a day photo to a night photo, translating a satellite aerial photo into Google Map esque design and what Philip calls "Edges to Photos" which is sort of like making a simple drawing and having the model turn into a somewhat realistic photo. The applications of this type of work are pretty wide spread. Let's hop into the show!

    https://youtu.be/B1bMMF8miN8

    "We investigate conditional adversarial networks as a general-purpose solution to image-to-image translation problems. These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations. We demonstrate that this approach is effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. As a community, we no longer hand-engineer our mapping functions, and this work suggests we can achieve reasonable results without hand-engineering our loss functions either." — Abstract from the paper.

    Check out their paper here: https://arxiv.org/pdf/1611.07004.pdf

    Are you a Machine Learning Engineer looking for a new job? Machine Learning Jobs by Amazon: http://jobs.ai-guild.com

    • 21 min
    A Singing Synthesizer Based on PixelCNN E10

    A Singing Synthesizer Based on PixelCNN E10

    In this episode we talk to Merlijn about his work with singing synthesizers. Using machine learning based approaches, work thus far has often resulted in subpar sound quality. Merlijn and the team set out to take advantage of advancements in generative models, specifically using deep neural nets, to build a similar model (Gated PixelCNN).

    Listen to their work here: http://www.dtic.upf.edu/~mblaauw/MdM_NIPS_seminar/

    Follow the podcast: http://ai-guild.com

    Want to see more Creative AI research? http://www.creativeai.net

    Are you a Machine Learning Engineer looking for a new job? Machine Learning Jobs by Amazon: http://jobs.ai-guild.com

    • 29 min
    Generating a 3D Human Pose & Shape from a Single Image E9

    Generating a 3D Human Pose & Shape from a Single Image E9

    When a human sees another human, they possess a wealth of information on how that body might exist within our physical world. When a machine sees a human in the same pose, the machine does not understand the pose and it’s implications. For example, the machine doesn’t know that my elbow bends inwards instead of outwards. For future robots interacting with humans, especially any robots that would be physically touching a human — it would be incredibly important for it to understand the ways in which I can and cannot bend.

    This episode we speak with Christoph Lassner who worked on extracting a 3D model of a human pose & shape based on a single photo. We’ll show a few examples of what this looks like and how it works in the show.

    Details: http://www.creativeai.net/posts/v8w2WYnQaahwcsbfs/smplify-3d-human-pose-and-shape-from-a-single-image

    Follow the podcast: http://ai-guild.com

    Want to see more Creative AI research? http://www.creativeai.net

    Are you a Machine Learning Engineer looking for a new job? Machine Learning Jobs by Amazon: http://jobs.ai-guild.com

    • 25 min
    Generating Music with AI (ALYSIA) E8

    Generating Music with AI (ALYSIA) E8

    Maya Ackerman and David Loker are the creators of ALYSIA. ALYSIA stands for Automated LYrical SongwrIting Application and is a tool to generate music through songwriting.

    The applications for this type of tool are immense. During the Industrial Revolution, the people that embraced the machines and learned to work side-by-side with them were the ones who prospered and dramatically increased their efficiencies. This was a big step for humanity. Tools like ALYSIA are laying the foundation for this same type of Artificially Intelligent Revolution.

    Paper: https://arxiv.org/pdf/1612.01058v1.pdf

    A song Maya made with ALYSIA: https://youtu.be/whgudcj82_I

    A melody that music professor Josh Palkki made with ALYSIA: https://youtu.be/611-UBmcU58

    Follow the podcast: http://ai-guild.com

    Want to see more Creative AI research? http://www.creativeai.net

    Are you a Machine Learning Engineer looking for a new job? Machine Learning Jobs by Amazon: http://jobs.ai-guild.com

    • 25 min

Top Podcasts In Technology

Listeners Also Subscribed To