7 episodes

Deep Dive:AI is an online event from the Open Source Initiative. We’ll be exploring how Artificial Intelligence impacts Open Source software, from developers to businesses to the rest of us.

Deep Dive: AI Deep Dive: AI

    • Technology
    • 5.0 • 3 Ratings

Deep Dive:AI is an online event from the Open Source Initiative. We’ll be exploring how Artificial Intelligence impacts Open Source software, from developers to businesses to the rest of us.

    How to secure AI systems

    How to secure AI systems

    With so many artificial systems claiming “intelligence” available to the public, making sure they do what they’re designed to is of the utmost importance. Dr. Bruce Draper, Program Manager of the Information Innovation Office at DARPA joins us on this bonus episode of Deep Dive: AI to unpack his work in the field and his current role. We have a fascinating chat with Draper about the risks and opportunities involved in this exciting field, and why growing bigger and more involved Open Source communities is better for everyone. Draper introduces us to the Guaranteeing AI Robustness Against Deception (GARD) Project, its main short-term goals and how these aim to mitigate exposure to danger while we explore the possibilities that machine learning offer. We also spend time discussing the agency’s Open Source philosophy and foundation, the AI boom in recent years, why policy making is so critical, the split between academic and corporate contributions, and much more. For Draper, community involvement is critical to spot potential issues and threats. Tune in to hear it all from this exceptional guest! Read the full transcript.
    Key points from this episode:

    The objectives of the GARD project and DARPA’s broader mission.
    How the Open Source model plays into the research strategy at DARPA.
    Differences between machine learning and more traditional IT systems.
    Draper talks about his ideas for ideal communities and the role of stakeholders.
    Key factors to the ‘extended summer of AI’ we have been experiencing.
    Getting involved in the GARD Project and how the community makes the systems more secure.
    The main impetus for the AI community to address these security concerns.
    Draper explains the complications of safety-critical AI systems.
    Deployment opportunities and concurrent development for optimum safety.
    Thoughts on the scope and role of policy makers in the AI security field.
    The need for a deeper theoretical understanding of possible and present threats.
    Draper talks about the broader goal of a self-sustaining Open Source community.
    Plotting the future role and involvement of DARPA in the community.
    The partners that DARPA works with: academic and corporate.
    The story of how Draper got involved with the GARD Project and adversarial AI.
    Looking at the near future for Draper and DARPA.
    Reflections on the last few years in AI and how much of this could have been predicted.

    Links mentioned in this episode:

    Dr. Bruce Draper
    DARPA
    Moderna
    ChatGPT
    DALL-E
    Adversarial Robustness Toolbox
    GARD Project
    Carnegie Mellon University
    Embedded Intelligence
    IBM
    Intel Federal LLC
    Johns Hopkins University
    MIT
    Toyota Technological Institute at Chicago
    Two Six Technologies
    University of Central Florida
    University of Maryland
    University of Wisconsin
    USC Information Sciences Institute
    Google Research
    MITRE

    Credits
    Special thanks to volunteer producer, Nicole Martinelli. Music by Jason Shaw, Audionautix.
    This podcast is sponsored by GitHub, DataStax and Google.
    No sponsor had any right or opportunity to approve or disapprove the content of this podcast.

    Why Debian won't distribute AI models any time soon

    Why Debian won't distribute AI models any time soon

    Welcome to a brand new episode of Deep Dive: AI! For today’s conversation, we are joined by Mo Zhou, a PhD student at Johns Hopkins University and an official Debian developer since 2018. Tune in as Mo speaks to the evolving role of artificial intelligence driven by big data and hardware capacity and shares some key insights into what sets AlphaGo apart from previous algorithms, making applications integral, and the necessity of releasing training data along with any free software. You’ll also learn about validation data and the difference powerful hardware makes, as well as why Debian is so strict about their practice of offering free software. Finally, Mo shares his predictions for the free software community (and what he would like to see happen in an ideal world) before sharing his own plans for the future, which include a strong element of research.
    If you’re looking to learn about the uphill climb for open source artificial intelligence, plus so much more, you won’t want to miss this episode! Full transcript. 
    Key points from this episode:

    Background on today’s guest, Mo Zhou: PhD student and Debian developer.
    His recent Machine Learning Policy proposal at Debian.
    Defining artificial intelligence and its evolution, driven by big data and hardware capacity.
    Why the recent advancements in deep learning would be impossible without hardware. 
    Where AlphaGo differs from past algorithms.
    The role of data, training code, and inference code in making an application integral.
    Why you have to release training data with any free software.
    The financial and time expense of classifying images.
    What you need access to in order to modify an existing model.
    The validation data set collected by the research community.
    Predicting the process of retraining.
    What you can gain from powerful hardware.
    Why Debian is so strict in the practice of free software. 
    Problems that occur when big companies charge for their ecosystems.
    What Zhou is expecting from the future of the free software community.
    Which licensing schemes are most popular and why.
    An ideal future for Open Source AI.
    Zhou’s plans for the future and why they include research.

    Links mentioned in today’s episode:

    Mo Zhou on LinkedIn
    Mo Zhou on GitHub
    Mo Zhou
    Johns Hopkins University
    Debian
    Debian Deep Learning Team
    DeepMind
    Apache

    Credits
    Special thanks to volunteer producer, Nicole Martinelli. Music by Jason Shaw, Audionautix.
    This podcast is sponsored by GitHub, DataStax and Google.
    No sponsor had any right or opportunity to approve or disapprove the content of this podcast.

    Creative Restrictions to Curb AI Abuse

    Creative Restrictions to Curb AI Abuse

    Along with all the positive, revolutionary aspects of AI comes a more sinister side. Joining us today to discuss ethics in AI from the developer’s point of view is David Gray Widder. David is currently doing his Ph.D. at the School of Computer Science at Carnegie Mellon University and is investigating AI from an ethical perspective, honing in specifically on the ethics-related challenges faced by AI software engineers. His research has been conducted at Intel Labs, Microsoft, and NASA’s Jet Propulsion Lab. In this episode, we discuss the harmful uses of deep fakes and the ethical ramifications thereof in proprietary versus open source contexts. Widder breaks down the notions of technological inevitability and technological neutrality, respectively, and explains the importance of challenging these ideas. Widder has identified a continuum between implementation-based harms and use-based harms and fills us in on how each is affected in the open source development space.
    Tune in to find out more about the importance of curbing AI abuse and the creativity required to do so, as well as the strengths and weaknesses of open source in terms of AI ethics. Full transcript.
    Key points from this episode:

    Introducing David Gray Widder, a Ph.D. student researching AI ethics.
    Why he chose to focus his research on ethics in AI, and how he drives his research.
    Widder explains deep fakes and gives examples of their uses.
    Sinister uses of deep fakes and the danger thereof.
    The ethical ramifications of deep fake tech in proprietary versus open source contexts.
    The kinds of harms that can be prevented in open source versus proprietary contexts.
    The licensing issues that result in developers relinquishing control (and responsibility) over the uses of their tech.
    Why Widder is critical of the notions of both technological inevitability and neutrality.
    Why it’s important to challenge the idea of technological neutrality.
    The potential to build restrictions, even within the dictates of open source.
    The continuum between implementation-based harms and use-based harms.
    How open source allows for increased scrutiny of implementation harms, but decreased accountability for use-based harms.
    The insight Widder gleaned from observing NASA’s use of AI, pertaining to the deep fake case.
    Widder voices his legal concerns around Copilot.
    The difference between laws and norms.
    How we’ve been unsuspectingly providing data by uploading photos online.
    Why it’s important to include open source and public sector organizations in the ethical AI conversation.
    Open source strengths and weaknesses in terms of the ethical use of AI.

    Links mentioned in today’s episode:

    David Gray Widder
    David Gray Widder on Twitter
    Limits and Possibilities of “Ethical AI” in Open Source: A Study of Deep Fakes
    What is Deepfake
    Copilot

    Credits
    Special thanks to volunteer producer, Nicole Martinelli. Music by Jason Shaw, Audionautix.
    This podcast is sponsored by GitHub, DataStax and Google.
    No sponsor had any right or opportunity to approve or disapprove the content of this podcast.

    When hackers take on AI: Sci-fi – or the future?

    When hackers take on AI: Sci-fi – or the future?

    Because we lack a fundamental understanding of the internal mechanisms of current AI models, today’s guest has a few theories about what these models might do when they encounter situations outside of their training data, with potentially catastrophic results. Tuning in, you’ll hear from Connor Leahy, who is one of the founders of Eleuther AI, a grassroots collective of researchers working to open source AI research. He’s also Founder and CEO of Conjecture, a startup that is doing some fascinating research into the interpretability and safety of AI. We talk more about this in today’s episode, with Leahy elaborating on some of the technical problems that he and other researchers are running into and the creativity that will be required to solve them. We also take a look at some of the nefarious ways that he sees AI evolving in the future and how he believes computer security hackers could contribute to mitigating these risks without curbing technological progress. We close on an optimistic note, with Leahy encouraging young career researchers to focus on the ‘massive orchard’ of low-hanging fruit in interpretability and AI safety and sharing his vision for this extremely valuable field of research.
    To learn more, make sure not to miss this fascinating conversation with EleutherAI Founder, Connor Leahy! Full transcript. 
    Key Points From This Episode:

    The true story of how EleutherAI started as a hobby project during the pandemic.
    Why Leahy believes that it’s critical that we understand AI technology.
    The importance of making AI more accessible to those who can do valuable research.
    What goes into building a large model like this: data, engineering, and computing.
    Leahy offers some insight into the truly monumental volume of data required to train these models and where it is sourced from.
    A look at Leahy ‘s (very specific) perspective on making EleutherAI’s models public.
    Potential consequences of releasing these models; will they be used for good or evil?
    Some of the nefarious ways in which Leahy sees AI technology evolving in the future.
    Mitigating the risks that AI poses; how we can prevent these systems from spinning out of control without curbing progress.
    Focusing on solvable technical problems to build systems with embedded safeguards.
    Why Leahy wishes more computer security hackers would work on AI problems.
    Low-hanging fruit in interpretability and AI safety for young career researchers.
    Why Leahy is optimistic about understanding these problems better going forward.
    The creativity required to come up with new ways of thinking about these problems.
    In closing, Leahy encourages listeners to take a shot at linear algebra, interpretability, and understanding neural networks.

    Links Mentioned in Today’s Episode:

    Connor Leahy on LinkedIn
    Connor Leahy on Twitter
    Connor Leahy on GitHub
    EleutherAI
    Conjecture
    Microsoft DeepSpeed Library
    NVIDIA Megatron
    Facebook Fully Sharded Data Parallel (FSDP) Library
    Fairseq
    Common Crawl
    The Eye
    arXiv
    David Bau Lab
    ‘Locating and Editing Factual Associations in GPT’

    Credits
    Special thanks to volunteer producer, Nicole Martinelli. Music by Jason Shaw, Audionautix.
    This podcast is sponsored by GitHub, DataStax and Google.
    No sponsor had any right or opportunity to approve or disapprove the content of this podcast.

    Solving for AI’s Black Box Problem

    Solving for AI’s Black Box Problem

    The mystery that surrounds the possibilities and probabilities of AI is multilayered, and depending on your perspective and involvement with new technology, your access to reliable information and a clear picture of current progress could be obscured in several ways. On the podcast today, we welcome Alek Tarkowski, who is the Strategy Director of Open Future Foundation to talk about some of the ways we can tackle issues of security, safety, privacy, and basic human rights. Tarkowski is a sociologist, an activist, and a strategist and his engagement and insight into the current landscape are extremely helpful in understanding these complex issues and murky waters. In our chat, we get to unpack some foundational updates about what is currently going on in the space, regulations that have been deployed recently, and how activists and the industry can find themselves at odds when debating policy. Tarkowski makes a clear plea for all parties to get involved in these debates and stay involved in this powerful avenue for the molding of our future.
    To hear it all from Tarkowski on this central aspect of the future of AI, be sure to join us! Full transcript.
    Key Points From This Episode:

    Real life and artificial intelligence; Tarkowski comments on the effects we are seeing right now. 
    Current conversations about the regulation of automated decision-making. 
    Looking at the recent AI Act published by the European Union and its goals. 
    The three categories for regulation; impact assessment, transparency, and human oversight. 
    The two strongest forces in the regulation debate: the industry and activists. 
    Some examples of past regulations that have impacted emergent technologies.   
    New tools for the same goals, and the task of keeping the technology in the right hands. 
    Unpacking the side of regulation that is dealing with data.
    Tarkowski talks about the right to data mining and the rules that have been adopted so far.
    Comments about the role of the industry; engaging in policy debates. 
    What a healthy and open AI landscape would look like to Tarkowski ! 
    The world-building power of policy and the vital importance of debate and the process of their creation.

    Links mentioned in today’s episode:

    Alek Tarkowski
    Alek Tarkowski on Twitter
    Panopticon Foundation
    AlgorithmWatch
    Automating Society study
    The AI Act
    The Digital Markets Act
    The Digital Services Act
    The EU Copyright Directive

    Credits
    Special thanks to volunteer producer, Nicole Martinelli. Music by Jason Shaw, Audionautix
    This podcast is sponsored by GitHub, DataStax and Google.
    No sponsor had any right or opportunity to approve or disapprove the content of this podcast.

    Copyright, selfie monkeys, the hand of God

    Copyright, selfie monkeys, the hand of God

    What are the copyright implications for AI? Can artwork created by a machine register for copyright? These are some of the questions we answer in this episode of Deep Dive: AI, an Open Source Initiative that explores how Artificial Intelligence impacts the world around us. Here to help us unravel the complexities of today’s topic is Pamela Chestek, an Open Source lawyer, Chair of the OSI License Committee, and OSI Board member.
    She is an accomplished business attorney with vast experience in free and open source software, trademark law, and copyright law, as well as in advertising, marketing, licensing, and commercial contracting. Pamela is also the author of various scholarly articles and writes a blog focused on analyzing existing intellectual property case law. She is a respected authority on the subject and has given talks concerning Open Source software, copyright, and trademark matters.
    In today’s conversation, we learn the basics of copyright law and delve into its complexities regarding open source material. We also talk about the line between human and machine creations, whether machine learning software can be registered for copyright, how companies monetize Open Source software, the concern of copyright infringement for machine learning datasets, and why understanding copyright is essential for businesses. We also learn about some amazing AI technology that is causing a stir in the design world and hear some real-world examples of copyright law in the technology space.
    Tune in today to get insider knowledge with expert Pamela Chestek! Full transcript.
    Key Points From This Episode:

    Introduction and a brief background about today’s guest, Pamela Chestek.
    Complexities regarding copyright for materials created by machines.
    Interesting examples of copyright rejection for non-human created materials.
    An outline of the standards required to register material for copyright.
    Hear a statement still used as a standard today made by the US copyright office in 1966.
    The fine line between what a human being is doing versus what the machine is doing.
    Learn about some remarkable technology creating beautiful artwork.
    She explains the complexities of copyright for art created by software or machines.
    We find out if machine learning software like SpamAssassin can register for copyright.
    Reasons why working hard, time, and resources do not meet copyright requirements.
    A discussion around the complexities of copyright concerning Open Source software.
    Pamela untangles the nuance of copyright when using datasets for machine learning.
    Common issues that her clients experience who are using machine learning.
    Whether AI will be a force to drive positive or negative change in the future.
    A rundown of some real-world applications of AI.
    Why understanding copyright law is essential to a company’s business model.
    How companies make money by creating Open Source software.
    The move by big social media companies to make their algorithm Open Source.
    A final takeaway message that Pamela has for listeners.

    Links Mentioned in Today’s Episode:

    Pamela Chestek on LinkedIn
    Pamela Chestek on Twitter
    Chestek Legal
    Pamela Chestek: Property intangible Blog
    Debian
    DALL·E
    Hacker News
    SpamAssassin
    European Pirate Party
    Open Source Definition Link
    Free Software Foundation
    Red Hat
    Jason Shaw, Audionautix

    Credits
    Special thanks to the volunteer producer, Nicole Martinelli.

Customer Reviews

5.0 out of 5
3 Ratings

3 Ratings

Top Podcasts In Technology

All-In with Chamath, Jason, Sacks & Friedberg
All-In Podcast, LLC
Acquired
Ben Gilbert and David Rosenthal
Search Engine
PJ Vogt, Audacy, Jigsaw
Lex Fridman Podcast
Lex Fridman
Hard Fork
The New York Times
TED Radio Hour
NPR