Latent Space: The AI Engineer Podcast

Debugging the Internet with AI agents – with Itamar Friedman of Codium AI and AutoGPT

We are hosting the AI World’s Fair in San Francisco on June 8th! You can RSVP here. Come meet fellow builders, see amazing AI tech showcases at different booths around the venue, all mixed with elements of traditional fairs: live music, drinks, games, and food! We are also at Amplitude’s AI x Product Hackathon and are hosting our first joint Latent Space + Practical AI Podcast Listener Meetup next month!

We are honored by the rave reviews for our last episode with MosaicML! They are also welcome on Apple Podcasts and Twitter/HN/LinkedIn/Mastodon etc!

We recently spent a wonderful week with Itamar Friedman, visiting all the way from Tel Aviv in Israel:

* We first recorded a podcast (releasing with this newsletter) covering Codium AI, the hot new VSCode/Jetbrains IDE extension focused on test generation for Python and JS/TS, with plans for a Code Integrity Agent.

* Then we attended Agent Weekend, where the founders of multiple AI/agent projects got together with a presentation from Toran Bruce Richards on Auto-GPT’s roadmap and then from Itamar on Codium’s roadmap

* Then some of us stayed to take part in the NextGen Hackathon and won first place with the new AI Maintainer project.

So… that makes it really hard to recap everything for you. But we’ll try!

Podcast: Codium: Code Integrity with Zero Bugs

When it launched in 2021, there was a lot of skepticism around Github Copilot.

Fast forward to 2023, and 40% of all code is checked in unmodified from Copilot.

Codium burst on the scene this year, emerging from stealth with an $11m seed, their own foundation model (TestGPT-1) and a vision to revolutionize coding by 2025.

You might have heard of "DRY” programming (Don’t Repeat Yourself), which aims to replace repetition with abstraction. Itamar came on the pod to discuss their “extreme DRY” vision: if you already spent time writing a spec, why repeat yourself by writing the code for it? If the spec is thorough enough, automated agents could write the whole thing for you.

Live Demo Video Section

This is referenced in the podcast about 6 minutes in.

Timestamps, show notes, and transcript are below the fold. We would really appreciate if you shared our pod with friends on Twitter, LinkedIn, Mastodon, Bluesky, or your social media poison of choice!

Auto-GPT: A Roadmap To The Future of Work

Making his first public appearance, Toran (perhaps better known as @SigGravitas on GitHub) presented at Agents Weekend:

Lightly edited notes for those who want a summary of the talk:

* What is AutoGPT?

AutoGPT is an Al agent that utilizes a Large Language Model to drive its actions and decisions. It can be best described as a user sitting at a computer, planning and interacting with the system based on its goals. Unlike traditional LLM applications, AutoGPT does not require repeated prompting by a human. Instead, it generates its own 'thoughts', criticizes its own strategy and decides what next actions to take.

* AutoGPT was released on GitHub in March 2023, and went viral on April 1 with a video showing automatic code generation. 2 months later it has 132k+ stars, is the 29th highest ranked open-source project of all-time, a thriving community of 37.5k+ Discord members, 1M+ downloads.

* What’s next for AutoGPT? The initial release required users to know how to build and run a codebase. They recently announced plans for a web/desktop UI and mobile app to enable nontechnical/everyday users to use AutoGPT. They are also working on an extensible plugin ecosystem called the Abilities Hub also targeted at nontechnical users.

* Improving Efficacy. AutoGPT has many well documented cases where it trips up. Getting stuck in loops, using

commands, and making obvious mistakes like execute_code("write

a cookbook"'. The plan is a new design called Challenge Driven Development - Challenges are goal-orientated tasks or problems that

Auto-GPT has difficulty solving or has not yet been able to accomplish. These may include improving specific functionalities, enhancing the model's understanding of specific domains, or even developing new features that the current version of Auto-GPT lacks. (AI Maintainer was born out of one such challenge). Itamar compared this with Software 1.0 (Test Driven Development), and Software 2.0 (Dataset Driven Development).

* Self-Improvement. Auto-GPT will analyze its own codebase and contribute to its own improvement. AI Safety (aka not-kill-everyone-ists) people like Connor Leahy might freak out at this, but for what it’s worth we were pleasantly surprised to learn that Itamar and many other folks on the Auto-GPT team are equally concerned and mindful about x-risk as well.

The overwhelming theme of Auto-GPT’s roadmap was accessibility - making AI Agents usable by all instead of the few.

Podcast Timestamps

* [00:00:00] Introductions

* [00:01:30] Itamar’s background and previous startups

* [00:03:30] Vision for Codium AI: reaching “zero bugs”

* [00:06:00] Demo of Codium AI and how it works

* [00:15:30] Building on VS Code vs JetBrains

* [00:22:30] Future of software development and the role of developers

* [00:27:00] The vision of integrating natural language, testing, and code

* [00:30:00] Benchmarking AI models and choosing the right models for different tasks

* [00:39:00] Codium AI spec generation and editing

* [00:43:30] Reconciling differences in languages between specs, tests, and code

* [00:52:30] The Israeli tech scene and startup culture

* [01:03:00] Lightning Round

Show Notes

* Codium AI

* Visualead

* AutoGPT

* StarCoder

* TDD (Test-Driven Development)

* AST (Abstract Syntax Tree)

* LangChain

* ICON

* AI21

Transcript

Alessio: [00:00:00] Hey everyone. Welcome to the Latent Space podcast. This is Alessio, Partner and CTO-in-Residence at Decibel Partners. I'm joined by my co-host, Swyx, writer and editor of Latent Space.

Swyx: Today we have a special guest, Tamar Friedman, all the way from Tel Aviv, CEO and co-founder of Codium AI. Welcome.

Itamar: Hey, great being here. Thank you for inviting me.

Swyx: You like the studio? It's nice, right?

Itamar: Yeah, they're awesome.

Swyx: So I'm gonna introduce your background a little bit and then we'll learn a bit more about who you are. So you graduated from Teknion Israel Institute of Technology's kind of like the MIT of of Israel. You did a BS in CS, and then you also did a Master's in Computer Vision, which is kind of relevant.

You had other startups before this, but your sort of claim to fame is Visualead, which you started in 2011 and got acquired by Alibaba Group You showed me your website, which is the sort of QR codes with different forms of visibility. And in China that's a huge, huge deal. It's starting to become a bigger deal in the west. My favorite anecdote that you told me was something about how much sales use you saved or something. I forget what the number was.

Itamar: Generally speaking, like there's a lot of peer-to-peer transactions going on, like payments and, and China with QR codes. So basically if for example 5% of the scanning does not work and with our scanner we [00:01:30] reduce it to 4%, that's a lot of money. Could be tens of millions of dollars a day.

Swyx: And at the scale of Alibaba, it serves all of China. It's crazy. You did that for seven years and you're in Alibaba until 2021 when you took some time off and then hooked up with Debbie, who you've known for 25 years, to start Codium AI and you just raised you