Interconnects

Nathan Lambert

0.0 (0)
Technology
Updated weekly

Audio essays about the latest developments in AI and interviews with leading scientists in the field. Breaking the hype, understanding what's under the hood, and telling stories. www.interconnects.ai

1 day ago

Farewell Ai2

I’m departing the Allen Institute for AI (Ai2), where I got the great privilege to work on the Olmo models, to grow, to learn, and to have broad lasting impacts. This post is an attempt to reflect on why what we did was influential, despite obviously being far from the frontier in performance (even when within size buckets), and how this reflects on various paths to impact in AI today. To start, I shared the following note with the company yesterday: Dear Ai2. As many of you know, today is my last day working at Ai2. I joined Ai2 largely as an accident. I met Luca at ICML 2023 in Hawaii and realized I could level up my open post-training work dramatically if I got the chance to join. When I got an offer it was an absolute no-brainer, it was such a welcoming and exciting environment. It has been a wonderful ride that has transformed my life, and I couldn’t be prouder of the work we did together. Ai2 has a wonderful scientific culture at its core and I’m excited to see this continue. I feel very lucky to have been here and that I personally have benefited massively from everyone who has worked so hard to cultivate that culture and environment. It is and has been a team effort. This includes all the people whose longest interactions with me were brief chats at the coffee machine. I drew so much energy and excitement from all the different ways people at Ai2 showed up for the mission. I’ve already thanked much of the OE team directly, but I wanted to thank everyone else that went into this. Legal, IT, Comms, and the Office team all do a great job enabling and leveling up our research work. It’s often work that is forgotten, outside of the lime light, or remembered at the last minute, but it all has been crucial to achieving our goals. I’m excited to keep visiting the wonderful Northlake space in the coming years. Even though I’m leaving, I’m more excited than ever about Ai2’s mission. Ai2 operates in such a rare niche between academia and industry, where we can explore and influence the most important technology of our lifetime. Doing this openly is the best way to ensure the technology diffuses safely to everyone who may benefit. Ai2 needs to stay as ambitious as possible, trying to influence the cutting edge of AI and the biggest issues of the field. Do not shy away from these challenges – AI needs independent voices as it only becomes more geopolitical, socially disruptive, and central to the economy. I will still be working in this space, working to make the open ecosystem better coordinated and more useful. So as I go off to try something new, don’t be strangers. I’ll always be reachable at nathan@natolambert.com and will still live in Seattle for most of the year. Nathan I have loved and will still love Ai2. Ai2 has a deep culture of caring about the research process, the outputs that get shared, and most importantly the people who do the work. This is why the institution creates countless wonderful people that go and spread the gospel throughout the research community. This core culture will remain through the rebuild, and there are plenty of resources to do impactful research across the spectrum of AI. In the last two years of my time at Ai2 I’ve done so much meaningful work. Of course Olmo is at the top and has been my priority, but making time for consistent practice here on Interconnects, weekend cram sessions for ATOM, and also the fun RLHF book make for a list that makes me wonder how I did it all. I was obviously obsessed with work, but not in a way that made me lose sleep or lose my overall wellness. It was the right long-term approach. This impressive list is one where I was ruthless in saying no to things that didn’t matter and got all my work out to see the light of day. I had no medium-sized projects that didn’t succeed in the last few years. It makes me wonder if I wasn’t taking enough risk. It shows you can truly do so much with your time, and it’s actually harder to find the right problems and environment to do it. Many people are in environments where their work never becomes public or they’re forced to change topics consistently. From zero to hero To start, I’d like to do a short recap on my path to Ai2 to show what Ai2 was just as much a growth story for me as an execution story. I studied electrical engineering in undergrad, focusing on linear systems math and microelectronics. I was admitted to the UC Berkeley EECS Ph.D. program to study microelectromechanical systems (MEMS). I showed up at Berkeley in August of 2017 and realized AI was obviously the thing I should be doing. I asked the likes of Sergey Levine or Pieter Abbeel if they could advise me – they said no. I threw all my energy into learning what I could about AI. I got a break to get advised by one of Sergey’s post-docs in 2018 or 2019. I went all in on that, I fought for funding, I fought to have an AI paper. This process worked out by the end of my Ph.D. in 2022: I had access to the Berkeley AI Research (BAIR) building and collaborations in the department. It was a bumpy road. I wanted to go to industry research, to get a nice paying job with intellectual freedom, something like FAIR or Google Brain at the time. HuggingFace was the only job that fit that bill, it was easy to say yes to. I joined HuggingFace in May of 2022 and wasted my time at the company until ChatGPT was released. I used my RL background to write a blog post on RLHF which went viral. HuggingFace decided it would be good for me to form a team around this success. In 2023 I learned NLP and about language models. I had a lot of fun and built an initial community. I got burned out by working remote with a huge time difference. I met Luca Soldaini at ICML in Hawaii, where I was giving a tutorial on RLHF, and they told me Ai2 was hiring. I got the job at Ai2 largely because of my excitement and how I was saying I wanted to do a lot of stuff that sounded cool to them but no one was likely to do (RL related things). My interviews were far from a sure thing – this is a great job to land! I started at Ai2 in October of 2023. I worked remotely for a while. I was doing normal research, I made the first reward model evaluation, RewardBench. It was a solid success, but nothing like how the pretraining team was getting ready to release the first Olmo. I helped coach Ai2 on how to release models well, helping the Tülu 2 project land (the first model to do DPO well, publicly at the 70B scale). The first Olmo was released in early 2024, I squeaked onto the papers just by trying to be helpful and doing some basic post-training. I was already good at paying attention to which projects are actually important. That summer I started rounding everyone up to do a “big frontier post-training project.” This became Tülu 3, one of my favorite projects ever released, in fall of 2024. The goal was to beat Llama 3’s post-training with their own base model. The team morale was incredibly high and the execution was so timely, allowing us to coin the term Reinforcement Learning with Verifiable Rewards (RLVR) in the paper. The crazy lengths I went to get the Tülu 3 and Olmo 2 post-training done had me sending 40% more slack messages than anyone at the company and got me the award “The Cat Herder.” 2025 was a much simpler year. We were too slow to react to reasoning models, given we had been doing similar stuff with Tülu 3, but sometimes that happens. Originally we wanted to release Olmo 3 by June or July of 2025. That obviously didn’t happen, but we got the slim chance to train a bigger model, and it really landed. We threaded the needle. Since Olmo 3 was released, it was clear that some changes were coming and I personally never got a big post-training project off the ground after that. Many other people managed great work in the spring of 2026. This all leaves me here today showing you that only about half of my story at Ai2 is what I was known widely for, and the rest was building momentum. It often takes a year of building relationships and direction before really big successes can happen in a career. I was just about a nobody when I joined Ai2 and I got to join a team that was willing to learn from the skills I had brought from HuggingFace. With how media works, I often think I get more recognition than I deserve for Ai2’s success. The likes of Tülu 3, Olmo 2, and Olmo 3 felt like generational team efforts. The amount of personal successes and breakthroughs that happened for those projects is immense – and to sustain them over such a long time period is incredibly hard to replicate. The sum far exceeded the individual parts. I’ve heard many times in the last few months how people wouldn’t know about Ai2 if it wasn’t for my writing. Statements like this are overblown, but they are partially true and reiterate how crucial building relationships and getting the word out is today. When you write a plan that is feasible, the world bends towards that plan. When you convince people it’s going to happen it only becomes more likely. Vision and compelling explanations are one of the items in shortest supply in the tech industry. Often building the thing is easy and explaining it is hard. If no one knows about your work, the value is often close to 0. So much of building reputation is about building relationships with people who will receive your work. Reflecting on all of this, I’ve had a shockingly linear path through my career to incremental success. I would expect the first 10 years of most careers to be in search of finding one opportunity as good as Ai2, and you will not always be able to seize it. There are some ways to create more opportunities. I’ve discussed before how a large part of my rise is down to many more senior and more established scientists being drawn into the closed ecosystems at the same time as an immense swell in interest for AI. This created a power vacuum that I, and a few other prominent

16 min
2 days ago

Open and closed models are on different exponentials

The largest debate that’ll define the future balance of power between the open and closed AI model ecosystems is primarily economic — it’s if users of AI will continue to pay dramatically more, i.e. large margins, for the top closed models. Early 2026 is a seminal time for the AI industry, as the coding agents have shown the first area where a huge AI market will continue to pay a substantial premium for better intelligence. The other side of this dichotomy is the inevitable decay of API businesses at these same labs. These labs will realize they need to protect their best models, rolling them out later in APIs to both protect token supply, avoid distillation, and stick to use-cases with higher margins. All of these effects will be clearly visible in 5-10 year timelines, as in the near term markets, prices, margins, and demand will be dictated by a rapid buildout of compute (supply-limited in the near term) and mass subsidization of tokens (through continued investment in new AI companies). The core of this argument rests in the obvious habit changes that are setting in with coding agents past the Opus 4.5 and Codex 5.2 thresholds. People are not making this switch because they are lazy, but because their net output is obviously higher when using an agent as an implementation aid for complex knowledge work. For people who rely on coding agents to work, they will always pay more for the best rather than settle for good enough. There are so many ways to make the product better, speed, intelligence, specialized models, etc. I would pay $2000/month for the tools today, especially knowing they’ll get much better. At the same time, it is likely that many companies are forcing agents and usage onto people that actually will get very little out of them in their current form, which helps the AI buildout (or bubble) continue. The best closed labs — right now this list is just Anthropic and OpenAI, but it’s reasonable to expect Google to catch up — will always make the most efficient models for intelligence at a given cost. Building models is a mass capital investment of talent, data, and compute. These systems, a combination of model weights, harnesses, tools, and serving infrastructure have massive returns on integration (where open models are designed to work across many, diverse serving situations). These integration benefits — the integration of hardware and new forms of software — can be expressed in any possible way of making models better. The models in the near future may saturate on benchmark scores, but if that intelligence ceiling really is a cap on utility then the labs will optimize utility per second or per watt, serving users in another way. Improving the models is possible in every direction — there have been no walls in progress. We’re early in the mass buildout of intelligence, which involves harnessing the physical world to build numerous datacenters, organizing many AI researchers so that a large team can contribute to one model, and of course solving many small, low-level puzzles that unlock performance. Every indication is that there is still meaningful performance to be unlocked and the closed labs are the best set up to extract it. The collective wisdom of the labs is that making the models smarter, in terms of the frontier of absolute intelligence, has the most value. This is the right call to me because it unlocks large new markets. Optimizing models at a fixed intelligence level locks in markets, expands accessibility over time, and increases return on investment for users (while potentially lowering margins for selling intelligence). Many people are making this bet that models will keep getting better and are learning to work well in these harnesses, even though some workflows are still a bit clunky. This is the right bet. These people all will continue to use the absolutely best models available. It’s like buying an iPhone as a consumer. You could get an Android and suffer from a bunch of paper cuts to save money, but why would you? The returns to performance are even higher in the workplace, which drives pricing power. In this mental model, the frontier labs as businesses, will look like new, reimagined forms of a mix of Apple and Microsoft. The Apple side is that they’re selling an integrated, extremely hard to replicate technology. The Microsoft side is selling high-leverage subscriptions across the economy. In 5-10 years I expect both OpenAI and Anthropic to be valued in the $2-10T range. The true frontier labs will be an oligopoly that looks like the cloud market today. Interconnects AI is a reader-supported publication. Consider becoming a subscriber. On the other side of this equation is the open model economy. This isn’t to say that the frontier labs will dominate all aspects of AI use. Yes, I expect OpenAI and Anthropic to be the most representative companies of the AI boom (new companies, alongside Nvidia of course), but the collective value capture around open models will be far bigger overall, it’s just that the revenue and margins will be shared across a wide stack of companies. Many businesses want to switch to open models but the models today are not good enough in out-of-distribution tasks. Eventually open model builders will stop chasing Claude and GPT on the Artificial Analysis index and fill this niche. This fork could be driven by economic factors, where they no longer have the revenue to support the growing R&D costs for continuing to scale models. It can also be driven by pure demand, where certain AI solutions only can exist at low price points present in open models. Where closed labs are an oligopoly, open model builders and users will be far more diverse and numerous. The total market value will dramatically exceed the cumulative value of OpenAI and Anthropic. Open models are by their nature not integrated, so they will rely on multiple companies coordinating to serve them. Each of these layers will have alternatives, driving prices down to commodity pricing. These low, predictable prices will be where many enterprises enter to build in-house agents and tools for niche tasks. The predominant mode of deployment here is that enterprises find a model that hits a sufficient performance threshold on a task of interest and does not replace the model later (setup costs are high). As customizing models becomes easier, again in the open model finetuning stack we are seeing emerge (Tinker, Fireworks, Prime Intellect, etc.), this market becomes even bigger. What this will look like in the coming years is a steady rise in open model inference proportion across the entrenched hyper-scale clouds of Google, Amazon, Microsoft and new AI infrastructure companies of Together, Fireworks, OpenRouter, etc when compared to OpenAI and Anthropic. The key is that the open and closed model economies are operating on different exponentials. I still believe that progress will continue at a fast pace across the entire ecosystem, but claims of recursive self improvement (RSI) giving the closed labs an unassailable advantage are overblown. New forms of products like background agents can support both these open and closed models. The closed models hit incredible product-market fit with the current agents, starting their integrated exponential by monetizing the top end of the knowledge work. The open model economy will take far longer, but it will also be far more satisfying to follow, as it tracks the broader diffusion of AI into the entire economy and world. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

7 min
26 May

Some ideas for what comes next, May 2026

As the years of AI progress go by, it’s been accompanied by a slowly rising tide of consequence. Models are getting more capable, how we work is changing quickly, economics of AI are becoming real, just as real-world risks come to the forefront. 2026 is the first year where I don’t think there’ll be any breaks from this. The hard part to prepare for is that there’s a good chance things just continue to ratchet up from here – more disruption, more surprises, more stakes. On my end, there’s been a growing list of topics that are very fateful to how I see the current state of AI, but I haven’t even gotten to write about them (at least not from all the angles I want to)! All of these are closely related to the implications of different models reaching new capability levels and how I use that to infer what may come next. 1. Open models haven’t had their true agent moment like Opus 4.5 The time gap between open and closed models is very often discussed, but the reality is that we have a nice time-gating that’s independent of debatable benchmarks – if open-weight models do or do not become super useful in agentic harnesses. The Opus 4.5 in Claude Code moment of December 2025 was so loud and obvious, that if open models hit this performance level for price points as low as $5/month, there will be an explosion in usage. Right now we are about 5-6 months in with no equivalent open model. I suspect the robustness of the best closed frontier models that I write about could make this moment take a good amount longer, say closer to 12+ months. In this time, Claude Code and Codex may seem like different categories of products. In the standard flurry of new, state-of-the-art open models from a variety of labs, benchmarks will definitely keep climbing, but the open-closed gap should become more interpretable as real-world use becomes the real litmus test. 2. Gemini still doesn’t have a meaningful competitor for Claude Code and Codex The best exclamation point I can offer to reinforce my prediction that open models are further behind than the benchmarks claim is that even the mighty Google doesn’t have a clear competitor for Claude Code and Codex. I’m sure the Gemini team is pushing very hard on this. I still need to do a lot more testing on Gemini 3.5 Flash, but reading reviews makes it clear that it’s not a substitute for how I’m working today. It’s maybe not the Gemini team explicitly specializing for Google’s existing products (search, YouTube, etc.), but the model seems to suit them. If Google doesn’t have a powerful tool here soon, I don’t expect the open model labs to either. The open models are going to be used more for automated, enterprise agents and low-cost domains, rather than being the driving tool of modern knowledge work. This will feed directly into the economic engine of funding future models, where the agents like Claude Code and Codex are the current best path to massive AI revenue growth. I discussed how the current environment is quietly driving labs in China to specialize on AI Proem with Grace Shao and this is central to my expectations of open models specializing over the next few years instead of competing with OpenAI, Anthropic, and Google. Interconnects AI is a reader-supported publication. Consider becoming a subscriber. 3. I don’t expect an open-weights Mythos this year While I don’t think Mythos is a general “god model” that will crush the competition in every domain, I do think it’s a remarkable technical achievement in software engineering and cybersecurity. Mythos is obviously a watershed moment for those fields. Having spoken to most of the Chinese labs – particularly those with the most prominent, large, open MoE models like Kimi, Z.ai, DeepSeek, and Qwen – I think they’re heavily resource limited and don’t have an immediate path to scaling up training processes like the big labs in the U.S. For the labs which are more corporate, which comes with more resources, such as Alibaba and Bytedance, they also have more conservative stances on safety and security.Mythos is a bellwether of the massive acceleration in training and research compute available to the largest American companies. Epoch AI recently had a nice piece on the compute available to various labs (~Google 25%, Meta 11%, OpenAI 11%, Anthropic 6%). All of these numbers are vastly higher than any Chinese lab. 4. American open models are slowly gaining steam Nvidia with Nemotron, Google with Gemma, Arcee AI and others are slowly stabilizing the open model ecosystem in the U.S. There’s a lot that’s hard to measure here, especially in the rise of local agents like OpenClaw and Hermes, but there are adoption numbers of American models that we haven’t seen since Llama 3.Gemma 4’s models are all tying or outperforming the equivalently sized Qwen 3.5/3.6 models — where Qwen has for years now been the default open model at these sizes. These Qwen 3.5/3.6 models have been tricky to get working in a lot of post-training research, partially due to architecture/tooling and partially likely due to modeling (i.e. the model is not easy to finetune for some training decision). I’ve heard few complaints about Gemma, but it also could be because Gemma is not yet the researcher default. There's a simple reality that we've seen recently with models like GPT-OSS, Nemotron 3, and now Gemma 4, that if a model is in the right range of benchmarks and released by an American lab with a truly permissive license, it'll get a large amount of adoption (in this cycle, recall that Gemma 4 adopted the Apache 2.0 License, changing from one with use-case restrictions on earlier Gemmas). This early phase of American growth in open models is establishing key brands directly with developers. The consensus is that more neolabs like Reflection and Thinking Machines are likely to participate in this space, but being too patient will lose the time when new agentic workflows and enterprise relationships are built. 5. Anthropic and OpenAI are just getting up to speed in model iterations I expect the rest of this year to be a ruthless competition between these two flagship companies. I’m at an interesting balance where I think GPT 5.5 is a bit smarter of a model and I love the Codex App, so I’m structuring much of my work to be possible there. At the same time, for a lot of writing-related and broader surface area tasks I really still love Claude. These models are rapidly changing how we work, I run Codex from my phone while doing other things, am setting up automated open model analysis jobs on the back of agents, and expect to be able to scale the research side of Interconnects widely. AI is beginning to drive companies to the two extremes in the scaling era. The biggest companies will be way bigger than ever, using resources and mass talent to have sustained progress at the frontier of raw AI capabilities. On the other side, tiny businesses like Interconnects thrive by using agents to refine, present, and sell niche expertise. The mass social job displacement that’ll come is going to reduce employability for various knowledge workers that don’t fit into either of these extremes for the raw technical side (big or small companies), while sustaining and maybe even amplifying careers that interface directly with humans (e.g. doctors) or other power structures with means to sustain themselves (law/government). 6. More existing power structures will assert themselves on AI Just in the last few days while writing this, we had the Pope release an over 40,000 word document on where AI is going and China expand personnel movement restrictions on top AI researchers across industry. At the same time, the U.S. has designated Anthropic a supply chain risk and continues to use its models for national security. The list of news like this is only going to grow. Existing power structures are realizing there’s a finite time window for them to exert themselves in the AI dynamic — an intuition that could be mapped to influence going down as AI models get more powerful. This intuition is potentially dangerous, as it sets up meaningful conflict in who controls the technology (as I discussed with Dean Ball after the Anthropic-DoW spat). Next: Where technical becomes social These largely technical and power trends accelerating are going to put more pressure on the social and political anti-AI sentiments within the U.S. This is currently the most obvious barrier to continued AI development and beneficial diffusion. Reflecting on this, many people in the tech discourse get too focused on the details, where yes a lot of data-center-detractors are making genuinely wrong factual claims in defense of their position. The real position that a large swath of Americans has is that they have a voice in saying no to the current trend — by not granting permission to build data centers. This is a voice that they haven’t been granted by the tech industry that changed the face of the global economy and power structures in the last few decades. This is setting us up for a challenging year ahead for the industry. The labs are aggregating and concentrating talent to peak levels. There are few neutral messengers to communicate the reality of AI to the public. The frontier labs leadership is largely gearing up to IPO and stay ahead in the capabilities race. With the status quo, there are few actions to unwind this path toward social conflict. It takes individuals in the AI ecosystem to zag and go against the groupthink of needing to make your wealth today, of needing to be at a lab to do impactful work, and so on. I’m personally continuing to bet on this, by trying to make a vibrant and diverse open model ecosystem supported by clear, unbiased information. If you agree with this and have been watching from the sidelines, it’s a good time to get involved, before the situation spirals into something uncontrollable. This is a p

10 min
7 May

Notes from inside China's AI labs

Staring out the window on a new, high-speed train from Hangzhou to Shanghai I’m gifted with views of dramatic ridgelines speckled with wind turbines that are silhouetted against the setting sun. The mountains cast a backdrop to a mix of spanning fields and clustered skyscrapers. I’m returning from China with great humility. It’s a very warming, human experience to go somewhere so foreign and be so welcomed. I had the honor of meeting so many people in the AI ecosystem who I knew from afar, and they greeted me with big smiles and cheer, reminding me how global my work and the AI ecosystem is. Interconnects AI is a reader-supported publication. Consider becoming a subscriber. The mentality of Chinese researchers The Chinese companies building language models are set up as the perfect fast-followers for the technology, building on long-standing cultural traditions in education and work, along with subtly different approaches to building technology companies. When you look at the outputs, the latest, biggest models enabling agentic workflows, and the ingredients, excellent scientists, large-scale data, and accelerated computing, the Chinese and American labs look largely similar. The lasting differences emerge in how these are organized and conditioned. I’ve long thought that a reason that the Chinese labs are so good at catching up and keeping up with the frontier is that they’re culturally aligned for this task, but without talking to people directly I felt like it wasn’t my place to attribute substantial influence to this hunch. Speaking with many wonderful, humble, and open scientists at the leading Chinese labs has crystallized a lot of my beliefs. So much of building the best LLMs today comes down to meticulous work across the entire stack, from data to architecture details and RL algorithm implementations. All points of the model can give some improvements, and fitting them in together is a complex process where the work of some brilliant individuals needs to get shelved in favor of the overall model maximizing a multi-objective optimization. Where American researchers are obviously also brilliant at solving the individual components, there’s more of a culture of speaking up for yourself in the U.S. As a scientist, you’re more successful when you speak up for your work and modern culture is pushing the new path to fame of “leading AI scientists”. This results in direct conflict. The Llama organization is heavily rumored to have collapsed under the political weight of these interests embedding themselves in a hierarchical organization. I’ve heard of other labs saying that it can be needed to pay off a top researcher to get them to stop complaining about their idea not making it in the final model. Whether or not that’s exactly true, the idea is clear. Ego and desires for career advancement do get in the way of making the best models. A small, directional shift in this sort of culture between the U.S. and China can have a meaningful impact on the final outputs. Some of this has to do with who is building the models in China. There’s an immediate reality at all of the labs that a large proportion of the core contributors are active students. The labs are quite young, and it reminds me of our setup at Ai2, where students are seen as peers and directly integrated in the LLM team. This is incredibly different from the top labs in the US, where the likes of OpenAI, Anthropic, Cursor, etc. simply don’t offer internships. Other companies like Google nominally have internships related to Gemini, but there’s a lot of concern about whether your internship will be siloed and away from anything real. To summarize how the slight change in culture can improve the ability to build models: * More willingness to do non-flashy work in order to improve the final model, * People new to building AI can be free of prior phases of AI hype cycles, allowing them to adapt to the new modern techniques faster (in fact, one of the Chinese scientists I talked to really actively attached to this strength), * Less ego enabling org charts to scale slightly, as there’s less gamifying the system, and * Abundant talent well-suited to solving problems with a proof of concept elsewhere, etc. This slight inclination towards skills that complement building today’s language models stands in contrast to a known stereotype that Chinese researchers tend to produce less creative, field-spawning, 0-to-1 academic style research. Among the more academic lab visits on our trip, many leaders talk about cultivating this more ambitious research culture. At the same time, some technical leaders we talked to were skeptical about whether such a rewiring in the approach to science is likely in the near term, because it’ll take a redesign of the education and incentive systems that is too big to happen within the current economic equilibrium. This culture seems to be training students and engineers that are excellent at the LLM building game. They also, of course, have an extremely abundant quantity. These students told me about a similar brain drain happening in China as in the U.S., where many who previously considered academic paths now intend to stay in industry. The funniest quote was from a researcher who was interested in being a professor to be close to the education system, but remarked that education is solved with LLMs – “why would a student talk to me!” The students have a benefit of coming at LLMs with fresh eyes. Over the last few years we’ve seen the key paradigm of LLMs shift from scaling MoE’s, to scaling RL, to enabling agents. Doing any of these well involves absorbing an insane amount of context quickly, both from the broader literature and the technical stack at your company. Students are used to doing this and excited to humbly drop all presumptions about what should work. They dive in head first and dedicate their life to getting the chance to improve the models. These students are also so magically direct and free of some of the philosophical chatter that can distract scientists. When asking questions on how they feel about the economics or long-term social risks of models, far fewer Chinese researchers have sophisticated opinions and a drive to influence this. Their role is to build the best model. This difference is subtle, and easy to deny, but it is best felt when having long conversations with an elegant, brilliant researcher who can clearly communicate well in English, basic questions on more philosophical aspects of AI hang in the air with a simple confusion. It’s a category error to them. One researcher even quoted the famous Dan Wang premise of China being run by engineers, relative to the lawyers of the U.S. when probing in these areas, to emphasize their desire to build. There’s no track in China that systematically enables the growth of star power for Chinese scientists, akin to mega mainstream podcasts like Dwarkesh or Lex. Trying to get Chinese scientists to comment on the coming economic uncertainty fueled by AI, questions beyond the capabilities of simple AGI, or moral debates on how models should behave all served to capture the upbringing and education of these scientists (edited). They are extremely dedicated to their work, but have grown up in a system where debates and opinions on how society should be structured and changed are not encouraged. Zooming out — Beijing especially felt much like the Bay Area, where a competitive lab is a short walk or Uber away. I got off a flight and stopped by Alibaba’s Beijing campus on the way to the hotel. Then, in 36 hours we went to all of Z.ai, Moonshot AI, Tsinghua University, Meituan, Xiaomi, and 01.ai. Travel by Didi is easy, and if you select an XL in China you’re often paired with electric mini vans that have massage chairs. We asked the researchers about the talent wars, and they said it’s very similar to what we’re experiencing in the U.S. It’s normal for researchers to bounce around, and much of where people choose to go is based on the best current vibes. In China, the LLM community feels far more like an ecosystem than battling tribes. Across many off the record conversations, it’s nothing but respect for peers. All of the Chinese labs fear Bytedance with their popular Doubao model, which is the only frontier closed lab in China. At the same time, all of the labs have massive respect for DeepSeek as the lab with the best research taste in execution. When you meet with lab members off the record in the States, sparks fly quickly. The most striking part of the humility of Chinese researchers is how they also often shrug on the business side, saying it’s not their problem, where everyone in the U.S. seems to be obsessed with various ecosystem-level industrial trends, from data sellers to compute or fundraising. Where China’s AI industry differs (and matches) the Western labs The thing that makes building an AI model today so interesting is that it’s not just about getting a group of great researchers in one building together to produce an engineering marvel. It used to be this, but to sustain AI businesses, the LLMs are becoming a mix of building, deploying, funding, and getting adoption for this creation. The leading AI companies exist in complex ecosystems that supply money, compute, data and more in order to keep pushing the frontier. The integration of these various inputs to creating and sustaining LLMs is fairly well conceptualized and mapped for the Western ecosystem, as typified by Anthropic and OpenAI, so finding big differences in how the Chinese labs think about it points at where the different companies can be making meaningfully different bets on the future. Of course, these futures can be heavily dictated by the constraints on funding and/or compute. I’ve documented the biggest “AI Industry” level take-aways from talking to these labs: * Early signs of domestic AI demand. There’s

17 min
4 May

The distillation panic

‘Distillation attacks’ is a horrible term for what is happening right now. Yes, some Chinese labs are hacking or jailbreaking APIs to attempt to extract more signal from model APIs — stopping this is important to maintain the U.S.’s lead in AI capabilities. Referring to this as distillation attack is going to irrevocably associate all distillation with this behavior, and distillation generally is a core technique needed to diffuse AI capabilities broadly through academic and economic activities. We went through this sort of language transition with the open source vs open weight debate. All the terms just reduced to open models – very few people in the large AI community know exactly how open-source differs from open-weights. And terminology matters, as the less informed people who still care about — and influence — the technology are bound by different terms they use. If we’re not careful with the discourse around distillation, many people could associate this broad technique used for research and development of new models as an act at the boundary of corporate manipulation and crime. I’ve recently written a more technical piece on estimating how impactful state-of-the-art distillation methods are on leading Chinese models, and this piece follows to push for caution in any hasty actions to target the methods with policy. To set the stage, recall Anthropic’s recent blog post where they detailed “distillation attacks” made by 3 Chinese labs. These labs used a technique called “distillation,” which involves training a less capable model on the outputs of a stronger one. Distillation is a widely used and legitimate training method. For example, frontier AI labs routinely distill their own models to create smaller, cheaper versions for their customers. But distillation can also be used for illicit purposes: competitors can use it to acquire powerful capabilities from other labs in a fraction of the time, and at a fraction of the cost, that it would take to develop them independently. This is a clever paragraph, where they normalize distillation generally and explain how a few people can use it illicitly, without detailing how illicit use often involves other more explicit behavior like jailbreaking, hacking, or identity spoofing of the API. Distillation itself is an industry standard. It’s used extensively, primarily in post-training, by smaller players to create specialized or smaller models. In my book coming this summer, I describe it as follows: The term distillation has been the most powerful form of discussion around the role of synthetic data in language models. Distillation as a term comes from a technical definition of teacher-student knowledge distillation from the deep learning literature. Distillation colloquially refers to using the outputs from a stronger model to train a smaller model. In post-training, this general notion of distillation takes two common forms: * As a data engine to use across wide swaths of the post-training process: Completions for instructions, preference data (or Constitutional AI), or verification for RL. * To transfer specific skills from a stronger model to a weaker model, which is often done for specific skills such as mathematical reasoning or coding. With this definition, it’s easy to see how distillation takes many forms. Of course, if you just take the outputs from GPT-5.5 and train a recent open-weight base model with them to host a competitive product, that’s one thing. But, a lot of the things that fall under the bucket of distillation are complex, multi-stage processes that muddle the exact impact of the model you distilled from. Modern LLM processes could look like using a GPT API to build an initial batch of synthetic data to build a specialized small data-processing model. A good example is a model like olmOCR (or many other models in this category) that are trained to convert PDFs to clean text. This specialized model would be used to create large amounts of data. Finally, you train another model (often from scratch) with the new data you created. Is this final model distilled from GPT? When done via a closed, API-based model, distillation sits in the grey area of the terms of service that you agree to when signing up to the Claude or GPT platform. They generally forbid the use of the API to create competing language model products, but this term has largely gone unenforced. The open-source community used to worry deeply at being cut off from these cutting-edge APIs for doing research or creating public datasets, but to date only one prominent case of corporate accounts being restricted exists (at least until the recent Chinese companies). This is all to say that distillation is an industry standard technique, and the use of closed APIs to perform distillation has always been a grey area. Nvidia’s latest Nemotron models, as one of the only models with open post-training datasets, are technically in large part distilled from Chinese, open-weight models. The Olmo models we’ve built at Ai2 are distilled from a mix of open and closed models. This grey area was brought to the forefront again when it turned out that xAI has been distilling from OpenAI. Quoting from the recent trial proceedings between Elon and OpenAI: OpenAI’s counsel asked Musk whether xAI has ever “distilled” technology from OpenAI. Musk: “Generally AI companies distill other AI companies.” “Is that a yes?” Savitt asked. Musk: “Partly.” xAI is likely the largest, and most successful AI company willing to thread the grey area that is distillation from their competitors. On the other side, the majority of startups and research groups with fewer resources than them have very likely engaged in distillation of some capacity from Claude, GPT, or Gemini models. Interconnects AI is a reader-supported publication. Consider becoming a subscriber. In the above Anthropic blog post, the problem with the distillation attacks by a few Chinese labs is less the distillation and more the means of attack. It is documented that Chinese labs are actively working to get around the intended use of the API, e.g. to provide additional reasoning data that is very useful for training. Of course no one should be able to access information from a model that a developer didn’t intend to reveal in their APIs (e.g., reasoning traces which would be helpful for training). Associating all of distillation with these attacks, which is to date an industry standard for post-training, from open and closed models alike will be a massive own goal. What these few labs are doing should be referred to as jailbreaking or abuse, rather than distillation. The discourse around these actions is creating a troubling discussion that’s marching towards a mix of regulatory capture or regulatory exuberance that’s most likely to harm the U.S.’s ecosystem more than China’s. Even if we ban, most likely through potential legal action and other penalties, this type of API abuse, the Chinese companies will likely still do it. We’ve seen this playbook with Chinese multimedia models taking a flexible view of copyrighted content that no U.S. player is willing to take the risk on. This distillation discussion has quickly snowballed, with a bill moving out of a committee in Congress, an executive order pushing for action, and congressional oversight targeting U.S. companies building on Chinese models (which are downstream of distillation). This multi-pronged regulatory environment could yield truly horrible outcomes – such as figuring out a way to effectively ban open-weight models in the U.S. that are built in China by groups abusing closed LLM APIs. It is obvious that no bill will literally ban open models, but they can create grey area that exposes entities to unwanted risk or require certain provisions that are bureaucratically very challenging to fulfill, squashing small open source contributors. In that scenario, the groups who lose are Western academics and smaller companies building models for the long-tail of AI uses. The ecosystem here could be made permanently irrelevant with the removal of nearly all Chinese open-weight models. There is no immediate substitute and building new models with meaningful community adoption has a lead time measured in 6+ months. In the time it takes to build a new domestic open-source ecosystem, countless researchers would’ve moved onto closed training platforms or into new areas. Altogether, I’m hoping this flurry of discussion around distillation becomes a nothing-burger and not a hasty, multi-pronged policy push. We need to avoid two things: * A wholesale negative connotation of the word distillation, which is used extensively across the AI ecosystem. * A domestic ban of the open-weight models built by organizations engaged in some portion of distillation. In addition to this, I want the leading U.S. AI companies to be able to provide their APIs without having their IP leak. They should share more information on why it is hard for them to secure their APIs, but that’s an issue out of scope for my expertise. I’ll conclude with a proposal from my friend Kevin Xu at Interconnected Capital (and great Substack) on why this current distillation dynamic may actually be good for the leading labs. If all the Chinese companies are addicted to distillation as a way of getting close to the frontier, then they’ll never actually learn the techniques needed to take an outright lead. If we cut off the Chinese’s obvious crutch in model building, we’ll gain a short-term lead in AI, but in the long-term that may be what they needed to get on a more competitive long-term trajectory. This is the same debate we’re having with other technologies where the U.S. currently has a lead, e.g. with advanced semiconductor technologies. So I understand the trade-offs, but we not should crack down on all of distillation. This is a public episode. If you'd like

9 min
15 Apr

My bets on open models, mid-2026

We’re living through the period of time when we’ll learn if open models can keep up with closed labs. The obvious answer is that no, they won’t. This answer is a form of saying they won’t keep up in every area. This framing closes off a popular prediction where the open models completely catch up, as in all models saturate and open and closed models only become increasingly similar. In living through this, it’s evidently very unclear when the longer-term stable balance of capabilities will solidify. This is a very complex dynamic, where the core point we monitor is a capability gap between models. At the same time, this gap is intertwined with evolving dynamics in the funding of open models, who builds open models, how techniques like distillation that enable fast-following translate through new application domains, potential regulation hampering the open-source AI ecosystem, and of course who actually uses open models. The capabilities gap is one signal in a complex sea of forces, pushing supply and demand into different shapes. In many cases the demand — where obviously tons of individuals, organizations, and sovereigns want, or need, open models — is largely separated from supply. Supply is fully dictated by economics. The question of “which business strategies support releasing open models” is still at stake. Interconnects AI is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber. With this complexity, I wanted to distill my key beliefs down into a clear list. These are downstream of 10+ pieces I’ve written or recorded on open models this spring (which are linked throughout). * It’s surprising that the top closed models did not show a growing capability margin over open models, based on compute differences for training and research, especially in the second half of 2025 and through today. * Open model labs are technically very strong at keeping pace on well-established benchmarks. This will continue and reflects a balance of abundant talent and sufficient computing power. * Chinese open-weight labs focus slightly more on benchmark scores than comparable closed labs in the U.S. Distillation helps the Chinese LLM companies do so, but it’s not a panacea. Changes in the distillation dynamic (e.g. regulation) will not be a determining factor on the balance of capabilities. This increase in focus is a natural evolution of their incentives in keeping the narrative on keeping up with the frontier alive, which is crucial to fundraising and adoption. * To date, closed models tend to be more robust and generally useful than similarly scoring open models. Closed models have certain hard-to-measure qualities that are not well captured in current or past benchmarks. This will be key to enabling closed models to dominate in markets where an individual user constantly presents new challenges, i.e. supporting knowledge workers as a direct assistant. * The open vs. closed model race, as monitored through benchmarks, will largely be a game of economic staying power and fast-following, until the market structure constricts. I expect Chinese open-weight labs to face funding difficulties first, as soon as later this year. Funding difficulties will be seen in different capability trajectories 3-9 months later. * The RL dominated training era has increased the relevance of distribution to real-world use-cases as a key factor in continued capabilities improvements. These are tasks where users directly use tools like Claude Code or Codex to solve problems in their job with agents. This is the first clear technical area that closed labs can dominate open-weight models on capabilities, potentially leveraging online RL directly based on user feedback. * Open models will be increasingly adopted in repetitive automation tasks, as measured in the relative share of the API market, for repetitive tasks across the ecosystem. This takes the form of many new AI-native applications, business backend automation, etc. The success of this will drive more investment in domain-specific, efficient open models. This is a complex picture, where the long-term trajectory is more of an economics question rather than an ability one. Many other outlets can paint a far more simplistic narrative that “China will assuredly catch us in AI” and get more distribution because it is a simple story. The reality is complex. Only real AI revenue begets more investment, eventually that’ll be linked to the ability to keep improving models at a rapid rate. Economic realities have not yet impacted scaling open models, as a general category. This economic-focused angle relates to my positions on the open model ecosystem more broadly. * Recurring calls to ban certain types of open models will continue to come but are in practice impossible to implement. Training strong AI models (i.e. near but not at the frontier) is a relatively small cost compared to large-scale deployments. E.g. if the U.S. bans open models over a certain compute threshold, another sovereign entity will eventually train them and release them publicly, with the models entering the U.S. market with less oversight. * The second derivative of influence on open models has shifted, and the U.S. will slowly regain ground in adoption metrics of open models starting in early 2027 (it takes a long time for China’s velocity to slow, then flip). Examples include Google’s Gemma 4 (a wild success), Nvidia’s Nemotron, and Arcee AI. * As ever-stronger closed models are built, previewed, and released, there will be more safety-shocks saying that open-weight versions of the strongest AI models never can be allowed to exist, similar to reactions to Claude Mythos. These can spur burdensome regulation on open models. * With the above, there will also be increased long-term interest in open models, as sovereign entities and existing power structures realize the coming, super powerful AI tools cannot land in the hands of only one or a few companies. These entities will see open models as a different governance paradigm. * New funding structures for open models will emerge, as many stakeholders realize dependencies on single, for-profit companies for access to intelligence are unreliable. * Local agents, OpenClaw, and other personal agents represent a large, to date, mostly ignored market for open model usage. It is a sort of dark matter, with pervasive, massive potential for influence on the balance of open-to-closed models. A single word governs this post and is intentionally repeated — complex. This complex reality has been driving me to think more deeply about how to clearly describe the open model gap, and why I can hold it in my head that I expect American closed labs to clearly draw ahead, despite the fairly unequivocal evidence in support of the capabilities of recent open-weight models. More on the nuance in the open-closed gap in another piece coming soon, so please subscribe! Let me know any positions that I missed. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

7 min
11 Apr

The inevitable need for an open model consortium

Recently, I was talking with Percy Liang, Stanford professor and lead of the Marin project (another fully-open model lab), and it set in on me that there will eventually be a consortium of companies funding a foundational set of open models used across industry. It’s not clear when this’ll emerge, and Nemotron (Coalition) is Nvidia’s attempt to bankroll and bootstrap this approach within a single wealthy company, but a consortium is the only long-term stable path to well-funded, near-frontier open models. In recent months, we’ve seen a lot of turnover in open model labs, with high-profile departures at Qwen and Ai2 (my comment). This shouldn’t be super surprising to followers of the ecosystem — it’s happened before with Meta shifting its focus away from Llama, and it’ll only happen more as the cost of trying to keep pace at the frontier of AI only increases. The other leading labs with models available today include Chinese startups such as Moonshot AI, MiniMax, and Z.ai — all of which look precarious on their ability to fund continued growth in the cost of training or R&D. Releasing one’s strongest models openly today is in active tension with the option of spending focus and resources on AI products that can currently generate meaningful revenue (and profits). We’re going to see business models emerge around releasing some, or even many, models openly, but these will largely be smaller models that enable a long-tail of functionality, rather than models at the absolute frontier. This class of companies that’ll release many, strong fine-tunable models will include the likes of Arcee AI, Thinking Machines, OpenAI, Google with Gemma, and more in that class. The cost and relative advantage of keeping the best models closed in a business environment with many opportunities for revenue are too high. To summarize — there will be an ever increasing number of companies releasing models that are good for creating a lively niche of smaller, custom models, but an ever decreasing number of companies willing to release fully open, near-frontier models. This is the core thesis of why I’m pushing hard for more people to do more research on how these smaller models can complement the best closed agents, the science of finetunability, etc. See my post below — it’s about creating a sustainable open model ecosystem, whether or not the frontier of open keeps paced with closed: It’ll take years for this equilibrium to become more obvious, seen through the lens of more open model families coming and going. This year, it seems likely we’ll see Nvidia’s Nemotron reach new heights, Reflection AI challenge some of the Chinese models with a strong, large MoE, maybe Meta releases a new open-weight model, and so on. True pressure to change strategy will only come when the capital environment punishes the less efficient spend on resources (e.g. giving away your competitive advantage, in having an in-house model). This pressure will likely hit Chinese startups training these models first. All of Moonshot AI, MiniMax, and Zhipu AI will show signs of financial challenge in the coming years if they retain their strategy, on top of their models falling further behind the best open models in terms of generality. This is inevitable pressure to evolve open models to areas that are profitable and complementary of the frontier of AI. Nvidia, which is best positioned to support the open ecosystem in the near term to support its core GPU business, could face many pressures to pull back its open model efforts. It could: * Realize it’s too competitive to their biggest customers as they succeed too much with Nemotron, * Fall to competition on their core business and lose the free cash flow buffer needed to fund this (e.g. it’s 2031 and OpenAI, Anthropic, Google, and the other frontier labs are worth so much they build their own chips). * Start succeeding beyond their initial goals and keep the chips for them to build ASI themselves, as a closed-weight model. The pressures for new funding mechanisms for open models are based on the assumptions of continued, substantive progress on the capabilities of frontier models. Mechanisms such as self-improvement and scaling all stages of the training pipeline are underway. This progress of capabilities will only increase the potential profit in selling models as and in products, not giving them away. The scale of investment required has already begun to push away non-profits from the game of making truly frontier-scale models. Capitalism is designed to make companies ruthless and chase down leads on profitability, not donate technology as charity. As the economic environment shifts companies away from releasing the strongest models openly, more companies that rely on these models will look for an outlet of securing model access into the future. This is going to be compounded by a growing group of companies who come to rely on open-weight models for their workflows. These points loop back into how model training is getting more expensive, so where desire to have the models will go up, ability to procure them will go down for many players. There are x-factors that could multiply the demand for institutions to ensure the existence of open models, such as the best frontier models not even being available via API (such as if Claude Mythos never goes general access). As training relevant models is shifting to cost billions of dollars, rather than millions, few companies well be able to afford it. many companies will bite at the cost of paying 1/10th of the cost to train a frontier model, or if the consortium works, 1/50th. The upside for companies will be some mechanism to steer development (e.g. model sizes) or getting early access to develop internal and open-source tooling for the model. It is in my nature to, by default, say this idea will fail, as training models is inherently a complex and high-focus endeavor, one that requires integration of every part of the stack and focusing specifically on your own vision and needs, rather than trying to serve every possible user. Eventually the need for open intelligence — and economic pressure to build it — will make a model consortium inevitable. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

6 min
9 Apr

Claude Mythos and misguided open-weight fearmongering

With the announcement of the Claude Mythos model this week and the admittedly very strong stated abilities, especially in cybersecurity, a new wave of anti open-weight AI model narratives surged. The TL;DR of the argument is that our digital infrastructure will not be ready in time for an open-weight version of this model, which will allow attacks to be conducted by numerous parties. The backlash against open models in the wake of the Mythos news conflates too many general unknowns into a simple, broad policy recommendation that could actually further weaken cybersecurity readiness. We’ve been here before – open-weight models were discussed as being extremely dangerous when OpenAI withheld GPT-2 weights in 2019, and when OpenAI released GPT-4 in 2023. Both of these waves came and went. The core mistake that is being made is the composition of two issues: 1) the acceptance of the open-closed model gap being static in time and 2) linking open-weight viability generally to specific issues. I’ve written at length recently on how I think that the best, frontier-level open weight models are going to fall behind the best closed models in overall capabilities in the near future. I’ve also written about how the open-weight ecosystem needs to adapt to accept this reality. This is one of the times for the AI industry where I will repeat that it’s a total blessing to have the 6-18 month delay from when a certain capability is available within a closed lab to it being reproduced in the open. It’s a good balance of safety and monitoring the frontier of AI systems while allowing a useful open-source ecosystem to exist and thrive. The core argument I’ve focused on in the open-closed model time gap has been in general capabilities – i.e. for general purpose, frontier models such as Claude Opus 4.X or GPT Thinking 5.X. The abilities of these closed models to robustly solve and work in diverse situations as agents remains out of scope of the best open-weight models. What the open-weight models have tended to be better at is quickly keeping pace on key benchmarks (which admittedly is helped to some extent, but not necessarily substantially by distillation). This discussion is entirely different, it has to do with if open weight models can keep pace on the specific skills related to cybersecurity, and when we could expect an open version of this model to be available to the world. The case of a Claude Mythos level open weight model is admittedly more nuanced to me than the previous few anti-open weight narratives the community has experienced. Where GPT-4 was about a more hypothetical risk, especially in areas like bio-risk, the clear and present reality of cyber infrastructure being prone to attack is far more tangible. Still, much of this nuance in the moment comes down to not knowing the full details of what the system can actually do (i.e. Mythos), and the state of the environment it would act in (i.e. our digital infrastructure). To properly assess this risk, we need to know what it takes to build and deploy a Claude Mythos scale model. This entails three pieces: 1) training and releasing the weights, 2) the harness that gives the model effective tools it knows how to use, and 3) the inference compute and software. (Below I make some model size & price estimates to show my thinking, these should not be taken as ground truth.) Current estimates put the size ranges of leading models like Claude Opus 4.6 or GPT 5.4 as being around 3-5T parameters. Currently, the largest open-source models, which have been coming from Chinese labs, are around 1T parameters. Claude Mythos’s preview pricing is 5X Opus, which could come from a simple multiplicative increase in active parameters (with the same serving system design), far higher inference-time scaling, more complex harnesses that make inference less efficient, lower utilization expectations, and so on. The simplest guess is that it’s a mix of all of the above, something like 2X bigger in parameters and much less efficient to serve. That’s a huge model, likely something similar to GPT 4.5, but actually post-trained well (GPT 4.5 was ahead of its time, infra-wise). With size comes the challenge actually training the model, as bigger models always come with new technical problems that must be solved to unlock the capabilities. For the case of cybersecurity, my guess is that most of the capabilities can be learned by training a model to be superhuman on coding. Unlike some capabilities such as knowledge work, medicine, law, etc., coding can be studied and improved substantially with public data like GitHub. I’m far more optimistic in open-weight models staying fairly close to the frontier in narrow domains of code execution and processing, but I don’t understand the full scope of skills needed to be superhuman in cybersecurity understanding. How much expert knowledge and special sauce went into training Claude Mythos? That’s a substantial source of my error bars on the impact. Second, we know nothing about how the model works under the hood. Today, models are complex systems that entail far more than just weights. They require complex tools and infrastructure to run them, of which Claude Code is the one we are most used to. Mythos very likely has its own innovations here. My estimate for how many GPUs you’d need to serve an 8T parameter, modern MoE is something like O(100) H100 GPUs, which costs something like $10K a day (and this may be very slow in terms of tok/s). Heck, the official marketing copy of the Nvidia GB200 VL72 system is “Unlocking Real-Time Trillion-Parameter Models” on the rack. Does Mythos fit on one rack? The point isn’t to rely on my specific estimate as a policy reference, but to repeat that running leading AI systems is very expensive and not something you can just do on a laptop or self-service cloud portals. There are far fewer actors who can get their hands on these resources, relative to those who can download the model. Of course, there are still many, but it’s important to flesh out all the details of what it would take to proliferate the capabilities of a Mythos-like model. In summary, tools like Mythos will make the best attackers have more powerful tools of the trade, but it won’t be handing a nuke to every teenager connected to the internet. Interconnects AI is a reader-supported publication. Consider becoming a subscriber. Personally, I do acknowledge there’s a chance that cybersecurity abuse is a red line that makes releasing open-weight text models above a certain capability threshold morally grey. Many people thought this red line would come far earlier, somewhere in between GPT-2 and GPT-4, through the harm axis of mis/disinformation, but that had different bottlenecks. For image generation models, we’re well past the first red line which is enabling non-consensual AI deepfakes with readily available open-weight models. We’re balancing the reality of these fears having come and gone before with a technology that’s becoming increasingly capable. So, my second large source of error bars is “how bad is it actually” with respect to the state of cybersecurity. How much can humans clean up in the most important software with months of private access to a model like Claude Mythos? What will never get fixed? For example, if we get open-weight models that are close to the capabilities of Claude Mythos, could those be fine-tuned by organizations to harden the security of their tools? Currently, it’s too soon to call it as a general reason to stop progress in open models. When Claude Mythos is closed to so few partners, in some ways having strong open models close to the threshold makes assessing the danger easier. Having to rely fully on a single private company to determine the security of essential, international infrastructure is not a tenable equilibrium. So, in conclusion, I urge people to further study three things: * How do we measure cybersecurity related capabilities across open and closed models. With this, are open models truly keeping up at a 6-9month lag, or are they only maintaining performance relevance in other areas of coding? * How do we independently measure the true impact of Claude Mythos and Project Glasswing on existing cybersecurity concerns? * If it is the case that the models are keeping up and the defensive capabilities of Claude Mythos are weak, how do we better monitor (and if needed, try to regulate) the targeted capabilities of open-weight models in narrow domains? The goal is to encourage fears about open models remaining very specific. Any general ban on open models in a nation will immediately and likely irrevocably remove that entity’s ability to influence a crucial, and amorphous technology. If we stop building the best open models in the U.S., then another country will do this and become the center of the technology. There’s no way to fully kill open models, only influencing, understanding, and steering. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit www.interconnects.ai/subscribe

9 min

See All (151)

Audio essays about the latest developments in AI and interviews with leading scientists in the field. Breaking the hype, understanding what's under the hood, and telling stories. www.interconnects.ai

Creator

Nathan Lambert
Years Active

2023 - 2026
Episodes

151
Rating

Clean
Show Website

Interconnects

Technology

Technology

Every two weeks
Technology

Technology

Updated weekly
Technology

Technology

Updated weekly
Technology

Technology

Updated 21 May
Technology

Technology

Updated daily
Technology

Technology

Updated weekly
Technology

Technology

Updated weekly

Interconnects

Farewell Ai2

Open and closed models are on different exponentials

Some ideas for what comes next, May 2026

Notes from inside China's AI labs

The distillation panic

My bets on open models, mid-2026

The inevitable need for an open model consortium

Claude Mythos and misguided open-weight fearmongering

About

Information

You Might Also Like

Interconnects

Episodes

Farewell Ai2

Open and closed models are on different exponentials

Some ideas for what comes next, May 2026

Notes from inside China's AI labs

The distillation panic

My bets on open models, mid-2026

The inevitable need for an open model consortium

Claude Mythos and misguided open-weight fearmongering

About

Information

You Might Also Like