Arcee AI is a the startup I’ve found to be taking the most real approach to monetizing their open models. With a bunch of experience (and revenue) in the past in post-training open models for specific customer domains, they realized they needed to both prove themselves and fill a niche by pretraining larger, higher performance open models built in the U.S.A. They’re a group of people that are most eagerly answering my call to action for The ATOM Project, and I’ve quickly become friends with them. Today, they’re releasing their flagship model — Trinity Large — as the culmination of this pivot. In anticipation of this release, I sat down with their CEO Mark McQuade, CTO Lucas Atkins, and pretraining lead, Varun Singh, to have a wide ranging conversation on: * The state (and future) of open vs. closed models, * The business of selling open models for on-prem deployments, * The story of Arcee AI & going “all-in” on this training run, * The ATOM project, * Building frontier model training teams in 6 months, * and other great topics. I really loved this one, and think you well too. The blog post linked above and technical report have many great details on training the model that I’m still digging into. One of the great things Arcee has been doing is releasing “true base models,” which don’t contain any SFT data or learning rate annealing. The Trinity Large model, an MoE with 400B total and 13B active tokens trained to 17 trillion tokens is the first publicly shared training run at this scale on B300 Nvidia Blackwell machines. As a preview, they shared the scores for the underway reasoning model relative to the who’s-who of today’s open models. It’s a big step for open models built in the U.S. to scale up like this. I won’t spoil all the details, so you still listen to the podcast, but their section of the blogpost on cost sets the tone well for the podcast, which is a very frank discussion on how and why to build open models: When we started this run, we had never pretrained anything remotely like this before. There was no guarantee this would work. Not the modeling, not the data, not the training itself, not the operational part where you wake up, and a job that costs real money is in a bad state, and you have to decide whether to restart or try to rescue it. All in—compute, salaries, data, storage, ops—we pulled off this entire effort for $20 million. 4 Models got us here in 6 months. That number is big for us. It’s also small compared to what frontier labs spend just to keep the lights on. We don’t have infinite retries. Once I post this, I’m going to dive right into trying the model, and I’m curious what you find too. Listen on Apple Podcasts, Spotify, YouTube, and where ever you get your podcasts. For other Interconnects interviews, go here. Guests Lucas Atkins —X,LinkedIn — CTO; leads pretraining/architecture, wrote the Trinity Manifesto. Mark McQuade — X, LinkedIn — Founder/CEO; previously at Hugging Face (monetization), Roboflow. Focused on shipping enterprise-grade open-weight models + tooling. Varun Singh — LinkedIn — pretraining lead. Most of this interview is conducted with Lucas, but Mark and Varun make great additions at the right times. Links Core: * Trinity Large (400B total, 13B active) collection, blog post. Instruct model today, reasoning models soon. * Trinity Mini, 26B total 3B active (base, including releasing pre-anneal checkpoint) * Trinity Nano Preview, 6B total 1B active (base) * Open Source Catalog: https://www.arcee.ai/open-source-catalog * API Docs and Playground (demo) * Socials: GitHub, Hugging Face, X, LinkedIn, YouTube Trinity Models: * Trinity models page: https://www.arcee.ai/trinity * The Trinity Manifesto (I recommend you read it): https://www.arcee.ai/blog/the-trinity-manifesto * Trinity HF collection — (Trinity Mini & Trinity Nano Preview) Older models: * AFM-4.5B (and base model) — their first open, pretrained in-house model (blog post). * Five open-weights models (blog): three production models previously exclusive to their SaaS platform plus two research models, released as they shifted focus to AFM — Arcee-SuperNova-v1, Virtuoso-Large, Caller, GLM-4-32B-Base-32K, Homunculus Open source tools: * MergeKit — model merging toolkit (LGPL license return) * DistillKit — knowledge distillation library * EvolKit — synthetic data generation via evolutionary methods Related: * Datology case study w/ Arcee Chapters * 00:00:00 Intro: Arcee AI, Trinity Models & Trinity Large * 00:08:26 Transitioning a Company to Pre-training * 00:13:00 Technical Decisions: Muon and MoE * 00:18:41 Scaling and MoE Training Pain * 00:23:14 Post-training and RL Strategies * 00:28:09 Team Structure and Data Scaling * 00:31:31 The Trinity Manifesto: US Open Weights * 00:42:31 Specialized Models and Distillation * 00:47:12 Infrastructure and Hosting 400B * 00:50:53 Open Source as a Business Moat * 00:56:31 Predictions: Best Model in 2026 * 01:02:29 Lightning Round & Conclusions Transcript Transcript generated with ElevenLabs Scribe v2 and cleaned with Claude Code with Opus 4.5. 00:00:06 Nathan Lambert: I’m here with the Arcee AI team. I personally have become a bit of a fan of Arcee, ‘cause I think what they’re doing in trying to build a company around building open models is a valiant and very reasonable way to do this, ‘cause nobody really has a good business plan for open models, and you just gotta try to figure it out, and you gotta build better models over time. And like open-source software, building in public, I think, is the best way to do this. So this kind of gives you the wheels to get the, um... You get to hit the ground running on whatever you’re doing. And this week, they’re launching their biggest model to date, which I’m very excited to see more kind of large-scale MoE open models. I think we’ve seen, I don’t know, at least ten of these from different providers from China last year, and it’s obviously a thing that’s gonna be international, and a lot of people building models, and the US kind of, for whatever reason, has fewer people building, um, open models here. And I think that wherever people are building models, they can stand on the quality of the work. But whatever. I’ll stop rambling. I’ve got Lucas, Mark, um, Varun on the, on the phone here. I’ve known some of them, and I consider us friends. We’re gonna kind of talk through this model, talk through building open models in the US, so thanks for hopping on the pod. 00:01:16 Mark McQuade: Thanks for having us. 00:01:18 Lucas Atkins: Yeah, yeah. Thanks for having us. Excited. 00:01:20 Varun Singh: Nice to be here. 00:01:20 Nathan Lambert: What- what should people know about this Trinity Large? What’s the actual name of this model? Like, how stoked are you? 00:01:29 Lucas Atkins: So to- yeah. 00:01:29 Nathan Lambert: Like, are you, like, finally made it? 00:01:32 Lucas Atkins: Uh, you know, we’re recording this a little bit before release, so it’s still like, you know, getting everything buttoned up, and inference going at that size is always a challenge, but we’re-- This has been, like, a six-month sprint since we released our first dense model, which is 4.5B, uh, in, in July of last year, 2025. So, um, it’s always been in service of releasing large. I- it’s a 400B, um, thirteen billion active sparse MoE, and, uh, yeah, we’re, we’re super excited. This has just been the entire thing the company’s focused on the last six months, so really nice to have kind of the fruits of that, uh, start to, start to be used by the people that you’re building it for. 00:02:16 Nathan Lambert: Yeah, I would say, like, the realistic question: do you think this is landing in the ballpark of the models in the last six months? Like, that has to be what you shop for, is there’s a high bar- ... of open models out there and, like, on what you’re targeting. Do you feel like these hit these, and somebody that’s familiar, or like MiniMax is, like, two thirty total, something less. I, I don’t know what it is. It’s like ten to twenty B active, probably. Um, you have DeepSeeks in the six hundred range, and then you have Kimi at the one trillion range. So this is still, like, actually on the smaller side of some of the big MoEs- ... that people know, which is, like, freaking crazy, especially you said 13B active. It’s, like- ... very high on the sparsity side. So I don’t actually know how you think about comparing it among those. I was realizing that MiniMax is smaller, doing some data analysis. So I think that it’s like, actually, the comparison might be a little bit too forced, where you just have to make something that is good and figure out if people use it. 00:03:06 Lucas Atkins: Yeah, I mean, if, if from raw compute, we’re, we’re roughly in the middle of MiniMax and then GLM 4.5, as far as, like, size. Right, GLM’s, like, three eighty, I believe, and, and thirty-four active. Um, so it-- you know, we go a little bit higher on the total, but we, we cut the, uh, the active in half. Um, it was definitely tricky when we decided we wanted to do this. Again, it was July when... It, it was July when we released, uh, the dense model, and then we immediately knew we wanted to kind of go, go for a really big one, and the, the tricky thing with that is knowing that it’s gonna take six months. You, you can’t really be tr-- you can’t be building the model to be competitive when you started designing it, because, you know, that, obviously, a lot happens in this industry in six months. So, um, when we threw out pre-training and, and a lot of our targets were the GLM 4.5 base model, um, because 4.6 and 4.7 have been, you know, post-training on top of that. Um, and, like, in performance-wise, it’s well within where we want it to be. Um, it’s gonna be... Technically, we’re calling it Trinity Large Preview because we just have a