Semi Doped

Vikram Sekar and Austin Lyons

5.0 (17)
TECHNOLOGY

The business and technology of semiconductors. Alpha for engineers and investors alike.

4D AGO

Lithography Masterclass

Spend one hour here and you've caught up on the entire arc of semiconductor lithography. Austin and Vik run a masterclass on the technology that decides who gets to make leading-edge chips, and why so few companies can afford to. The thread is economics. An EUV machine runs about $400 million, a new fab needs roughly 15 of them, and the total bill clears $20-30 billion before a single wafer ships. Austin and Vik trace the whole story: Rock's Law and the cost of a fab, what it actually takes to build one, the evolution from 193nm DUV through multi-patterning to 13.5nm EUV, how ASML generates EUV light by exploding falling tin droplets, and the move to high NA and its mirrors. Along the way, the fun history — i-line, krypton fluoride, immersion lithography, and the engineer who started it all by flipping a microscope upside down. Then the part that matters most: where lithography goes next. Two startups, xLight and Substrate, are attacking the cost problem from first principles. xLight wants to decouple the light source from the scanner with a free-electron laser and sell photons as a service. Substrate wants to skip EUV entirely and revive X-ray lithography. If either works, the economics of who can build a fab change completely. Chapters: 0:00 The 13F panic, and today's topic 2:23 Why the real story is economics, not physics 6:18 Austin in the clean room: graphene and bunny suits 10:06 Rock's Law and the $20 billion fab 18:08 DUV, the Sharpie, and a history of light 24:58 Multi-patterning, explained with a football field 34:45 How EUV makes 13.5nm light from tin droplets 41:14 High NA, anamorphic optics, and the half-field tax 46:45 The startups rethinking lithography: xLight and Substrate Relevant reading: Chipstrat — The economics of lithography: https://www.chipstrat.com/p/lithography-economics Chipstrat — xLight and photons as a service: https://www.chipstrat.com/p/photons-as-a-service Chipstrat — Substrate and X-ray lithography: https://www.chipstrat.com/p/substrate Vik's Newsletter — the viability of X-ray lithography: https://www.viksnewsletter.com/p/an-in-depth-look-at-the-viability Fred Chen — LELE multipatterning and EUV stochastics (Substack): https://frederickchen.substack.com/p/can-lele-multipatterning-help-against Chip War, Chris Miller Focus, Marc Hijink (the ASML book): https://www.amazon.com/Focus-Inside-struggle-complex-machine-ebook/dp/B0CW1FLCD4 Follow Chipstrat: Newsletter: https://www.chipstrat.com X: https://x.com/chipstrat Follow Vik: Newsletter: https://www.viksnewsletter.com/ X: https://x.com/vikramskr Follow Semi Doped: Get more of Austin and Vik daily, free! Sign up: https://www.semidoped.com/

1h 3m
MAY 15

Cerebras IPO

Cerebras IPO is the only thing to talk about this week. 🔥 IPO prices at $185/share. Pops nearly 70% right after. The first wafer-scale chip company to make it public — after a 40-year curse killed every prior attempt. A water-cooler-style convo on what Cerebras actually builds, why a 23 kW wafer is a power and cooling nightmare, why 44 GB of SRAM is both the magic and the wall for LLM inference, and the cursed Trilogy Systems saga that Gene Amdahl tried — and failed — to pull off in 1983. Why does Cerebras leave the whole wafer intact instead of dicing it? How do they route around defects to harvest ~900K working cores out of ~1M? Why is power delivery vertical, and why does the wafer literally expand a tenth of a millimeter when it heats up? What does the OpenAI deal actually buy — wafers, or tokens? And why does that distinction matter? Chapters: 0:00 Cold open: 23 kW per wafer 0:15 Cerebras IPO day at $185 2:39 What's a wafer-scale engine 10:30 Power, cooling, and thermal expansion 18:12 The 44 GB wall 26:35 The Trilogy Systems curse 32:11 Supercomputing → training → inference 39:36 The OpenAI deal and the Wild West Relevant reading: Vik's Substack post on the Cerebras IPO and OpenAI deal: https://www.viksnewsletter.com/ Follow Chipstrat: Newsletter: https://www.chipstrat.com X: https://x.com/austinsemis Follow Vik: Newsletter: https://www.viksnewsletter.com/ X: https://x.com/vikramskr Follow Semi Doped: Get more of Austin and Vik daily, free! Sign up: https://www.semidoped.com/

51 min
MAY 12

Gimlet's Cross-Vendor Inference Cloud

Gimlet Labs runs an inference cloud built on heterogeneous silicon. Their software traces a PyTorch workload, segments it into its component parts, and schedules each piece onto the best-suited hardware — connecting chips from different vendors on a single high-speed fabric. In this interview, Gimlet co-founder Natalie Serrino and former Intel executive Beltir walk through the architecture (graph trace, optimal split points, lowering each segment to TensorRT on NVIDIA and equivalents elsewhere), the three customer segments they sell into (frontier labs, sovereign clouds, AI natives), and a concrete demo: on GPT-OSS 120B at 8K input / 1K output, running the speculative decoder on a d-Matrix Corsair card while NVIDIA B200s handle the verifier shifts the throughput-vs-interactivity Pareto frontier roughly 4× over GPU-only speculative decode. The most surprising takeaway: most Neoclouds gave significant equity to a single silicon vendor in exchange for capacity. Hardware amortization is around 70% of their annual costs, and the equity terms prevent them from diversifying their silicon. So the only software innovation they can ship is disaggregation on top of one vendor's stack — never across vendors. Gimlet's two-track model (deploying orchestration software inside customer data centers, plus running their own Neocloud built on mixed silicon) is the answer to that constraint. Read the full transcript on Chipstrat. Chapters: 0:00 Intro and the chips no one's connected before 0:33 Inference cloud for agents 1:02 From Intel to Gimlet 2:14 The case for heterogeneous inference 4:03 Disaggregating inference by resource profile 6:24 Tracing PyTorch into a schedulable graph 8:08 Connecting chips never connected before 10:52 CPUs as the agentic workhorse 12:01 Tool calls in the same data center as the LLM 13:21 Latency vs throughput on a shared fabric 14:57 Three customer buckets 15:54 Sovereigns: make an API call, not a porting project 19:37 "Cracked software is the platform" 22:24 Why merchant silicon vendors need partners 25:18 Hyperscalers outsourcing CapEx, not just kernels 28:49 AI natives: latency budgets, not just price 32:06 The d-Matrix partnership 33:31 The Pareto frontier chart 35:56 Speculative decode on Corsair: 4× shift 37:27 4× faster, or 3× more customers? 41:22 Why most Neoclouds can't follow this model 42:34 Gimlet's two-track business model 44:30 CoreWeave vs Together vs Gimlet 45:15 Series A and hiring Relevant reading: The Information on Gimlet helping OpenAI optimize for Cerebras: https://www.theinformation.com/newsletters/ai-agenda/startup-helping-openai-optimize-ai-cerebras-chips Sachin Katti and Zain Asgar coauthored research at Stanford: https://arxiv.org/abs/2507.19635 Follow Chipstrat: Newsletter: https://www.chipstrat.com X: https://x.com/chipstrat

49 min
MAY 8

Power as the Next Physics Wall for AI

What's common to optics and power that ruins everything in the era of AI? Resistance. The same physics that drove interconnects to optics is now driving low-voltage power delivery up to 800V. Austin Lyons (Chipstrat) and Vik Sekar (Vik's Newsletter) unpack it using the Kyber rack as an example. At 600kW and 48V, you're pushing 12,500 amps through a single rack. Power loss scales with I². The math doesn't work. The fix is 800V — and the parts come straight from the EV traction inverter ecosystem (SiC, GaN, IGBTs). We cover the full grid-to-GPU power conversion chain (substation, utility room, PSU, intermediate bus converter, VRM), why vertical power delivery is the CPO equivalent for power, and why the power industry is a much wider open problem than optics or HBM. Plus the new topology fight: 800V → 48V (reuse the existing 48V infrastructure) vs 800V → 6V (skip 48V entirely, like TI and Navitas are pushing). We also touch Coherent's six-inch indium phosphide ramp at Järfälla, Sweden, and why margins are the real read-through next quarter. Relevant reading: Vik's Substack post on power: https://www.viksnewsletter.com/p/power-delivery-as-the-next-physics-wall Google TPU 8i / 8t blog (Boardfly deep dive): https://cloud.google.com/blog/products/compute/tpu-8t-and-tpu-8i-technical-deep-dive Get more of Austin and Vik daily, free! Sign up here: https://www.semidoped.com/ Follow Chipstrat: Newsletter: https://www.chipstrat.com X: https://x.com/austinsemis Follow Vik: Newsletter: https://www.viksnewsletter.com/ X: https://x.com/vikramskr Chapters (00:00) Intro (01:41) Memory tax: inflation, not innovation (03:46) Boardfly: 16 hops to 7 (05:12) Coherent's six-inch indium phosphide ramp (12:15) Power is the next physics wall (15:08) Why 48V breaks at 600kW: 12,500 amps (23:05) 800V and vertical power delivery: CPO for power (30:34) Grid to GPU: every stage is a different supply chain (39:20) 800V → 48V or skip straight to 6V?

42 min
MAY 4

CapEx is just Memory Tax Now, Deepseek V4 NAND impact

The hyperscaler memory tax quarter. More CapEx? Pssh. We knew flops needed scaling. But $25B at Microsoft alone just to pay higher component prices? A memory tax. That's the news. NAND? Sold out. HBM? Sold out. What we cover: SanDisk revenue +97% sequential.78% gross margin. Guidance above 80% next quarter.Samsung HBM4 first to ship. Demand outstripping supply.DeepSeek v4 goes SSD-centric. KV cache offloads to flash.Microsoft: $25B of 2026 CapEx is just memory pricing.Jassy: memory shortage pushes on-prem to AWS.Qualcomm: mystery custom ASIC. Ships December.New Semi Doped with @vikramskr and @austinsemis. Check out our Substacks - https://www.viksnewsletter.com/ - https://www.chipstrat.com/ Chapters: 0:00 Intro and Vik goes full-time 5:15 Earnings week: the memory tax 7:26 Samsung HBM4 and the Gbps race 14:42 Is the memory tax worth it? 17:37 SanDisk and the SunDisk origin 23:22 78% gross margins and 5-year supply lock-ins 29:29 DeepSeek v4 and SSD-centric inference 38:49 Hyperscaler CapEx and the cloud pull 42:49 AI accelerators: TPU, Trainium, MTIA

46 min
APR 24

Masterclass on Google's TPU v8 Networking

Google's Cloud Next 2026 keynote? Fire. 🔥 The TPU is now two chips instead of one — 8t for training, 8i for inference — but more interestingly, it's two scale-up networking topologies too. Austin Lyons (Chipstrat) and Vik Sekar (Vik's Newsletter) walk through what actually changed, one day after the announcement. OCS? Yes. AECs? Yep. Copper? Yep. Optics? Yep. We cover Virgo (Google's 47 petabit/second scale-out fabric, built entirely on OCS), Boardfly (the new scale-up topology for MoE inference that cuts hop count from 16 to 7), and the 3D torus Google still uses for training. Why is optical circuit switching the substrate of Google's data center? Why do active electrical cables still carry scale-up traffic inside racks? Why did Google split the CPU layer too, with custom ARM Axion head nodes to keep the TPUs fed? Along the way we trace the Dragonfly topology lineage to a 2008 paper by John Kim, Bill Dally, Steve Scott, and Dennis Abts. Abts went on to build Groq's rack-scale interconnect before landing at Nvidia. Chapters: 0:00 Intro 0:21 Two TPUs for two workloads 2:31 HBM, SRAM, and Axion CPUs 7:22 Why networking is the new bottleneck 17:14 Virgo: rebuilding scale-out on optics 25:24 3D torus Rubik's Cube scale-up for training 34:50 Boardfly: scale-up for MoE inference 42:07 Workload-specific everything Follow Chipstrat: Newsletter: https://www.chipstrat.com X: https://x.com/austinsemis Follow Vik: Newsletter: https://www.viksnewsletter.com/ X: https://x.com/vikramskr

47 min
APR 20

Meta VP Matt Steiner on Ads Infra, GPUs, MTIA, and LLM-Written Kernels

Matt Steiner, VP of Monetization Infrastructure, Ranking & AI Foundations at Meta, walks through how Meta's ad system actually works, and why the infrastructure behind it differs from what you'd build for LLMs. We cover Andromeda (retrieval on a custom NVIDIA Grace Hopper SKU Meta co-designed), Lattice (consolidating N ranking models into one), GEM (Meta's Generative Ads Recommendation foundation model), and the adaptive ranking model, a roughly one-trillion-parameter recommender served at sub-second latency. We get into why recommender workloads aren't embarrassingly parallel like LLMs (the "personalization blob"), what that means for Meta's MTIA custom silicon roadmap, and how LLM-written kernels (KernelEvolve) flipped the economics of running a heterogeneous hardware fleet. Demand for software engineering has actually gone up as the price has come down. Meta now wants ~100x more optimized kernels per chip. Read the full transcript at https://www.chipstrat.com/p/an-interview-with-meta-vp-matt-steiner Chapters: 0:00 Intro and scale 0:39 How Meta's ad system works 2:00 Meta Andromeda and the custom NVIDIA SKU 3:30 Lattice: consolidating ranking models 5:00 GEM, Meta's ads foundation model 6:30 Adaptive ranking for power users 8:17 The scale: 3B DAUs at sub-second latency 9:40 Why longer interaction histories matter 10:45 The anniversary gift analogy 12:57 A decade of compute evolution 15:21 Meta's infra as a CP-SAT problem 16:07 Co-designing Grace Hopper with NVIDIA 17:47 Matching compute shape to workload 18:26 Influencing hardware and software roadmaps 20:23 MTIA: why ads aren't LLMs 22:07 The personalization blob and I/O ratios 26:38 One trillion parameters at sub-second latency 28:26 Heterogeneous hardware trade-offs 29:30 KernelEvolve: LLMs writing custom kernels 33:30 GenAI and recommender systems cross-pollination 35:21 The 2-year infrastructure outlook 37:00 Why demand for software engineering is rising 38:53 How Matt stays on top of it all Relevant reading: KernelEvolve (Meta Engineering): https://engineering.fb.com/2026/04/02/developer-tools/kernelevolve-how-metas-ranking-engineer-agent-optimizes-ai-infrastructure/ Follow Chipstrat: Newsletter: https://www.chipstrat.com X: https://x.com/chipstrat

40 min
APR 17

Credo + Dust Photonics, XPO, Nuvacore

Austin and Vik discuss Credo's acquisition of Dust Photonics, XPO as the new standard for scale-out (maybe instead of CPO?) and some thoughts about Nuvacore entering the CPU scene for agentic AI. Gavin Baker's tweet: https://x.com/GavinSBaker/status/2044410644301046031?s=20 Vik's Substack: https://www.viksnewsletter.com Austin's Substack: https://www.chipstrat.com Chapters 00:00 Introduction to the Semiconductor Landscape 02:49 The Rise of Nuvacore and CPU Innovations 05:27 The Demand for CPUs in the AI Era 07:59 Photonics: The Next Frontier in Semiconductors 10:26 Credo's Acquisition of Dust Photonics 13:12 Vertical Integration in Semiconductor Companies 15:15 The Future of Copper and Optical Technologies 20:28 The Evolution of AI Training Models 25:28 Innovations in Optical Interconnects 31:10 The Future of Data Center Connectivity 36:56 Strategic Implications in the Optical Ecosystem

38 min

See All (30)

out of 5

17 Ratings

Amazing Amazing! 🩶👾

Feb 21

Overclocked Espresso

Semi Doped has quickly become my absolute favorite due to its insightful and well-researched content, including detailed interviews with industry experts and comprehensive analysis of current trends. I always learn something new from every episode, thanks to the clear explanations that make even the most complex topics accessible and understandable. Thanks so much, Vik and Austin, for all the hard work! Your effort truly means a lot. 🚀🚀

The business and technology of semiconductors. Alpha for engineers and investors alike.

Creator

Vikram Sekar and Austin Lyons
Years Active

2K
Episodes

30
Rating

Clean
Show Website

Semi Doped

Technology

Technology

Updated Apr 21
Investing

Investing

Updated Weekly
Technology

Technology

Updated Semiweekly
Investing

Investing

Updated 1d ago
Investing

Investing

Updated Weekly
Technology

Technology

Updated Semiweekly
Investing

Investing

Updated Weekly

Semi Doped

Lithography Masterclass

Cerebras IPO

Gimlet's Cross-Vendor Inference Cloud

Power as the Next Physics Wall for AI

CapEx is just Memory Tax Now, Deepseek V4 NAND impact

Masterclass on Google's TPU v8 Networking

Meta VP Matt Steiner on Ads Infra, GPUs, MTIA, and LLM-Written Kernels

Credo + Dust Photonics, XPO, Nuvacore

Ratings & Reviews

Amazing Amazing! 🩶👾

About

Information

You Might Also Like

Semi Doped

Episodes

Lithography Masterclass

Cerebras IPO

Gimlet's Cross-Vendor Inference Cloud

Power as the Next Physics Wall for AI

CapEx is just Memory Tax Now, Deepseek V4 NAND impact

Masterclass on Google's TPU v8 Networking

Meta VP Matt Steiner on Ads Infra, GPUs, MTIA, and LLM-Written Kernels

Credo + Dust Photonics, XPO, Nuvacore

Ratings & Reviews

About

Information

You Might Also Like