Summary Microsoft has a habit of renaming things in ways that make people scratch their heads — “Fabric,” “OneLake,” “Lakehouse,” “Warehouse,” etc. In this episode, I set out to cut through the naming noise and show what actually matters under the hood: how data storage, governance, and compute interact in Fabric, without assuming you’re an engineer. We dig into how OneLake works as the foundation, what distinguishes a Warehouse from a Lakehouse, why Microsoft chose Delta + Parquet as the storage engine, and how shortcuts, governance, and workspace structure help (or hurt) your implementation. This isn’t marketing fluff — it’s the real architecture that determines whether your organization’s data projects succeed or collapse into chaos. By the end, you’ll be thinking less “What is Fabric?” and more “How can we use Fabric smartly?” — with a sharper view of trade-offs, pitfalls, and strategies. What You’ll Learn * The difference between Warehouse and Lakehouse in Microsoft Fabric * How OneLake acts as the underlying storage fabric for all data workloads * Why Delta + Parquet matter — not just as buzzwords, but as core guarantees (ACID, versioning, schema) * How shortcuts let you reuse data without duplication — and the governance risks involved * Best practices for workspace design, permissions, and governance layers * What to watch out for in real deployments (e.g. role mismatches, inconsistent access paths) Full Transcript Here’s a fun corporate trick: Microsoft managed to confuse half the industry by slapping the word “house” on anything with a data label. But here’s what you’ll actually get out of the next few minutes: we’ll nail down what OneLake really is, when to use a Warehouse versus a Lakehouse, and why Delta and Parquet keep your data from turning into a swamp of CSVs. That’s three concrete takeaways in plain English. Want the one‑page cheat sheet? Subscribe to the M365.Show newsletter. Now, with the promise clear, let’s talk about Microsoft’s favorite game: naming roulette. Lakehouse vs Warehouse: Microsoft’s Naming Roulette When people first hear “Lakehouse” and “Warehouse,” it sounds like two flavors of the same thing. Same word ending, both live inside Fabric, so surely they’re interchangeable—except they’re not. The names are what trip teams up, because they hide the fact that these are different experiences built on the same storage foundation. Here’s the plain breakdown. A Warehouse is SQL-first. It expects structured tables, defined schemas, and clean data. It’s what you point dashboards at, what your BI team lives in, and what delivers fast query responses without surprises. A Lakehouse, meanwhile, is the more flexible workbench. You can dump in JSON logs, broken CSVs, or Parquet files from another pipeline and not break the system. It’s designed for engineers and data scientists who run Spark notebooks, machine learning jobs, or messy transformations. If you want a visual, skip the sitcom-length analogy: think of the Warehouse as a labeled pantry and the Lakehouse as a garage with the freezer tucked next to power tools. One is organized and efficient for everyday meals. The other has room for experiments, projects, and overflow. Both store food, but the vibe and workflow couldn’t be more different. Now, here’s the important part Microsoft’s marketing can blur: neither exists in its own silo. Both Lakehouses and Warehouses in Fabric store their tables in the open Delta Parquet format, both sit on top of OneLake, and both give you consistent access to the underlying files. What’s different is the experience you interact with. Think of Fabric not as separate buildings, but as two different rooms built on the same concrete slab, each furnished for a specific kind of work. From a user perspective, the divide is real. Analysts love Warehouses because they behave predictably with SQL and BI tools. They don’t want to crawl through raw web logs at 2 a.m.—they want structured tables with clean joins. Data engineers and scientists lean toward Lakehouses because they don’t want to spend weeks normalizing heaps of JSON just to answer “what’s trending in the logs.” They want Spark, Python, and flexibility. So the decision pattern boils down to this: use a Warehouse when you need SQL-driven, curated reporting; use a Lakehouse when you’re working with semi-structured data, Spark, and exploration-heavy workloads. That single sentence separates successful projects from the ones where teams shout across Slack because no one knows why the “dashboard” keeps choking on raw log files. And here’s the kicker—mixing up the two doesn’t just waste time, it creates political messes. If management assumes they’re interchangeable, analysts get saddled with raw exports they can’t process, while engineers waste hours building shadow tables that should’ve been Lakehouse assets from day one. The tools are designed to coexist, not to substitute for each other. So the bottom line: Warehouses serve reporting. Lakehouses serve engineering and exploration. Same OneLake underneath, same Delta Parquet files, different optimizations. Get that distinction wrong, and your project drags. Get it right, and both sides of the data team stop fighting long enough to deliver something useful to the business. And since this all hangs on the same shared layer, it raises the obvious question—what exactly is this OneLake that sits under everything? OneLake: The Data Lake You Already Own Picture this: you move into a new house, and surprise—there’s a giant underground pool already filled and ready to use. That’s what OneLake is in Fabric. You don’t install it, you don’t beg IT for storage accounts, and you definitely don’t file a ticket for provisioning. It’s automatically there. OneLake is created once per Fabric tenant, and every workspace, every Lakehouse, every Warehouse plugs into it by default. Under the hood, it actually runs on Azure Data Lake Storage Gen2, so it’s not some mystical new storage type—it’s Microsoft putting a SaaS layer on top of storage you probably already know. Before OneLake, each department built its own “lake” because why not—storage accounts were cheap, and everyone believed their copy was the single source of truth. Marketing had one. Finance had one. Data science spun one up in another region “for performance.” The result was a swamp of duplicate files, rogue pipelines, and zero coordination. It was SharePoint sprawl, except this time the mistakes showed up in your Azure bill. Teams burned budget maintaining five lakes that didn’t talk to each other, and analysts wasted nights reconciling “final_v2” tables that never matched. OneLake kills that off by default. Think of it as the single pool everyone has to share instead of each team digging muddy holes in their own backyards. Every object in Fabric—Lakehouses, Warehouses, Power BI datasets—lands in the same logical lake. That means no more excuses about Finance having its “own version” of the data. To make sharing easier, OneLake exposes a single file-system namespace that stretches across your entire tenant. Workspaces sit inside that namespace like folders, giving different groups their place to work without breaking discoverability. It even spans regions seamlessly, which is why shortcuts let you point at other sources without endless duplication. The small print: compute capacity is still regional and billed by assignment, so while your OneLake is global and logical, the engines you run on top of it are tied to regions and budgets. At its core, OneLake standardizes storage around Delta Parquet files. Translation: instead of ten competing formats where every engine has to spin its own copy, Fabric speaks one language. SQL queries, Spark notebooks, machine learning jobs, Power BI dashboards—they all hit the same tabular store. Columnar layout makes queries faster, transactional support makes updates safe, and that reduces the nightmare of CSV scripts crisscrossing like spaghetti. The structure is simple enough to explain to your boss in one diagram. At the very top you have your tenant—that’s the concrete slab the whole thing sits on. Inside the tenant are workspaces, like containers for departments, teams, or projects. Inside those workspaces live the actual data items: warehouses, lakehouses, datasets. It’s organized, predictable, and far less painful than juggling dozens of storage accounts and RBAC assignments across three regions. On top of this, Microsoft folds in governance as a default: Purview cataloging and sensitivity labeling are already wired in. That way, OneLake isn’t just raw storage, it also enforces discoverability, compliance, and policy from day one without you building it from scratch. If you’ve lived the old way, the benefits are obvious. You stop paying to store the same table six different times. You stop debugging brittle pipelines that exist purely to sync finance copies with marketing copies. You stop getting those 3 a.m. calls where someone insists version FINAL_v3.xlsx is “the right one,” only to learn HR already published FINAL_v4. OneLake consolidates that pain into a single source of truth. No heroic intern consolidating files. No pipeline graveyard clogging budgets. Just one layer, one copy, and all the engines wired to it. It’s not magic, though—it’s just pooled storage. And like any pool, if you don’t manage it, it can turn swampy real fast. OneLake gives you the centralized foundation, but it relies on the Delta format layer to keep data clean, consistent, and usable across different engines. That’s the real filter that turns OneLake into a lake worth swimming in. And that brings us to the next piece of the puzzle—the unglamorous technology that keeps that water clear in the first place. Delta and