DEV

Eric Lamanna

Software and AI development podcast. We cover all things software development, including today's advanced AI development tricks and techniques.

  1. 22h ago

    C++ in 2026: Why the 40-Year-Old Language Still Dominates High Performance

    Every few years, a new language is crowned the future of systems programming. Yet when the stakes are highest — financial systems measured in microseconds, medical devices where latency is a safety concern, or AI backends crunching tensors at scale — engineering teams keep reaching for C++. This episode of Development examines the case for C++ as a top choice for high-performance software in 2025, unpacking why four decades of evolution have made the language more relevant, not less. The episode covers a lot of ground for anyone weighing C++ against newer alternatives — or trying to make sense of why legacy-looking code still powers cutting-edge infrastructure: Raw performance fundamentals: Native machine-code compilation, zero garbage-collector pauses, and direct control over memory layout give C++ a ceiling that managed runtimes can't match — especially critical when cache behavior is the real bottleneck.A dramatically safer modern toolchain: AddressSanitizer, ThreadSanitizer, static analyzers, and the C++ Core Guidelines have quietly transformed the language's safety profile, making accidental footguns far harder to fire than the language's reputation implies.Modern C++ looks nothing like the textbooks: Smart pointers, move semantics, concepts, ranges, and coroutines — features introduced from C++11 through C++23 — push the language toward clean, expressive code without sacrificing performance.Five domains where C++ is essentially irreplaceable: High-frequency trading, gaming and XR, autonomous systems and robotics, scientific computing and AI infrastructure (the C++ backends behind Python's ML fame), and 5G telecom and edge computing.A maturing ecosystem: Package managers like Conan 3 and vcpkg, build systems like Buck2, and interoperability layers like pybind11 mean teams no longer have to choose between C++ performance and modern developer ergonomics.The talent and standards pipeline: Universities still teach low-level computing through C++, CppCon and related communities remain active, and the standards committee is already working on reflection, pattern matching, and safer concurrency for future releases.The episode closes with a reframe worth keeping: the smart question in 2025 isn't why teams are still using C++, but whether their requirements justify anything else. If you enjoyed this one, the show has also tackled the closest rival head-on — check out the episode C++ vs. Rust: Choosing the Right Language for Systems-Level Development for a direct comparison that complements everything discussed here. DEV

    8 min
  2. 1d ago

    C++ vs. Rust: Choosing the Right Language for Systems-Level Development

    Systems-level programming demands more from a language than raw speed — it demands predictability, safety, and a codebase that someone can still reason about years down the line. This episode of Development puts C++ and Rust side by side across the dimensions that actually matter in production, drawing on the C++ vs. Rust comparison published at DEV. Rather than declaring a winner, the episode gives engineers the framework to make an informed, context-specific call. Here's what the episode covers: Performance parity — and where it breaks down: Both languages compile to native machine code with zero-cost abstractions, but C++'s unchecked freedom can introduce undefined behavior that Rust's compile-time borrow checker structurally prevents.Memory safety as a design philosophy: C++ treats safety as a choice (smart pointers, disciplined use); Rust treats it as the default, requiring an explicit unsafe block to opt out — a difference with real implications for team dynamics and security posture.RAII and deterministic cleanup: Both languages tie resource lifetimes to object scope, but Rust's drop semantics catch double-frees and use-after-free errors at compile time rather than at runtime.Developer experience and tooling: Rust's borrow checker has a steep early learning curve, but its error messages are unusually helpful; Cargo's unified build and package management gives Rust a structural advantage over C++'s fragmented CMake/vcpkg/Conan ecosystem.Ecosystem maturity: C++ remains dominant in embedded, automotive, and AAA game development (Unreal Engine); Rust's crates.io ecosystem has surpassed 120,000 packages and is production-ready in async, serialization, and cloud-native domains.Long-term maintenance: C++'s backward compatibility spans decades, making it invaluable for aerospace and defense; Rust's opt-in edition model lets the language evolve without breaking existing code, and its explicitness makes codebases easier to hand off.The episode lands on a practical conclusion: teams with deep C++ roots and the expertise to match should feel no pressure to abandon it, but greenfield projects — especially those where security, team turnover, or compiler-enforced correctness matter — have strong reasons to reach for Rust. More from the show: if you're thinking about how languages and runtimes intersect with AI safety, check out LLM Guardrails: How Token-Level Filters Keep AI Output Safe. DEV

    9 min
  3. 1d ago

    How To Build Your Own Large Language Model From Scratch

    Training your own large language model might sound like something only well-funded research labs can pull off — but the open-source ecosystem, rentable cloud compute, and publicly available datasets have changed that calculus dramatically. This episode of Development unpacks this step-by-step guide to building a custom LLM, walking through every major decision point a developer will face on the journey from an empty directory to a deployed, queryable model. The episode covers the full pipeline in practical terms, giving developers a realistic picture of what each phase actually demands in time, hardware, and expertise: Data is the real foundation. A mid-sized model requires hundreds of gigabytes of clean, diverse text. Public datasets like OpenWebText, The Pile, and Common Crawl derivatives are strong starting points, but domain-specific builds — legal, medical, coding — will need proprietary supplements, with careful attention to licensing restrictions.Cleaning is unglamorous but non-negotiable. Raw web-scraped text is noisy and duplicate-heavy. Tools like MinHash or SimHash fingerprinting are close to mandatory for preventing a model from memorizing rather than generalizing.Infrastructure scales with ambition. A sub-7B parameter model can train on a single high-end GPU; beyond 13B, multi-GPU setups and distributed training frameworks like DeepSpeed or Hugging Face Accelerate become necessary. Containerizing the entire environment — and version-pinning dependencies — is essential for reproducibility during long training runs.Architecture and tokenization choices lock in early. Most practitioners build on established open-source architectures like Llama or GPT-NeoX rather than designing from scratch. Tokenizer training, fixed-length chunking, and hyperparameter choices — learning rate schedules, AdamW, gradient checkpointing — all get unpacked in concrete terms.Evaluation goes beyond perplexity. Automated metrics are a sanity check, not a verdict. Manual prompt grading, code completion benchmarks like HumanEval, and A/B comparisons against established baselines reveal blind spots that numbers alone miss.Deployment is its own engineering challenge. Quantization (4-bit or 8-bit) can dramatically cut memory requirements; production setups call for Kubernetes clusters, load balancers, and streaming gateways. Prompt logging, rate-limiting, and sandboxing against injection attacks round out a responsible deployment strategy.The episode closes with an honest assessment: building an LLM is within reach for determined developers today, but "within reach" is not the same as easy. The data pipeline alone represents more than half the battle — get that right, and the rest of the process becomes far more tractable. For more on keeping LLM outputs safe once a model is running, check out the earlier episode LLM Guardrails: How Token-Level Filters Keep AI Output Safe. DEV

    8 min
  4. 1d ago

    Client-Side vs. Server-Side JavaScript: Where Your Code Lives Changes Everything

    Where JavaScript executes isn't just a technical footnote — it's one of the most consequential architectural decisions a developer makes. This episode of Development digs into the fundamental divide between client-side and server-side JavaScript, tracing the language's evolution from a browser-only scripting tool into a full-stack runtime, and unpacking why the execution environment shapes everything from user experience to data security. The discussion draws on the key differences between client-side and server-side JavaScript to give developers a practical mental model for making smarter architectural choices. The episode covers a lot of ground, from foundational concepts to real-world patterns, including: A brief history of JavaScript's runtime environments — from Brendan Eich's ten-day browser experiment in 1995 to Node.js opening the server in 2009.The core distinction, clearly defined — client-side code runs on the user's device with direct DOM access; server-side code runs on remote infrastructure the user never sees, with access to databases, file systems, and private credentials.Three critical dimensions of difference — latency (client-side is immediate; server-side requires a network round trip), resource usage (server hardware is controlled and consistent; client hardware is not), and security (the browser is a public environment — treat it that way).Where each environment truly excels — reactive UI frameworks and offline capabilities belong on the client; database coordination, CPU-heavy tasks, and API orchestration belong on the server.Hydration and hybrid patterns — why the best applications blend both environments, using server rendering for fast initial loads and client-side JavaScript to deliver rich interactivity.Security threats on both sides — XSS and token exposure on the client; injection attacks, event-loop exhaustion, and compromised npm packages on the server — and the disciplined mitigations that address each.The episode wraps with three practical principles to guide every future architectural call: put fast interactions on the client, protect sensitive operations on the server, and make migration decisions based on measurement rather than instinct. For more on AI safety in a related domain, check out the episode LLM Guardrails: How Token-Level Filters Keep AI Output Safe from the Development back catalogue. DEV

    8 min
  5. 4d ago

    LLM Guardrails: How Token-Level Filters Keep AI Output Safe

    Content moderation for large language models is often treated as an afterthought — a filter bolted on after the model has already finished speaking. This episode of Development makes the case that timing is everything, and that catching harmful output as it forms, token by token, is a fundamentally different and more defensible approach. The discussion is grounded in this in-depth guide to creating token-level filters for unsafe LLM output, translating its technical detail into practical guidance for developers building AI-powered products. Here's what the episode covers: Why token-level filtering beats post-hoc review — Completed outputs can flash on screen before a filter fires; intervening during generation closes that window almost entirely.The three main threat categories — Harassment and hate speech, sensitive information leakage from fine-tuned models, and harmful instruction generation each require a different filtering posture.Rule-based vs. ML-based approaches — and why hybrid wins — Deterministic rules are fast and predictable for clear-cut violations; a learned classifier handles subtler, context-dependent cases. The episode explains why combining both is the recommended architecture.The partial-token problem — Acting too early risks false positives; waiting too long risks the harmful word completing. The episode walks through how to use directional probability signals to find the right intervention point.Tiered responses to violations — Not every flagged token warrants a hard stop. A graduated system — gentle redirection for borderline drift, clean refusals for serious violations — keeps the user experience intact while maintaining safety.Over-filtering as its own failure mode — Blocking legitimate content frustrates users just as surely as letting harmful content through. Adversarial testing, ongoing monitoring, and careful calibration are non-negotiable parts of the process.The episode also addresses two practical engineering tradeoffs developers often underestimate: context collapse, where a filter reacts to a token pattern without understanding conversational intent, and latency overhead, where per-token inference costs add up fast in high-volume real-time applications. Both are manageable with the right architectural decisions — but only if you plan for them from the start. For more on building with machine learning, check out the Development episode on Top Python Libraries for Machine Learning in 2026. DEV.co

    8 min
  6. 5d ago

    Top Python Libraries for Machine Learning in 2026

    Choosing the right Python library for machine learning isn't just a technical decision — it's a strategic one. With the ecosystem evolving rapidly, this episode of Development cuts through the noise to spotlight the tools that are genuinely delivering in 2025, drawing on this in-depth overview of Python's top ML libraries to give developers a clear-eyed view of what's worth learning and what's worth building with. The episode covers the major frameworks and fast-rising contenders shaping modern ML workflows, including: TensorFlow 3.x — a significantly improved developer experience via the fully integrated Keras API, eager execution by default, automatic hardware routing across CPUs, GPUs, and TPUv5e clusters, and a curated Model Garden 2.0 stocked with production-ready architectures.PyTorch 2.3 — the researcher-favorite doubles down on flexibility while closing the gap to production, with the TorchDynamo compiler accelerating dynamic graphs, built-in quantization-aware training, and TorchServe 1.5 automating REST and gRPC endpoint creation from saved checkpoints.Scikit-Learn 2.0 — a milestone rewrite that adds native GPU acceleration through CuML and Intel oneAPI backends, automatic feature type inference in ColumnTransformer, and first-class probabilistic outputs — keeping interpretability front and center for enterprise teams.JAX — built for developers who need maximum numerical performance, its XLA-compiled functional model combined with the new PJRT runtime enables seamless scaling from a single GPU to a multi-TPU pod with no code changes.Hugging Face Transformers 5.0 — now functioning as a full-stack ML platform, with a new Model Agent API for chaining models without boilerplate and a quantized model zoo offering thousands of 4-bit and 8-bit checkpoints runnable on consumer hardware.Fast-rising tools to watch — Polars for high-performance data manipulation, RAPIDS cuML for GPU-accelerated classical ML, and Optuna 4.0 for asynchronous hyperparameter optimization across all major frameworks.Beyond the library-by-library breakdown, the episode offers a practical decision framework: match your tooling to your project goals, your team's strengths, and your deployment targets — then validate the shortlist with a small vertical prototype before committing to a full stack. For more on picking a Python web framework, check out the episode Flask vs. Django: Choosing the Right Python Web Framework. DEV

    8 min
  7. 6d ago

    Flask vs. Django: Choosing the Right Python Web Framework

    Picking a Python web framework isn't just a technical checkbox — it shapes how fast a team ships, how easily new developers ramp up, and how cleanly a codebase handles growth over time. This episode of Development digs into one of the most debated questions in the Python ecosystem, drawing on the Flask vs. Django framework comparison published at DEV. Rather than declaring a winner, the episode gives developers and technical leads a clear framework for matching each tool to the right situation. Here's what the episode covers: Origins and philosophy: Django arrived in 2005 as a batteries-included solution built for newsroom speed; Flask launched in 2010 with a deliberately minimal core — and that founding split still defines everything about how the two frameworks feel in daily use.Team size dynamics: A solo developer or small team can move fast with Flask's transparency and lack of abstraction layers, while Django's enforced conventions become a genuine asset as teams grow and junior developers join the mix.Project type as the deciding factor: Django's out-of-the-box auth, admin panel, ORM, and migrations make it a strong fit for MVPs and feature-rich apps; Flask's lean footprint is a cleaner match for API-only services, microservices, and highly customized request pipelines.Scalability myths and realities: Both frameworks can handle serious production traffic — but Django tends to scale vertically within a monolith, while Flask lends itself to horizontal scaling across separate, focused services.Ecosystem and maintenance trade-offs: Django's massive ecosystem (including the near-ubiquitous Django REST Framework) integrates with minimal friction; Flask's extension model hands developers full control but also full responsibility for keeping components compatible over time.Development workflow texture: Flask encourages incremental structure — starting with a single file and graduating to Blueprints — while Django scaffolds a clean, organized project layout from the very first command, guiding separation of concerns before a line of business logic is written.The episode's honest conclusion: neither framework is universally superior. Both are mature, battle-tested, and well-supported. The right call comes down to your project's complexity, your team's experience level, and where you expect the codebase to be a year from now. If the choice is genuinely unclear, prototyping a small feature in each is worth the time. More from the show: Enterprise Java in 2026: Tools, Trends, and What Still Matters. DEV

    8 min
  8. Jun 18

    Enterprise Java in 2026: Tools, Trends, and What Still Matters

    Java has been written off more times than anyone cares to count, yet it continues to underpin some of the world's most critical software — from banking infrastructure to global logistics platforms. This episode of Development takes a clear-eyed look at the state of enterprise Java in 2025, drawing on this deep-dive into enterprise Java tools and trends to map out what's actually changed, what's stayed the same, and what separates developers who are thriving in this space from those stuck in older patterns. The episode covers a wide range of ground across tooling, architecture, DevOps practice, and developer skills: Cloud-native Java is no longer a contradiction. GraalVM native image compilation, along with frameworks like Quarkus and Micronaut that perform dependency injection at compile time, has dramatically reduced startup times and memory overhead — making Java microservices genuinely competitive with lighter-weight alternatives.The build and observability toolbox. Gradle's Kotlin DSL and faster incremental builds have been winning teams away from Maven, though Maven's stability keeps it firmly in place at large organisations. For observability, OpenTelemetry paired with Prometheus and Grafana has become the standard for understanding application health beyond simple uptime checks.API and testing consensus. The OpenAPI Specification (with tools like springdoc-openapi keeping docs in sync with code) anchors REST API design, while JUnit 5, Testcontainers, and AssertJ form a near-universal testing stack — with Testcontainers earning particular attention for enabling tests against real, ephemeral infrastructure rather than unreliable mocks.The microservices reckoning. The dust is settling on a decade of decomposition, and the pattern that emerges is nuanced: microservices aligned to real business capabilities deliver genuine value, while poorly bounded services create operational nightmares. Service meshes like Istio and Linkerd help manage cross-cutting concerns at the infrastructure layer, keeping application code cleaner.Event-driven architecture and DevOps discipline. Apache Kafka dominates high-throughput asynchronous workloads, with frameworks like Spring Cloud Stream reducing boilerplate. On the DevOps side, pipeline-as-code, distroless container images (built with tools like Jib), and shift-left security scanning with OWASP Dependency-Check or Snyk are presented as non-negotiable practices in enterprise contexts.The skills that actually matter now. Modern Java language features — records, sealed classes, pattern matching, and Project Loom's virtual threads — reward developers who track the six-month release cadence. Observability fluency and cloud cost judgment (knowing when to scale out versus when to tune) are called out as meaningful differentiators in senior roles.The through-line of the episode is that Java's longevity isn't passive — it reflects continuous adaptation to cloud infrastructure, evolving architectural patterns, and developer expectations. If you're working on or evaluating enterprise systems, this episode offers a practical framework for thinking about where the ecosystem stands today. For more on building production-ready backend systems, check out our earlier episode Building Scalable Web Apps with Django and Python. DEV

    8 min

About

Software and AI development podcast. We cover all things software development, including today's advanced AI development tricks and techniques.