EDGE AI POD

EDGE AI FOUNDATION

Discover the cutting-edge world of energy-efficient machine learning, edge AI, hardware accelerators, software algorithms, and real-world use cases with this podcast feed from all things in the world's largest EDGE AI community.  These are shows like EDGE AI Talks, EDGE AI Blueprints as well as EDGE AI FOUNDATION event talks on a range of research, product and business topics. Join us to stay informed and inspired!

  1. 1D AGO

    AI-Driven Brain-Computer Interface (BCI) Unlocking the Minds Potential

    Imagine steering a game or selecting a letter with nothing but a blink or a glance. We set out to make that feel normal, not magical, by building a non-invasive brain–computer interface that runs entirely on a low-power microcontroller and fits into everyday wearables like glasses. No surgery, no cloud dependency—just smart sensing, tight signal processing, and a tiny neural net that turns eye movements into reliable commands. We start with the “why”: millions live with motor impairments yet can still move their eyes, leaving a powerful window for communication and control. From there, we map the BCI landscape—high-precision invasive implants like Neuralink, BrainGate, and Synchron on one side; accessible non-invasive tools like Emotiv, Muse, and OpenBCI on the other—and unpack the trade-offs across accuracy, latency, cost, and ethics. Our approach uses electrostatic charge sensing to read subtle changes around the eyes, with electrodes positioned for comfort and signal quality. A lean pipeline cleans the data with high-pass, notch, and low-pass filters; a Z-score event detector wakes the model only when something meaningful happens. The model is a compact 1D CNN that classifies four classes—discard involuntary blinks, trigger with a voluntary blink, and detect left or right glances—achieving about 90% accuracy on a small multi-participant dataset. Running on an STM32H7, it uses roughly 18 KB flash and 6 KB RAM, with sub-millisecond inference; the overall response is driven by the short data window at 240 Hz, delivering real-time control for basic tasks. We demo blink-to-jump and look-to-steer gameplay to prove responsiveness and highlight how the same system could power communication aids and smart-home control. Looking ahead, we focus on integrating the electrodes into comfortable glasses, adding quick calibration for personal variability, and expanding the command set without sacrificing simplicity. If this mix of accessibility, edge AI, and practical human–machine interaction resonates with you, follow the show, share it with a friend, and leave a review so we can reach more builders and caregivers working on assistive tech. What would you control first with a glance? Send us Fan Mail Support the show Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

    15 min
  2. APR 30

    An Embedded Transformer- base face recognition system in the STM32N6

    What if transformer-level face recognition could run on a microcontroller without giving up speed or accuracy? We set out to make that real on the STM32N6 by pairing its neural processing unit with a hybrid model that blends convolutional efficiency and attention-like global context. Along the way, we rewired core assumptions about attention, reworked unsupported operators, and delivered a full on-device pipeline that actually feels instant. We start with the hardware edge: ARM Cortex M55, 4 MB of continuous RAM, and an NPU pushing up to 600 GOPS at remarkable power efficiency. That lets us chain models—RetinaFace-style detection with landmarks, alignment for a stable canonical view, MobileNetV2 anti-spoofing to block print and replay attacks, and a final recognizer that outputs a 512‑dimensional embedding. The recognizer is built on EdgeFace, itself based on EdgeNext, chosen for its sweet spot between parameter count and accuracy. It behaves like a transformer where it matters—capturing long-range relationships—yet fits into the tight compute envelope of a microcontroller. The turning point is attention without the dot product. Because the ST toolchain doesn’t support batch matmul, we replaced it with a convolutional self-attention mechanism. Depthwise and pointwise convolutions encode relationships across pixels and channels, a sigmoid stands in for softmax, and element-wise products reconstruct attention’s weighting behavior. This maps cleanly to the NPU, avoids quadratic costs, and preserves the ability to stabilize identities across pose, lighting, and occlusion. Benchmarks show roughly 40 ms per frame end to end—about 25 FPS—plus substantial speedups over STM32H7 and higher accuracy than MobileFaceNet across validation sets. That opens doors for privacy-first access control, frictionless enrollment on-device, and personalized experiences where latency matters and data should never leave the edge. If you’re exploring embedded AI, this walkthrough shows how to align model design with silicon capabilities and deliver results that feel both fast and trustworthy. Enjoy the deep dive? Subscribe, share this episode with a fellow edge AI builder, and leave a quick review to help others find the show. Send us Fan Mail Support the show Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

    12 min
  3. APR 23

    Verification, Validation & Certification of AI in Safety-Critical Applications

    A cyclist disappears to the model, not to your eyes—and that mismatch is the heart of safety-critical AI. We open with the “vanishing cyclist” to show how tiny, imperceptible perturbations can flip life-or-death decisions, then walk through a practical path to trust that spans data, verification, and deployment. Along the way, we share real stories from BMW, Airbus, and Madrid Metro to ground the engineering in results, not hype. We break down how to build a resilient pipeline: domain-specific data labeling, realistic synthetic generation for rare and risky scenarios, and tight interoperability across MATLAB, Python, PyTorch, TensorFlow, and ONNX. We dig into explainability beyond classification with D-RISE for object detectors and semantic segmentation, helping you see what the network actually uses to decide. Then we raise the bar with formal verification for robustness—mathematical guarantees within defined perturbation sets—so you aren’t mistaking the absence of found attacks for true safety. Finally, we get practical about the edge. Model compression and projection recover accuracy with fewer parameters, enabling fast, power-efficient deployment to CPUs, GPUs, and FPGAs, backed by code generation for the entire application. We also cover runtime safeguards like out-of-distribution detection to catch smog-on-the-runway moments and escalate safely. Throughout, we connect the work to evolving standards, the EU AI Act, and updated workflows that adapt the V-model for learning systems, so your process and artifacts are ready for audits and certification. If you care about trustworthy AI for cars, planes, rail, and medical devices—and want tools and habits that survive contact with reality—this one’s for you. Listen, subscribe, and leave a review with your biggest trust gap or the safeguard you’d ship first. Send us Fan Mail Support the show Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

    19 min
  4. APR 16

    Aptos: Creating ML models that fit your edge device like a glove

    Shipping edge AI shouldn’t feel like a marathon through model zoos, missing ops, and latency ceilings. We lay out a practical path to get from your data and constraints to a hardware-ready model—measured on real boards—without the endless back-and-forth between data science and firmware teams. If you’ve wrestled with quantization loss, unsupported kernels, or picking the “right” NPU, this walkthrough will feel like oxygen. We start by naming the pain: quick demos that collapse under real device limits, foundation models that fail after export, and feedback loops that burn months. From there, we unpack Aptos, our automation engine that turns edge AI into a data in, model out process. The system explores parameterized architecture recipes and neural architecture search, trains promising candidates, and deploys them to a hardware farm packed with evaluation kits. Every candidate returns hard numbers—latency, per-layer timing, memory, on-device accuracy, and power—so tradeoffs are grounded in measurements, not wishful thinking. What makes it fast is the learning layer. As Aptos accumulates results, meta models predict runtime, memory fit, and stable hyperparameter ranges before committing compute. That means less time wasted on dead ends and more time converging on models that satisfy your KPIs, whether you care about sub-5 ms inference on an i.MX 8 Plus, battery life in the field, or non-square inputs that match your camera feed. We also fold in research-backed techniques—pruning, quantization, distillation—so you benefit from the latest without chasing papers. If your team is eyeing a chip migration or evaluating new NPUs, a dropdown swap in Aptos triggers a fresh search tuned to the new hardware, minimizing lock-in and keeping options open. The result is timeline compression: where projects used to take 12–18 months with large teams, we aim to surface strong, deployable candidates in one to two weeks. Subscribe for more deep dives into edge AI deployment, share this episode with your team, and leave a review telling us which device you want to target next. Send us Fan Mail Support the show Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

    21 min
  5. APR 9

    Neural-ART: ST’s New NPU Architecture at the Edge

    What if the fastest path to efficient edge AI isn’t a bigger CPU, but a smarter stream of data? We pull back the curtain on NeuralArt—the flexible, stream‑based accelerator inside the STM32N6—and show how a decade of prototypes led us to rethink how tensors move, how layers are scheduled, and how much work a compiler can save when memory is the real bottleneck. Instead of shuttling activations back and forth, our architecture routes data through specialized units in tightly orchestrated “epochs,” keeping compute hot and bandwidth cool. From there, we tackle the hard limits of standard‑cell designs on practical MCU nodes. Power efficiency stuck around 1–5 TOPS/W and density near 0.1–2 TOPS/mm² pushed us to explore in‑memory computing. We break down digital versus analog IMC—determinism and integration on one side, approximate but highly efficient compute on the other—and share prototype results that hit roughly 40 TOPS/W and about 10 TOPS/mm² at 1 GHz. Along the way, we dig into why half of system power can vanish into data movement and how weight‑stationary strategies change the game. We also get candid about trade‑offs. Embedded phase change memory (PCM) brings remarkable density and multi‑level storage, but demands strict weight‑stationary mapping and drift compensation. No single technology wins every metric, so we lay out a heterogeneous 2D mesh that blends digital IMC, analog IMC, and classical stream units. Our compiler assigns each subgraph to the node that fits its accuracy, throughput, and energy needs, and our NeoSoC research effort moves this vision toward silicon with an upcoming 80‑nm tapeout. If you care about edge inference, memory bandwidth, quantization, and real‑world efficiency beyond spec‑sheet peaks, this conversation is for you. Subscribe, share with a teammate who’s wrestling with on‑device AI, and leave a review with the biggest bottleneck you want us to tackle next. Send us Fan Mail Support the show Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

    15 min
  6. APR 2

    A Unified Neuromorphic Platform for Sparse, Low Power Computation

    Sensors are flooding the edge with data while CPUs juggle denoising, formatting, and inference. We built ADA to flip that script: a Turing-complete neuromorphic processor that computes with time-encoded spikes, slashing power, latency, and memory movement by keeping work inside an event-driven pipeline. We start by unpacking why conventional embedded architectures stall under modern workloads, from pre-processing bottlenecks to compromised security on battery-powered devices. Then we break down neuromorphic fundamentals—how spikes encode information and why sparsity matters—and compare general-purpose frameworks, highlighting the trade-offs that often inflate activity or force manual design. From there, we explain why we chose interval coding and how we solved its biggest flaw. By predicting future spike times, ADA avoids per-tick updates, reducing complexity from linear to logarithmic with precision and mapping neatly to simple add, multiply, and shift hardware. You’ll hear how the architecture comes together: a tiny neuron core that fits in modest FPGAs, standard interfaces like UART and AER for DVS cameras, and our Axon SDK that compiles Python, NumPy, or C algorithms into deployable binaries—no neuron micromanagement required. We demo a three-tap FIR filter built from modular primitives and show ADA acting as a programmable pre-processing element for event vision. On the DVS128 gesture dataset, ADA’s spatial-temporal denoising cut downstream compute by over 50%, keeping the pipeline sparse and fast. Security gets equal attention. We extended the primitive set with modulus arithmetic to support polynomial math central to post-quantum cryptography such as Kyber. The result: 5x better power efficiency and a 2.5x improvement in energy-latency product over MCU baselines, with clear paths to reduce latency further. It points to neuromorphic cryptography that protects implants and IoT sensors without sacrificing battery life. Ready to try it? The Axon SDK is publicly available. Give ADA a spin, share your toughest edge workload, and subscribe for more deep dives into neuromorphic computing. If this sparked ideas, leave a review and pass it to a friend building at the edge. Send us Fan Mail Support the show Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

    20 min
  7. MAR 26

    From Fragments to Foundation: The Sound of Progress in Edge Audio AI

    What if your printer didn’t just spit out pages, but actually understood them? We walk through a hands-on look at multimodal AI on the edge—how visual-language models read layouts, extract tables, translate content, and reformat documents right where data lives, without shipping sensitive files to the cloud. It’s a practical tour from passive peripherals to active intelligence, with real workflows and measurable speedups. We share the architecture behind on-device document intelligence: pre-processing that stabilizes inputs, VLMs that localize and reason over text and images, and post-processing that converts outputs into CSVs, charts, and accessibility-friendly layouts. You’ll hear how Qwen 2.5-VL handles complex visual inputs while maintaining strong language performance, and how a Flux-based diffusion setup enables creative generation and targeted edits—from updating dates in greeting cards to changing borders and colors by prompt. Along the way, we unpack quantization with GGUF to run 7B-class models in tight memory, diffusion sampler and scheduler tuning for latency, and NVIDIA-optimized libraries to squeeze more from modest GPUs. Beyond demos, we dig into business and engineering realities: fine-tuning with enterprise data to reduce hallucinations, building guardrails and fallback paths for reliability, and segmenting large documents to manage VRAM. We also discuss why a companion device—AI PC or smartphone—can orchestrate heavy lifting until printer SOCs catch up, keeping data private and workflows responsive. If you care about document AI, privacy by design, or accessibility features like dynamic type and contrast, this conversation makes the path concrete and actionable. Enjoy the deep dive? Subscribe, share with a colleague who lives in PDFs, and leave a review with the one edge use case you want us to test next. Send us Fan Mail Support the show Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

    29 min
  8. MAR 19

    Empowering at the Edge: the "Arduino way" to AI

    What if AI felt like a door you could open, not a wall you had to climb? We dig into how Arduino’s approach—accessibility first, power when you need it—turns the edge AI buzz into a concrete path you can follow, whether you’re a student with a starter kit or an engineer shipping to a fleet. We walk through a practical four-step journey: try AI through no-code experiments, understand it with pre-trained models, train by fine-tuning or starting from scratch with your data, and build something real that lives beyond a demo. Along the way, we unpack a core principle we call “abstraction without obfuscation”—removing friction while keeping the logic transparent—so you can inspect, modify, and truly own the systems you create. That design philosophy shapes everything from our open hardware portfolio (TinyML-friendly MCUs up to Linux-capable MPUs) to our integrations with popular AI frameworks and community-driven libraries. You’ll also hear how cloud-native developer tools streamline the messy middle: browser-based workflows, single-device to fleet deployments, secure OTA updates, data collection for predictive insights, and closed-loop model improvement. Plus, we introduce our AI assistant as a coach that explains code, diagnoses bugs, and helps optimize for memory and speed—turning dead ends into learning moments. Real-world validation from a 35-million-strong community and enterprise teams, including automotive innovators, shows how openness and cohesion accelerate the leap from idea to production. If you care about AI that empowers rather than intimidates, this conversation lays out the playbook. Subscribe, share with a teammate who loves to build, and leave a review telling us the project you’re dreaming about—we might feature it next. Send us Fan Mail Support the show Learn more about the EDGE AI FOUNDATION - edgeaifoundation.org

    20 min

Ratings & Reviews

4
out of 5
2 Ratings

About

Discover the cutting-edge world of energy-efficient machine learning, edge AI, hardware accelerators, software algorithms, and real-world use cases with this podcast feed from all things in the world's largest EDGE AI community.  These are shows like EDGE AI Talks, EDGE AI Blueprints as well as EDGE AI FOUNDATION event talks on a range of research, product and business topics. Join us to stay informed and inspired!