Data Science With Sam

Soumava Dey

This is an educational podcast focused on bringing academia and industry experts together in a common forum and initiate discussion geared towards data science, artificial intelligence, actuarial science and scientific research. DISCLAIMER: The views and opinions expressed in this podcast are solely those of the host(s) or guest(s) and do not necessarily reflect the policy or position of any organization. The podcast is intended to provide general educational information and entertainment purposes only.

  1. EP 38: The Local AI Stack Nobody Talks About (But Should)

    HACE 2 DÍAS

    EP 38: The Local AI Stack Nobody Talks About (But Should)

    You want to run AI locally. You have questions: What hardware do I actually need? Which framework should I use? How much will this cost? What's the realistic performance? In this episode, Sam brings back Trent Rossiter, founder of Logical Data Solutions, for a practical walkthrough of building a production-grade local AI lab. Trent has built real systems for enterprise clients, tested frameworks on multiple hardware stacks, and made the hardware choices that matter. This is not theory. This is what actually works. WHAT WE COVER: ▪  Hardware & Framework Choices: VRAM is the critical metric (not all VRAM is equal — memory throughput matters as much as capacity).  ▪  Model Architecture & Capability: Mixture of Experts (MoE) lets you fit more power into less VRAM by using fewer active parameters.  ▪  Real Enterprise Applications: Computer vision for quality assurance on assembly lines. Proprietary data handling without cloud exposure.  ▪  Your Starter Stack (All Free): Langflow (agentic workflow builder), Goose (MCP-enabled chat), AnythingLLM (with vector stores for RAG), MCP servers (Model Context Protocol — standardised tool integration).  ▪  Agentic AI & Security: OpenClaw is powerful but controversial — manages email, Telegram, calendars, creates sub-agents. Trent runs it in Docker on an isolated machine for safety. NVIDIA's NemoClaw is the enterprise version (security-first, nothing-allowed-by-default, explicit permissions). HARDWARE TRENT MENTIONS: NVIDIA DGX Spark — 128GB unified memory, CUDA stack Apple MacBook Pro/Mac mini — up to 512GB unified memory, market leader for personal AI AMD integrated AI PCs — emerging competitor NVIDIA RTX gaming cards (30/40/50/60 series) — high VRAM, high power consumption, complex FIND TRENT ROSSITER: LinkedIn: https://www.linkedin.com/in/benjamin-trent-rossiter-mba-0157945/ Logic Data Solutions: https://logicdatasolutions.com/ Contact: BenjaminRossiter@LogicDataSolutions.com

    41 min
  2. EP 37: Neurons: Future of AI Processing

    HACE 5 DÍAS

    EP 37: Neurons: Future of AI Processing

    What if the next generation of computers wasn't made of silicon — but of living human neurons? Not simulated neurons, not artificial neural networks inspired by biology, but actual brain cells grown in a lab, connected to electrodes, and used to process information. That's not science fiction anymore. It's happening right now at FinalSpark, a Swiss startup building the world's first remotely accessible biocomputing platform. In this episode, Sam talks with Dr. Ewelina Kurtys, a neuroscientist with a PhD in brain imaging and a postdoctoral researcher at King's College London, about how living neurons could revolutionise computing — and why they use one million times less energy than silicon-based AI hardware.   ▸  WHAT YOU'LL LEARN ▪  How FinalSpark was founded in 2014 by Fred Jordan and Martin Kutter — and why they pivoted from digital AI to biological computing when they realised the energy and cost problem was unsolvable with silicon ▪  Why 20 watts powers the human brain while silicon-based AI requires megawatts — and what that means for AI's sustainability crisis ▪  The difference between neurons as processors (not power sources) — a crucial distinction most people get wrong ▪  Why biological neural networks learn continuously while digital systems require full model updates — and what that means for energy efficiency ▪  The honest challenge: nobody yet knows exactly how neurons encode information — the biggest scientific hurdle in biocomputing right now ▪  How the I/O interface works: electrodes measuring neural spikes, analog-to-digital converters, researchers writing Python code to control neurons remotely ▪  The remote access breakthrough: researchers in Tokyo or Bristol can log in and control living neurons in Switzerland in real time via browser ▪  Why neurons won't outperform GPUs on speed: biocomputing specialises in efficiency and adaptability, not clock cycles ▪  FinalSpark's current stage: they've stored 1 bit of information and are collaborating with 9 universities on fundamental research ▪  The cost argument: even at 10× lower price than NVIDIA, biocomputers would still generate billions in profit due to energy and infrastructure savings ▪  Bioethics, consent, and regulation: how FinalSpark is working with philosophers now to establish ethical frameworks before biocomputing scales ▪  Why human-machine integration is not new: prosthetics, pacemakers, and smartphones are already blending biology and technology ▪  The hybrid computing future: silicon, quantum, and biocomputing will coexist, each doing what they do best ▪  The real game-changer: cheap, accessible AI for everyone — Ewelina's vision for what biocomputing means for society in 10–20 years.   ▸  LINKS MENTIONED IN THIS EPISODE →  Dr. Ewelina Kurtys on LinkedIn →  Ewelina's Personal Blog & Articles →  FinalSpark (official website) →  FinalSpark Neuroplatform (with live neuron view) →  FinalSpark Team →  Psync (Ewelina's mental wellness startup) →  FinalSpark Contact Form

    30 min
  3. EP 35: Who Actually Controls AI? The Governance Gap Explained

    23 MAR

    EP 35: Who Actually Controls AI? The Governance Gap Explained

    There's no international treaty governing AI, no agreed definition of "safe AI," and nobody with actual authority over frontier model deployment. A handful of CEOs make decisions with civilizational implications while governance structures lag years behind. This episode examines who's responsible for AI governance. The current state? Fragmented and lagging. The US has no comprehensive federal AI legislation—Biden's executive order was rolled back under Trump. The EU AI Act is most comprehensive but heavy provisions don't kick in for years. China's regulation focuses on censorship over safety. The UK AI Safety Institute does serious work but has no enforcement authority. What's working? AI safety institutes are building evaluation capacity. Open-source releases like DeepSeek enable external research. Academic safety community advances interpretability work. Market pressure matters—Anthropic gained users by taking public safety stands.   Three urgent needs: mandatory disclosure requirements for high-capability systems, international coordination with shared evaluation standards (AI safety summits need teeth), and public deliberation beyond experts and officials.   This concludes the AI Governance and Regulation series. People who understand AI deeply - technically, commercially, ethically, politically - will shape governance's future. Stay curious, stay critical, never outsource thinking to any single company or voice.

    7 min
  4. EP 33: Agents Everywhere: What Agentic AI Actually Means for Your Job

    18 MAR

    EP 33: Agents Everywhere: What Agentic AI Actually Means for Your Job

    Everyone's talking about agentic AI, but there's a gap between the hype ("AI will do your job for you") and the reality, which is more nuanced and frankly more interesting. The word "agentic" has officially crossed from technical jargon into buzzword territory—simultaneously everywhere and nowhere. Everyone's using it, few can define it precisely. This episode cuts through the noise to explain what agentic AI systems actually are, what they can and cannot do today, and the realistic implications for people working in data, tech, and knowledge work. What is an agent? Traditional AI interaction: you send a prompt, the model produces a response, done. An AI agent is different: it takes a goal, breaks it into steps, takes actions in the world (browsing the web, writing and running code, calling APIs, managing files), observes results, and iterates until the goal is achieved or it gets stuck. The key agentic feature: it operates across multiple steps autonomously without you manually directing each one. Examples include OpenAI's Claude (consumer-facing), but in enterprise settings, agents are being deployed for automated customer support escalation, multi-step data pipeline management, code review and testing workflows, and research synthesis across large document sets. What can agents do today in early 2026? Agents are reliable for well-defined, bounded tasks with clear success criteria—taking support tickets, classifying them, drafting responses, flagging uncertain ones for human review. But for autonomously managing complex, open-ended strategic projects? Still unreliable. Failure modes include hallucinations, tool use errors, context window limitations in long tasks, and difficulty recovering gracefully when something unexpected happens mid-task. These are real limitations the best researchers are actively working on. The realistic workforce impact right now is task displacement rather than job displacement. Specific tasks within jobs are being automated: first drafts of documents, initial data analysis, standard code patterns, customer FAQ responses. Higher-order judgment, stakeholder navigation, creative problem framing, and ethical calls remain under human control. For data scientists specifically, repetitive engineering work is most likely to be automated: data cleaning pipelines, standard visualizations, model deployment scripts. But statistical thinking, algorithmic design, understanding model outputs, and evaluating trustworthiness remain human responsibilities. The work becoming more valuable: knowing what questions to ask, evaluating whether AI output is trustworthy, and designing systems that fail safely. The advice: become a power user of agentic tools before your role requires it. Not because you'll be replaced by an agent, but because practitioners who understand these tools deeply will be disproportionately effective. Learn how to prompt agents for complex multi-step tasks, evaluate outputs critically, and understand failure modes so you can deploy humans strategically. Agentic AI is real, useful today for specific tasks, and improving rapidly. The hype is ahead of the reality, but not by as much as you might think.

    8 min
  5. EP 32: AI Discovers Drugs: The 2026 Clinical Trial Moment for AI in Biotech

    16 MAR

    EP 32: AI Discovers Drugs: The 2026 Clinical Trial Moment for AI in Biotech

    For years, AI in drug discovery has been a promise—billions invested, hundreds of papers published, dozens of startups founded, but actual drugs coming out the other end? Not yet. This is changing in 2026. Several AI-discovered drug candidates are now entering mid-to-late stage clinical trials. This is the year the receipts arrive for AI in drug discovery. The biotech industry is calling 2026 a landmark year. For a sector that's been hyped as much as it's been scrutinized, the fact that we're finally getting real clinical data on AI-designed drug candidates is a big deal. Multiple candidates discovered and optimized using AI systems are now in Phase 2 and Phase 3 clinical trials, primarily focused on oncology and rare diseases—areas where existing options are limited and financial incentives for innovation are high. Companies furthest along include Insilico Medicine, Recursion Pharmaceuticals, and Exscientia. Their drug candidates were identified by AI systems analyzing massive biological datasets and predicting molecular structures likely to interact with disease targets in useful ways. What used to take teams of medicinal chemists years to accomplish, these systems can explore in weeks—a massive boost for clinical trial phases by reducing R&D time. Why this matters: Traditional drug discovery takes 10-15 years and over $1 billion per approved drug. Most candidates fail—the attrition rate in clinical trials is brutal. AI's promise is dramatically improving the hit rate by better predicting which candidates will actually work before spending money on trials. Even a modest improvement in clinical trial success rates would have enormous downstream impact on human health. But 2026 is a stress test. Clinical trials expose whether AI-predicted drug behavior holds up in actual human biology, which is extraordinarily complex. AI models are trained on known data; when candidates reach trials, you're testing the model's ability to generalize to real biological complexity that wasn't in training. Early signals have been mixed—some candidates performing well, others hitting unexpected toxicity issues. The honest answer: we don't know yet how much AI improves success rates at the clinical stage. For data scientists interested in this space, the most interesting current work is in molecular property prediction, protein structure modeling building on AlphaFold, and multi-objective optimization across efficacy, safety, and synthesizability simultaneously. Recursion's operating system approach treats drug discovery as a data problem end-to-end—one of the most ambitious attempts to apply ML infrastructure thinking to biology at scale. AI in drug discovery is no longer just a story about potential—it's now a story about evidence. The next two years of clinical data will either validate or seriously challenge what's been claimed.

    8 min

Calificaciones y reseñas

5
de 5
14 calificaciones

Acerca de

This is an educational podcast focused on bringing academia and industry experts together in a common forum and initiate discussion geared towards data science, artificial intelligence, actuarial science and scientific research. DISCLAIMER: The views and opinions expressed in this podcast are solely those of the host(s) or guest(s) and do not necessarily reflect the policy or position of any organization. The podcast is intended to provide general educational information and entertainment purposes only.