26 episodes

With Ai X Podcast, Open Data Science Conference (ODSC) brings its vast experience in building community and its knowledge of the data science and AI fields to the podcast platform. The interests and challenges of the data science community are wide ranging. To reflect this Ai X Podcast will offer a similarly wide range of content, from one-on-one interviews with leading experts, to career talks, to educational interviews, to profiles of AI Startup Founders. Join us every two weeks to discover what’s going on in the data science community.

Find more ODSC lightning interviews, webinars, live trainings, certifications, bootcamps here - https://aiplus.training/
Don't miss out on this exciting opportunity to expand your knowledge and stay ahead of the curve.

ODSC's Ai X Podcast ODSC

- Technology
- 5.0 • 3 Ratings

Listen on Apple Podcasts

Requires macOS 11.4 or higher

Listen on Apple Podcasts

Requires macOS 11.4 or higher

- JUN 2, 2024
AI and Data Science in Financial Markets with Iro Tasitsiomi

AI and Data Science in Financial Markets with Iro Tasitsiomi

In this episode, our guest is Argyro (Iro) Tasitsiomi, Head of Investments Data Science at T. Rowe Price. Iro has extensive experience in AI and data science within the financial markets. Early in her career, Iro held significant roles focusing on quantitative research & risk management, where she developed advanced trading strategies and econometric forecasting tools.

Later, as a Data Scientist in Investment Banking, she led initiatives in business intelligence, market intelligence, and machine learning infrastructure development. Her work has spanned various domains, including asset portfolio optimization, enterprise risk management, and mergers and acquisitions.

She joins us on the podcast to discuss her professional journey at leading financial institutions such as Goldman Sachs, BlackRock, Prudential, and T. Rowe Price, and explore how AI is transforming various aspects of financial markets. We'll discuss advancements in financial modeling, the opportunities and risks of integrating AI into financial analysis, and the impact of fake data on market stability. Additionally, we'll cover the importance of quality data, the enduring value of human expertise, and emerging skills in the era of AI.

Topics:
Career Journey: Iro’s professional journey working at top financial institutions such as Goldman Sachs, BlackRock, Prudential, and T. Rowe Price
Earlier Modeling Techniques: time series, trading strategies and risk modeling
AI Applications in Financial Markets: How is AI being applied across different areas of the financial markets today?
Fundamental Analysts in the Age of AI: How can fundamental analysts leverage their skills alongside AI tools to create a more comprehensive approach to financial analysis?
Advancements in Financial Modeling: With the rise of AI, have modeling techniques advanced beyond traditional methods like Monte Carlo simulations and the Black-Scholes models?
Opportunities and Risks of AI: What are the opportunities and risks of integrating AI into fundamental financial analysis?
Impact of Fake Data and Fake News: Exploring the influence of fake data and fake news on social media and its impact on financial markets.
Importance of Quality Data: The significance of quality data for AI in financial markets and how the evolving data landscape is shaping robust quantitative models.
Crowding: In traditional finance, popular trading strategies often become less effective as more market participants adopt them (crowding). Looking ahead, with the rise of AI and everyone potentially having access to similar LLM tools, do we foresee a similar phenomenon of "crowding" occurring?
Synthetic Data Usage: Discussing the use of synthetic data in financial market modeling and analysis.
Economic forecasting: The potential benefits and challenges associated with using AI for this purpose
Astrophysics: Iro’s research focused on large-scale structure formation in the universe using cosmological simulations.
Show Notes:

Learn more about Iro Tasitsiomi:
https://www.linkedin.com/in/argyrotasitsiomi/
Bits and Brainwaves Newsletter: https://www.linkedin.com/newsletters/bits-brainwaves-7151965921455046657/

Argyro (Iro) Tasitsiomi's Google Scholar Astrophysic Citations: https://scholar.google.com/citations?user=-dtAtwIAAAAJ&hl=en

Chronos: Learning the Language of Time Series:
https://arxiv.org/abs/2403.07815

On cosmology, investment strategies and ergodicity:
https://www.linkedin.com/pulse/cosmology-investment-strategies-ergodicity-tasitsiomi-phd-aec0f

This episode was sponsored by:
Ai+ Training https://aiplus.training/
Home to hundreds of hours of on-demand, self-paced AI training, ODSC interviews, free webinars, and certifications in in-demand skills like LLMs and Prompt Engineering

And created in partnership with ODSC https://odsc.com/
The Leading AI Training Conference, featuring expert-led, hands-on workshops, training sessions, and talks on cutting-edge AI topics and

Never miss an episode, subscribe now!
- MAY 27, 2024
How AI will Impact the News with Noah Giansiracusa

How AI will Impact the News with Noah Giansiracusa

In this episode, Professor Noah Giansiracusa, a tenured associate professor of mathematics and data science at Bentley University, joins us to discuss AI’s impact on the news.

Professor Noah’s research interests range from algebraic geometry to machine learning to empirical legal studies. He is the author of How Algorithms Create and Prevent Fake News (2021) and has written op-eds for Barron's, Boston Globe, Wired, Slate, and Fast Company.

During this interview, Professor Noah Giansiracusa will discuss the benefits and drawbacks of AI as it relates to the creation and dissemination of news. He’ll explore the economic pressures on newsrooms and how AI might provide cost-effective solutions, while also examining the challenges of ensuring factual accuracy in an AI-powered news landscape.

Topics:
- Overview of guest’s background
- How AI will impact the news
- The new skills journalists will need to develop and how newsrooms can adapt their workforce strategies
- AI’s potential (or lack thereof) to provide cost-effective solutions for these smaller, local news organizations struggling to compete and stay relevant
- Ways local and small news organizations can make use of AI
- Automation and real-time AI-driven journalism
- Using AI to fact-check the news
- The long-term impacts of AI on how people consume and understand the news and avoiding filter bubbles and echo chambers
- How AI can help improve the fairness, accuracy, and trustworthiness of news, and the challenges involved
- Public awareness and perception of AI in the news
- The role AI might play in the creation and dissemination of misinformation
- Upcoming USA and Europe elections: misinformation generation and misinformation detection
- Watermarking, audio fakes, and deepfake detection
- Wrapup and how to follow our guest

Show Notes:

Learn more about Noah Giansiracusa, PhD here:
https://www.noahgian.com/

Get his book here:
https://link.springer.com/book/10.1007/978-1-4842-7155-1

Learn more about Watermarking AI here: https://www.brookings.edu/articles/detecting-ai-fingerprints-a-guide-to-watermarking-and-beyond/

Learn more about The Coalition for Content Provenance and Authenticity here:
https://c2pa.org/

This episode was sponsored by:
Ai+ Training https://aiplus.training/
Home to hundreds of hours of on-demand, self-paced AI training, ODSC interviews, free webinars, and certifications in in-demand skills like LLMs and Prompt Engineering

And created in partnership with ODSC https://odsc.com/
The Leading AI Training Conference, featuring expert-led, hands-on workshops, training sessions, and talks on cutting-edge AI topics and

Never miss an episode, subscribe now!
- MAY 19, 2024
HPCC - Open-Source Platform High-Performance Computing on Large-Scale Data with Bob Foreman

HPCC - Open-Source Platform High-Performance Computing on Large-Scale Data with Bob Foreman

In this episode of the Ai X Podcast, Bob Foreman, Lead Software Engineering LexisNexis Risk Solutions, joins us to discuss the High-Performance Computer Cluster (HPCC) project, an open-source, massive parallel-processing computing platform for data processing and analytics.

Bob has spent more than a decade working with HPCC Systems and with the Enterprise Control Language as a course developer and trainer. Not only is he a highly experienced developer, but he is also the designer of the HPCC Systems online training courses and is the senior instructor for all classroom and remote-based training.

Join him for a deep dive into the HPCC project to discover how it simplifies complex data analysis at scale and why it’s an ideal tool for students, startups, or companies exploring or running POC for large-scale data-intensive computing.

Topics:
- Guest Background and Professional Development
- Overview of LexisNexis as a company
- Guest current role
- What high-speed data engineering involves and why it's important in big data solutions
- Main components of the HPCC Systems platform
- Where the HPCC Systems platform fits in the data landscape
- The role of Thor, Roxie, and ECL in the platform
- Example of how ECL (Enterprise Control Language) can be used to manipulate and analyze large datasets
- How Roxie enhances the performance of real-time data querying and analysis
- How to get started with HPCC's open-source massive parallel system.
- Working with small sets of data before getting to large data sets
- The Machine Learning Library native to HPCC
- How the HPCC platform fits into the latest trends of Machine Learning and Generative AI
- HPCC-related events in 2024
- ODSC Workshops and how HPCC community initiatives
- How to follow HPCC updates

Show Notes:
Learn more about Bob Foreman: https://www.linkedin.com/in/bobforeman/
Learn more about the HPCC Platform: https://github.com/hpcc-systems/HPCC-Platform |
https://hpccsystems.com/about/#Platform
HPCC call ECL bundles: https://github.com/hpcc-systems/ecl-bundles
HPCC Systems Machine Learning Library: https://hpccsystems.com/download/free-modules/hpcc-systems-machine-learning-library/

Bob’s Educational Resources
Bob’s online course: https://hpccsystems.com/training/free-online-learning-with-hpcc-systems/

HPCC community initiatives: https://www.missingkids.org/ourwork/ncmecdata

This episode was sponsored by:
Ai+ Training https://aiplus.training/
Home to hundreds of hours of on-demand, self-paced AI training, ODSC interviews, free webinars, and certifications in in-demand skills like LLMs and Prompt Engineering

And created in partnership with ODSC https://odsc.com/
The Leading AI Training Conference, featuring expert-led, hands-on workshops, training sessions, and talks on cutting-edge AI topics and

Never miss an episode, subscribe now!
- MAY 12, 2024
Training and Deploying Open-Source LLMs with Dr. Jon Krohn

Training and Deploying Open-Source LLMs with Dr. Jon Krohn

In this episode, we speak with Dr. Jon Krohn about the life cycle of open-source LLMs. Jon is a co-founder and chief data scientist at the machine learning company Nebula. He is the author of the book Deep Learning Illustrated, which was an instant #1 bestseller and was translated into seven languages. He is also the host of the fabulous SuperDataScience podcast, the data science industry’s most listened-to podcast. An incredible instructor and speaker, Jon’s workshops at ODSC conferences and other events are always one of our most popular.

Topics:
1. Guest Introduction
2. Definition of an open source LLMs and what it means to be truly open source
3. The importance of LLM weights and neural networks architecture for training
4. Transformer architecture
5. Apple expanding their AI team
6. What do I need to train or fine-tune an LLM
7. Key libraries for fine-tunning an LLM
8. The LoRA (Low-Rank Adaptation) technique for efficiently fine-tuning large language models
9. Testing and evaluating LLMs prior to deploying in production
10. Retrieval Augmented Generation (RAG)
11. Deploying LLM to production
12. How to keep inference costs down
13. How can people follow Jon’s content (see show notes also)
Show Notes:
More about Jon:
LinkedIn - https://www.linkedin.com/in/jonkrohn/
Jon’s YouTube Channel - https://www.youtube.com/c/jonkrohnlearns
Jon’s Monthly Newsletter - https://www.jonkrohn.com/

Tools and Resources:
Michael Nielsen's eBook on Neural Networks and Deep Learning - http://neuralnetworksanddeeplearning.com/
PyTorch Lightning is the deep learning framework - https://lightning.ai/docs/pytorch/stable/

Hugging Face Transformers Library - https://huggingface.co/docs/transformers/v4.17.0/en/index

Vicuna: An Open-Source Chatbot - https://lmsys.org/blog/2023-03-30-vicuna/

LoRA: Low-Rank Adaptation of Large Language Models - https://arxiv.org/abs/2106.09685

SDS 674: Parameter-Efficient Fine-Tuning of LLMs using LoRA (Low-Rank Adaptation) - https://www.superdatascience.com/podcast/parameter-efficient-fine-tuning-of-llms-using-lora-low-rank-adaptation

Unsloth for finetuning Llama 3, Mistral & Gemma - https://github.com/unslothai/unsloth

Phoenix: an open-source AI Observability & Evaluation tool - https://github.com/Arize-ai/phoenix

ODSC Podcast with Amber Roberts on Phoenix and troubleshooting LLMs - https://www.odsc.com/podcast/#e33

Weights & Biases - https://wandb.ai/site
This episode was sponsored by:
Ai+ Training https://aiplus.training/
Home to hundreds of hours of on-demand, self-paced AI training, ODSC interviews, free webinars, and certifications in in-demand skills like LLMs and Prompt Engineering

And created in partnership with ODSC https://odsc.com/
The Leading AI Training Conference, featuring expert-led, hands-on workshops, training sessions, and talks on cutting-edge AI topics and

Never miss an episode, subscribe now!
- MAY 5, 2024
DBRX and Open Source Mixture of Experts LLMs with Hagay Lupesko

DBRX and Open Source Mixture of Experts LLMs with Hagay Lupesko

Today on our podcast, we're thrilled to have Hagay Lupesko, Senior Director of Engineering in the Mosaic AI team at Databricks and one of the key architects behind Databricks' groundbreaking large language model, DBRX.

Previously Haguy was the VP of Engineering at Moscia ML, which was acquired by Databricks in 2023. Hagay has also held AI engineering leadership roles at Meta, AWS, and GE Healthcare.

Our topic today is the the open-source DBRX large language model, which stands out in the LLM AI landscape due to its innovative use of the Mixture of Experts (MoE) architecture. This architecture allows DBRX to efficiently scale by distributing tasks across 64 combinable experts, allowing the model to select the most suitable set of internal configurations from a pool of experts for each specific task. This results in faster processing and potentially better performance compared to traditional LLM architectures.

We'll be exploring the inspiration behind DBRX, the advantages of Mixture of Experts, and how it positions DBRX within the larger LLM landscape.

Podcast Topics:

- DBRX backstory and Databricks' Mosaic AI Research team
- Inspiration for the open Source DBRX LLMs, and what gap it fills
- Core features of DBRX that distinguish it from other LLMs (LLMs)
- Mixture of Experts - Mixture-of-experts (MoE) architecture
- How Mixture-of-Experts (MoE) architecture enhances LLM
- Comparison to other MoE models like Mixtral-8x7B and Grok-1
- Advanced DBRX Architecture features?
- Rotary Position Encodings (RoPE): https://paperswithcode.com/method/rope
- Gated Linear Units (GLU): https://paperswithcode.com/method/glu
- Grouped Query Attention (GQA): https://towardsdatascience.com/demystifying-gqa-grouped-query-attention-3fb97b678e4a
- GPT-4 Tokenizer (tiktoken): https://github.com/openai/tiktoken
- Types of tasks and applications Mixture-of-Experts models are particularly well-suited for
- RAG (Retrieval Augmented Generation ) and LLMs for Enterprise applications
- How open source LLM MoE Models like DBRX are being used by Databricks customers
- What’s next in 2024 for DBRX, mixture-of-experts models, and LLMs in general
- Keep up with the evolving AI field?

Show Notes:

Learn more about and connect with Hagay Lupesko: https://www.linkedin.com/in/hagaylupesko/

Learn more about DBRX, its use of Mixture of Experts, and scaling laws:

- Introducing DBRX: A New State-of-the-Art Open LLM
https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm
- Mixture of Experts
https://opendatascience.com/what-is-mixture-of-experts-and-how-can-they-boost-llms/#google_vignette
- Lost in the Middle: How Language Models Use Long Contexts
https://cs.stanford.edu/~nfliu/papers/lost-in-the-middle.arxiv2023.pdf
- HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
https://hotpotqa.github.io
- Scaling Laws
https://en.wikipedia.org/wiki/Neural_scaling_law
- Training Compute-Optimal Large Language Models
https://arxiv.org/pdf/2203.15556
- Advance Architecture Features & Techniques in DBRX
- Rotary Position Encodings (RoPE): https://paperswithcode.com/method/rope
- Gated Linear Units (GLU): https://paperswithcode.com/method/glu
- Grouped Query Attention (GQA): https://towardsdatascience.com/demystifying-gqa-grouped-query-attention-3fb97b678e4a
- GPT-4 Tokenizer (tiktoken): https://github.com/openai/tiktoken

This episode was sponsored by:
Ai+ Training https://aiplus.training/
Home to hundreds of hours of on-demand, self-paced AI training, ODSC interviews, free webinars, and certifications in in-demand skills like LLMs and Prompt Engineering

And created in partnership with ODSC https://odsc.com/
The Leading AI Training Conference, featuring expert-led, hands-on workshops, training sessions, and talks on cutting-edge AI topics and

Never miss an episode, subscribe now!
- APR 29, 2024
Embedding Trustworthy Practices Across the AI Lifecycle with Vrushali Sawant

Embedding Trustworthy Practices Across the AI Lifecycle with Vrushali Sawant

Vrushali Sawant is a data scientist with SAS's Data Ethics Practice (DEP), steering the practical implementation of fairness and trustworthy principles into the SAS platform. She regularly writes and speaks about practical strategies for implementing trustworthy AI systems on sas.com and external platforms. With a background in technology consulting, data management, and data visualization, she has been helping customers make data driven decisions for a decade.

In this episode of ODSC’s Ai X Podcast, we speak with SAS’s Vrushali Sawant about trustworthy AI, including an overview of what it is, what a career path in trustworthy AI looks like, how data scientists can implement trustworthy AI, and what’s already being done to promote ethical AI.

Questions:

1. What is the career path of a data scientist specializing in AI ethics?
2. What exactly is trustworthy AI, and why is it crucial?
3. How can we build AI systems that reflect societal values and avoid unintended bias?
4. What practical steps can data scientists take to ensure data privacy, security, and fairness throughout the AI lifecycle?
5. How can we develop models that are not only accurate but also interpretable and explainable?
6. What are some unintended harms at a massive scale and real-world examples of unintended harms caused by AI?
7. What is the AI incidents database?
8. What is the significance of data management in building trustworthy AI systems?
9. What’s an example of bias and a practical way to identify and mitigate potential biases in datasets?
10. How can training data imbalances affect the fairness of AI models?
11. What are the three stages of bias mitigation?
12. What is the importance of asking the right questions at the beginning of the AI lifecycle?
13. Beyond fairness metrics, what methods can developers use to ensure fairness, model transparency, and explainability?
14. Why is accountability such a critical aspect for building trust in AI systems
15. What are accountability and feedback mechanisms?
16. What are some ethical concerns associated with using synthetic data in various industries?
17. What are key ethical guidelines that developers and organizations should follow when implementing AI solutions?
18. What strategies can organizations implement to build a culture of responsible AI?
19. What are the biggest hurdles in achieving trustworthy AI in practice?
20. Let’s review your ODSC talk, "From Code to Trust."

Learn more about Vrushali and her work:
Her blog and articles: https://blogs.sas.com/content/author/vrushalisawant/
Follow her posts on LinkedIn: https://www.linkedin.com/in/vrushalipsawant/
SAS blogs: https://blogs.sas.com/content/
SAS Viya: https://www.sas.com/en_us/software/viya.html
Learn more about the harms and near harms caused by AI on the AI incident database and how we can use the lessons learned to prevent future bad outcomes: https://incidentdatabase.ai/

This episode was sponsored by:
Ai+ Training https://aiplus.training/
Home to hundreds of hours of on-demand, self-paced AI training, ODSC interviews, free webinars, and certifications in in-demand skills like LLMs and Prompt Engineering

And created in partnership with ODSC https://odsc.com/
The Leading AI Training Conference, featuring expert-led, hands-on workshops, training sessions, and talks on cutting-edge AI topics and

Never miss an episode, subscribe now!