100 episodes

Technology, machine learning and algorithms

Data Science at Home Francesco Gadaleta

    • Technology
    • 4.2, 57 Ratings

Technology, machine learning and algorithms

    Why you care about homomorphic encryption (Ep. 116)

    Why you care about homomorphic encryption (Ep. 116)

    After deep learning, a new entry is about ready to go on stage. The usual journalists are warming up their keyboards for blogs, news feeds, tweets, in one word, hype.This time it's all about privacy and data confidentiality. The new words, homomorphic encryption.
     
    Join and chat with us on the official Discord channel.
     
    Sponsors
    This episode is supported by Amethix Technologies.
    Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.
     
    References
    Towards a Homomorphic Machine Learning Big Data Pipeline for the Financial Services Sector
    IBM Fully Homomorphic Encryption Toolkit for Linux

    • 18 min
    Test-First machine learning (Ep. 115)

    Test-First machine learning (Ep. 115)

    In this episode I speak about a testing methodology for machine learning models that are supposed to be integrated in production environments.
    Don't forget to come chat with us in our Discord channel
     
    Enjoy the show!
     
    --
    This episode is supported by Amethix Technologies.
     
    Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.

    • 19 min
    GPT-3 cannot code (and never will) (Ep. 114)

    GPT-3 cannot code (and never will) (Ep. 114)

    The hype around GPT-3 is alarming and gives and provides us with the awful picture of people misunderstanding artificial intelligence. In response to some comments that claim GPT-3 will take developers' jobs, in this episode I express some personal opinions about the state of AI in generating source code (and in particular GPT-3).
     
    If you have comments about this episode or just want to chat, come join us on the official Discord channel.
     
     
    This episode is supported by Amethix Technologies.



    Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. They are a consulting firm focused on data science, machine learning, and artificial intelligence.

    • 19 min
    Make Stochastic Gradient Descent Fast Again (Ep. 113)

    Make Stochastic Gradient Descent Fast Again (Ep. 113)

    There is definitely room for improvement in the family of algorithms of stochastic gradient descent. In this episode I explain a relatively simple method that has shown to improve on the Adam optimizer. But, watch out! This approach does not generalize well.
    Join our Discord channel and chat with us.
     
    References
    More descent, less gradient
    Taylor Series
     

    • 20 min
    What data transformation library should I use? Pandas vs Dask vs Ray vs Modin vs Rapids (Ep. 112)

    What data transformation library should I use? Pandas vs Dask vs Ray vs Modin vs Rapids (Ep. 112)

    In this episode I speak about data transformation frameworks available for the data scientist who writes Python code. The usual suspect is clearly Pandas, as the most widely used library and de-facto standard. However when data volumes increase and distributed algorithms are in place (according to a map-reduce paradigm of computation), Pandas no longer performs as expected. Other frameworks play a role in such context. 
    In this episode I explain the frameworks that are the best equivalent to Pandas in bigdata contexts.
    Don't forget to join our Discord channel and comment previous episodes or propose new ones.
     
    This episode is supported by Amethix Technologies



    Amethix works to create and maximize the impact of the world’s leading corporations, startups, and nonprofits, so they can create a better future for everyone they serve. Amethix is a consulting firm focused on data science, machine learning, and artificial intelligence.
     



    References

    Pandas a fast, powerful, flexible and easy to use open source data analysis and manipulation tool - https://pandas.pydata.org/


    Modin - Scale your pandas workflows by changing one line of code - https://github.com/modin-project/modin


    Dask advanced parallelism for analytics https://dask.org/


    Ray is a fast and simple framework for building and running distributed applications https://github.com/ray-project/ray


    RAPIDS - GPU data science https://rapids.ai/

    • 21 min
    [RB] It’s cold outside. Let’s speak about AI winter (Ep. 111)

    [RB] It’s cold outside. Let’s speak about AI winter (Ep. 111)

    In this episode I speak with Filip Piekniewski about some of the most worth noting findings in AI and machine learning in 2019. As a matter of fact, the entire field of AI has been inflated by hype and claims that are hard to believe. A lot of the promises made a few years ago have revealed quite hard to achieve, if not impossible. Let's stay grounded and realistic on the potential of this amazing field of research, not to bring disillusion in the near future.
    Join us to our Discord channel to discuss your favorite episode and propose new ones.
     
    This episode is brought to you by Protonmail
    Click on the link in the description or go to protonmail.com/datascience and get 20% off their annual subscription.

    • 36 min

Customer Reviews

4.2 out of 5
57 Ratings

57 Ratings

zerolagtime ,

Still Need A Degree

I have a BS in Computer Science. I've taken my share of math, including classes on statistics, graph theory, and optimization. I can barely keep up because my day job doesn't put me near this material. If I was working on my Masters in CS, this is great stuff. If I was trying to adapt existing projects to Big Data, this really helps me avoid some traps and pitfalls. It also helps me decode lingo from software vendors to find the liars. What it won't do is teach you Data Science.
I'd get lost for hours if Francesco uploaded his script to his blog with links to projects, papers, and products. It is a very dense lecture though that pulls valuable experiences into one place. Between that and the great improvements to the production process, I'm going to keep listening and gleaning.
In late 2019, Francesco turned the mic over to a friend with a soap box that rightly attacked recommender algorithms, but it stepped way far away from a technical, improvisational interview. Francesco has finally started to realize that ML is easily abused and I look forward to his leadership in guiding the ML industry in better methods to avoid traps imposed by data owners on their analysts.

okschu ,

Awesome

Good greatest

mrkchase ,

Good coverage of the topic

I am just beginning to learn this topic. Francesco covers it in a way I can understand most of the time. There was one episode that was completely over my head, but that's OK. His audience is made up of more than just beginners so he needs to present a wide range of topics so the show can appeal to everyone. And who knows, at some point, i'll be able to go back to that episode and understand it.

Top Podcasts In Technology

Listeners Also Subscribed To