100 episodi

Technology, machine learning and algorithms

Data Science at Home Francesco Gadaleta

    • Tecnologia

Technology, machine learning and algorithms

    Compressing deep learning models: rewinding (Ep.105)

    Compressing deep learning models: rewinding (Ep.105)

    As a continuation of the previous episode in this one I cover the topic about compressing deep learning models and explain another simple yet fantastic approach that can lead to much smaller models that still perform as good as the original one.
    Don't forget to join our Slack channel and discuss previous episodes or propose new ones.
    This episode is supported by Pryml.io Pryml is an enterprise-scale platform to synthesise data and deploy applications built on that data back to a production environment.
     
    References
    Comparing Rewinding and Fine-tuning in Neural Network Pruninghttps://arxiv.org/abs/2003.02389
     

    • 15 min
    Compressing deep learning models: distillation (Ep.104)

    Compressing deep learning models: distillation (Ep.104)

    Using large deep learning models on limited hardware or edge devices is definitely prohibitive. There are methods to compress large models by orders of magnitude and maintain similar accuracy during inference.
    In this episode I explain one of the first methods: knowledge distillation
     Come join us on Slack
    Reference
    Distilling the Knowledge in a Neural Network https://arxiv.org/abs/1503.02531
    Knowledge Distillation and Student-Teacher Learning for Visual Intelligence: A Review and New Outlooks https://arxiv.org/abs/2004.05937

    • 22 min
    Pandemics and the risks of collecting data (Ep. 103)

    Pandemics and the risks of collecting data (Ep. 103)

    Codiv-19 is an emergency. True. Let's just not prepare for another emergency about privacy violation when this one is over.
     
    Join our new Slack channel
     
    This episode is supported by Proton. You can check them out at protonmail.com or protonvpn.com

    • 20 min
    Why average can get your predictions very wrong (ep. 102)

    Why average can get your predictions very wrong (ep. 102)

    Whenever people reason about probability of events, they have the tendency to consider average values between two extremes. In this episode I explain why such a way of approximating is wrong and dangerous, with a numerical example.
    We are moving our community to Slack. See you there!
     
     

    • 14 min
    Activate deep learning neurons faster with Dynamic RELU (ep. 101)

    Activate deep learning neurons faster with Dynamic RELU (ep. 101)

    In this episode I briefly explain the concept behind activation functions in deep learning. One of the most widely used activation function is the rectified linear unit (ReLU). While there are several flavors of ReLU in the literature, in this episode I speak about a very interesting approach that keeps computational complexity low while improving performance quite consistently.
    This episode is supported by pryml.io. At pryml we let companies share confidential data. Visit our website.
    Don't forget to join us on discord channel to propose new episode or discuss the previous ones. 
    References
    Dynamic ReLU https://arxiv.org/abs/2003.10027

    • 22 min
    WARNING!! Neural networks can memorize secrets (ep. 100)

    WARNING!! Neural networks can memorize secrets (ep. 100)

    One of the best features of neural networks and machine learning models is to memorize patterns from training data and apply those to unseen observations. That's where the magic is. However, there are scenarios in which the same machine learning models learn patterns so well such that they can disclose some of the data they have been trained on. This phenomenon goes under the name of unintended memorization and it is extremely dangerous.
    Think about a language generator that discloses the passwords, the credit card numbers and the social security numbers of the records it has been trained on. Or more generally, think about a synthetic data generator that can disclose the training data it is trying to protect. 
    In this episode I explain why unintended memorization is a real problem in machine learning. Except for differentially private training there is no other way to mitigate such a problem in realistic conditions.At Pryml we are very aware of this. Which is why we have been developing a synthetic data generation technology that is not affected by such an issue.
     
    This episode is supported by Harmonizely. Harmonizely lets you build your own unique scheduling page based on your availability so you can start scheduling meetings in just a couple minutes.Get started by connecting your online calendar and configuring your meeting preferences.Then, start sharing your scheduling page with your invitees!
     
    References

    The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networkshttps://www.usenix.org/conference/usenixsecurity19/presentation/carlini

    • 24 min

Top podcast nella categoria Tecnologia

Gli ascoltatori si sono iscritti anche a