Data Science Deployed @dsdeployed
-
- Technology
The podcast where we tackle the big questions around deploying data science projects like:
How do I go from my laptop to production?
What happens when my dataset doesn’t fit on my laptop?
How do we (as infrastructure people) support going from research -> production workflows
Reproducibility - how to achieve it?
Data Integrity - where was the data touched/cleaned/munged, how do we track lifecycles of data, versioning, etc.
-
Episode #9 - Linked Data with Donny Winston
This week we talk about linked data, how to get started with it, and how it is currently being used.
Donny's Linked Data GitHub Repo and Course - https://github.com/polyneme/intro-linkeddata-mongo-pythonYouTube PlayList - https://youtube.com/playlist?list=PL9QvE4W_ly6NzUSUIpsGJOtFM-aot5H2q
----------------------------------------
Follow the podcast on Twitter: @dsdeployedhttps://twitter.com/dsdeployed
----------------------------------------
Donny Winston
I help researchers do data-intensive science together.Twitter: https://twitter.com/donnywinston @donnywinstonEmail: donny@polyneme.xyzWebsite: https://polyneme.xyz/LinkedIn: https://www.linkedin.com/in/donnywinston/
Ben CookI help data science teams deploy their algorithms because a machine learning model is only as good as the system that delivers it.
Twitter: @jbencook https://twitter.com/jbencookLinkedIn: https://www.linkedin.com/in/jbencook/Email: ben@sparrow.devWebsite: https://sparrow.dev/
Jillian RoweI help biotech startups deploy scalable high performance compute infrastructure on AWS.
Email: jillian@dabbleofdevops.com Website: https://www.dabbleofdevops.comTwitter: www.twitter.com/jillianeroweLinkedIn: https://www.linkedin.com/in/jillian-rowe-9410437a/ -
Episode #8 Computer Vision Annotation with Label Studio
This week we talk about computer vision annotation platforms in general and Label studio in particular!
Label studio: https://labelstud.io/Webinar Replay on Instnace Segmentation - https://youtu.be/ULeWxgVH4SY
----------------------------------------
Follow the podcast on Twitter: @dsdeployedhttps://twitter.com/dsdeployed
----------------------------------------
Donny Winston
I help researchers do data-intensive science together.Twitter: https://twitter.com/donnywinston @donnywinstonEmail: donny@polyneme.xyzWebsite: https://polyneme.xyz/LinkedIn: https://www.linkedin.com/in/donnywinston/
Ben CookI help data science teams deploy their algorithms because a machine learning model is only as good as the system that delivers it.
Twitter: @jbencook https://twitter.com/jbencookLinkedIn: https://www.linkedin.com/in/jbencook/Email: ben@sparrow.devWebsite: https://sparrow.dev/
Jillian RoweI help biotech startups deploy scalable high performance compute infrastructure on AWS.
Email: jillian@dabbleofdevops.com Website: https://www.dabbleofdevops.comTwitter: www.twitter.com/jillianeroweLinkedIn: https://www.linkedin.com/in/jillian-rowe-9410437a/ -
Episode #7 - Python Package Management with Poetry
This week we're talking about Python Package management with Poetry along with general data versioning and software stack management tips, tricks, and complaints.
Ben started us off with an article he wrote introducing Poetry - https://sparrow.dev/python-poetry-machine-learning/
----------------------------------------
Follow the podcast on Twitter: @dsdeployed
https://twitter.com/dsdeployed
----------------------------------------
Donny Winston
I help researchers do data-intensive science together.
Twitter: https://twitter.com/donnywinston @donnywinston
Email: donny@polyneme.xyz
Website: https://polyneme.xyz/
LinkedIn: https://www.linkedin.com/in/donnywinston/
Ben Cook
I help data science teams deploy their algorithms because a machine learning model is only as good as the system that delivers it.
Twitter: @jbencook https://twitter.com/jbencook
LinkedIn: https://www.linkedin.com/in/jbencook/
Email: ben@sparrow.dev
Website: https://sparrow.dev/
Jillian Rowe
I help biotech startups deploy scalable high performance compute infrastructure on AWS.
Email: jillian@dabbleofdevops.com
Website: https://www.dabbleofdevops.com
Twitter: www.twitter.com/jillianerowe
LinkedIn: https://www.linkedin.com/in/jillian-rowe-9410437a/ -
Data Versioning for Data Science
Today we talk about Data Versioning. Why you should do it, what to do about humans in the loop, and how to minimize mistakes.
Tools mentioned:
DVC - https://dvc.org/
Quilt Data Versioning - https://quiltdata.com/
Apache Airflow - https://airflow.apache.org/
Apache Superset - https://superset.apache.org/
OpenProject - https://www.openproject.org/
----------------------------------------
Follow the podcast on Twitter: @dsdeployed
https://twitter.com/dsdeployed
----------------------------------------
Donny Winston
I help researchers do data-intensive science together.
Twitter: https://twitter.com/donnywinston @donnywinston
Email: donny@polyneme.xyz
Website: https://polyneme.xyz/
LinkedIn: https://www.linkedin.com/in/donnywinston/
Ben Cook
I help data science teams deploy their algorithms because a machine learning model is only as good as the system that delivers it.
Twitter: @jbencook https://twitter.com/jbencook
LinkedIn: https://www.linkedin.com/in/jbencook/
Email: ben@sparrow.dev
Website: https://sparrow.dev/
Jillian Rowe
I help biotech startups deploy scalable high performance compute infrastructure on AWS.
Email: jillian@dabbleofdevops.com
Website: https://www.dabbleofdevops.com
Twitter: www.twitter.com/jillianerowe
LinkedIn: https://www.linkedin.com/in/jillian-rowe-9410437a/ -
Episode#5 - Building Data Science Stacks like Pangeo
This week we discuss Pangeo, how they structured their project from infrastructure to data science, and how that can inform other projects.Read more about Pangeo: https://medium.com/pangeo/pangeo-2-0-2bedf099582d
And see the showcase: https://pangeo.io/pangeo-showcase.html#pangeo-showcase
----------------------------------------
Follow the podcast on Twitter: @dsdeployed
https://twitter.com/dsdeployed
----------------------------------------
Donny Winston
I help researchers do data-intensive science together.
Twitter: https://twitter.com/donnywinston @donnywinston
Email: donny@polyneme.xyz
Website: https://polyneme.xyz/
LinkedIn: https://www.linkedin.com/in/donnywinston/
Ben Cook
I help data science teams deploy their algorithms because a machine learning model is only as good as the system that delivers it.
Twitter: @jbencook https://twitter.com/jbencook
LinkedIn: https://www.linkedin.com/in/jbencook/
Email: ben@sparrow.dev
Website: https://sparrow.dev/
Jillian Rowe
I help biotech startups deploy scalable high performance compute infrastructure on AWS.
Email: jillian@dabbleofdevops.com
Website: https://www.dabbleofdevops.com
Twitter: www.twitter.com/jillianerowe
LinkedIn: https://www.linkedin.com/in/jillian-rowe-9410437a/ -
Episode #4 with Greg Wilson - Building Better Data Science Communities
We're talking with Greg this week about the peopling around data science. He brings up tons of excellent points.
You can find out more about Greg on Twitter: https://twitter.com/gvwilson
https://github.com/gvwilson/12-design
https://merely-useful.tech/py-rse/
https://www.amazon.com/Fearless-Change-Patterns-Introducing-paperback/dp/0134395255
https://www.amazon.com/dp/B0051HSJBE/ref=dp-kindle-redirect?_encoding=UTF8&btkr=1
https://www.amazon.com/Discussion-Book-Great-People-Talking/dp/1119049717
https://producingoss.com/
https://codebender.org/
----------------------------------------
Follow the podcast on Twitter: @dsdeployed
https://twitter.com/dsdeployed
----------------------------------------
Donny Winston
I help researchers do data-intensive science together.
Twitter: https://twitter.com/donnywinston @donnywinston
Email: donny@polyneme.xyz
Website: https://polyneme.xyz/
LinkedIn: https://www.linkedin.com/in/donnywinston/
Ben Cook
I help data science teams deploy their algorithms because a machine learning model is only as good as the system that delivers it.
Twitter: @jbencook https://twitter.com/jbencook
LinkedIn: https://www.linkedin.com/in/jbencook/
Email: ben@sparrow.dev
Website: https://sparrow.dev/
Jillian Rowe
I help biotech startups deploy scalable high performance compute infrastructure on AWS.
Email: jillian@dabbleofdevops.com
Website: https://www.dabbleofdevops.com
Twitter: www.twitter.com/jillianerowe
LinkedIn: https://www.linkedin.com/in/jillian-rowe-9410437a/