PodcastsCarrièresMachine Learning Engineered

Machine Learning Engineered

Charlie You
Machine Learning Engineered
Nieuwste aflevering

32 afleveringen

  • Machine Learning Engineered

    Diving Deep into Synthetic Data with Alex Watson of Gretel.ai

    20-04-2021 | 1 u. 19 Min.
    Alex Watson is the co-founder and CEO of Gretel.ai, a startup that offers APIs for creating anonymized and synthetic datasets. Previously he was the founder of Harvest.ai, whose product Macie, an analytics platform protecting against data breaches, was acquired by AWS.
    Learn more about Alex and Gretel AI:
    http://gretel.ai
    Every Thursday I send out the most useful things I’ve learned, curated specifically for the busy machine learning engineer. Sign up here: https://www.cyou.ai/newsletter

    Follow Charlie on Twitter: https://twitter.com/CharlieYouAI
    Subscribe to ML Engineered: https://mlengineered.com/listen
    Comments? Questions? Submit them here: http://bit.ly/mle-survey
    Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/

    Timestamps:
    02:15 Introducing Alex Watson
    03:45 How Alex was first exposed to programming
    05:00 Alex's experience starting Harvest AI, getting acquired by AWS, and integrating their product at massive scale
    21:20 How Alex first saw the opportunity for Gretel.ai
    24:20 The most exciting use-cases for synthetic data
    28:55 Theoretical guarantees of anonymized data with differential privacy
    36:40 Combining pre-training with synthetic data
    38:40 When to anonymize data and when to synthesize it
    41:25 How Gretel's synthetic data engine works
    44:50 Requirements of a dataset to create a synthetic version
    49:25 Augmenting datasets with synthetic examples to address representation bias
    52:45 How Alex recommends teams get started with Gretel.ai
    59:00 Expected accuracy loss from training models on synthetic data
    01:03:15 Biggest surprises from building Gretel.ai
    01:05:25 Organizational patterns for protecting sensitive data
    01:07:40 Alex's vision for Gretel's data catalog
    01:11:15 Rapid fire questions

    Links:
    Gretel.ai Blog
    NetFlix Cancels Recommendation Contest After Privacy Lawsuit
    Greylock - The Github of Data
    Improving massively imbalanced datasets in machine learning with synthetic data
    Deep dive on generating synthetic data for Healthcare
    Gretel’s New Synthetic Performance Report
    The...
  • Machine Learning Engineered

    A Practical Approach to Learning Machine Learning with Radek Osmulski (Earth Species Project)

    30-03-2021 | 1 u. 38 Min.
    Radek Osmulski is a fully self-taught machine learning engineer. After getting tired of his corporate job, he taught himself programming and started a new career as a Ruby on Rails developer. He then set out to learn machine learning. Since then, he's been a Fast AI International Fellow, become a Kaggle Master, and is now an AI Data Engineer on the Earth Species Project.
    Learn more about Radek:
    https://www.radekosmulski.com
    https://twitter.com/radekosmulski
    Every Thursday I send out the most useful things I’ve learned, curated specifically for the busy machine learning engineer. Sign up here: http://cyou.ai/newsletter
    Follow Charlie on Twitter: https://twitter.com/CharlieYouAI
    Subscribe to ML Engineered: https://mlengineered.com/listen
    Comments? Questions? Submit them here: http://bit.ly/mle-survey
    Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/

    Timestamps:
    02:15 How Radek got interested in programming and computer science
    09:00 How Radek taught himself machine learning
    26:40 The skills Radek learned from Fast AI
    39:20 Radek's recommendations for people learning ML now
    51:30 Why Radek is writing a book
    01:01:20 Radek's work at the Earth Species Project
    01:10:15 How the ESP collects animal language data
    01:21:05 Rapid fire questions

    Links:
    Radek's Book "Meta-Learning"
    Andrew Ng ML Coursera
    Fast AI
    Universal Language Model Fine-tuning for Text Classification
    How to do Machine Learning Efficiently
    NPR - Two Heartbeats a Minute
    Earth Species Project
    A Guide to the Good Life
    The Origin of Wealth
    Make Time
    You Are Here
  • Machine Learning Engineered

    From Data Science Leader to ML Researcher with Rodrigo Rivera (Skoltech ADASE, Samsung NEXT)

    23-03-2021 | 1 u. 23 Min.
    Rodrigo Rivera is a machine learning researcher at the Advanced Data Analytics in Science and Engineering Group at Skoltech and technical director of Samsung Next. He's previously been in data science and research leadership roles at companies all around the world including Rocket Internet and Philip-Morris.
    Learn more about Rodrigo:
    https://rodrigo-rivera.com/
    https://twitter.com/rodrigorivr
    Every Thursday I send out the most useful things I’ve learned, curated specifically for the busy machine learning engineer. Sign up here: https://www.cyou.ai/newsletter
    Follow Charlie on Twitter: https://twitter.com/CharlieYouAI
    Subscribe to ML Engineered: https://mlengineered.com/listen
    Comments? Questions? Submit them here: http://bit.ly/mle-survey
    Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/

    Timestamps:
    03:00 How Rodrigo got started in computer science and started his first company
    10:40 Rodrigo's experiences leading data science teams at Rocket Internet and PMI
    26:15 Leaving industry to get a PhD in machine learning
    28:55 Data science collaboration between business and academia
    32:45 Rodrigo's research interest in time series data
    39:25 Topological data analysis
    45:35 Framing effective research as a startup
    48:15 Neural Prophet
    01:04:10 The potential future of Julia for numerical computing
    01:08:20 Most exciting opportunities for ML in industry
    01:15:05 Rodrigo's advice for listeners
    01:17:00 Rapid fire questions

    Links:
    Rodrigo's Google Scholar
    Advanced Data Analytics in Science and Engineering Group
    Neural Prophet
    M-Competitions
    Machine Learning Refined
    Foundations of Machine Learning
    A First Course in Machine Learning
  • Machine Learning Engineered

    The Future of ML and AI Infrastructure and Ethics with Dan Jeffries (Pachyderm, AI Infrastructure Alliance)

    16-03-2021 | 1 u. 36 Min.
    Dan Jeffries is the chief technical evangelist at Pachyderm, a leading data science platform. He's a prominent writer and speaker on all things related to the future. He's been in software for over two decades, many of those at Redhat, and is the founder of the AI Infrastructure Alliance and Practical AI Ethics.
    Learn more about Dan:
    https://twitter.com/Dan_Jeffries1
    https://medium.com/@dan.jeffries
    Every Thursday I send out the most useful things I’ve learned, curated specifically for the busy machine learning engineer. Sign up here: http://cyou.ai/newsletter

    Follow Charlie on Twitter: https://twitter.com/CharlieYouAI
    Subscribe to ML Engineered: https://mlengineered.com/listen
    Comments? Questions? Submit them here: http://bit.ly/mle-survey
    Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/

    Timestamps:
    02:15 How Dan got started in computer science
    06:50 What Dan is most excited about in AI
    14:45 Where we are in the adoption curve of ML
    20:40 The "Canonical Stack" of ML
    32:00 Dan's goal for the AI Infrastructure Alliance
    40:55 "Problems that ML startups don't know they're going to have"
    49:00 Closed vs open source tools in the Canonical Stack
    01:00:05 Building out the "boring" part of the infrastructure to enable exciting applications
    01:08:40 Dan's practical approach to AI Ethics
    01:23:50 Rapid fire questions

    Links:
    Pachyderm
    AI Infrastructure Alliance
    Practical AI Ethics Alliance
    Rise of the Canonical Stack in Machine Learning
    Rise of AI - The Age of AI in 2030
    Google Magenta
    AlphaGo Documentary
    Thinking in Bets
    A History of the World in 6 Glasses
    Super-Thinking
  • Machine Learning Engineered

    Developing Feast, the Leading Open Source Feature Store, with Willem Pienaar (Gojek, Tecton)

    09-03-2021 | 1 u. 11 Min.
    Willem Pienaar is the co-creator of Feast, the leading open source feature store, which he leads the development of as a tech lead at Tecton. Previously, he led the ML platform team at Gojek, a super-app in Southeast Asia.
    Learn more:
    https://twitter.com/willpienaar
    https://feast.dev/
    Every Thursday I send out the most useful things I’ve learned, curated specifically for the busy machine learning engineer. Sign up here: https://www.cyou.ai/newsletter

    Follow Charlie on Twitter: https://twitter.com/CharlieYouAI
    Subscribe to ML Engineered: https://mlengineered.com/listen
    Comments? Questions? Submit them here: http://bit.ly/mle-survey
    Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/

    Timestamps:
    02:15 How Willem got started in computer science
    03:40 Paying for college by starting an ISP
    05:25 Willem's experience creating Gojek's ML platform
    21:45 Issues faced that led to the creation of Feast
    26:45 Lessons learned building Feast
    33:45 Integrating Feast with data quality monitoring tools
    40:10 What it looks like for a team to adopt Feast
    44:20 Feast's current integrations and future roadmap
    46:05 How a data scientist would use Feast when creating a model
    49:40 How the feature store pattern handles DAGs of models
    52:00 Priorities for a startup's data infrastructure
    55:00 Integrating with Amundsen, Lyft's data catalog
    57:15 The evolution of data and MLOps tool standards for interoperability
    01:01:35 Other tools in the modern data stack
    01:04:30 The interplay between open and closed source offerings

    Links:
    Feast's Github
    Gojek Data Science Blog
    Data Build Tool (DBT)
    Tensorflow Data Validation (TFDV)
    A State of Feast
    Google BigQuery
    Lyft Amundsen
    Cortex
    Kubeflow
    MLFlow

Meer Carrières podcasts

Over Machine Learning Engineered

This podcast helps Machine Learning Engineers become the best at what they do. Join host Charlie You every week as he talks to the brightest minds in data science, artificial intelligence, and software engineering to discover how they bring cutting edge research out of the lab and into products that people love. You'll learn the skills, tools, and best practices you can use to build better ML systems and accelerate your career in this flourishing new field.
Podcast website

Luister naar Machine Learning Engineered, Unfinished Business en vele andere podcasts van over de hele wereld met de radio.net-app

Ontvang de gratis radio.net app

  • Zenders en podcasts om te bookmarken
  • Streamen via Wi-Fi of Bluetooth
  • Ondersteunt Carplay & Android Auto
  • Veel andere app-functies