A machine learning podcast that explores more than just algorithms and data: Life lessons from the experts. Welcome to "Learning from Machine Learning," a podca...
Leland McInnes: UMAP, HDBSCAN & the Geometry of Data | Learning from Machine Learning #10
In this episode of Learning from Machine Learning, we explore the intersection of pure mathematics and modern data science with Leland McInnes, the mind behind an ecosystem of tools for unsupervised learning including UMAP, HDBSCAN, PyNN Descent and DataMapPlot. As a researcher at the Tutte Institute for Mathematics and Computing, Leland has fundamentally shaped how we approach and understand complex data.Leland views data through a unique geometric lens, drawing from his background in algebraic topology to uncover hidden patterns and relationships within complex datasets. This perspective led to the creation of UMAP, a breakthrough in dimensionality reduction that preserves both local and global data structure to allow for incredible visualizations and clustering. Similarly, his clustering algorithm HDBSCAN tackles the messy reality of real-world data, handling varying densities and noise with remarkable effectiveness.But perhaps what's most striking about Leland isn't just his technical achievements – it's his philosophy toward algorithm development. He champions the concept of "decomposing black box algorithms," advocating for transparency and understanding over blind implementation. By breaking down complex algorithms into their fundamental components, Leland argues, we gain the power to adapt and innovate rather than simply consume.For those entering the field, Leland offers poignant advice: resist the urge to chase the hype. Instead, find your unique angle, even if it seems unconventional. His own journey – applying concepts from algebraic topology and fuzzy simplicial sets to data science – demonstrates how breakthrough innovations often emerge from unexpected connections.Throughout our conversation, Leland's passion for knowledge and commitment to understanding shine through. His approach reminds us that the most powerful advances in data science often come not from following the crowd, but from diving deep into fundamentals and drawing connections across disciplines.There's immense value in understanding the tools you use, questioning established approaches, and bringing your unique perspective to the field. As Leland shows us, sometimes the most significant breakthroughs come from seeing familiar problems through a new lens.Resources for Leland McInnesLeland’s GithubUMAPHDBSCANPyNN DescentDataMapPlotEVoCReferencesMaarten GrootendorstLearning from Machine Learning Episode 1Vincent Warmerdam - CalmcodeLearning from Machine Learning Episode 2Matt RocklinEmily Riehl - Category Theory in ContextLorena BarbaDavid Spivak - Fuzzy Simplicial SetsImproving Mapper’s Robustness by Varying Resolution According to Lens-Space DensityLearning from Machine LearningYoutubehttps://mindfulmachines.substack.com/
--------
55:27
Chris Van Pelt: Machine Learning Tooling, Weights and Biases, Entrepreneurship | Learning from Machine Learning #9
In this episode, we are joined by Chris Van Pelt, co-founder of Weights & Biases and Figure Eight/CrowdFlower. Chris has played a pivotal role in the development of MLOps platforms and has dedicated the last two decades to refining ML workflows and making machine learning more accessible.Throughout the conversation, Chris provides valuable insights into the current state of the industry. He emphasizes the significance of Weights & Biases as a powerful developer tool, empowering ML engineers to navigate through the complexities of experimentation, data visualization, and model improvement. His candid reflections on the challenges in evaluating ML models and addressing the gap between AI hype and reality offer a profound understanding of the field's intricacies.Drawing from his entrepreneurial experience co-founding two machine learning companies, Chris leaves us with lessons in resilience, innovation, and a deep appreciation for the human dimension within the tech landscape. As a Weights & Biases user for five years, witnessing both the tool and the company's growth, it was a genuine honor to host Chris on the show.References and Resourceshttps://wandb.ai/https://www.youtube.com/c/WeightsBiaseshttps://x.com/weights_biaseshttps://www.linkedin.com/company/wandb/https://twitter.com/vanpeltResources to learn more about Learning from Machine Learninghttps://www.youtube.com/@learningfrommachinelearninghttps://www.linkedin.com/company/learning-from-machine-learninghttps://mindfulmachines.substack.com/https://www.linkedin.com/in/sethplevine/https://medium.com/@levine.seth.p
--------
1:05:06
Michelle Gill: AI-Assisted Drug Discovery, NVIDIA, Biofoundation Models, Creating Applied Research Teams | Learning from Machine Learning #8
This episode features Dr. Michelle Gill, Tech Lead and Applied Research Manager at NVIDIA, working on transformative projects like BioNemo to accelerate drug discovery through AI. Her team explores Biofoundation models to enable researchers to better perform tasks like protein folding and small molecule binding.Michelle shares her incredible journey from wet lab biochemist to driving cutting edge AI at NVIDIA. Michelle discusses the overlap and differences between NLP and AI in biology. She outlines the critical need for better machine learning representations that capture the intricate dynamics of biology.Michelle provides advice for beginners and early career professionals in the field of machine learning, emphasizing the importance of continuous learning and staying up to date with the latest tools and techniques. She also shares insights on building successful multidisciplinary teamsAfter hearing her fascinating PyData NYC keynote, it was such an honor to have her on the show to discuss innovations at the intersection of biochemistry and AI.References and Resourceshttps://michellelynngill.com/Michelle Gill - Keynote - PyData NYC https://www.youtube.com/watch?v=ATo2SzA1Pp4AlexNetAlphaFold - https://www.nature.com/articles/s41586-021-03819-2OpenFold - https://www.biorxiv.org/content/10.1101/2022.11.20.517210v1BioNemo - https://www.nvidia.com/en-us/clara/bionemo/NeurIPS - https://nips.cc/Art Palmer - https://www.biochem.cuimc.columbia.edu/profile/arthur-g-palmer-iii-phdPatrick Loria - https://chem.yale.edu/faculty/j-patrick-loriaScott Strobel - https://chem.yale.edu/faculty/scott-strobelAlexander Rives - https://www.forbes.com/sites/kenrickcai/2023/08/25/evolutionaryscale-ai-biotech-startup-meta-researchers-funding/?sh=648f1a1140cfDeborah Marks - https://sysbio.med.harvard.edu/debora-marksResources to learn more about Learning from Machine Learninghttps://www.linkedin.com/company/learning-from-machine-learninghttps://mindfulmachines.substack.com/https://www.linkedin.com/in/sethplevine/https://medium.com/@levine.seth.p
This episode features co-founder and CEO of Explosion, Ines Montani. Listen in as we discuss the evolution of the web and machine learning, the development of SpaCy, Natural Language Processing vs. Natural Language Understanding, the misconceptions of starting a software company, and so much more! Ines is a software developer working on Artificial Intelligence and Natural Language Processing technologies.She's the co-founder and CEO of Explosion, the company behind SpaCy, one of the leading open-source libraries for NLP in Python and Prodigy, an annotation tool to help create training data for Machine Learning Models. Ines has an academic background in Communication Science, Media Studies and Linguistics and has been coding and designing websites since she was 11. She's been the keynote speaker at Python and Data Science conferences around the world.Learning from Machine Learning, a podcast that explores more than just algorithms and data: Life lessons from the experts.Listen on YouTube: https://youtu.be/XNFqFT-DZwo?si=Aj75TmsCyBQTyWqqListen on your favorite podcast platform:https://rss.com/podcasts/learning-from-machine-learning/1190862/References in the Episodehttps://explosion.ai/https://spacy.io/https://ines.io/Applied NLP ThinkingInes Montani - How to Ignore Most Startup Advice and Build a Decent Software Business Ines Montani: Incorporating LLMs into practical NLP workflowsInes Montani (spaCy) - Large Language Models from Prototype to Production [PyData Südwest] Confectionhttps://github.com/explosion/confectionResources to learn more about Learning from Machine Learninghttps://www.linkedin.com/company/learning-from-machine-learninghttps://mindfulmachines.substack.com/https://www.linkedin.com/in/sethplevine/https://medium.com/@levine.seth.p
--------
1:23:11
Lewis Tunstall: Hugging Face, SetFit and Reinforcement Learning | Learning from Machine Learning #6
This episode features Lewis Tunstall, machine learning engineer at Hugging Face and author of the best selling book Natural Language Processing with Transformers. He currently focuses on one of the hottest topic in NLP right now reinforcement learning from human feedback (RLHF). Lewis holds a PhD in quantum physics and his research has taken him around the world and into some of the most impactful projects including the Large Hadron Collider, the world's largest and most powerful particle accelerator. Lewis shares his unique story from Quantum Physicist to Data Scientist to Machine Learning Engineer. Resources to learn more about Lewis Tunstallhttps://www.linkedin.com/in/lewis-tunstall/https://github.com/lewtunReferences from the Episodehttps://www.fast.ai/https://jeremy.fast.ai/SetFit - https://arxiv.org/abs/2209.11055Proximal Policy OptimizationInstructGPTRAFT BenchmarkBidirectional Language Models are Also Few-Shot LearnersNils Reimers - Sentence TransformersJay Alammar - Illustrated TransformerAnnotated TransformerMoshe Wasserblat, Intel, NLP, Research ManagerLeandro von Werra, Co-Author of NLP with Transformers, Hugging Face ResearcherLLMSys - https://lmsys.org/LoRA - Low-Rank Adaptation of Large Language ModelsResources to learn more about Learning from Machine Learninghttps://www.linkedin.com/company/learning-from-machine-learninghttps://mindfulmachines.substack.com/https://www.linkedin.com/in/sethplevine/https://medium.com/@levine.seth.p
A machine learning podcast that explores more than just algorithms and data: Life lessons from the experts. Welcome to "Learning from Machine Learning," a podcast about the insights gained from a career in the field of Machine Learning and Data Science. In each episode, industry experts, entrepreneurs and practitioners will share their experiences and advice on what it takes to succeed in this rapidly-evolving field. But this podcast is not just about the technical aspects of ML. It will also delve into the ways machine learning is changing the world around us. From the implications of artificial intelligence to the ways machine learning is being applied in various sectors, a wide range of topics will be covered that are relevant to anyone interested in the intersection of technology and society.All interviews available on YouTube: https://www.youtube.com/@learningfrommachinelearning