DataTalks.Club podcast | Gratis online luisteren

217 afleveringen

Applied AI 2026 Berlin Conference Interview
19-06-2026 | 54 Min.
The conference highlighted a critical shift in the technology and engineering ecosystem, moving away from passive implementations toward autonomous AI systems, collaborative communities, and robust engineering guardrails. Discussions centered on the practical architecture required to scale AI safely, the evolution of modern developer tools, and the importance of cross-border technical collaboration. Ultimately, the insights underscored that the future of technology relies on blending rigorous infrastructure with human-centric ecosystem growth.

Florian Hönicke an expert in engineering infrastructure, explored the operational shifting of cloud services and the challenges of secure temporary access provisioning. He detailed strategies for managing transient credentials for large groups and autonomous agents using automated serverless functions without exposing long-lived access keys. His central thesis argues that true engineering rigor requires deterministic, self-expiring security layers at the container level.

Stella Buhalis, a technical community and developer relations leader, addressed the human dynamics fueling open-source ecosystems and community-driven adoption. She emphasized that long-term project viability stems from structured developer onboarding and lower cognitive barriers rather than pure marketing outreach. Her key insight is that building trusted technical communities acts as the ultimate feedback loop for improving developer experience and software reliability.

Błażej Nowakowski, a backend systems architect, focused on database migration paradigms and the optimization of high-dimensional vector search at the network edge. He analyzed real-world infrastructure friction points, specifically isolating SQLite database lock conflicts and remote data sync latencies on serverless architectures. He noted that decoupling persistent remote backends from the core runtime is crucial for maintaining low-latency, multi-cloud application performance.

Alena Astrakhantseva, a talent strategy and engineering education specialist, outlined the rapid evolution of technical training as the industry shifts from traditional development to autonomous AI flows. She analyzed how continuous testing, real-time monitoring, and structured evaluation frameworks must become core competencies for new developers. Her notable perspective highlights that the next wave of technical talent must be hired for systemic engineering rigor over simple syntax mastery.

Zhen Ming Ng (Babypro), an open-source library maintainer and developer, demonstrated automation workflows for package deployment and baseline library compliance. He focused on minimizing framework overhead by substituting heavy, resource-intensive dependencies with lightweight tokenizers and compact client drivers. His core perspective is that library design must prioritize minimalism to remain functional across edge-native runtime environments.
Connect with speakers:
Florian HönickeCloud Infrastructure & DevOps Engineer Specialisthttps://www.linkedin.com/in/florian-h%C3%B6nicke-b902b6aa

Stella BuhalisDeveloper Relations & Technical Community Leadhttps://www.linkedin.com/in/stella-buhalis

Błażej NowakowskiBackend Systems Architect & Database Engineerhttps://www.linkedin.com/in/b%C5%82a%C5%BCej-nowakowski-096716168/

Alena AstrakhantsevaTechnical Talent Strategist & Engineering Educatorhttps://www.linkedin.com/in/alenaastra/

Zhen Ming Ng (Babypro)Open Source Software Maintainer & Core Developerhttps://www.linkedin.com/in/ming91/
From GenAI Pilots to Production - Nikita Kozodoi
05-06-2026 | 1 u. 3 Min.
In this talk, Nikita, Senior Applied Data Scientist at the AWS Generative AI Innovation Center, shares his expertise in bringing enterprise artificial intelligence out of the sandbox—from his early days optimizing traditional machine learning models like gradient boosting to deploying advanced production-grade GenAI pipelines. We explore what it really takes to move generative AI systems from pilot prototypes to production environments.Links:- AWS Generative AI Innovation Center: https://aws.amazon.com/ai/generative-ai/innovation-center/You’ll learn about:- Deploying multi-layered defenses independent of backend LLMs.- Evaluating parameter-efficient methods like LoRA and QLoRA for small models.- Balancing long-term domain expertise with real-time documentation retrieval.- Utilizing multi-agent orchestration for search and anomaly explanation.- Setting up robust LLM-as-a-judge frameworks verified by human metrics.- Leveraging Amazon Bedrock components for memory and runtime scalability.TIMECODES:05:52 Shifting from traditional ML to generative AI07:49 Hybrid pipelines blending classical ML and LLMs11:25 Production guardrails and multi-layered system defense16:15 Prompt bypasses, input attacks, and AI red teaming20:49 Newsletter localization and translation with Zalando27:24 Evaluation frameworks and human-in-the-loop metrics33:07 Aligning LLM-as-a-judge with few-shot prompts34:49 Fine-tuning small language models versus prompting41:18 Complementary mechanics of RAG and fine-tuning43:00 Agentic web search tools for anomaly explanation47:01 Automated text generation from real-time sports sensors49:58 AWS project scoping and proof of concept timelines54:58 Interview requirements and career skills for AWS roles57:59 Enterprise architecture patterns and system observability01:00:42 Reusable infrastructure blocks on Amazon BedrockThis session is designed for machine learning engineers, data scientists, and technical product managers looking to architect reliable, production-ready GenAI workflows. It is highly valuable for teams aiming to bridge the gap between experimental AI prototypes and secure enterprise software.Connect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/ Connect with Nikita- Linkedin - https://www.linkedin.com/in/kozodoi/- Github - https://github.com/kozodoi- Website and blog - https://www.kozodoi.me/
From Notebook to Production: Building End-to-End AI Systems - Mariano Semelman
29-05-2026 | 1 u. 7 Min.
In this talk, Mariano, Lead Data Scientist and ML Engineer at OLX, shares his journey building high-impact AI media solutions. We explore the transition from traditional e-commerce models to Generative AI and Agentic tools, focusing on how to take AI products from a notebook to full-scale production.You’ll learn about:
How to master the full product cycle from requirement gathering to deployment.
Using video-to-ad technology to automate car listings and seller experiences.
Essential modern tools like FastAPI, Arize, and why UV is a game-changer.
When to use LLMs versus specialized vision models like CLIP and YOLO.
Why production pipelines are moving from Jupyter notebooks to CLI tools.
How agentic coding and AI assistants are 10x-ing development speed.
TIMECODES:0:00 Community Introduction and Slack Engagement4:16 Career Journey: From Argentina to Barcelona7:16 Product-Driven AI vs. Traditional Reporting9:41 AI Media Solutions for E-Commerce Sellers10:55 Video-to-Ad: The Future of Marketplaces13:45 Automated Content Creation for Sellers17:10 Defining End-to-End Ownership in Data Science21:12 The Longevity of the CRISP-DM Framework25:33 Impact of Agentic Coding and GitHub Copilot31:42 Why LLMs Aren't Always the Best Solution37:39 Translating Business Needs to ML Requirements41:18 Managing Explicit and Implicit Feedback Loops48:26 Architecture Deep Dive: Image Description Logic55:28 The Declining Role of Notebooks in Production1:02:53 The Modern Tech Stack: Fast API, UV, and Arize

Connect with Mariano:
Linkedin - https://www.linkedin.com/in/msemelman/
Connect with DataTalks.Club:- Join the community - https://datatalks.club/slack.html- Subscribe to our Google calendar to have all our events in your calendar - https://calendar.google.com/calendar/r?cid=ZjhxaWRqbnEwamhzY3A4ODA5azFlZ2hzNjBAZ3JvdXAuY2FsZW5kYXIuZ29vZ2xlLmNvbQ- Check other upcoming events - https://lu.ma/dtc-events- GitHub: https://github.com/DataTalksClub- LinkedIn - https://www.linkedin.com/company/datatalks-club/ - Twitter - https://twitter.com/DataTalksClub - Website - https://datatalks.club/
Data Makers Fest 2026 Conference Interviews
22-05-2026 | 1 u. 6 Min.
At Data Makers Fest, a recurring theme was the tension between GenAI hype and production reality. Speakers stressed that classical ML, MLOps, evaluation, data quality, and governance remain essential—especially in regulated sectors like fintech and healthcare. Another strong theme was inclusivity: building AI that serves smaller languages, diverse communities, and practitioners beyond the English-centric ecosystem.

Ryan Chaves. Head of ML at a Dutch fintech, Ryan focused on the gap between AI demos and production systems. He argued that classical ML remains critical for fraud detection and risk scoring, while GenAI works best as an accelerator on top of existing systems. He also emphasized storytelling, stakeholder communication, and mentorship as core engineering skills.
Alp Öktem. Computational linguist and researcher Alp explored the imbalance between AI progress in English and low-resource languages. Through Mozilla Data Collective, he highlighted how open datasets, speech corpora, and synthetic data can expand AI access to underrepresented communities. His broader warning: fluent AI can still fail culturally, linguistically, and ethically.
Agnieszka Kamińska. Working in pharmaceutical ML engineering, Agnieszka discussed extracting scientific knowledge from research documents into knowledge graphs. Her focus was reliability: LLMs help with entity extraction and relationship discovery, but trustworthy systems still require ontologies, validation layers, and production-minded engineering. She advocated a pragmatic middle ground between AI hype and skepticism.
Nemanja Radojković. An MLOps engineer in finance, Nemanja reflected on how GenAI is changing software engineering itself. He argued that coding assistants improve productivity but risk weakening engineers’ understanding if overused. His central point: governance, reproducibility, and platform engineering will become even more important as organizations deploy AI agents at scale.
Filipa Castro. Leading AI initiatives at Euronext, Filipa described how GenAI is integrated into regulated financial workflows. Her team uses LLMs to automate document-heavy operational processes while preserving human validation. Her broader message: successful enterprise AI depends less on flashy models and more on infrastructure foundations like CI/CD, monitoring, governance, and operational rigor.
Beatriz Silva. As a student volunteer pursuing a master’s in data science, Beatriz represented the conference’s educational and community dimension. For her, the event was about access—networking with companies, exploring thesis opportunities, and connecting academic learning with industry practice. Her perspective highlighted how conferences like Data Makers Fest help shape the next generation of AI practitioners.

Connect with speakers:
Ryan Chaves. Head of Machine Learning at a Dutch fintech focused on fraud detection, risk systems, and production ML. LinkedIn
Alp Öktem. Computational linguist and researcher focused on low-resource languages, inclusive AI, and open language datasets. LinkedIn
Agnieszka Kamińska. Machine Learning Engineer working on scientific knowledge extraction, knowledge graphs, and AI systems in pharma. LinkedIn
Nemanja Radojković. Senior MLOps Engineer specializing in regulated financial systems, AI governance, and platform engineering. LinkedIn
Filipa Castro. AI Lead at Euronext focused on enterprise GenAI systems, operational AI strategy, and financial services automation. LinkedIn
Beatriz Silva. Data science master’s student and conference volunteer exploring opportunities in ML and computer vision. LinkedIn
Competitions: Beyond the Kaggle Leaderboard - Tatiana Habruseva
01-05-2026 | 1 u. 5 Min.
In this talk, Tatiana, Staff Software Engineer at LinkedIn, shares her journey from academic physics to becoming a Kaggle Master and winning the Sound Demixing Challenge. We explore how to use machine learning competitions as a strategic tool to build a high-impact career and bridge the gap between theory and production.You’ll learn about:
Turning competition code into professional GitHub repos.
Converting results into papers for NIPS and CVPR.
How LLMs are changing the benchmark for AI competitions.
Why hands-on implementation beats passive learning.
Using Topcoder and AI Crowd for research-driven goals.
Practical steps for your very first model submission.Links:
Rise: 3 Practical Steps for Advancing Your Career, Standing Out as a Leader, and Liking Your Life. By Patty Azzarello https://www.porchlightbooks.com/pages/author/Patty_Azzarello-16156396 - awesome book about why doing good is not enough, and what else you need to do to promote your career (same applies to competitions)
AICrowd - https://www.aicrowd.com/challenges
Grand challenges - https://grand-challenge.org/challenges/
Kaggle competitions - https://www.kaggle.com/competitions
TopCoder challenge SpaceNet 9 - https://www.topcoder.com/challenges/9620f66a-767e-40ac-81d5-5cc61274b186(no current active competitions, but they appear)
Medium blog post with instruction - https://medium.com/data-science/writing-papers-tech-reports-after-kaggle-competitions-ee504fc0c4c1
Kaggle Solution Write-Up Documentation - https://www.kaggle.com/solution-write-up-documentation
Evaluating Machine Learning Agents on Machine Learning Engineering - https://arxiv.org/abs/2410.07095
Machine Learning Engineering Agent via Search and Targeted Refinement - https://arxiv.org/html/2506.15692v2
AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench - chrome-extension://efaidnbmnnnibpcajpcglclefindmkaj/https://arxiv.org/pdf/2507.02554TIMECODES:00:00 Tatiana’s journey from academia to staff software engineer06:01 Machine learning applications in physics and signal processing09:13 Skill development and domain diversification on Kaggle13:35 Agentic AI benchmarks and automated competition entries17:43 Deep technical mastery versus leaderboard gamification23:04 Hands-on implementation and the illusion of learning26:01 Specialized platforms and fair competition environments31:35 Academic publications and research from silver medals35:24 GitHub repositories and engineering portfolio building39:02 Technical marketing via blog posts and LinkedIn43:25 Innovative approaches for academic conference submissions47:21 Research challenges at NIPS and CVPR workshops52:51 Medical imaging platforms and specialized recommendations57:46 First submission strategies for beginners01:00:56 Asynchronous collaboration and competition team dynamicsPerfect for data scientists and engineers looking to transition from academia or build a formal portfolio using Kaggle as a career-advancement tool.Connect with Tatiana:
Linkedin - https://www.linkedin.com/in/tatigabru/