Powered by RND
PodcastsWetenschapDaily Paper Cast
Luister naar Daily Paper Cast in de app
Luister naar Daily Paper Cast in de app
(2.067)(250 021)
Favorieten opslaan
Wekker
Slaaptimer

Daily Paper Cast

Podcast Daily Paper Cast
Jingwen Liang, Gengyu Wang
We publish 10 episodes every day to discuss 10 AI research papers. Both the podcast scripts and audio are generated by AI. The 10 papers are selected from the h...

Beschikbare afleveringen

5 van 508
  • Analyze Feature Flow to Enhance Interpretation and Steering in Language Models
    🤗 Upvotes: 41 | cs.LG, cs.CL Authors: Daniil Laptev, Nikita Balagansky, Yaroslav Aksenov, Daniil Gavrilov Title: Analyze Feature Flow to Enhance Interpretation and Steering in Language Models Arxiv: http://arxiv.org/abs/2502.03032v2 Abstract: We introduce a new approach to systematically map features discovered by sparse autoencoder across consecutive layers of large language models, extending earlier work that examined inter-layer feature links. By using a data-free cosine similarity technique, we trace how specific features persist, transform, or first appear at each stage. This method yields granular flow graphs of feature evolution, enabling fine-grained interpretability and mechanistic insights into model computations. Crucially, we demonstrate how these cross-layer feature maps facilitate direct steering of model behavior by amplifying or suppressing chosen features, achieving targeted thematic control in text generation. Together, our findings highlight the utility of a causal, cross-layer interpretability framework that not only clarifies how features develop through forward passes but also provides new means for transparent manipulation of large language models.
    --------  
    23:32
  • UltraIF: Advancing Instruction Following from the Wild
    🤗 Upvotes: 15 | cs.CL, cs.AI Authors: Kaikai An, Li Sheng, Ganqu Cui, Shuzheng Si, Ning Ding, Yu Cheng, Baobao Chang Title: UltraIF: Advancing Instruction Following from the Wild Arxiv: http://arxiv.org/abs/2502.04153v1 Abstract: Instruction-following made modern large language models (LLMs) helpful assistants. However, the key to taming LLMs on complex instructions remains mysterious, for that there are huge gaps between models trained by open-source community and those trained by leading companies. To bridge the gap, we propose a simple and scalable approach UltraIF for building LLMs that can follow complex instructions with open-source data. UltraIF first decomposes real-world user prompts into simpler queries, constraints, and corresponding evaluation questions for the constraints. Then, we train an UltraComposer to compose constraint-associated prompts with evaluation questions. This prompt composer allows us to synthesize complicated instructions as well as filter responses with evaluation questions. In our experiment, for the first time, we successfully align LLaMA-3.1-8B-Base to catch up with its instruct version on 5 instruction-following benchmarks without any benchmark information, using only 8B model as response generator and evaluator. The aligned model also achieved competitive scores on other benchmarks. Moreover, we also show that UltraIF could further improve LLaMA-3.1-8B-Instruct through self-alignment, motivating broader use cases for the method. Our code will be available at https://github.com/kkk-an/UltraIF.
    --------  
    19:51
  • Great Models Think Alike and this Undermines AI Oversight
    🤗 Upvotes: 14 | cs.LG, cs.AI, cs.CL Authors: Shashwat Goel, Joschka Struber, Ilze Amanda Auzina, Karuna K Chandra, Ponnurangam Kumaraguru, Douwe Kiela, Ameya Prabhu, Matthias Bethge, Jonas Geiping Title: Great Models Think Alike and this Undermines AI Oversight Arxiv: http://arxiv.org/abs/2502.04313v1 Abstract: As Language Model (LM) capabilities advance, evaluating and supervising them at scale is getting harder for humans. There is hope that other language models can automate both these tasks, which we refer to as "AI Oversight". We study how model similarity affects both aspects of AI oversight by proposing a probabilistic metric for LM similarity based on overlap in model mistakes. Using this metric, we first show that LLM-as-a-judge scores favor models similar to the judge, generalizing recent self-preference results. Then, we study training on LM annotations, and find complementary knowledge between the weak supervisor and strong student model plays a crucial role in gains from "weak-to-strong generalization". As model capabilities increase, it becomes harder to find their mistakes, and we might defer more to AI oversight. However, we observe a concerning trend -- model mistakes are becoming more similar with increasing capabilities, pointing to risks from correlated failures. Our work underscores the importance of reporting and correcting for model similarity, especially in the emerging paradigm of AI oversight.
    --------  
    30:24
  • Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2
    🤗 Upvotes: 14 | cs.AI, cs.LG Authors: Yuri Chervonyi, Trieu H. Trinh, Miroslav Olšák, Xiaomeng Yang, Hoang Nguyen, Marcelo Menegali, Junehyuk Jung, Vikas Verma, Quoc V. Le, Thang Luong Title: Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2 Arxiv: http://arxiv.org/abs/2502.03544v1 Abstract: We present AlphaGeometry2, a significantly improved version of AlphaGeometry introduced in Trinh et al. (2024), which has now surpassed an average gold medalist in solving Olympiad geometry problems. To achieve this, we first extend the original AlphaGeometry language to tackle harder problems involving movements of objects, and problems containing linear equations of angles, ratios, and distances. This, together with other additions, has markedly improved the coverage rate of the AlphaGeometry language on International Math Olympiads (IMO) 2000-2024 geometry problems from 66% to 88%. The search process of AlphaGeometry2 has also been greatly improved through the use of Gemini architecture for better language modeling, and a novel knowledge-sharing mechanism that combines multiple search trees. Together with further enhancements to the symbolic engine and synthetic data generation, we have significantly boosted the overall solving rate of AlphaGeometry2 to 84% for $\textit{all}$ geometry problems over the last 25 years, compared to 54% previously. AlphaGeometry2 was also part of the system that achieved silver-medal standard at IMO 2024 https://dpmd.ai/imo-silver. Last but not least, we report progress towards using AlphaGeometry2 as a part of a fully automated system that reliably solves geometry problems directly from natural language input.
    --------  
    22:33
  • Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment
    🤗 Upvotes: 14 | cs.CV, cs.CL, cs.MM, cs.SD, eess.AS, eess.IV Authors: Zuyan Liu, Yuhao Dong, Jiahui Wang, Ziwei Liu, Winston Hu, Jiwen Lu, Yongming Rao Title: Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment Arxiv: http://arxiv.org/abs/2502.04328v1 Abstract: Recent advances in large language models, particularly following GPT-4o, have sparked increasing interest in developing omni-modal models capable of understanding more modalities. While some open-source alternatives have emerged, there is still a notable lag behind specialized single-modality models in performance. In this paper, we present Ola, an Omni-modal language model that achieves competitive performance across image, video, and audio understanding compared to specialized counterparts. The core design of Ola lies in its progressive modality alignment strategy that extends the supporting modality of the language model progressively. Our training pipeline begins with the most distinct modalities: image and text, then gradually expands the skill sets of the model using speech data that connects language and audio knowledge, and video data that connects all modalities. The progressive learning pipeline also enables us to maintain a relatively small size of the cross-modal alignment data, making developing omni-modal from existing vision-language models easy and less costly. Moreover, to unlock an advanced interactive experience like GPT-4o, we further design a sentence-wise decoding solution for streaming speech generation. Extensive experiments demonstrate that Ola surpasses existing open omni-modal LLMs across all modalities while achieving highly competitive performance compared to state-of-the-art specialized models of similar sizes. We aim to make Ola a fully open omni-modal understanding solution to advance future research in this emerging field. Model weights, code, and data are open-sourced at https://github.com/Ola-Omni/Ola.
    --------  
    20:43

Meer Wetenschap podcasts

Over Daily Paper Cast

We publish 10 episodes every day to discuss 10 AI research papers. Both the podcast scripts and audio are generated by AI. The 10 papers are selected from the highest-voted ones on Huggingface Daily Paper (https://huggingface.co/papers). Feedback and suggestions are welcome! Email us: [email protected] Creator: Jingwen Liang, 3D ML, https://www.linkedin.com/in/jingwen-liang/ Gengyu Wang, NLP, http://wanggengyu.com Listen on: Spotify: https://open.spotify.com/show/21nrhmdaA8qoBiH8q03NXL Apple Podcast: https://podcasts.apple.com/us/podcast/daily-paper-cast/id1777620236 Cover Image by Kawen Kuang https://kawen.art
Podcast website

Luister naar Daily Paper Cast, De Koningswens en vele andere podcasts van over de hele wereld met de radio.net-app

Ontvang de gratis radio.net app

  • Zenders en podcasts om te bookmarken
  • Streamen via Wi-Fi of Bluetooth
  • Ondersteunt Carplay & Android Auto
  • Veel andere app-functies
Social
v7.6.0 | © 2007-2025 radio.de GmbH
Generated: 2/10/2025 - 8:00:43 AM