Skip to main content
Academic

Influence-Preserving Proxies for Gradient-Based Data Selection in LLM Fine-tuning

arXiv:2602.17835v1 Announce Type: new Abstract: Supervised fine-tuning (SFT) relies critically on selecting training data that most benefits a model's downstream performance. Gradient-based data selection methods such as TracIn and Influence Functions leverage influence to identify useful samples, but their computational cost scales poorly, making them impractical for multi-billion-parameter large language models (LLMs). A common alternative is to use off-the-shelf smaller models as proxies, but they remain suboptimal since their learning dynamics are unclear, their sizes cannot be flexibly adjusted, and they cannot be further aligned with the target model in terms of gradient-based influence estimation. To address these challenges, we introduce Iprox, a two-stage framework that derives influence-preserving proxies directly from the target model. It first applies a low-rank compression stage to preserve influence information of the target model, and then an aligning stage to align bot

arXiv:2602.17835v1 Announce Type: new Abstract: Supervised fine-tuning (SFT) relies critically on selecting training data that most benefits a model's downstream performance. Gradient-based data selection methods such as TracIn and Influence Functions leverage influence to identify useful samples, but their computational cost scales poorly, making them impractical for multi-billion-parameter large language models (LLMs). A common alternative is to use off-the-shelf smaller models as proxies, but they remain suboptimal since their learning dynamics are unclear, their sizes cannot be flexibly adjusted, and they cannot be further aligned with the target model in terms of gradient-based influence estimation. To address these challenges, we introduce Iprox, a two-stage framework that derives influence-preserving proxies directly from the target model. It first applies a low-rank compression stage to preserve influence information of the target model, and then an aligning stage to align both model gradients and logits, thereby constructing proxies that flexibly control computational cost while retaining the target model's influence. Experimental results across diverse LLM families and evaluation tasks show that Iprox consistently outperforms off-the-shelf proxies and baseline methods. On Qwen3-4B, a 1.5B proxy constructed with Iprox achieves stronger performance than the larger 1.7B off-the-shelf proxy. Notably, on Llama3.2, Iprox achieves better performance than baselines while reducing computational cost by more than half relative to the full 3B model. These results show that Iprox provides effective influence-preserving proxies, making gradient-based data selection more scalable for LLMs.

Executive Summary

This article presents Iprox, a two-stage framework that generates influence-preserving proxies for gradient-based data selection in large language models (LLMs). Iprox addresses the challenges of existing methods by directly deriving proxies from the target model, preserving influence information and aligning model gradients and logits. Experimental results demonstrate Iprox's effectiveness, outperforming off-the-shelf proxies and baseline methods, and reducing computational cost. The framework's scalability and flexibility make it suitable for LLM fine-tuning, enabling more efficient data selection. The research contributes to the development of more robust and efficient LLM fine-tuning methods, with potential applications in natural language processing and related fields.

Key Points

  • Iprox generates influence-preserving proxies directly from the target model
  • The framework consists of a low-rank compression stage and an aligning stage
  • Iprox outperforms off-the-shelf proxies and baseline methods in experimental results

Merits

Scalability and Flexibility

Iprox enables the use of smaller models as proxies while preserving influence information, making it suitable for LLM fine-tuning with varying model sizes.

Efficiency

The framework reduces computational cost by aligning model gradients and logits, making it more efficient than existing methods.

Demerits

Complexity

Iprox involves a two-stage framework, which may add complexity to the model selection process.

Expert Commentary

The article presents a significant contribution to the field of LLM fine-tuning, addressing the challenges of gradient-based data selection and providing a scalable and flexible framework. While Iprox's complexity may be a concern, its efficiency and effectiveness make it a compelling solution for LLM researchers and practitioners. The research has far-reaching implications for the development of more robust and explainable AI models, with potential applications in natural language processing, computer vision, and other areas of AI.

Recommendations

  • Further investigation into the application of Iprox in more diverse LLM families and evaluation tasks
  • Exploration of Iprox's potential use in other AI domains, such as computer vision and speech recognition

Sources