Preference Packing: Efficient Preference Optimization for Large Language Models
arXiv:2602.24082v1 Announce Type: new Abstract: Resource-efficient training optimization techniques are becoming increasingly important as the size of large language models (LLMs) continues to grow. In particular, batch packing is commonly used in pre-training and supervised fine-tuning to achieve resource-efficient training. We propose preference packing, a method to enhance resource efficiency in training techniques that use data with different responses for the same input prompt, such as reward models or Direct Preference Optimization (DPO). Preference packing improves resource efficiency by reducing the attention operations for duplicate input prompts and decreasing KV cache memory usage. We conducted experiments on text-only datasets and image-included datasets and achieved at least 37% reduction in training time. Notably, this method can be applied alongside existing optimization techniques such as batch sorting, resulting in a 3.22x speedup.
arXiv:2602.24082v1 Announce Type: new Abstract: Resource-efficient training optimization techniques are becoming increasingly important as the size of large language models (LLMs) continues to grow. In particular, batch packing is commonly used in pre-training and supervised fine-tuning to achieve resource-efficient training. We propose preference packing, a method to enhance resource efficiency in training techniques that use data with different responses for the same input prompt, such as reward models or Direct Preference Optimization (DPO). Preference packing improves resource efficiency by reducing the attention operations for duplicate input prompts and decreasing KV cache memory usage. We conducted experiments on text-only datasets and image-included datasets and achieved at least 37% reduction in training time. Notably, this method can be applied alongside existing optimization techniques such as batch sorting, resulting in a 3.22x speedup.
Executive Summary
The article proposes 'preference packing', a novel method to enhance resource efficiency in training large language models. By reducing attention operations for duplicate input prompts and decreasing KV cache memory usage, preference packing achieves significant reductions in training time. Experimental results demonstrate at least 37% reduction in training time, with potential for further speedup when combined with existing optimization techniques. This innovation has substantial implications for the development and deployment of large language models, particularly in applications where computational resources are limited.
Key Points
- ▸ Preference packing reduces attention operations for duplicate input prompts
- ▸ Decreases KV cache memory usage, enhancing resource efficiency
- ▸ Achieves at least 37% reduction in training time, with potential for further speedup
Merits
Improved Resource Efficiency
Preference packing optimizes computational resources, enabling faster and more efficient training of large language models
Demerits
Limited Generalizability
The method's effectiveness may be limited to specific applications or datasets, requiring further research to fully explore its potential
Expert Commentary
The proposed preference packing method represents a significant advancement in optimizing resource efficiency for large language models. By addressing the computational bottlenecks associated with duplicate input prompts, this innovation has the potential to accelerate progress in natural language processing and related fields. Furthermore, the compatibility of preference packing with existing optimization techniques underscores its versatility and potential for widespread adoption. As the field continues to evolve, it is crucial to prioritize research into efficient and sustainable AI training methods, and preference packing is an important step in this direction.
Recommendations
- ✓ Further research into the applicability of preference packing across diverse datasets and applications
- ✓ Exploration of potential synergies between preference packing and other optimization techniques to maximize efficiency gains