Beyond Musical Descriptors: Extracting Preference-Bearing Intent in Music Queries
arXiv:2602.12301v1 Announce Type: cross Abstract: Although annotated music descriptor datasets for user queries are increasingly common, few consider the user's intent behind these descriptors, which is essential for effectively meeting their needs. We introduce MusicRecoIntent, a manually annotated corpus of...
Sparse Autoencoders are Capable LLM Jailbreak Mitigators
arXiv:2602.12418v1 Announce Type: cross Abstract: Jailbreak attacks remain a persistent threat to large language model safety. We propose Context-Conditioned Delta Steering (CC-Delta), an SAE-based defense that identifies jailbreak-relevant sparse features by comparing token-level representations of the same harmful request with...
DiffuRank: Effective Document Reranking with Diffusion Language Models
arXiv:2602.12528v1 Announce Type: cross Abstract: Recent advances in large language models (LLMs) have inspired new paradigms for document reranking. While this paradigm better exploits the reasoning and contextual understanding capabilities of LLMs, most existing LLM-based rerankers rely on autoregressive generation,...
HyperMLP: An Integrated Perspective for Sequence Modeling
arXiv:2602.12601v1 Announce Type: cross Abstract: Self-attention is often viewed as probabilistic query-key lookup, motivating designs that preserve normalized attention scores and fixed positional semantics. We advocate a simpler and more unified perspective: an autoregressive attention head can be viewed as...
VimRAG: Navigating Massive Visual Context in Retrieval-Augmented Generation via Multimodal Memory Graph
arXiv:2602.12735v1 Announce Type: cross Abstract: Effectively retrieving, reasoning, and understanding multimodal information remains a critical challenge for agentic systems. Traditional Retrieval-augmented Generation (RAG) methods rely on linear interaction histories, which struggle to handle long-context tasks, especially those involving information-sparse yet...
Abstractive Red-Teaming of Language Model Character
arXiv:2602.12318v1 Announce Type: new Abstract: We want language model assistants to conform to a character specification, which asserts how the model should act across diverse user interactions. While models typically follow these character specifications, they can occasionally violate them in...
The Appeal and Reality of Recycling LoRAs with Adaptive Merging
arXiv:2602.12323v1 Announce Type: new Abstract: The widespread availability of fine-tuned LoRA modules for open pre-trained models has led to an interest in methods that can adaptively merge LoRAs to improve performance. These methods typically include some way of selecting LoRAs...
Wireless TokenCom: RL-Based Tokenizer Agreement for Multi-User Wireless Token Communications
arXiv:2602.12338v1 Announce Type: new Abstract: Token Communications (TokenCom) has recently emerged as an effective new paradigm, where tokens are the unified units of multimodal communications and computations, enabling efficient digital semantic- and goal-oriented communications in future wireless networks. To establish...
High-dimensional Level Set Estimation with Trust Regions and Double Acquisition Functions
arXiv:2602.12391v1 Announce Type: new Abstract: Level set estimation (LSE) classifies whether an unknown function's value exceeds a specified threshold for given inputs, a fundamental problem in many real-world applications. In active learning settings with limited initial data, we aim to...
Synthetic Interaction Data for Scalable Personalization in Large Language Models
arXiv:2602.12394v1 Announce Type: new Abstract: Personalized prompting offers large opportunities for deploying large language models (LLMs) to diverse users, yet existing prompt optimization methods primarily focus on task-level optimization while largely overlooking user-specific preferences and latent constraints of individual users....
Computationally sufficient statistics for Ising models
arXiv:2602.12449v1 Announce Type: new Abstract: Learning Gibbs distributions using only sufficient statistics has long been recognized as a computationally hard problem. On the other hand, computationally efficient algorithms for learning Gibbs distributions rely on access to full sample configurations generated...
A Theoretical Analysis of Mamba's Training Dynamics: Filtering Relevant Features for Generalization in State Space Models
arXiv:2602.12499v1 Announce Type: new Abstract: The recent empirical success of Mamba and other selective state space models (SSMs) has renewed interest in non-attention architectures for sequence modeling, yet their theoretical foundations remain underexplored. We present a first-step analysis of generalization...
On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs
arXiv:2602.12506v1 Announce Type: new Abstract: Reinforcement learning (RL) fine-tuning has become a key technique for enhancing large language models (LLMs) on reasoning-intensive tasks, motivating its extension to vision language models (VLMs). While RL-tuned VLMs improve on visual reasoning benchmarks, they...
Multi-Agent Model-Based Reinforcement Learning with Joint State-Action Learned Embeddings
arXiv:2602.12520v1 Announce Type: new Abstract: Learning to coordinate many agents in partially observable and highly dynamic environments requires both informative representations and data-efficient training. To address this challenge, we present a novel model-based multi-agent reinforcement learning framework that unifies joint...
Analytical Results for Two Exponential Family Distributions in Hierarchical Dirichlet Processes
arXiv:2602.12527v1 Announce Type: new Abstract: The Hierarchical Dirichlet Process (HDP) provides a flexible Bayesian nonparametric framework for modeling grouped data with a shared yet unbounded collection of mixture components. While existing applications of the HDP predominantly focus on the Dirichlet-multinomial...
Flow-Factory: A Unified Framework for Reinforcement Learning in Flow-Matching Models
arXiv:2602.12529v1 Announce Type: new Abstract: Reinforcement learning has emerged as a promising paradigm for aligning diffusion and flow-matching models with human preferences, yet practitioners face fragmented codebases, model-specific implementations, and engineering complexity. We introduce Flow-Factory, a unified framework that decouples...
AMPS: Adaptive Modality Preference Steering via Functional Entropy
arXiv:2602.12533v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) often exhibit significant modality preference, which is a tendency to favor one modality over another. Depending on the input, they may over-rely on linguistic priors relative to visual evidence, or...
Exploring Accurate and Transparent Domain Adaptation in Predictive Healthcare via Concept-Grounded Orthogonal Inference
arXiv:2602.12542v1 Announce Type: new Abstract: Deep learning models for clinical event prediction on electronic health records (EHR) often suffer performance degradation when deployed under different data distributions. While domain adaptation (DA) methods can mitigate such shifts, its "black-box" nature prevents...
Fractional Order Federated Learning for Battery Electric Vehicle Energy Consumption Modeling
arXiv:2602.12567v1 Announce Type: new Abstract: Federated learning on connected electric vehicles (BEVs) faces severe instability due to intermittent connectivity, time-varying client participation, and pronounced client-to-client variation induced by diverse operating conditions. Conventional FedAvg and many advanced methods can suffer from...
VI-CuRL: Stabilizing Verifier-Independent RL Reasoning via Confidence-Guided Variance Reduction
arXiv:2602.12579v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a dominant paradigm for enhancing Large Language Models (LLMs) reasoning, yet its reliance on external verifiers limits its scalability. Recent findings suggest that RLVR primarily functions...
RelBench v2: A Large-Scale Benchmark and Repository for Relational Data
arXiv:2602.12606v1 Announce Type: new Abstract: Relational deep learning (RDL) has emerged as a powerful paradigm for learning directly on relational databases by modeling entities and their relationships across multiple interconnected tables. As this paradigm evolves toward larger models and relational...
Dual-Granularity Contrastive Reward via Generated Episodic Guidance for Efficient Embodied RL
arXiv:2602.12636v1 Announce Type: new Abstract: Designing suitable rewards poses a significant challenge in reinforcement learning (RL), especially for embodied manipulation. Trajectory success rewards are suitable for human judges or model fitting, but the sparsity severely limits RL sample efficiency. While...
Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics
arXiv:2602.12643v1 Announce Type: new Abstract: We present Unified Latent Dynamics (ULD), a novel reinforcement learning algorithm that unifies the efficiency of model-free methods with the representational strengths of model-based approaches, without incurring planning overhead. By embedding state-action pairs into a...
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts - ACL Anthology
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing - ACL Anthology
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing - ACL Anthology
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts - ACL Anthology
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology