On Robustness and Chain-of-Thought Consistency of RL-Finetuned VLMs
arXiv:2602.12506v1 Announce Type: new Abstract: Reinforcement learning (RL) fine-tuning has become a key technique for enhancing large language models (LLMs) on reasoning-intensive tasks, motivating its extension to vision language models (VLMs). While RL-tuned VLMs improve on visual reasoning benchmarks, they...
Bench-MFG: A Benchmark Suite for Learning in Stationary Mean Field Games
arXiv:2602.12517v1 Announce Type: new Abstract: The intersection of Mean Field Games (MFGs) and Reinforcement Learning (RL) has fostered a growing family of algorithms designed to solve large-scale multi-agent systems. However, the field currently lacks a standardized evaluation protocol, forcing researchers...
Multi-Agent Model-Based Reinforcement Learning with Joint State-Action Learned Embeddings
arXiv:2602.12520v1 Announce Type: new Abstract: Learning to coordinate many agents in partially observable and highly dynamic environments requires both informative representations and data-efficient training. To address this challenge, we present a novel model-based multi-agent reinforcement learning framework that unifies joint...
AMPS: Adaptive Modality Preference Steering via Functional Entropy
arXiv:2602.12533v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) often exhibit significant modality preference, which is a tendency to favor one modality over another. Depending on the input, they may over-rely on linguistic priors relative to visual evidence, or...
Exploring Accurate and Transparent Domain Adaptation in Predictive Healthcare via Concept-Grounded Orthogonal Inference
arXiv:2602.12542v1 Announce Type: new Abstract: Deep learning models for clinical event prediction on electronic health records (EHR) often suffer performance degradation when deployed under different data distributions. While domain adaptation (DA) methods can mitigate such shifts, its "black-box" nature prevents...
Fractional Order Federated Learning for Battery Electric Vehicle Energy Consumption Modeling
arXiv:2602.12567v1 Announce Type: new Abstract: Federated learning on connected electric vehicles (BEVs) faces severe instability due to intermittent connectivity, time-varying client participation, and pronounced client-to-client variation induced by diverse operating conditions. Conventional FedAvg and many advanced methods can suffer from...
VI-CuRL: Stabilizing Verifier-Independent RL Reasoning via Confidence-Guided Variance Reduction
arXiv:2602.12579v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a dominant paradigm for enhancing Large Language Models (LLMs) reasoning, yet its reliance on external verifiers limits its scalability. Recent findings suggest that RLVR primarily functions...
Vehicle behaviour estimation for abnormal event detection using distributed fiber optic sensing
arXiv:2602.12591v1 Announce Type: new Abstract: The distributed fiber-optic sensing (DFOS) system is a cost-effective wide-area traffic monitoring technology that utilizes existing fiber infrastructure to effectively detect traffic congestions. However, detecting single-lane abnormalities, that lead to congestions, is still a challenge....
Block-Sample MAC-Bayes Generalization Bounds
arXiv:2602.12605v1 Announce Type: new Abstract: We present a family of novel block-sample MAC-Bayes bounds (mean approximately correct). While PAC-Bayes bounds (probably approximately correct) typically give bounds for the generalization error that hold with high probability, MAC-Bayes bounds have a similar...
Coden: Efficient Temporal Graph Neural Networks for Continuous Prediction
arXiv:2602.12613v1 Announce Type: new Abstract: Temporal Graph Neural Networks (TGNNs) are pivotal in processing dynamic graphs. However, existing TGNNs primarily target one-time predictions for a given temporal span, whereas many practical applications require continuous predictions, that predictions are issued frequently...
Efficient Personalized Federated PCA with Manifold Optimization for IoT Anomaly Detection
arXiv:2602.12622v1 Announce Type: new Abstract: Internet of things (IoT) networks face increasing security threats due to their distributed nature and resource constraints. Although federated learning (FL) has gained prominence as a privacy-preserving framework for distributed IoT environments, current federated principal...
Formalizing the Sampling Design Space of Diffusion-Based Generative Models via Adaptive Solvers and Wasserstein-Bounded Timesteps
arXiv:2602.12624v1 Announce Type: new Abstract: Diffusion-based generative models have achieved remarkable performance across various domains, yet their practical deployment is often limited by high sampling costs. While prior work focuses on training objectives or individual solvers, the holistic design of...
Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics
arXiv:2602.12643v1 Announce Type: new Abstract: We present Unified Latent Dynamics (ULD), a novel reinforcement learning algorithm that unifies the efficiency of model-free methods with the representational strengths of model-based approaches, without incurring planning overhead. By embedding state-action pairs into a...
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts - ACL Anthology
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing - ACL Anthology
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing - ACL Anthology
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing - ACL Anthology
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts - ACL Anthology
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology
Deed - Attribution-NonCommercial-ShareAlike 3.0 Unported - Creative Commons
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) - ACL Anthology
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: Industry Track - ACL Anthology
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing - ACL Anthology
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: Industry Track - ACL Anthology
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track - ACL Anthology
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology