Unleashing Low-Bit Inference on Ascend NPUs: A Comprehensive Evaluation of HiFloat Formats
arXiv:2602.12635v1 Announce Type: new Abstract: As LLMs scale, low-bit floating-point formats like MXFP and NVFP4 offer new opportunities for precision and efficiency. In this work, we evaluate HiFloat (HiF8 and HiF4), a family of formats tailored for Ascend NPUs. Through...
Aspect-Based Sentiment Analysis for Future Tourism Experiences: A BERT-MoE Framework for Persian User Reviews
arXiv:2602.12778v1 Announce Type: new Abstract: This study advances aspect-based sentiment analysis (ABSA) for Persian-language user reviews in the tourism domain, addressing challenges of low-resource languages. We propose a hybrid BERT-based model with Top-K routing and auxiliary losses to mitigate routing...
RAT-Bench: A Comprehensive Benchmark for Text Anonymization
arXiv:2602.12806v1 Announce Type: new Abstract: Data containing personal information is increasingly used to train, fine-tune, or query Large Language Models (LLMs). Text is typically scrubbed of identifying information prior to use, often with tools such as Microsoft's Presidio or Anthropic's...
AIWizards at MULTIPRIDE: A Hierarchical Approach to Slur Reclamation Detection
arXiv:2602.12818v1 Announce Type: new Abstract: Detecting reclaimed slurs represents a fundamental challenge for hate speech detection systems, as the same lexcal items can function either as abusive expressions or as in-group affirmations depending on social identity and context. In this...
TraceBack: Multi-Agent Decomposition for Fine-Grained Table Attribution
arXiv:2602.13059v1 Announce Type: new Abstract: Question answering (QA) over structured tables requires not only accurate answers but also transparency about which cells support them. Existing table QA systems rarely provide fine-grained attribution, so even correct answers often lack verifiable grounding,...
SCOPE: Selective Conformal Optimized Pairwise LLM Judging
arXiv:2602.13110v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used as judges to replace costly human preference labels in pairwise evaluation. Despite their practicality, LLM judges remain prone to miscalibration and systematic biases. This paper proposes SCOPE (Selective...
The Appeal and Reality of Recycling LoRAs with Adaptive Merging
arXiv:2602.12323v1 Announce Type: new Abstract: The widespread availability of fine-tuned LoRA modules for open pre-trained models has led to an interest in methods that can adaptively merge LoRAs to improve performance. These methods typically include some way of selecting LoRAs...
Deep Doubly Debiased Longitudinal Effect Estimation with ICE G-Computation
arXiv:2602.12379v1 Announce Type: new Abstract: Estimating longitudinal treatment effects is essential for sequential decision-making but is challenging due to treatment-confounder feedback. While Iterative Conditional Expectation (ICE) G-computation offers a principled approach, its recursive structure suffers from error propagation, corrupting the...
Synthetic Interaction Data for Scalable Personalization in Large Language Models
arXiv:2602.12394v1 Announce Type: new Abstract: Personalized prompting offers large opportunities for deploying large language models (LLMs) to diverse users, yet existing prompt optimization methods primarily focus on task-level optimization while largely overlooking user-specific preferences and latent constraints of individual users....
Regularized Meta-Learning for Improved Generalization
arXiv:2602.12469v1 Announce Type: new Abstract: Deep ensemble methods often improve predictive performance, yet they suffer from three practical limitations: redundancy among base models that inflates computational cost and degrades conditioning, unstable weighting under multicollinearity, and overfitting in meta-learning pipelines. We...
Geometric separation and constructive universal approximation with two hidden layers
arXiv:2602.12482v1 Announce Type: new Abstract: We give a geometric construction of neural networks that separate disjoint compact subsets of $\Bbb R^n$, and use it to obtain a constructive universal approximation theorem. Specifically, we show that networks with two hidden layers...
Analytical Results for Two Exponential Family Distributions in Hierarchical Dirichlet Processes
arXiv:2602.12527v1 Announce Type: new Abstract: The Hierarchical Dirichlet Process (HDP) provides a flexible Bayesian nonparametric framework for modeling grouped data with a shared yet unbounded collection of mixture components. While existing applications of the HDP predominantly focus on the Dirichlet-multinomial...
AMPS: Adaptive Modality Preference Steering via Functional Entropy
arXiv:2602.12533v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) often exhibit significant modality preference, which is a tendency to favor one modality over another. Depending on the input, they may over-rely on linguistic priors relative to visual evidence, or...
Fractional Order Federated Learning for Battery Electric Vehicle Energy Consumption Modeling
arXiv:2602.12567v1 Announce Type: new Abstract: Federated learning on connected electric vehicles (BEVs) faces severe instability due to intermittent connectivity, time-varying client participation, and pronounced client-to-client variation induced by diverse operating conditions. Conventional FedAvg and many advanced methods can suffer from...
VI-CuRL: Stabilizing Verifier-Independent RL Reasoning via Confidence-Guided Variance Reduction
arXiv:2602.12579v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a dominant paradigm for enhancing Large Language Models (LLMs) reasoning, yet its reliance on external verifiers limits its scalability. Recent findings suggest that RLVR primarily functions...
Power Interpretable Causal ODE Networks: A Unified Model for Explainable Anomaly Detection and Root Cause Analysis in Power Systems
arXiv:2602.12592v1 Announce Type: new Abstract: Anomaly detection and root cause analysis (RCA) are critical for ensuring the safety and resilience of cyber-physical systems such as power grids. However, existing machine learning models for time series anomaly detection often operate as...
Block-Sample MAC-Bayes Generalization Bounds
arXiv:2602.12605v1 Announce Type: new Abstract: We present a family of novel block-sample MAC-Bayes bounds (mean approximately correct). While PAC-Bayes bounds (probably approximately correct) typically give bounds for the generalization error that hold with high probability, MAC-Bayes bounds have a similar...
acl-org/acl-anthology
Data and software for building the ACL Anthology. Contribute to acl-org/acl-anthology development by creating an account on GitHub.
Deed - Attribution-NonCommercial-ShareAlike 3.0 Unported - Creative Commons
Metaphors we judge (AI) by: a rhetorical analysis of artificial copyright disputes
Abstract This article is a ‘metaphorical’ guide to today’s most pressing artificial intelligence (AI) copyright questions, focusing in particular on the EU and the USA. Is unauthorized training on copyright-protected works permitted? Can AI models copy? And is AI-generated output...
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing - ACL Anthology
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations - ACL Anthology