A Modular LLM Framework for Explainable Price Outlier Detection
arXiv:2603.20636v1 Announce Type: new Abstract: Detecting product price outliers is important for retail and e-commerce stores as erroneous or unexpectedly high prices adversely affect competitiveness, revenue, and consumer trust. Classical techniques offer simple thresholds while ignoring the rich semantic relationships...
Hear Both Sides: Efficient Multi-Agent Debate via Diversity-Aware Message Retention
arXiv:2603.20640v1 Announce Type: new Abstract: Multi-Agent Debate has emerged as a promising framework for improving the reasoning quality of large language models through iterative inter-agent communication. However, broadcasting all agent messages at every round introduces noise and redundancy that can...
Can I guess where you are from? Modeling dialectal morphosyntactic similarities in Brazilian Portuguese
arXiv:2603.20695v1 Announce Type: new Abstract: This paper investigates morphosyntactic covariation in Brazilian Portuguese (BP) to assess whether dialectal origin can be inferred from the combined behavior of linguistic variables. Focusing on four grammatical phenomena related to pronouns, correlation and clustering...
Reasoning Topology Matters: Network-of-Thought for Complex Reasoning Tasks
arXiv:2603.20730v1 Announce Type: new Abstract: Existing prompting paradigms structure LLM reasoning in limited topologies: Chain-of-Thought (CoT) produces linear traces, while Tree-of-Thought (ToT) performs branching search. Yet complex reasoning often requires merging intermediate results, revisiting hypotheses, and integrating evidence from multiple...
MzansiText and MzansiLM: An Open Corpus and Decoder-Only Language Model for South African Languages
arXiv:2603.20732v1 Announce Type: new Abstract: Decoder-only language models can be adapted to diverse tasks through instruction finetuning, but the extent to which this generalizes at small scale for low-resource languages remains unclear. We focus on the languages of South Africa,...
SozKZ: Training Efficient Small Language Models for Kazakh from Scratch
arXiv:2603.20854v1 Announce Type: new Abstract: Kazakh, a Turkic language spoken by over 22 million people, remains underserved by existing multilingual language models, which allocate minimal capacity to low-resource languages and employ tokenizers ill-suited to agglutinative morphology. We present SozKZ, a...
NoveltyAgent: Autonomous Novelty Reporting Agent with Point-wise Novelty Analysis and Self-Validation
arXiv:2603.20884v1 Announce Type: new Abstract: The exponential growth of academic publications has led to a surge in papers of varying quality, increasing the cost of paper screening. Current approaches either use novelty assessment within general AI Reviewers or repurpose DeepResearch,...
LLM Router: Prefill is All You Need
arXiv:2603.20895v1 Announce Type: new Abstract: LLMs often share comparable benchmark accuracies, but their complementary performance across task subsets suggests that an Oracle router--a theoretical selector with perfect foresight--can significantly surpass standalone model accuracy by navigating model-specific strengths. While current routers...
User Preference Modeling for Conversational LLM Agents: Weak Rewards from Retrieval-Augmented Interaction
arXiv:2603.20939v1 Announce Type: new Abstract: Large language models are increasingly used as personal assistants, yet most lack a persistent user model, forcing users to repeatedly restate preferences across sessions. We propose Vector-Adapted Retrieval Scoring (VARS), a pipeline-agnostic, frozen-backbone framework that...
DiscoUQ: Structured Disagreement Analysis for Uncertainty Quantification in LLM Agent Ensembles
arXiv:2603.20975v1 Announce Type: new Abstract: Multi-agent LLM systems, where multiple prompted instances of a language model independently answer questions, are increasingly used for complex reasoning tasks. However, existing methods for quantifying the uncertainty of their collective outputs rely on shallow...
Mitigating Selection Bias in Large Language Models via Permutation-Aware GRPO
arXiv:2603.21016v1 Announce Type: new Abstract: Large language models (LLMs) used for multiple-choice and pairwise evaluation tasks often exhibit selection bias due to non-semantic factors like option positions and label symbols. Existing inference-time debiasing is costly and may harm reasoning, while...
Reading Between the Lines: How Electronic Nonverbal Cues shape Emotion Decoding
arXiv:2603.21038v1 Announce Type: new Abstract: As text-based computer-mediated communication (CMC) increasingly structures everyday interaction, a central question re-emerges with new urgency: How do users reconstruct nonverbal expression in environments where embodied cues are absent? This paper provides a systematic, theory-driven...
Assessing the Ability of Neural TTS Systems to Model Consonant-Induced F0 Perturbation
arXiv:2603.21078v1 Announce Type: new Abstract: This study proposes a segmental-level prosodic probing framework to evaluate neural TTS models' ability to reproduce consonant-induced f0 perturbation, a fine-grained segmental-prosodic effect that reflects local articulatory mechanisms. We compare synthetic and natural speech realizations...
JointFM-0.1: A Foundation Model for Multi-Target Joint Distributional Prediction
arXiv:2603.20266v1 Announce Type: new Abstract: Despite the rapid advancements in Artificial Intelligence (AI), Stochastic Differential Equations (SDEs) remain the gold-standard formalism for modeling systems under uncertainty. However, applying SDEs in practice is fraught with challenges: modeling risk is high, calibration...
MARLIN: Multi-Agent Reinforcement Learning for Incremental DAG Discovery
arXiv:2603.20295v1 Announce Type: new Abstract: Uncovering causal structures from observational data is crucial for understanding complex systems and making informed decisions. While reinforcement learning (RL) has shown promise in identifying these structures in the form of a directed acyclic graph...
Probing the Latent World: Emergent Discrete Symbols and Physical Structure in Latent Representations
arXiv:2603.20327v1 Announce Type: new Abstract: Video world models trained with Joint Embedding Predictive Architectures (JEPA) acquire rich spatiotemporal representations by predicting masked regions in latent space rather than reconstructing pixels. This removes the visual verification pathway of generative models, creating...
Bounded Coupled AI Learning Dynamics in Tri-Hierarchical Drone Swarms
arXiv:2603.20333v1 Announce Type: new Abstract: Modern autonomous multi-agent systems combine heterogeneous learning mechanisms operating at different timescales. An open question remains: can one formally guarantee that coupled dynamics of such mechanisms stay within the admissible operational regime? This paper studies...
Interpretable Multiple Myeloma Prognosis with Observational Medical Outcomes Partnership Data
arXiv:2603.20341v1 Announce Type: new Abstract: Machine learning (ML) promises better clinical decision-making, yet opaque model behavior limits the adoption in healthcare. We propose two novel regularization techniques for ensuring the interpretability of ML models trained on real-world data. In particular,...
CAMA: Exploring Collusive Adversarial Attacks in c-MARL
arXiv:2603.20390v1 Announce Type: new Abstract: Cooperative multi-agent reinforcement learning (c-MARL) has been widely deployed in real-world applications, such as social robots, embodied intelligence, UAV swarms, etc. Nevertheless, many adversarial attacks still exist to threaten various c-MARL systems. At present, the...
Putnam 2025 Problems in Rocq using Opus 4.6 and Rocq-MCP
arXiv:2603.20405v1 Announce Type: new Abstract: We report on an experiment in which Claude Opus~4.6, equipped with a suite of Model Context Protocol (MCP) tools for the Rocq proof assistant, autonomously proved 10 of 12 problems from the 2025 Putnam Mathematical...
Thinking in Different Spaces: Domain-Specific Latent Geometry Survives Cross-Architecture Translation
arXiv:2603.20406v1 Announce Type: new Abstract: We investigate whether independently trained language models converge to geometrically compatible latent representations, and whether this compatibility can be exploited to correct model behavior at inference time without any weight updates. We learn a linear...
SLE-FNO: Single-Layer Extensions for Task-Agnostic Continual Learning in Fourier Neural Operators
arXiv:2603.20410v1 Announce Type: new Abstract: Scientific machine learning is increasingly used to build surrogate models, yet most models are trained under a restrictive assumption in which future data follow the same distribution as the training set. In practice, new experimental...
Data-driven discovery of roughness descriptors for surface characterization and intimate contact modeling of unidirectional composite tapes
arXiv:2603.20418v1 Announce Type: new Abstract: Unidirectional tapes surface roughness determines the evolution of the degree of intimate contact required for ensuring the thermoplastic molecular diffusion and the associated inter-tapes consolidation during manufacturing of composite structures. However, usual characterization of rough...
Detecting Neurovascular Instability from Multimodal Physiological Signals Using Wearable-Compatible Edge AI: A Responsible Computational Framework
arXiv:2603.20442v1 Announce Type: new Abstract: We propose Melaguard, a multimodal ML framework (Transformer-lite, 1.2M parameters, 4-head self-attention) for detecting neurovascular instability (NVI) from wearable-compatible physiological signals prior to structural stroke pathology. The model fuses heart rate variability (HRV), peripheral perfusion...
Reinforcement Learning from Multi-Source Imperfect Preferences: Best-of-Both-Regimes Regret
arXiv:2603.20453v1 Announce Type: new Abstract: Reinforcement learning from human feedback (RLHF) replaces hard-to-specify rewards with pairwise trajectory preferences, yet regret-oriented theory often assumes that preference labels are generated consistently from a single ground-truth objective. In practical RLHF systems, however, feedback...
Spatio-Temporal Grid Intelligence: A Hybrid Graph Neural Network and LSTM Framework for Robust Electricity Theft Detection
arXiv:2603.20488v1 Announce Type: new Abstract: Electricity theft, or non-technical loss (NTL), presents a persistent threat to global power systems, driving significant financial deficits and compromising grid stability. Conventional detection methodologies, predominantly reactive and meter-centric, often fail to capture the complex...
Delightful Distributed Policy Gradient
arXiv:2603.20521v1 Announce Type: new Abstract: Distributed reinforcement learning trains on data from stale, buggy, or mismatched actors, producing actions with high surprisal (negative log-probability) under the learner's policy. The core difficulty is not surprising data per se, but \emph{negative learning...
Does This Gradient Spark Joy?
arXiv:2603.20526v1 Announce Type: new Abstract: Policy gradient computes a backward pass for every sample, even though the backward pass is expensive and most samples carry little learning value. The Delightful Policy Gradient (DG) provides a forward-pass signal of learning value:...
Understanding Behavior Cloning with Action Quantization
arXiv:2603.20538v1 Announce Type: new Abstract: Behavior cloning is a fundamental paradigm in machine learning, enabling policy learning from expert demonstrations across robotics, autonomous driving, and generative models. Autoregressive models like transformer have proven remarkably effective, from large language models (LLMs)...
LJ-Bench: Ontology-Based Benchmark for U.S. Crime
arXiv:2603.20572v1 Announce Type: new Abstract: The potential of Large Language Models (LLMs) to provide harmful information remains a significant concern due to the vast breadth of illegal queries they may encounter. Unfortunately, existing benchmarks only focus on a handful types...