MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild
arXiv:2603.17187v1 Announce Type: new Abstract: Large language model (LLM) agents are increasingly used for complex tasks, yet deployed agents often remain static, failing to adapt as user needs evolve. This creates a tension between the need for continuous service and...
Self-Conditioned Denoising for Atomistic Representation Learning
arXiv:2603.17196v1 Announce Type: new Abstract: The success of large-scale pretraining in NLP and computer vision has catalyzed growing efforts to develop analogous foundation models for the physical sciences. However, pretraining strategies using atomistic data remain underexplored. To date, large-scale supervised...
Binary Latent Protein Fitness Landscapes for Quantum Annealing Optimization
arXiv:2603.17247v1 Announce Type: new Abstract: We propose Q-BIOLAT, a framework for modeling and optimizing protein fitness landscapes in binary latent spaces. Starting from protein sequences, we leverage pretrained protein language models to obtain continuous embeddings, which are then transformed into...
Pathology-Aware Multi-View Contrastive Learning for Patient-Independent ECG Reconstruction
arXiv:2603.17248v1 Announce Type: new Abstract: Reconstructing a 12-lead electrocardiogram (ECG) from a reduced lead set is an ill-posed inverse problem due to anatomical variability. Standard deep learning methods often ignore underlying cardiac pathology losing vital morphology in precordial leads. We...
WINFlowNets: Warm-up Integrated Networks Training of Generative Flow Networks for Robotics and Machine Fault Adaptation
arXiv:2603.17301v1 Announce Type: new Abstract: Generative Flow Networks for continuous scenarios (CFlowNets) have shown promise in solving sequential decision-making tasks by learning stochastic policies using a flow and a retrieval network. Despite their demonstrated efficiency compared to state-of-the-art Reinforcement Learning...
Variational Kernel Design for Internal Noise: Gaussian Chaos Noise, Representation Compatibility, and Reliable Deep Learning
arXiv:2603.17365v1 Announce Type: new Abstract: Internal noise in deep networks is usually inherited from heuristics such as dropout, hard masking, or additive perturbation. We ask two questions: what correlation geometry should internal noise have, and is the implemented perturbation compatible...
TimeAPN: Adaptive Amplitude-Phase Non-Stationarity Normalization for Time Series Forecasting
arXiv:2603.17436v1 Announce Type: new Abstract: Non-stationarity is a fundamental challenge in multivariate long-term time series forecasting, often manifested as rapid changes in amplitude and phase. These variations lead to severe distribution shifts and consequently degrade predictive performance. Existing normalization-based methods...
Auto-Unrolled Proximal Gradient Descent: An AutoML Approach to Interpretable Waveform Optimization
arXiv:2603.17478v1 Announce Type: new Abstract: This study explores the combination of automated machine learning (AutoML) with model-based deep unfolding (DU) for optimizing wireless beamforming and waveforms. We convert the iterative proximal gradient descent (PGD) algorithm into a deep neural network,...
QuantFL: Sustainable Federated Learning for Edge IoT via Pre-Trained Model Quantisation
arXiv:2603.17507v1 Announce Type: new Abstract: Federated Learning (FL) enables privacy-preserving intelligence on Internet of Things (IoT) devices but incurs a significant carbon footprint due to the high energy cost of frequent uplink transmission. While pre-trained models are increasingly available on...
The leaderboard “you can’t game,” funded by the companies it ranks
Artificial intelligence models are multiplying fast, and competition is stiff. With so many players crowding the space, which one will be the best — and who decides that? Arena, formerly LM Arena, has emerged as the de facto public leaderboard...
The PhD students who became the judges of the AI industry
Artificial intelligence models are multiplying fast, and competition is stiff. With so many players crowding the space, which one will be the best — and who decides that? Arena, formerly LM Arena, has emerged as the de facto public leaderboard...
Argumentative Human-AI Decision-Making: Toward AI Agents That Reason With Us, Not For Us
arXiv:2603.15946v1 Announce Type: new Abstract: Computational argumentation offers formal frameworks for transparent, verifiable reasoning but has traditionally been limited by its reliance on domain-specific information and extensive feature engineering. In contrast, LLMs excel at processing unstructured text, yet their opaque...
Prose2Policy (P2P): A Practical LLM Pipeline for Translating Natural-Language Access Policies into Executable Rego
arXiv:2603.15799v1 Announce Type: new Abstract: Prose2Policy (P2P) is a LLM-based practical tool that translates natural-language access control policies (NLACPs) into executable Rego code (the policy language of Open Policy Agent, OPA). It provides a modular, end-to-end pipeline that performs policy...
An Agentic Evaluation Framework for AI-Generated Scientific Code in PETSc
arXiv:2603.15976v1 Announce Type: new Abstract: While large language models have significantly accelerated scientific code generation, comprehensively evaluating the generated code remains a major challenge. Traditional benchmarks reduce evaluation to test-case matching, an approach insufficient for library code in HPC where...
SQL-ASTRA: Alleviating Sparse Feedback in Agentic SQL via Column-Set Matching and Trajectory Aggregation
arXiv:2603.16161v1 Announce Type: new Abstract: Agentic Reinforcement Learning (RL) shows promise for complex tasks, but Text-to-SQL remains mostly restricted to single-turn paradigms. A primary bottleneck is the credit assignment problem. In traditional paradigms, rewards are determined solely by the final-turn...
COGNAC at SemEval-2026 Task 5: LLM Ensembles for Human-Level Word Sense Plausibility Rating in Challenging Narratives
arXiv:2603.15897v1 Announce Type: new Abstract: We describe our system for SemEval-2026 Task 5, which requires rating the plausibility of given word senses of homonyms in short stories on a 5-point Likert scale. Systems are evaluated by the unweighted average of...
ARISE: Agent Reasoning with Intrinsic Skill Evolution in Hierarchical Reinforcement Learning
arXiv:2603.16060v1 Announce Type: new Abstract: The dominant paradigm for improving mathematical reasoning in language models relies on Reinforcement Learning with verifiable rewards. Yet existing methods treat each problem instance in isolation without leveraging the reusable strategies that emerge and accumulate...
NLP Occupational Emergence Analysis: How Occupations Form and Evolve in Real Time -- A Zero-Assumption Method Demonstrated on AI in the US Technology Workforce, 2022-2026
arXiv:2603.15998v1 Announce Type: new Abstract: Occupations form and evolve faster than classification systems can track. We propose that a genuine occupation is a self-reinforcing structure (a bipartite co-attractor) in which a shared professional vocabulary makes practitioners cohesive as a group,...
Quantum-Secure-By-Construction (QSC): A Paradigm Shift For Post-Quantum Agentic Intelligence
arXiv:2603.15668v1 Announce Type: new Abstract: As agentic artificial intelligence systems scale across globally distributed and long lived infrastructures, secure and policy compliant communication becomes a fundamental systems challenge. This challenge grows more serious in the quantum era, where the cryptographic...
A Dynamic Survey of Fuzzy, Intuitionistic Fuzzy, Neutrosophic, Plithogenic, and Extensional Sets
arXiv:2603.15667v1 Announce Type: new Abstract: Real-world phenomena often exhibit vagueness, partial truth, and incomplete information. To model such uncertainty in a mathematically rigorous way, many generalized set-theoretic frameworks have been introduced, including Fuzzy Sets [1], Intuitionistic Fuzzy Sets [2], Neutrosophic...
Adaptive Theory of Mind for LLM-based Multi-Agent Coordination
arXiv:2603.16264v1 Announce Type: new Abstract: Theory of Mind (ToM) refers to the ability to reason about others' mental states, and higher-order ToM involves considering that others also possess their own ToM. Equipping large language model (LLM)-driven agents with ToM has...
MAC: Multi-Agent Constitution Learning
arXiv:2603.15968v1 Announce Type: new Abstract: Constitutional AI is a method to oversee and control LLMs based on a set of rules written in natural language. These rules are typically written by human experts, but could in principle be learned automatically...
Proactive Rejection and Grounded Execution: A Dual-Stage Intent Analysis Paradigm for Safe and Efficient AIoT Smart Homes
arXiv:2603.16207v1 Announce Type: new Abstract: As Large Language Models (LLMs) transition from information providers to embodied agents in the Internet of Things (IoT), they face significant challenges regarding reliability and interaction efficiency. Direct execution of LLM-generated commands often leads to...
AsgardBench - Evaluating Visually Grounded Interactive Planning Under Minimal Feedback
arXiv:2603.15888v1 Announce Type: new Abstract: With AsgardBench we aim to evaluate visually grounded, high-level action sequence generation and interactive planning, focusing specifically on plan adaptation during execution based on visual observations rather than navigation or low-level manipulation. In the landscape...
Are Large Language Models Truly Smarter Than Humans?
arXiv:2603.16197v1 Announce Type: new Abstract: Public leaderboards increasingly suggest that large language models (LLMs) surpass human experts on benchmarks spanning academic knowledge, law, and programming. Yet most benchmarks are fully public, their questions widely mirrored across the internet, creating systematic...
Interpretable Context Methodology: Folder Structure as Agentic Architecture
arXiv:2603.16021v1 Announce Type: new Abstract: Current approaches to AI agent orchestration typically involve building multi-agent frameworks that manage context passing, memory, error handling, and step coordination through code. These frameworks work well for complex, concurrent systems. But for sequential workflows...
Agent-based imitation dynamics can yield efficiently compressed population-level vocabularies
arXiv:2603.15903v1 Announce Type: new Abstract: Natural languages have been argued to evolve under pressure to efficiently compress meanings into words by optimizing the Information Bottleneck (IB) complexity-accuracy tradeoff. However, the underlying social dynamics that could drive the optimization of a...
Selective Memory for Artificial Intelligence: Write-Time Gating with Hierarchical Archiving
arXiv:2603.15994v1 Announce Type: new Abstract: Retrieval-augmented generation stores all content indiscriminately, degrading accuracy as noise accumulates. Parametric approaches compress knowledge into weights, precluding selective updates. Neither mirrors biological memory, which gates encoding based on salience and archives rather than deletes...
DynaTrust: Defending Multi-Agent Systems Against Sleeper Agents via Dynamic Trust Graphs
arXiv:2603.15661v1 Announce Type: new Abstract: Large Language Model-based Multi-Agent Systems (MAS) have demonstrated remarkable collaborative reasoning capabilities but introduce new attack surfaces, such as the sleeper agent, which behave benignly during routine operation and gradually accumulate trust, only revealing malicious...