RadioGen3D: 3D Radio Map Generation via Adversarial Learning on Large-Scale Synthetic Data
arXiv:2602.18744v1 Announce Type: new Abstract: Radio maps are essential for efficient radio resource management in future 6G and low-altitude networks. While deep learning (DL) techniques have emerged as an efficient alternative to conventional ray-tracing for radio map estimation (RME), most...
Boosting for Vector-Valued Prediction and Conditional Density Estimation
arXiv:2602.18866v1 Announce Type: new Abstract: Despite the widespread use of boosting in structured prediction, a general theoretical understanding of aggregation beyond scalar losses remains incomplete. We study vector-valued and conditional density prediction under general divergences and identify stability conditions under...
Anthropic launches new push for enterprise agents with plug-ins for finance, engineering, and design
It's a major opportunity to grow Anthropic’s enterprise client base — and a significant threat to SaaS products currently performing those functions.
VIRAASAT: Traversing Novel Paths for Indian Cultural Reasoning
arXiv:2602.18429v1 Announce Type: new Abstract: Large Language Models (LLMs) have made significant progress in reasoning tasks across various domains such as mathematics and coding. However, their performance deteriorates in tasks requiring rich socio-cultural knowledge and diverse local contexts, particularly those...
On the Semantic and Syntactic Information Encoded in Proto-Tokens for One-Step Text Reconstruction
arXiv:2602.18301v1 Announce Type: cross Abstract: Autoregressive large language models (LLMs) generate text token-by-token, requiring n forward passes to produce a sequence of length n. Recent work, Exploring the Latent Capacity of LLMs for One-Step Text Reconstruction (Mezentsev and Oseledets), shows...
BioBridge: Bridging Proteins and Language for Enhanced Biological Reasoning with LLMs
arXiv:2602.17680v1 Announce Type: new Abstract: Existing Protein Language Models (PLMs) often suffer from limited adaptability to multiple tasks and exhibit poor generalization across diverse biological contexts. In contrast, general-purpose Large Language Models (LLMs) lack the capability to interpret protein sequences...
AnCoder: Anchored Code Generation via Discrete Diffusion Models
arXiv:2602.17688v1 Announce Type: new Abstract: Diffusion language models offer a compelling alternative to autoregressive code generation, enabling global planning and iterative refinement of complex program logic. However, existing approaches fail to respect the rigid structure of programming languages and, as...
Grassmannian Mixture-of-Experts: Concentration-Controlled Routing on Subspace Manifolds
arXiv:2602.17798v1 Announce Type: new Abstract: Mixture-of-Experts models rely on learned routers to assign tokens to experts, yet standard softmax gating provides no principled mechanism to control the tradeoff between sparsity and utilization. We propose Grassmannian MoE (GrMoE), a routing framework...
Calibrated Adaptation: Bayesian Stiefel Manifold Priors for Reliable Parameter-Efficient Fine-Tuning
arXiv:2602.17809v1 Announce Type: new Abstract: Parameter-efficient fine-tuning methods such as LoRA enable practical adaptation of large language models but provide no principled uncertainty estimates, leading to poorly calibrated predictions and unreliable behavior under domain shift. We introduce Stiefel-Bayes Adapters (SBA),...
Causality by Abstraction: Symbolic Rule Learning in Multivariate Timeseries with Large Language Models
arXiv:2602.17829v1 Announce Type: new Abstract: Inferring causal relations in timeseries data with delayed effects is a fundamental challenge, especially when the underlying system exhibits complex dynamics that cannot be captured by simple functional mappings. Traditional approaches often fail to produce...
Understanding the Generalization of Bilevel Programming in Hyperparameter Optimization: A Tale of Bias-Variance Decomposition
arXiv:2602.17947v1 Announce Type: new Abstract: Gradient-based hyperparameter optimization (HPO) have emerged recently, leveraging bilevel programming techniques to optimize hyperparameter by estimating hypergradient w.r.t. validation loss. Nevertheless, previous theoretical works mainly focus on reducing the gap between the estimation and ground-truth...
When Remembering and Planning are Worth it: Navigating under Change
arXiv:2602.15274v1 Announce Type: new Abstract: We explore how different types and uses of memory can aid spatial navigation in changing uncertain environments. In the simple foraging task we study, every day, our agent has to find its way from its...
World-Model-Augmented Web Agents with Action Correction
arXiv:2602.15384v1 Announce Type: new Abstract: Web agents based on large language models have demonstrated promising capability in automating web tasks. However, current web agents struggle to reason out sensible actions due to the limitations of predicting environment changes, and might...
Common Belief Revisited
arXiv:2602.15403v1 Announce Type: new Abstract: Contrary to common belief, common belief is not KD4. If individual belief is KD45, common belief does indeed lose the 5 property and keep the D and 4 properties -- and it has none of...
Enhancing Building Semantics Preservation in AI Model Training with Large Language Model Encodings
arXiv:2602.15791v1 Announce Type: new Abstract: Accurate representation of building semantics, encompassing both generic object types and specific subtypes, is essential for effective AI model training in the architecture, engineering, construction, and operation (AECO) industry. Conventional encoding methods (e.g., one-hot) often...
Orchestration-Free Customer Service Automation: A Privacy-Preserving and Flowchart-Guided Framework
arXiv:2602.15377v1 Announce Type: new Abstract: Customer service automation has seen growing demand within digital transformation. Existing approaches either rely on modular system designs with extensive agent orchestration or employ over-simplified instruction schemas, providing limited guidance and poor generalizability. This paper...
TAROT: Test-driven and Capability-adaptive Curriculum Reinforcement Fine-tuning for Code Generation with Large Language Models
arXiv:2602.15449v1 Announce Type: new Abstract: Large Language Models (LLMs) are changing the coding paradigm, known as vibe coding, yet synthesizing algorithmically sophisticated and robust code still remains a critical challenge. Incentivizing the deep reasoning capabilities of LLMs is essential to...
Improving Interactive In-Context Learning from Natural Language Feedback
arXiv:2602.16066v1 Announce Type: new Abstract: Adapting one's thought process based on corrective feedback is an essential ability in human learning, particularly in collaborative settings. In contrast, the current large language model training paradigm relies heavily on modeling vast, static corpora....
Framework of Thoughts: A Foundation Framework for Dynamic and Optimized Reasoning based on Chains, Trees, and Graphs
arXiv:2602.16512v1 Announce Type: new Abstract: Prompting schemes such as Chain of Thought, Tree of Thoughts, and Graph of Thoughts can significantly enhance the reasoning capabilities of large language models. However, most existing schemes require users to define static, problem-specific reasoning...
Creating a digital poet
arXiv:2602.16578v1 Announce Type: new Abstract: Can a machine write good poetry? Any positive answer raises fundamental questions about the nature and value of art. We report a seven-month poetry workshop in which a large language model was shaped into a...
Redefining boundaries in innovation and knowledge domains: Investigating the impact of generative artificial intelligence on copyright and intellectual property rights
Preference Optimization for Review Question Generation Improves Writing Quality
arXiv:2602.15849v1 Announce Type: cross Abstract: Peer review relies on substantive, evidence-based questions, yet existing LLM-based approaches often generate surface-level queries, drawing over 50\% of their question tokens from a paper's first page. To bridge this gap, we develop IntelliReward, a...
A Lightweight Explainable Guardrail for Prompt Safety
arXiv:2602.15853v1 Announce Type: cross Abstract: We propose a lightweight explainable guardrail (LEG) method for the classification of unsafe prompts. LEG uses a multi-task learning architecture to jointly learn a prompt classifier and an explanation classifier, where the latter labels prompt...
Playing With AI: How Do State-Of-The-Art Large Language Models Perform in the 1977 Text-Based Adventure Game Zork?
arXiv:2602.15867v1 Announce Type: cross Abstract: In this positioning paper, we evaluate the problem-solving and reasoning capabilities of contemporary Large Language Models (LLMs) through their performance in Zork, the seminal text-based adventure game first released in 1977. The game's dialogue-based structure...
Evidence for Daily and Weekly Periodic Variability in GPT-4o Performance
arXiv:2602.15889v1 Announce Type: cross Abstract: Large language models (LLMs) are increasingly used in research both as tools and as objects of investigation. Much of this work implicitly assumes that LLM performance under fixed conditions (identical model snapshot, hyperparameters, and prompt)...
Conv-FinRe: A Conversational and Longitudinal Benchmark for Utility-Grounded Financial Recommendation
arXiv:2602.16990v1 Announce Type: new Abstract: Most recommendation benchmarks evaluate how well a model imitates user behavior. In financial advisory, however, observed actions can be noisy or short-sighted under market volatility and may conflict with a user's long-term goals. Treating what...
Predictive Batch Scheduling: Accelerating Language Model Training Through Loss-Aware Sample Prioritization
arXiv:2602.17066v1 Announce Type: new Abstract: We introduce Predictive Batch Scheduling (PBS), a novel training optimization technique that accelerates language model convergence by dynamically prioritizing high-loss samples during batch construction. Unlike curriculum learning approaches that require predefined difficulty metrics or hard...
Owen-based Semantics and Hierarchy-Aware Explanation (O-Shap)
arXiv:2602.17107v1 Announce Type: new Abstract: Shapley value-based methods have become foundational in explainable artificial intelligence (XAI), offering theoretically grounded feature attributions through cooperative game theory. However, in practice, particularly in vision tasks, the assumption of feature independence breaks down, as...
Web Verbs: Typed Abstractions for Reliable Task Composition on the Agentic Web
arXiv:2602.17245v1 Announce Type: new Abstract: The Web is evolving from a medium that humans browse to an environment where software agents act on behalf of users. Advances in large language models (LLMs) make natural language a practical interface for goal-directed...
Evaluating Monolingual and Multilingual Large Language Models for Greek Question Answering: The DemosQA Benchmark
arXiv:2602.16811v1 Announce Type: new Abstract: Recent advancements in Natural Language Processing and Deep Learning have enabled the development of Large Language Models (LLMs), which have significantly advanced the state-of-the-art across a wide range of tasks, including Question Answering (QA). Despite...