Large-Scale 3D Ground-Motion Synthesis with Physics-Inspired Latent Operator Flow Matching
arXiv:2603.17403v1 Announce Type: new Abstract: Earthquake hazard analysis and design of spatially distributed infrastructure, such as power grids and energy pipeline networks, require scenario-specific ground-motion time histories with realistic frequency content and spatiotemporal coherence. However, producing the large ensembles needed...
Causal Representation Learning on High-Dimensional Data: Benchmarks, Reproducibility, and Evaluation Metrics
arXiv:2603.17405v1 Announce Type: new Abstract: Causal representation learning (CRL) models aim to transform high-dimensional data into a latent space, enabling interventions to generate counterfactual samples or modify existing data based on the causal relationships among latent variables. To facilitate the...
Efficient Soft Actor-Critic with LLM-Based Action-Level Guidance for Continuous Control
arXiv:2603.17468v1 Announce Type: new Abstract: We present GuidedSAC, a novel reinforcement learning (RL) algorithm that facilitates efficient exploration in vast state-action spaces. GuidedSAC leverages large language models (LLMs) as intelligent supervisors that provide action-level guidance for the Soft Actor-Critic (SAC)...
QuantFL: Sustainable Federated Learning for Edge IoT via Pre-Trained Model Quantisation
arXiv:2603.17507v1 Announce Type: new Abstract: Federated Learning (FL) enables privacy-preserving intelligence on Internet of Things (IoT) devices but incurs a significant carbon footprint due to the high energy cost of frequent uplink transmission. While pre-trained models are increasingly available on...
Nvidia is quietly building a multibillion-dollar behemoth to rival its chips business
Nvidia's networking business raked in $11 billion last quarter despite getting significantly less fanfare than chips and gaming.
Patreon CEO calls AI companies’ fair use argument ‘bogus,’ says creators should be paid
Patreon CEO Jack Conte says AI companies should pay creators for training data, arguing their fair use defense falls apart when they license content from major publishers.
Rebel Audio is a new AI podcasting tool aimed at first-time creators
Rebel Audio is a new all-in-one podcasting tool that allows creators to record podcasts, edit, clip content for social, and publish episodes, all without ever leaving the platform.
The leaderboard “you can’t game,” funded by the companies it ranks
Artificial intelligence models are multiplying fast, and competition is stiff. With so many players crowding the space, which one will be the best — and who decides that? Arena, formerly LM Arena, has emerged as the de facto public leaderboard...
The PhD students who became the judges of the AI industry
Artificial intelligence models are multiplying fast, and competition is stiff. With so many players crowding the space, which one will be the best — and who decides that? Arena, formerly LM Arena, has emerged as the de facto public leaderboard...
Persona-Conditioned Risk Behavior in Large Language Models: A Simulated Gambling Study with GPT-4.1
arXiv:2603.15831v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly deployed as autonomous agents in uncertain, sequential decision-making contexts. Yet it remains poorly understood whether the behaviors they exhibit in such environments reflect principled cognitive patterns or simply surface-level...
NextMem: Towards Latent Factual Memory for LLM-based Agents
arXiv:2603.15634v1 Announce Type: new Abstract: Memory is critical for LLM-based agents to preserve past observations for future decision-making, where factual memory serves as its foundational part. However, existing approaches to constructing factual memory face several limitations. Textual methods impose heavy...
BANGLASOCIALBENCH: A Benchmark for Evaluating Sociopragmatic and Cultural Alignment of LLMs in Bangladeshi Social Interaction
arXiv:2603.15949v1 Announce Type: new Abstract: Large Language Models have demonstrated strong multilingual fluency, yet fluency alone does not guarantee socially appropriate language use. In high-context languages, communicative competence requires sensitivity to social hierarchy, relational roles, and interactional norms that are...
Regularized Latent Dynamics Prediction is a Strong Baseline For Behavioral Foundation Models
arXiv:2603.15857v1 Announce Type: new Abstract: Behavioral Foundation Models (BFMs) produce agents with the capability to adapt to any unknown reward or task. These methods, however, are only able to produce near-optimal policies for the reward functions that are in the...
Optimizing Hospital Capacity During Pandemics: A Dual-Component Framework for Strategic Patient Relocation
arXiv:2603.15960v1 Announce Type: new Abstract: The COVID-19 pandemic has placed immense strain on hospital systems worldwide, leading to critical capacity challenges. This research proposes a two-part framework to optimize hospital capacity through patient relocation strategies. The first component involves developing...
NLP Occupational Emergence Analysis: How Occupations Form and Evolve in Real Time -- A Zero-Assumption Method Demonstrated on AI in the US Technology Workforce, 2022-2026
arXiv:2603.15998v1 Announce Type: new Abstract: Occupations form and evolve faster than classification systems can track. We propose that a genuine occupation is a self-reinforcing structure (a bipartite co-attractor) in which a shared professional vocabulary makes practitioners cohesive as a group,...
Prose2Policy (P2P): A Practical LLM Pipeline for Translating Natural-Language Access Policies into Executable Rego
arXiv:2603.15799v1 Announce Type: new Abstract: Prose2Policy (P2P) is a LLM-based practical tool that translates natural-language access control policies (NLACPs) into executable Rego code (the policy language of Open Policy Agent, OPA). It provides a modular, end-to-end pipeline that performs policy...
A Context Alignment Pre-processor for Enhancing the Coherence of Human-LLM Dialog
arXiv:2603.16052v1 Announce Type: new Abstract: Large language models (LLMs) have made remarkable progress in generating fluent text, but they still face a critical challenge of contextual misalignment in long-term and dynamic dialogue. When human users omit premises, simplify references, or...
I Know What I Don't Know: Latent Posterior Factor Models for Multi-Evidence Probabilistic Reasoning
arXiv:2603.15670v1 Announce Type: new Abstract: Real-world decision-making, from tax compliance assessment to medical diagnosis, requires aggregating multiple noisy and potentially contradictory evidence sources. Existing approaches either lack explicit uncertainty quantification (neural aggregation methods) or rely on manually engineered discrete predicates...
MAC: Multi-Agent Constitution Learning
arXiv:2603.15968v1 Announce Type: new Abstract: Constitutional AI is a method to oversee and control LLMs based on a set of rules written in natural language. These rules are typically written by human experts, but could in principle be learned automatically...
Are Large Language Models Truly Smarter Than Humans?
arXiv:2603.16197v1 Announce Type: new Abstract: Public leaderboards increasingly suggest that large language models (LLMs) surpass human experts on benchmarks spanning academic knowledge, law, and programming. Yet most benchmarks are fully public, their questions widely mirrored across the internet, creating systematic...
Protein Design with Agent Rosetta: A Case Study for Specialized Scientific Agents
arXiv:2603.15952v1 Announce Type: new Abstract: Large language models (LLMs) are capable of emulating reasoning and using tools, creating opportunities for autonomous agents that execute complex scientific tasks. Protein design provides a natural testbed: although machine learning (ML) methods achieve strong...
GSI Agent: Domain Knowledge Enhancement for Large Language Models in Green Stormwater Infrastructure
arXiv:2603.15643v1 Announce Type: new Abstract: Green Stormwater Infrastructure (GSI) systems, such as permeable pavement, rain gardens, and bioretention facilities, require continuous inspection and maintenance to ensure long-term performance. However, domain knowledge about GSI is often scattered across municipal manuals, regulatory...
Adaptive Theory of Mind for LLM-based Multi-Agent Coordination
arXiv:2603.16264v1 Announce Type: new Abstract: Theory of Mind (ToM) refers to the ability to reason about others' mental states, and higher-order ToM involves considering that others also possess their own ToM. Equipping large language model (LLM)-driven agents with ToM has...
MoLoRA: Composable Specialization via Per-Token Adapter Routing
arXiv:2603.15965v1 Announce Type: new Abstract: Multi-adapter serving systems route entire sequences to a single adapter, forcing a choice when requests span multiple domains. This assumption fails in two important settings: (1) multimodal generation, where text and image tokens require different...
Argumentative Human-AI Decision-Making: Toward AI Agents That Reason With Us, Not For Us
arXiv:2603.15946v1 Announce Type: new Abstract: Computational argumentation offers formal frameworks for transparent, verifiable reasoning but has traditionally been limited by its reliance on domain-specific information and extensive feature engineering. In contrast, LLMs excel at processing unstructured text, yet their opaque...
Semi-Autonomous Formalization of the Vlasov-Maxwell-Landau Equilibrium
arXiv:2603.15929v1 Announce Type: new Abstract: We present a complete Lean 4 formalization of the equilibrium characterization in the Vlasov-Maxwell-Landau (VML) system, which describes the motion of charged plasma. The project demonstrates the full AI-assisted mathematical research loop: an AI reasoning...
AsgardBench - Evaluating Visually Grounded Interactive Planning Under Minimal Feedback
arXiv:2603.15888v1 Announce Type: new Abstract: With AsgardBench we aim to evaluate visually grounded, high-level action sequence generation and interactive planning, focusing specifically on plan adaptation during execution based on visual observations rather than navigation or low-level manipulation. In the landscape...
POLAR:A Per-User Association Test in Embedding Space
arXiv:2603.15950v1 Announce Type: new Abstract: Most intrinsic association probes operate at the word, sentence, or corpus level, obscuring author-level variation. We present POLAR (Per-user On-axis Lexical Association Re-port), a per-user lexical association test that runs in the embedding space of...
SQL-ASTRA: Alleviating Sparse Feedback in Agentic SQL via Column-Set Matching and Trajectory Aggregation
arXiv:2603.16161v1 Announce Type: new Abstract: Agentic Reinforcement Learning (RL) shows promise for complex tasks, but Text-to-SQL remains mostly restricted to single-turn paradigms. A primary bottleneck is the credit assignment problem. In traditional paradigms, rewards are determined solely by the final-turn...