COMBA: Cross Batch Aggregation for Learning Large Graphs with Context Gating State Space Models
arXiv:2602.17893v1 Announce Type: new Abstract: State space models (SSMs) have recently emerged for modeling long-range dependency in sequence data, with much simplified computational costs than modern alternatives, such as transformers. Advancing SMMs to graph structured data, especially for large graphs,...
Breaking the Correlation Plateau: On the Optimization and Capacity Limits of Attention-Based Regressors
arXiv:2602.17898v1 Announce Type: new Abstract: Attention-based regression models are often trained by jointly optimizing Mean Squared Error (MSE) loss and Pearson correlation coefficient (PCC) loss, emphasizing the magnitude of errors and the order or shape of targets, respectively. A common...
Distribution-Free Sequential Prediction with Abstentions
arXiv:2602.17918v1 Announce Type: new Abstract: We study a sequential prediction problem in which an adversary is allowed to inject arbitrarily many adversarial instances in a stream of i.i.d.\ instances, but at each round, the learner may also \emph{abstain} from making...
Memory-Based Advantage Shaping for LLM-Guided Reinforcement Learning
arXiv:2602.17931v1 Announce Type: new Abstract: In environments with sparse or delayed rewards, reinforcement learning (RL) incurs high sample complexity due to the large number of interactions needed for learning. This limitation has motivated the use of large language models (LLMs)...
Causal Neighbourhood Learning for Invariant Graph Representations
arXiv:2602.17934v1 Announce Type: new Abstract: Graph data often contain noisy and spurious correlations that mask the true causal relationships, which are essential for enabling graph models to make predictions based on the underlying causal structure of the data. Dependence on...
Optimizing Graph Causal Classification Models: Estimating Causal Effects and Addressing Confounders
arXiv:2602.17941v1 Announce Type: new Abstract: Graph data is becoming increasingly prevalent due to the growing demand for relational insights in AI across various domains. Organizations regularly use graph data to solve complex problems involving relationships and connections. Causal learning is...
Understanding the Generalization of Bilevel Programming in Hyperparameter Optimization: A Tale of Bias-Variance Decomposition
arXiv:2602.17947v1 Announce Type: new Abstract: Gradient-based hyperparameter optimization (HPO) have emerged recently, leveraging bilevel programming techniques to optimize hyperparameter by estimating hypergradient w.r.t. validation loss. Nevertheless, previous theoretical works mainly focus on reducing the gap between the estimation and ground-truth...
Court grapples with disputes over efforts to recover losses from Cuban confiscations
In a pair of oral arguments on Monday, the Supreme Court wrestled with disputes over whether U.S. companies can recover under U.S. law for losses resulting from the confiscation of […]The postCourt grapples with disputes over efforts to recover losses...
Birthright citizenship: under the flag
Brothers in Law is a recurring series by brothers Akhil and Vikram Amar, with special emphasis on measuring what the Supreme Court says against what the Constitution itself says. For more content from […]The postBirthright citizenship: under the flagappeared first...
Supreme Court agrees to hear case on Colorado dispute over climate change
Returning from its winter recess, the Supreme Court on Monday added just one new case to its oral argument docket. In a list of orders from the justices’ private conference […]The postSupreme Court agrees to hear case on Colorado dispute...
Data center builders thought farmers would willingly sell land, learn otherwise
Even in a fragile farm economy, million-dollar offers can't sway dedicated farmers.
AIs can generate near-verbatim copies of novels from training data
LLMs memorize more training data than previously thought.
A Meta AI security researcher said an OpenClaw agent ran amok on her inbox
The viral X post from an AI security researcher reads like satire. But it's really a word of warning about what can go wrong when handing tasks to an AI agent.
Anthropic accuses Chinese AI labs of mining Claude as US debates AI chip exports
Anthropic accuses DeepSeek, Moonshot, and MiniMax of using 24,000 fake accounts to distill Claude’s AI capabilities, as U.S. officials debate export controls aimed at slowing China’s AI progress.
5 days left to lock in the lowest TechCrunch Disrupt 2026 ticket rates
Five days to save up to $680 on your TechCrunch Disrupt 2026 ticket. These lowest rates of the year disappear on February 27 at 11:59 p.m. PT.
How AI agents could destroy the economy
Citrini Research imagines a report from two years in the future, in which unemployment has doubled and the total value of the stock market has fallen by more than a third.
Defense Secretary summons Anthropic’s Amodei over military use of Claude
Defense Secretary Pete Hegseth has summoned Anthropic CEO Dario Amodei to the Pentagon for a tense discussion over the military's use of Claude. Hegseth has threatened to designate Anthropic a "supply chain risk."
Connecting the dots in trustworthy Artificial Intelligence: From AI principles, ethics, and key requirements to responsible AI systems and regulation
Trustworthy Artificial Intelligence (AI) is based on seven technical requirements sustained over three main pillars that should be met throughout the system’s entire life cycle: it should be (1) lawful, (2) ethical, and (3) robust, both from a technical and...
Enhancing Diversity and Feasibility: Joint Population Synthesis from Multi-source Data Using Generative Models
arXiv:2602.15270v1 Announce Type: new Abstract: Generating realistic synthetic populations is essential for agent-based models (ABM) in transportation and urban planning. Current methods face two major limitations. First, many rely on a single dataset or follow a sequential data fusion and...
When Remembering and Planning are Worth it: Navigating under Change
arXiv:2602.15274v1 Announce Type: new Abstract: We explore how different types and uses of memory can aid spatial navigation in changing uncertain environments. In the simple foraging task we study, every day, our agent has to find its way from its...
EAA: Automating materials characterization with vision language model agents
arXiv:2602.15294v1 Announce Type: new Abstract: We present Experiment Automation Agents (EAA), a vision-language-model-driven agentic system designed to automate complex experimental microscopy workflows. EAA integrates multimodal reasoning, tool-augmented action, and optional long-term memory to support both autonomous procedures and interactive user-guided...
Improving LLM Reliability through Hybrid Abstention and Adaptive Detection
arXiv:2602.15391v1 Announce Type: new Abstract: Large Language Models (LLMs) deployed in production environments face a fundamental safety-utility trade-off either a strict filtering mechanisms prevent harmful outputs but often block benign queries or a relaxed controls risk unsafe content generation. Conventional...
GenAI-LA: Generative AI and Learning Analytics Workshop (LAK 2026), April 27--May 1, 2026, Bergen, Norway
arXiv:2602.15531v1 Announce Type: new Abstract: This work introduces EduEVAL-DB, a dataset based on teacher roles designed to support the evaluation and training of automatic pedagogical evaluators and AI tutors for instructional explanations. The dataset comprises 854 explanations corresponding to 139...
How Vision Becomes Language: A Layer-wise Information-Theoretic Analysis of Multimodal Reasoning
arXiv:2602.15580v1 Announce Type: new Abstract: When a multimodal Transformer answers a visual question, is the prediction driven by visual evidence, linguistic reasoning, or genuinely fused cross-modal computation -- and how does this structure evolve across layers? We address this question...
On inferring cumulative constraints
arXiv:2602.15635v1 Announce Type: new Abstract: Cumulative constraints are central in scheduling with constraint programming, yet propagation is typically performed per constraint, missing multi-resource interactions and causing severe slowdowns on some benchmarks. I present a preprocessing method for inferring additional cumulative...
CARE Drive A Framework for Evaluating Reason-Responsiveness of Vision Language Models in Automated Driving
arXiv:2602.15645v1 Announce Type: new Abstract: Foundation models, including vision language models, are increasingly used in automated driving to interpret scenes, recommend actions, and generate natural language explanations. However, existing evaluation methods primarily assess outcome based performance, such as safety and...
PERSONA: Dynamic and Compositional Inference-Time Personality Control via Activation Vector Algebra
arXiv:2602.15669v1 Announce Type: new Abstract: Current methods for personality control in Large Language Models rely on static prompting or expensive fine-tuning, failing to capture the dynamic and compositional nature of human traits. We introduce PERSONA, a training-free framework that achieves...
Recursive Concept Evolution for Compositional Reasoning in Large Language Models
arXiv:2602.15725v1 Announce Type: new Abstract: Large language models achieve strong performance on many complex reasoning tasks, yet their accuracy degrades sharply on benchmarks that require compositional reasoning, including ARC-AGI-2, GPQA, MATH, BBH, and HLE. Existing methods improve reasoning by expanding...
This human study did not involve human subjects: Validating LLM simulations as behavioral evidence
arXiv:2602.15785v1 Announce Type: new Abstract: A growing literature uses large language models (LLMs) as synthetic participants to generate cost-effective and nearly instantaneous responses in social science experiments. However, there is limited guidance on when such simulations support valid inference about...
LemonadeBench: Evaluating the Economic Intuition of Large Language Models in Simple Markets
arXiv:2602.13209v1 Announce Type: cross Abstract: We introduce LemonadeBench v0.5, a minimal benchmark for evaluating economic intuition, long-term planning, and decision-making under uncertainty in large language models (LLMs) through a simulated lemonade stand business. Models must manage inventory with expiring goods,...