United States v. Hemani: an animated explainer
SCOTUSblog is thrilled to introduce the first in a series of animated videos, done in partnership with Briefly, on some of the most important upcoming cases of the 2025-26 term. Today’s […]The postUnited States v. Hemani: an animated explainerappeared first...
SCOTUStoday for Friday, February 27
We’re thrilled to introduce the first in a series of animated videos, done in partnership with Briefly, on some of the most important upcoming cases of the current term. This first […]The postSCOTUStoday for Friday, February 27appeared first onSCOTUSblog.
Anthropic vs. the Pentagon: What’s actually at stake?
Anthropic and the Pentagon are clashing over AI use in autonomous weapons and surveillance, raising high-stakes questions about national security, corporate control, and who sets the rules for military AI.
Who’s really running AI? Inside the billion-dollar battle over regulation with Alex Bores
The Pentagon is playing chicken with Anthropic over who gets to control how the military uses AI while communities across the country are blocking data center construction. As the AI debate has been flattened to “doomers versus boomers,” one state...
Last 24 hours to get TechCrunch Disrupt 2026 tickets at the lowest rates of the year
The lowest rates of the year for TechCrunch Disrupt 2026 end after today. Prices go up at 11:59 p.m. PT. Don't miss connecting with 10,000 founders, investors, and operators, and key takeaways from 250+ industry leaders. Register now to save...
Precision Medicine and Data Privacy: Balancing Innovation with Patient Rights
The rapid advancement of precision medicine creates unprecedented opportunities for personalized treatment while raising complex data privacy and consent challenges.
Overconfident Errors Need Stronger Correction: Asymmetric Confidence Penalties for Reinforcement Learning
arXiv:2602.21420v1 Announce Type: cross Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has become the leading paradigm for enhancing reasoning in Large Language Models (LLMs). However, standard RLVR algorithms suffer from a well-documented pathology: while they improve Pass@1 accuracy through sharpened...
ECHOSAT: Estimating Canopy Height Over Space And Time
arXiv:2602.21421v1 Announce Type: cross Abstract: Forest monitoring is critical for climate change mitigation. However, existing global tree height maps provide only static snapshots and do not capture temporal forest dynamics, which are essential for accurate carbon accounting. We introduce ECHOSAT,...
Disaster Question Answering with LoRA Efficiency and Accurate End Position
arXiv:2602.21212v1 Announce Type: new Abstract: Natural disasters such as earthquakes, torrential rainfall, floods, and volcanic eruptions occur with extremely low frequency and affect limited geographic areas. When individuals face disaster situations, they often experience confusion and lack the domain-specific knowledge...
TRACE: Trajectory-Aware Comprehensive Evaluation for Deep Research Agents
arXiv:2602.21230v1 Announce Type: new Abstract: The evaluation of Deep Research Agents is a critical challenge, as conventional outcome-based metrics fail to capture the nuances of their complex reasoning. Current evaluation faces two primary challenges: 1) a reliance on singular metrics...
ToolMATH: A Math Tool Benchmark for Realistic Long-Horizon Multi-Tool Reasoning
arXiv:2602.21265v1 Announce Type: new Abstract: We introduce \ToolMATH, a math-grounded benchmark that evaluates tool-augmented language models in realistic multi-tool environments where the output depends on calling schema-specified tools and sustaining multi-step execution. It turns math problems into a controlled, correctness-checkable...
VecGlypher: Unified Vector Glyph Generation with Language Models
arXiv:2602.21461v1 Announce Type: new Abstract: Vector glyphs are the atomic units of digital typography, yet most learning-based pipelines still depend on carefully curated exemplar sheets and raster-to-vector postprocessing, which limits accessibility and editability. We introduce VecGlypher, a single multimodal language...
Enhancing Multilingual Embeddings via Multi-Way Parallel Text Alignment
arXiv:2602.21543v1 Announce Type: new Abstract: Multilingual pretraining typically lacks explicit alignment signals, leading to suboptimal cross-lingual alignment in the representation space. In this work, we show that training standard pretrained models for cross-lingual alignment with a multi-way parallel corpus in...
When More Is Less: A Systematic Analysis of Spatial and Commonsense Information for Visual Spatial Reasoning
arXiv:2602.21619v1 Announce Type: new Abstract: Visual spatial reasoning (VSR) remains challenging for modern vision-language models (VLMs), despite advances in multimodal architectures. A common strategy is to inject additional information at inference time, such as explicit spatial cues, external commonsense knowledge,...
RuCL: Stratified Rubric-Based Curriculum Learning for Multimodal Large Language Model Reasoning
arXiv:2602.21628v1 Announce Type: new Abstract: Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a prevailing paradigm for enhancing reasoning in Multimodal Large Language Models (MLLMs). However, relying solely on outcome supervision risks reward hacking, where models learn spurious reasoning...
Multi-dimensional Assessment and Explainable Feedback for Counselor Responses to Client Resistance in Text-based Counseling with LLMs
arXiv:2602.21638v1 Announce Type: new Abstract: Effectively addressing client resistance is a sophisticated clinical skill in psychological counseling, yet practitioners often lack timely and scalable supervisory feedback to refine their approaches. Although current NLP research has examined overall counseling quality and...
Scalable Multilingual Multimodal Machine Translation with Speech-Text Fusion
arXiv:2602.21646v1 Announce Type: new Abstract: Multimodal Large Language Models (MLLMs) have achieved notable success in enhancing translation performance by integrating multimodal information. However, existing research primarily focuses on image-guided methods, whose applicability is constrained by the scarcity of multilingual image-text...
DWA-KD: Dual-Space Weighting and Time-Warped Alignment for Cross-Tokenizer Knowledge Distillation
arXiv:2602.21669v1 Announce Type: new Abstract: Knowledge Distillation (KD) has emerged as a crucial technique for compressing Large Language Models (LLMs). Although existing cross-tokenizer KD methods have made notable progress, their effectiveness remains constrained by suboptimal alignment across sequence and vocabulary...
Evaluating the relationship between regularity and learnability in recursive numeral systems using Reinforcement Learning
arXiv:2602.21720v1 Announce Type: new Abstract: Human recursive numeral systems (i.e., counting systems such as English base-10 numerals), like many other grammatical systems, are highly regular. Following prior work that relates cross-linguistic tendencies to biases in learning, we ask whether regular...
Explore-on-Graph: Incentivizing Autonomous Exploration of Large Language Models on Knowledge Graphs with Path-refined Reward Modeling
arXiv:2602.21728v1 Announce Type: new Abstract: The reasoning process of Large Language Models (LLMs) is often plagued by hallucinations and missing facts in question-answering tasks. A promising solution is to ground LLMs' answers in verifiable knowledge sources, such as Knowledge Graphs...
D-COT: Disciplined Chain-of-Thought Learning for Efficient Reasoning in Small Language Models
arXiv:2602.21786v1 Announce Type: new Abstract: Chain-of-Thought (CoT) distillation from Large Language Models (LLMs) often induces "overthinking" in Small Language Models (SLMs), leading to performance degradation and excessive token consumption. In this study, we propose Disciplined Chain-of-Thought (D-CoT), a novel framework...
FewMMBench: A Benchmark for Multimodal Few-Shot Learning
arXiv:2602.21854v1 Announce Type: new Abstract: As multimodal large language models (MLLMs) advance in handling interleaved image-text data, assessing their few-shot learning capabilities remains an open challenge. In this paper, we introduce FewMMBench, a comprehensive benchmark designed to evaluate MLLMs under...
Personalized Graph-Empowered Large Language Model for Proactive Information Access
arXiv:2602.21862v1 Announce Type: new Abstract: Since individuals may struggle to recall all life details and often confuse events, establishing a system to assist users in recalling forgotten experiences is essential. While numerous studies have proposed memory recall systems, these primarily...
ExpLang: Improved Exploration and Exploitation in LLM Reasoning with On-Policy Thinking Language Selection
arXiv:2602.21887v1 Announce Type: new Abstract: Current large reasoning models (LRMs) have shown strong ability on challenging tasks after reinforcement learning (RL) based post-training. However, previous work mainly focuses on English reasoning in expectation of the strongest performance, despite the demonstrated...
Large Language Models are Algorithmically Blind
arXiv:2602.21947v1 Announce Type: new Abstract: Large language models (LLMs) demonstrate remarkable breadth of knowledge, yet their ability to reason about computational processes remains poorly understood. Closing this gap matters for practitioners who rely on LLMs to guide algorithm selection and...
RADAR: Reasoning as Discrimination with Aligned Representations for LLM-based Knowledge Graph Reasoning
arXiv:2602.21951v1 Announce Type: new Abstract: Knowledge graph reasoning (KGR) infers missing facts, with recent advances increasingly harnessing the semantic priors and reasoning abilities of Large Language Models (LLMs). However, prevailing generative paradigms are prone to memorizing surface-level co-occurrences rather than...
CxMP: A Linguistic Minimal-Pair Benchmark for Evaluating Constructional Understanding in Language Models
arXiv:2602.21978v1 Announce Type: new Abstract: Recent work has examined language models from a linguistic perspective to better understand how they acquire language. Most existing benchmarks focus on judging grammatical acceptability, whereas the ability to interpret meanings conveyed by grammatical forms...
A Diversity Diet for a Healthier Model: A Case Study of French ModernBERT
arXiv:2602.22014v1 Announce Type: new Abstract: Diversity has been gaining interest in the NLP community in recent years. At the same time, state-of-the-art transformer models such as ModernBERT use very large pre-training datasets, which are driven by size rather than by...
DLT-Corpus: A Large-Scale Text Collection for the Distributed Ledger Technology Domain
arXiv:2602.22045v1 Announce Type: new Abstract: We introduce DLT-Corpus, the largest domain-specific text collection for Distributed Ledger Technology (DLT) research to date: 2.98 billion tokens from 22.12 million documents spanning scientific literature (37,440 publications), United States Patent and Trademark Office (USPTO)...
SymTorch: A Framework for Symbolic Distillation of Deep Neural Networks
arXiv:2602.21307v1 Announce Type: new Abstract: Symbolic distillation replaces neural networks, or components thereof, with interpretable, closed-form mathematical expressions. This approach has shown promise in discovering physical laws and mathematical relationships directly from trained deep learning models, yet adoption remains limited...