Thousands of consumer routers hacked by Russia's military
End-of-life routers in homes and small offices hacked in 120 countries.
Supreme Court summarily closes the courthouse doors again
Civil Rights and Wrongs is a recurring series by Daniel Harawa covering criminal justice and civil rights cases before the court. I have written before about the Supreme Court’s troubling […]The postSupreme Court summarily closes the courthouse doors againappeared first...
State election dispute on political speech comes to Supreme Court on interim docket
Lawyers for Ohio Secretary of State Frank LaRose, as well as county election officials, urged the Supreme Court on Wednesday to let them go ahead with a ballot that does […]The postState election dispute on political speech comes to Supreme...
AE-ViT: Stable Long-Horizon Parametric Partial Differential Equations Modeling
arXiv:2604.06475v1 Announce Type: new Abstract: Deep Learning Reduced Order Models (ROMs) are becoming increasingly popular as surrogate models for parametric partial differential equations (PDEs) due to their ability to handle high-dimensional data, approximate highly nonlinear mappings, and utilize GPUs. Existing...
MICA: Multivariate Infini Compressive Attention for Time Series Forecasting
arXiv:2604.06473v1 Announce Type: new Abstract: Multivariate forecasting with Transformers faces a core scalability challenge: modeling cross-channel dependencies via attention compounds attention's quadratic sequence complexity with quadratic channel scaling, making full cross-channel attention impractical for high-dimensional time series. We propose Multivariate...
From Load Tests to Live Streams: Graph Embedding-Based Anomaly Detection in Microservice Architectures
arXiv:2604.06448v1 Announce Type: new Abstract: Prime Video regularly conducts load tests to simulate the viewer traffic spikes seen during live events such as Thursday Night Football as well as video-on-demand (VOD) events such as Rings of Power. While these stress...
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning
arXiv:2604.06427v1 Announce Type: new Abstract: The viability of chain-of-thought (CoT) monitoring hinges on models being unable to reason effectively in their latent representations. Yet little is known about the limits of such latent reasoning in LLMs. We test these limits...
Toward a universal foundation model for graph-structured data
arXiv:2604.06391v1 Announce Type: new Abstract: Graphs are a central representation in biomedical research, capturing molecular interaction networks, gene regulatory circuits, cell--cell communication maps, and knowledge graphs. Despite their importance, currently there is not a broadly reusable foundation model available for...
The Master Key Hypothesis: Unlocking Cross-Model Capability Transfer via Linear Subspace Alignment
arXiv:2604.06377v1 Announce Type: new Abstract: We investigate whether post-trained capabilities can be transferred across models without retraining, with a focus on transfer across different model scales. We propose the Master Key Hypothesis, which states that model capabilities correspond to directions...
BiScale-GTR: Fragment-Aware Graph Transformers for Multi-Scale Molecular Representation Learning
arXiv:2604.06336v1 Announce Type: new Abstract: Graph Transformers have recently attracted attention for molecular property prediction by combining the inductive biases of graph neural networks (GNNs) with the global receptive field of Transformers. However, many existing hybrid architectures remain GNN-dominated, causing...
AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent
arXiv:2604.06296v1 Announce Type: new Abstract: AI agents are increasingly deployed in real-world applications, including systems such as Manus, OpenClaw, and coding agents. Existing research has primarily focused on \emph{server-side} efficiency, proposing methods such as caching, speculative execution, traffic scheduling, and...
Spectral Edge Dynamics Reveal Functional Modes of Learning
arXiv:2604.06256v1 Announce Type: new Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions -- the spectral edge -- which reliably distinguishes grokking from non-grokking regimes. We show that standard mechanistic interpretability tools (head attribution, activation...
Probabilistic Language Tries: A Unified Framework for Compression, Decision Policies, and Execution Reuse
arXiv:2604.06228v1 Announce Type: new Abstract: We introduce probabilistic language tries (PLTs), a unified representation that makes explicit the prefix structure implicitly defined by any generative model over sequences. By assigning to each outgoing edge the conditional probability of the corresponding...
A Parameter-Efficient Transfer Learning Approach through Multitask Prompt Distillation and Decomposition for Clinical NLP
arXiv:2604.06650v1 Announce Type: new Abstract: Existing prompt-based fine-tuning methods typically learn task-specific prompts independently, imposing significant computing and storage overhead at scale when deploying multiple clinical natural language processing (NLP) systems. We present a multitask prompt distillation and decomposition framework...
DiffuMask: Diffusion Language Model for Token-level Prompt Pruning
arXiv:2604.06627v1 Announce Type: new Abstract: In-Context Learning and Chain-of-Thought prompting improve reasoning in large language models (LLMs). These typically come at the cost of longer, more expensive prompts that may contain redundant information. Prompt compression based on pruning offers a...
LLM-based Schema-Guided Extraction and Validation of Missing-Person Intelligence from Heterogeneous Data Sources
arXiv:2604.06571v1 Announce Type: new Abstract: Missing-person and child-safety investigations rely on heterogeneous case documents, including structured forms, bulletin-style posters, and narrative web profiles. Variations in layout, terminology, and data quality impede rapid triage, large-scale analysis, and search-planning workflows. This paper...
To Lie or Not to Lie? Investigating The Biased Spread of Global Lies by LLMs
arXiv:2604.06552v1 Announce Type: new Abstract: Misinformation is on the rise, and the strong writing capabilities of LLMs lower the barrier for malicious actors to produce and disseminate false information. We study how LLMs behave when prompted to spread misinformation across...
ValueGround: Evaluating Culture-Conditioned Visual Value Grounding in MLLMs
arXiv:2604.06484v1 Announce Type: new Abstract: Cultural values are expressed not only through language but also through visual scenes and everyday social practices. Yet existing evaluations of cultural values in language models are almost entirely text-only, making it unclear whether models...
DataSTORM: Deep Research on Large-Scale Databases using Exploratory Data Analysis and Data Storytelling
arXiv:2604.06474v1 Announce Type: new Abstract: Deep research with Large Language Model (LLM) agents is emerging as a powerful paradigm for multi-step information discovery, synthesis, and analysis. However, existing approaches primarily focus on unstructured web data, while the challenges of conducting...
Attention Flows: Tracing LLM Conceptual Engagement via Story Summaries
arXiv:2604.06416v1 Announce Type: new Abstract: Although LLM context lengths have grown, there is evidence that their ability to integrate information across long-form texts has not kept pace. We evaluate one such understanding task: generating summaries of novels. When human authors...
Application-Driven Pedagogical Knowledge Optimization of Open-Source LLMs via Reinforcement Learning and Supervised Fine-Tuning
arXiv:2604.06385v1 Announce Type: new Abstract: We present an innovative multi-stage optimization strategy combining reinforcement learning (RL) and supervised fine-tuning (SFT) to enhance the pedagogical knowledge of large language models (LLMs), as illustrated by EduQwen 32B-RL1, EduQwen 32B-SFT, and an optional...
A Severity-Based Curriculum Learning Strategy for Arabic Medical Text Generation
arXiv:2604.06365v1 Announce Type: new Abstract: Arabic medical text generation is increasingly needed to help users interpret symptoms and access general health guidance in their native language. Nevertheless, many existing methods assume uniform importance across training samples, overlooking differences in clinical...
In-Context Learning in Speech Language Models: Analyzing the Role of Acoustic Features, Linguistic Structure, and Induction Heads
arXiv:2604.06356v1 Announce Type: new Abstract: In-Context Learning (ICL) has been extensively studied in text-only Language Models, but remains largely unexplored in the speech domain. Here, we investigate how linguistic and acoustic features affect ICL in Speech Language Models. We focus...
Extracting Breast Cancer Phenotypes from Clinical Notes: Comparing LLMs with Classical Ontology Methods
arXiv:2604.06208v1 Announce Type: new Abstract: A significant amount of data held in Oncology Electronic Medical Records (EMRs) is contained in unstructured provider notes -- including but not limited to the chemotherapy (or cancer treatment) outcome, different biomarkers, the tumor's location,...
Tool-MCoT: Tool Augmented Multimodal Chain-of-Thought for Content Safety Moderation
arXiv:2604.06205v1 Announce Type: new Abstract: The growth of online platforms and user content requires strong content moderation systems that can handle complex inputs from various media types. While large language models (LLMs) are effective, their high computational cost and latency...
Beyond Facts: Benchmarking Distributional Reading Comprehension in Large Language Models
arXiv:2604.06201v1 Announce Type: new Abstract: While most reading comprehension benchmarks for LLMs focus on factual information that can be answered by localizing specific textual evidence, many real-world tasks require understanding distributional information, such as population-level trends and preferences expressed across...
The Stepwise Informativeness Assumption: Why are Entropy Dynamics and Reasoning Correlated in LLMs?
arXiv:2604.06192v1 Announce Type: new Abstract: Recent work uses entropy-based signals at multiple representation levels to study reasoning in large language models, but the field remains largely empirical. A central unresolved puzzle is why internal entropy dynamics, defined under the predictive...
$S^3$: Stratified Scaling Search for Test-Time in Diffusion Language Models
arXiv:2604.06260v1 Announce Type: new Abstract: Test-time scaling investigates whether a fixed diffusion language model (DLM) can generate better outputs when given more inference compute, without additional training. However, naive best-of-$K$ sampling is fundamentally limited because it repeatedly draws from the...
The Rhetoric of Machine Learning
arXiv:2604.06754v1 Announce Type: new Abstract: I examine the technology of machine learning from the perspective of rhetoric, which is simply the art of persuasion. Rather than being a neutral and "objective" way to build "world models" from data, machine learning...
A Comparative Study of Demonstration Selection for Practical Large Language Models-based Next POI Prediction
arXiv:2604.06207v1 Announce Type: new Abstract: This paper investigates demonstration selection strategies for predicting a user's next point-of-interest (POI) using large language models (LLMs), aiming to accurately forecast a user's subsequent location based on historical check-in data. While in-context learning (ICL)...