GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization
arXiv:2602.13921v1 Announce Type: new Abstract: Repository-level bug localization-the task of identifying where code must be modified to fix a bug-is a critical software engineering challenge. Standard Large Language Modles (LLMs) are often unsuitable for this task due to context window...
The academic article on GREPO introduces a critical innovation for software engineering by establishing the first GNN benchmark for repository-level bug localization, addressing a persistent limitation in LLMs for large-scale code analysis. Key legal developments include the recognition of specialized algorithmic tools (like GNNs) over traditional retrieval methods in technical problem-solving, which may influence legal frameworks on intellectual property, software licensing, or algorithmic accountability. While not directly immigration-related, the research signals a broader policy trend toward validating specialized technical solutions as authoritative resources, potentially impacting regulatory approaches to AI governance or immigration-related tech workforce issues.
**Jurisdictional Comparison and Analytical Commentary on the Impact on Immigration Law Practice** The article "GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization" has no direct implications on Immigration Law practice. However, a comparative analysis of US, Korean, and international approaches to innovation and technology adoption in the context of Immigration Law can be insightful. In the US, the H-1B visa program allows foreign workers with specialized skills, including software engineers, to work in the country. The US government has been exploring ways to streamline the H-1B application process, including the use of artificial intelligence (AI) and machine learning (ML) to improve efficiency and accuracy. The development of GREPO, a benchmark for Graph Neural Networks (GNNs) on repository-level bug localization tasks, could potentially be applied to improve the H-1B application process by automating tasks and reducing processing times. In Korea, the government has implemented various initiatives to promote innovation and technology adoption in the country. The Korean Immigration Service has introduced an online application system for visa applications, which utilizes AI and ML to streamline the process. The development of GREPO could be applied to improve the Korean immigration system by enhancing the accuracy and efficiency of visa application processing. Internationally, the United Nations High Commissioner for Refugees (UNHCR) has been exploring the use of AI and ML to improve the refugee resettlement process. The development of GREPO could potentially be applied to improve the UNHCR
The article introduces GREPO as a pivotal benchmark for GNNs in repository-level bug localization, addressing a critical gap in software engineering research. By providing a scalable dataset (86 Python repositories, 47294 bug-fixing tasks) tailored for GNN processing, GREPO enables direct application of graph-based models, potentially shifting the paradigm from traditional retrieval methods (e.g., keyword matching, text similarity) to more sophisticated GNN-driven solutions. Practitioners in software engineering and AI/ML should note this as a foundational resource; its impact aligns with regulatory trends promoting innovation in AI-driven software maintenance (e.g., USPTO’s focus on AI applications in engineering). Case law relevance may emerge if GREPO’s methodology influences patent eligibility for AI-assisted bug detection under 35 U.S.C. § 101.
SCOTUStoday: Sotomayor criticizes Kavanaugh
Curious about how Supreme Court justices spend their spare time? Justice Sonia Sotomayor revealed on Tuesday that she likes reading … recent books from her colleagues. She “said she just […]The postSCOTUStoday: Sotomayor criticizes Kavanaughappeared first onSCOTUSblog.
Beyond Facts: Benchmarking Distributional Reading Comprehension in Large Language Models
arXiv:2604.06201v1 Announce Type: new Abstract: While most reading comprehension benchmarks for LLMs focus on factual information that can be answered by localizing specific textual evidence, many real-world tasks require understanding distributional information, such as population-level trends and preferences expressed across...
SHAPE: Stage-aware Hierarchical Advantage via Potential Estimation for LLM Reasoning
arXiv:2604.06636v1 Announce Type: new Abstract: Process supervision has emerged as a promising approach for enhancing LLM reasoning, yet existing methods fail to distinguish meaningful progress from mere verbosity, leading to limited reasoning capabilities and unresolved token inefficiency. To address this,...
A Parameter-Efficient Transfer Learning Approach through Multitask Prompt Distillation and Decomposition for Clinical NLP
arXiv:2604.06650v1 Announce Type: new Abstract: Existing prompt-based fine-tuning methods typically learn task-specific prompts independently, imposing significant computing and storage overhead at scale when deploying multiple clinical natural language processing (NLP) systems. We present a multitask prompt distillation and decomposition framework...
Application-Driven Pedagogical Knowledge Optimization of Open-Source LLMs via Reinforcement Learning and Supervised Fine-Tuning
arXiv:2604.06385v1 Announce Type: new Abstract: We present an innovative multi-stage optimization strategy combining reinforcement learning (RL) and supervised fine-tuning (SFT) to enhance the pedagogical knowledge of large language models (LLMs), as illustrated by EduQwen 32B-RL1, EduQwen 32B-SFT, and an optional...
Blending Human and LLM Expertise to Detect Hallucinations and Omissions in Mental Health Chatbot Responses
arXiv:2604.06216v1 Announce Type: new Abstract: As LLM-powered chatbots are increasingly deployed in mental health services, detecting hallucinations and omissions has become critical for user safety. However, state-of-the-art LLM-as-a-judge methods often fail in high-risk healthcare contexts, where subtle errors can have...
Distributed Interpretability and Control for Large Language Models
arXiv:2604.06483v1 Announce Type: new Abstract: Large language models that require multiple GPU cards to host are usually the most capable models. It is necessary to understand and steer these models, but the current technologies do not support the interpretability and...
Optimal Rates for Pure {\varepsilon}-Differentially Private Stochastic Convex Optimization with Heavy Tails
arXiv:2604.06492v1 Announce Type: new Abstract: We study stochastic convex optimization (SCO) with heavy-tailed gradients under pure epsilon-differential privacy (DP). Instead of assuming a bound on the worst-case Lipschitz parameter of the loss, we assume only a bounded k-th moment. This...
Asymptotic-Preserving Neural Networks for Viscoelastic Parameter Identification in Multiscale Blood Flow Modeling
arXiv:2604.06287v1 Announce Type: new Abstract: Mathematical models and numerical simulations offer a non-invasive way to explore cardiovascular phenomena, providing access to quantities that cannot be measured directly. In this study, we start with a one-dimensional multiscale blood flow model that...
A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset
arXiv:2604.06227v1 Announce Type: new Abstract: Accurate short-term forecasting of agricultural commodity prices is critical for food security planning and smallholder income stabilisation in developing economies, yet machine-learning-ready datasets for this purpose remain scarce in South Asia. This paper makes two...
MedConclusion: A Benchmark for Biomedical Conclusion Generation from Structured Abstracts
arXiv:2604.06505v1 Announce Type: new Abstract: Large language models (LLMs) are widely explored for reasoning-intensive research tasks, yet resources for testing whether they can infer scientific conclusions from structured biomedical evidence remain limited. We introduce $\textbf{MedConclusion}$, a large-scale dataset of $\textbf{5.7M}$...
ART: Attention Replacement Technique to Improve Factuality in LLMs
arXiv:2604.06393v1 Announce Type: new Abstract: Hallucination in large language models (LLMs) continues to be a significant issue, particularly in tasks like question answering, where models often generate plausible yet incorrect or irrelevant information. Although various methods have been proposed to...
In-Context Learning in Speech Language Models: Analyzing the Role of Acoustic Features, Linguistic Structure, and Induction Heads
arXiv:2604.06356v1 Announce Type: new Abstract: In-Context Learning (ICL) has been extensively studied in text-only Language Models, but remains largely unexplored in the speech domain. Here, we investigate how linguistic and acoustic features affect ICL in Speech Language Models. We focus...
STDec: Spatio-Temporal Stability Guided Decoding for dLLMs
arXiv:2604.06330v1 Announce Type: new Abstract: Diffusion Large Language Models (dLLMs) have achieved rapid progress, viewed as a promising alternative to the autoregressive paradigm. However, most dLLM decoders still adopt a global confidence threshold, and do not explicitly model local context...
DiffuMask: Diffusion Language Model for Token-level Prompt Pruning
arXiv:2604.06627v1 Announce Type: new Abstract: In-Context Learning and Chain-of-Thought prompting improve reasoning in large language models (LLMs). These typically come at the cost of longer, more expensive prompts that may contain redundant information. Prompt compression based on pruning offers a...
State election dispute on political speech comes to Supreme Court on interim docket
Lawyers for Ohio Secretary of State Frank LaRose, as well as county election officials, urged the Supreme Court on Wednesday to let them go ahead with a ballot that does […]The postState election dispute on political speech comes to Supreme...
Efficient Quantization of Mixture-of-Experts with Theoretical Generalization Guarantees
arXiv:2604.06515v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (MoE) allows scaling of language and vision models efficiently by activating only a small subset of experts per input. While this reduces computation, the large number of parameters still incurs substantial memory overhead...
VLMShield: Efficient and Robust Defense of Vision-Language Models against Malicious Prompts
arXiv:2604.06502v1 Announce Type: new Abstract: Vision-Language Models (VLMs) face significant safety vulnerabilities from malicious prompt attacks due to weakened alignment during visual integration. Existing defenses suffer from efficiency and robustness. To address these challenges, we first propose the Multimodal Aggregated...
ODE-free Neural Flow Matching for One-Step Generative Modeling
arXiv:2604.06413v1 Announce Type: new Abstract: Diffusion and flow matching models generate samples by learning time-dependent vector fields whose integration transports noise to data, requiring tens to hundreds of network evaluations at inference. We instead learn the transport map directly. We...
AgentOpt v0.1 Technical Report: Client-Side Optimization for LLM-Based Agent
arXiv:2604.06296v1 Announce Type: new Abstract: AI agents are increasingly deployed in real-world applications, including systems such as Manus, OpenClaw, and coding agents. Existing research has primarily focused on \emph{server-side} efficiency, proposing methods such as caching, speculative execution, traffic scheduling, and...
MO-RiskVAE: A Multi-Omics Variational Autoencoder for Survival Risk Modeling in Multiple MyelomaMO-RiskVAE
arXiv:2604.06267v1 Announce Type: new Abstract: Multimodal variational autoencoders (VAEs) have emerged as a powerful framework for survival risk modeling in multiple myeloma by integrating heterogeneous omics and clinical data. However, when trained under survival supervision, standard latent regularization strategies often...
Spectral Edge Dynamics Reveal Functional Modes of Learning
arXiv:2604.06256v1 Announce Type: new Abstract: Training dynamics during grokking concentrate along a small number of dominant update directions -- the spectral edge -- which reliably distinguishes grokking from non-grokking regimes. We show that standard mechanistic interpretability tools (head attribution, activation...
The Illusion of Stochasticity in LLMs
arXiv:2604.06543v1 Announce Type: new Abstract: In this work, we demonstrate that reliable stochastic sampling is a fundamental yet unfulfilled requirement for Large Language Models (LLMs) operating as agents. Agentic systems are frequently required to sample from distributions, often inferred from...
Does a Global Perspective Help Prune Sparse MoEs Elegantly?
arXiv:2604.06542v1 Announce Type: new Abstract: Empirical scaling laws for language models have encouraged the development of ever-larger LLMs, despite their growing computational and memory costs. Sparse Mixture-of-Experts (MoEs) offer a promising alternative by activating only a subset of experts per...
Multi-objective Evolutionary Merging Enables Efficient Reasoning Models
arXiv:2604.06465v1 Announce Type: new Abstract: Reasoning models have demonstrated remarkable capabilities in solving complex problems by leveraging long chains of thought. However, this more deliberate reasoning comes with substantial computational overhead at inference time. The Long-to-Short (L2S) reasoning problem seeks...
State-of-the-Art Arabic Language Modeling with Sparse MoE Fine-Tuning and Chain-of-Thought Distillation
arXiv:2604.06421v1 Announce Type: new Abstract: This paper introduces Arabic-DeepSeek-R1, an application-driven open-source Arabic LLM that leverages a sparse MoE backbone to address the digital equity gap for under-represented languages, and establishes a new SOTA across the entire Open Arabic LLM...
The Illusion of Superposition? A Principled Analysis of Latent Thinking in Language Models
arXiv:2604.06374v1 Announce Type: new Abstract: Latent reasoning via continuous chain-of-thoughts (Latent CoT) has emerged as a promising alternative to discrete CoT reasoning. Operating in continuous space increases expressivity and has been hypothesized to enable superposition: the ability to maintain multiple...
A Severity-Based Curriculum Learning Strategy for Arabic Medical Text Generation
arXiv:2604.06365v1 Announce Type: new Abstract: Arabic medical text generation is increasingly needed to help users interpret symptoms and access general health guidance in their native language. Nevertheless, many existing methods assume uniform importance across training samples, overlooking differences in clinical...
Emergent decentralized regulation in a purely synthetic society
arXiv:2604.06199v1 Announce Type: new Abstract: As autonomous AI agents increasingly inhabit online environments and extensively interact, a key question is whether synthetic collectives exhibit self-regulated social dynamics with neither human intervention nor centralized design. We study OpenClaw agents on Moltbook,...