All Practice Areas

Immigration Law

이민법

Jurisdiction: All US KR EU Intl
LOW Academic United States

Probing the Limits of the Lie Detector Approach to LLM Deception

arXiv:2603.10003v1 Announce Type: new Abstract: Mechanistic approaches to deception in large language models (LLMs) often rely on "lie detectors", that is, truth probes trained to identify internal representations of model outputs as false. The lie detector approach to LLM deception...

1 min 1 month, 1 week ago
ead
LOW Academic International

GATech at AbjadGenEval Shared Task: Multilingual Embeddings for Arabic Machine-Generated Text Classification

arXiv:2603.10007v1 Announce Type: new Abstract: We present our approach to the AbjadGenEval shared task on detecting AI-generated Arabic text. We fine-tuned the multilingual E5-large encoder for binary classification, and we explored several pooling strategies to pool token representations, including weighted...

1 min 1 month, 1 week ago
ead
LOW Academic International

Evaluating Progress in Graph Foundation Models: A Comprehensive Benchmark and New Insights

arXiv:2603.10033v1 Announce Type: new Abstract: Graph foundation models (GFM) aim to acquire transferable knowledge by pre-training on diverse graphs, which can be adapted to various downstream tasks. However, domain shift in graphs is inherently two-dimensional: graphs differ not only in...

1 min 1 month, 1 week ago
tps
LOW Academic European Union

The Prediction-Measurement Gap: Toward Meaning Representations as Scientific Instruments

arXiv:2603.10130v1 Announce Type: new Abstract: Text embeddings have become central to computational social science and psychology, enabling scalable measurement of meaning and mixed-method inference. Yet most representation learning is optimized and evaluated for prediction and retrieval, yielding a prediction-measurement gap:...

1 min 1 month, 1 week ago
ead
LOW Academic International

The Generation-Recognition Asymmetry: Six Dimensions of a Fundamental Divide in Formal Language Theory

arXiv:2603.10139v1 Announce Type: new Abstract: Every formal grammar defines a language and can in principle be used in three ways: to generate strings (production), to recognize them (parsing), or -- given only examples -- to infer the grammar itself (grammar...

1 min 1 month, 1 week ago
ead
LOW Academic European Union

Lost in Backpropagation: The LM Head is a Gradient Bottleneck

arXiv:2603.10145v1 Announce Type: new Abstract: The last layer of neural language models (LMs) projects output features of dimension $D$ to logits in dimension $V$, the size of the vocabulary, where usually $D \ll V$. This mismatch is known to raise...

1 min 1 month, 1 week ago
ead
LOW Academic International

GR-SAP: Generative Replay for Safety Alignment Preservation during Fine-Tuning

arXiv:2603.10243v1 Announce Type: new Abstract: Recent studies show that the safety alignment of large language models (LLMs) can be easily compromised even by seemingly non-adversarial fine-tuning. To preserve safety alignment during fine-tuning, a widely used strategy is to jointly optimize...

1 min 1 month, 1 week ago
tps
LOW Academic United States

Revisiting Sharpness-Aware Minimization: A More Faithful and Effective Implementation

arXiv:2603.10048v1 Announce Type: new Abstract: Sharpness-Aware Minimization (SAM) enhances generalization by minimizing the maximum training loss within a predefined neighborhood around the parameters. However, its practical implementation approximates this as gradient ascent(s) followed by applying the gradient at the ascent...

1 min 1 month, 1 week ago
ead
LOW Academic International

InFusionLayer: a CFA-based ensemble tool to generate new classifiers for learning and modeling

arXiv:2603.10049v1 Announce Type: new Abstract: Ensemble learning is a well established body of methods for machine learning to enhance predictive performance by combining multiple algorithms/models. Combinatorial Fusion Analysis (CFA) has provided method and practice for combining multiple scoring systems, using...

1 min 1 month, 1 week ago
tps
LOW Academic International

HTMuon: Improving Muon via Heavy-Tailed Spectral Correction

arXiv:2603.10067v1 Announce Type: new Abstract: Muon has recently shown promising results in LLM training. In this work, we study how to further improve Muon. We argue that Muon's orthogonalized update rule suppresses the emergence of heavy-tailed weight spectra and over-emphasizes...

1 min 1 month, 1 week ago
tps
LOW Academic International

Improving Search Agent with One Line of Code

arXiv:2603.10069v1 Announce Type: new Abstract: Tool-based Agentic Reinforcement Learning (TARL) has emerged as a promising paradigm for training search agents to interact with external tools for a multi-turn information-seeking process autonomously. However, we identify a critical training instability that leads...

1 min 1 month, 1 week ago
ead
LOW Academic United States

Marginals Before Conditionals

arXiv:2603.10074v1 Announce Type: new Abstract: We construct a minimal task that isolates conditional learning in neural networks: a surjective map with K-fold ambiguity, resolved by a selector token z, so H(A | B) = log K while H(A | B,...

1 min 1 month, 1 week ago
ead
LOW Academic United States

ES-dLLM: Efficient Inference for Diffusion Large Language Models by Early-Skipping

arXiv:2603.10088v1 Announce Type: new Abstract: Diffusion large language models (dLLMs) are emerging as a promising alternative to autoregressive models (ARMs) due to their ability to capture bidirectional context and the potential for parallel generation. Despite the advantages, dLLM inference remains...

1 min 1 month, 1 week ago
tps
LOW Academic European Union

A Survey of Weight Space Learning: Understanding, Representation, and Generation

arXiv:2603.10090v1 Announce Type: new Abstract: Neural network weights are typically viewed as the end product of training, while most deep learning research focuses on data, features, and architectures. However, recent advances show that the set of all possible weight values...

1 min 1 month, 1 week ago
tps
LOW Academic United States

Equivariant Asynchronous Diffusion: An Adaptive Denoising Schedule for Accelerated Molecular Conformation Generation

arXiv:2603.10093v1 Announce Type: new Abstract: Recent 3D molecular generation methods primarily use asynchronous auto-regressive or synchronous diffusion models. While auto-regressive models build molecules sequentially, they're limited by a short horizon and a discrepancy between training and inference. Conversely, synchronous diffusion...

1 min 1 month, 1 week ago
ead
LOW Academic United States

Rethinking Adam for Time Series Forecasting: A Simple Heuristic to Improve Optimization under Distribution Shifts

arXiv:2603.10095v1 Announce Type: new Abstract: Time-series forecasting often faces challenges from non-stationarity, particularly distributional drift, where the data distribution evolves over time. This dynamic behavior can undermine the effectiveness of adaptive optimizers, such as Adam, which are typically designed for...

1 min 1 month, 1 week ago
tps
LOW Academic International

Lost in the Middle at Birth: An Exact Theory of Transformer Position Bias

arXiv:2603.10123v1 Announce Type: new Abstract: The ``Lost in the Middle'' phenomenon -- a U-shaped performance curve where LLMs retrieve well from the beginning and end of a context but fail in the middle -- is widely attributed to learned Softmax...

1 min 1 month, 1 week ago
ead
LOW Academic European Union

Mashup Learning: Faster Finetuning by Remixing Past Checkpoints

arXiv:2603.10156v1 Announce Type: new Abstract: Finetuning on domain-specific data is a well-established method for enhancing LLM performance on downstream tasks. Training on each dataset produces a new set of model weights, resulting in a multitude of checkpoints saved in-house or...

1 min 1 month, 1 week ago
ead
LOW Academic International

DT-BEHRT: Disease Trajectory-aware Transformer for Interpretable Patient Representation Learning

arXiv:2603.10180v1 Announce Type: new Abstract: The growing adoption of electronic health record (EHR) systems has provided unprecedented opportunities for predictive modeling to guide clinical decision making. Structured EHRs contain longitudinal observations of patients across hospital visits, where each visit is...

1 min 1 month, 1 week ago
tps
LOW Academic United States

Discovery of a Hematopoietic Manifold in scGPT Yields a Method for Extracting Performant Algorithms from Biological Foundation Model Internals

arXiv:2603.10261v1 Announce Type: new Abstract: We report the discovery and extraction of a compact hematopoietic algorithm from the single-cell foundation model scGPT, to our knowledge the first biologically useful, competitive algorithm extracted from a foundation model via mechanistic interpretability. We...

1 min 1 month, 1 week ago
ead
LOW Academic United States

Taming Score-Based Denoisers in ADMM: A Convergent Plug-and-Play Framework

arXiv:2603.10281v1 Announce Type: new Abstract: While score-based generative models have emerged as powerful priors for solving inverse problems, directly integrating them into optimization algorithms such as ADMM remains nontrivial. Two central challenges arise: i) the mismatch between the noisy data...

1 min 1 month, 1 week ago
ead
LOW Academic United States

How to make the most of your masked language model for protein engineering

arXiv:2603.10302v1 Announce Type: new Abstract: A plethora of protein language models have been released in recent years. Yet comparatively little work has addressed how to best sample from them to optimize desired biological properties. We fill this gap by proposing...

1 min 1 month, 1 week ago
ead
LOW Academic European Union

Optimal Expert-Attention Allocation in Mixture-of-Experts: A Scalable Law for Dynamic Model Design

arXiv:2603.10379v1 Announce Type: new Abstract: This paper presents a novel extension of neural scaling laws to Mixture-of-Experts (MoE) models, focusing on the optimal allocation of compute between expert and attention sub-layers. As MoE architectures have emerged as an efficient method...

1 min 1 month, 1 week ago
ead
LOW Academic International

Variance-Aware Adaptive Weighting for Diffusion Model Training

arXiv:2603.10391v1 Announce Type: new Abstract: Diffusion models have recently achieved remarkable success in generative modeling, yet their training dynamics across different noise levels remain highly imbalanced, which can lead to inefficient optimization and unstable learning behavior. In this work, we...

1 min 1 month, 1 week ago
ead
LOW Academic International

On the Learning Dynamics of Two-layer Linear Networks with Label Noise SGD

arXiv:2603.10397v1 Announce Type: new Abstract: One crucial factor behind the success of deep learning lies in the implicit bias induced by noise inherent in gradient-based training algorithms. Motivated by empirical observations that training with noisy labels improves model generalization, we...

1 min 1 month, 1 week ago
tps
LOW News United States

The 14th Amendment’s citizenship clause does not codify English principles of subjectship

Critics and supporters of President Donald Trump’s executive order on birthright citizenship often focus on the order’s barring of automatic citizenship to children born to individuals unlawfully present in the […]The postThe 14th Amendment’s citizenship clause does not codify English...

1 min 1 month, 1 week ago
citizenship
LOW News International

Zendesk acquires agentic customer service startup Forethought

Forethought was years ahead of its time and the 2018 winner of TechCrunch Battlefield.

1 min 1 month, 1 week ago
ead
LOW Law Review International

What is a Tort?

What is a tort, and what is tort law for? On one leading scholarly account, torts are legal liability rules that seek to promote the welfare of society at large by disincentivizing socially suboptimal behavior and distributing the costs of...

1 min 1 month, 1 week ago
ead
LOW Academic International

Context Engineering: From Prompts to Corporate Multi-Agent Architecture

arXiv:2603.09619v1 Announce Type: new Abstract: As artificial intelligence (AI) systems evolve from stateless chatbots to autonomous multi-step agents, prompt engineering (PE), the discipline of crafting individual queries, proves necessary but insufficient. This paper introduces context engineering (CE) as a standalone...

1 min 1 month, 1 week ago
ead
LOW Academic International

Let's Verify Math Questions Step by Step

arXiv:2505.13903v1 Announce Type: cross Abstract: Large Language Models (LLMs) have recently achieved remarkable progress in mathematical reasoning. To enable such capabilities, many existing works distill strong reasoning models into long chains of thought or design algorithms to construct high-quality math...

1 min 1 month, 1 week ago
tps
Previous Page 32 of 71 Next

Impact Distribution

Critical 0
High 0
Medium 7
Low 2110