Tax Law

LOW Academic International

A Coin Flip for Safety: LLM Judges Fail to Reliably Measure Adversarial Robustness

arXiv:2603.06594v1 Announce Type: new Abstract: Automated \enquote{LLM-as-a-Judge} frameworks have become the de facto standard for scalable evaluation across natural language processing. For instance, in safety evaluation, these judges are relied upon to evaluate harmfulness in order to benchmark the robustness...

1 min 1 month, 1 week ago

audit

LOW Academic International

Can Safety Emerge from Weak Supervision? A Systematic Analysis of Small Language Models

arXiv:2603.07017v1 Announce Type: new Abstract: Safety alignment is critical for deploying large language models (LLMs) in real-world applications, yet most existing approaches rely on large human-annotated datasets and static red-teaming benchmarks that are costly, difficult to scale, and slow to...

1 min 1 month, 1 week ago

vat

LOW Academic International

AutoChecklist: Composable Pipelines for Checklist Generation and Scoring with LLM-as-a-Judge

arXiv:2603.07019v1 Announce Type: new Abstract: Checklists have emerged as a popular approach for interpretable and fine-grained evaluation, particularly with LLM-as-a-Judge. Beyond evaluation, these structured criteria can serve as signals for model alignment, reinforcement learning, and self-correction. To support these use...

1 min 1 month, 1 week ago

tax

LOW Academic International

Few Tokens, Big Leverage: Preserving Safety Alignment by Constraining Safety Tokens during Fine-tuning

arXiv:2603.07445v1 Announce Type: new Abstract: Large language models (LLMs) often require fine-tuning (FT) to perform well on downstream tasks, but FT can induce safety-alignment drift even when the training dataset contains only benign data. Prior work shows that introducing a...

1 min 1 month, 1 week ago

vat

LOW Academic International

Cross-Modal Taxonomic Generalization in (Vision-) Language Models

arXiv:2603.07474v1 Announce Type: new Abstract: What is the interplay between semantic representations learned by language models (LM) from surface form alone to those learned from more grounded evidence? We study this question for a scenario where part of the input...

1 min 1 month, 1 week ago

tax

LOW Academic International

Benchmarking Large Language Models for Quebec Insurance: From Closed-Book to Retrieval-Augmented Generation

arXiv:2603.07825v1 Announce Type: new Abstract: The digitization of insurance distribution in the Canadian province of Quebec, accelerated by legislative changes such as Bill 141, has created a significant "advice gap", leaving consumers to interpret complex financial contracts without professional guidance....

1 min 1 month, 1 week ago

vat

LOW Academic International

vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM

arXiv:2603.06588v1 Announce Type: new Abstract: Modern artificial intelligence (AI) models are deployed on inference engines to optimize runtime efficiency and resource allocation, particularly for transformer-based large language models (LLMs). The vLLM project is a major open-source library to support model...

1 min 1 month, 1 week ago

vat

LOW Academic International

Not all tokens are needed(NAT): token efficient reinforcement learning

arXiv:2603.06619v1 Announce Type: new Abstract: Reinforcement learning (RL) has become a key driver of progress in large language models, but scaling RL to long chain-of-thought (CoT) trajectories is increasingly constrained by backpropagation over every generated token. Even with optimized rollout...

1 min 1 month, 1 week ago

tax

LOW Academic International

Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection

arXiv:2603.06745v1 Announce Type: new Abstract: Large Language Models (LLMs), despite advances in instruction tuning, often fail to follow complex user instructions. Activation steering techniques aim to mitigate this by manipulating model internals, but have a potential risk of oversteering, where...

1 min 1 month, 1 week ago

vat

LOW Academic International

Latent Autoencoder Ensemble Kalman Filter for Data assimilation

arXiv:2603.06752v1 Announce Type: new Abstract: The ensemble Kalman filter (EnKF) is widely used for data assimilation in high-dimensional systems, but its performance often deteriorates for strongly nonlinear dynamics due to the structural mismatch between the Kalman update and the underlying...

1 min 1 month, 1 week ago

vat

LOW Academic International

DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality

arXiv:2603.05912v1 Announce Type: new Abstract: Search-augmented LLM agents can produce deep research reports (DRRs), but verifying claim-level factuality remains challenging. Existing fact-checkers are primarily designed for general-domain, factoid-style atomic claims, and there is no benchmark to test whether such verifiers...

1 min 1 month, 1 week ago

audit

LOW Academic International

Reasoning Models Struggle to Control their Chains of Thought

arXiv:2603.05706v1 Announce Type: new Abstract: Chain-of-thought (CoT) monitoring is a promising tool for detecting misbehaviors and understanding the motivations of modern reasoning models. However, if models can control what they verbalize in their CoT, it could undermine CoT monitorability. To...

1 min 1 month, 1 week ago

vat

LOW Academic International

SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement

arXiv:2603.06333v1 Announce Type: new Abstract: Recursive self-improvement is moving from theory to practice: modern systems can critique, revise, and evaluate their own outputs, yet iterative self-modification risks subtle alignment drift. We introduce SAHOO, a practical framework to monitor and control...

1 min 1 month, 1 week ago

vat

LOW Academic International

The DSA's Blind Spot: Algorithmic Audit of Advertising and Minor Profiling on TikTok

arXiv:2603.05653v1 Announce Type: cross Abstract: Adolescents spend an increasing amount of their time in digital environments where their still-developing cognitive capacities leave them unable to recognize or resist commercial persuasion. Article 28(2) of the Digital Service Act (DSA) responds to...

1 min 1 month, 1 week ago

audit

LOW Academic International

CodeScout: Contextual Problem Statement Enhancement for Software Agents

arXiv:2603.05744v1 Announce Type: new Abstract: Current AI-powered code assistance tools often struggle with poorly-defined problem statements that lack sufficient task context and requirements specification. Recent analysis of software engineering agents reveals that failures on such underspecified requests are highly correlated...

1 min 1 month, 1 week ago

vat

LOW Academic International

PVminerLLM: Structured Extraction of Patient Voice from Patient-Generated Text using Large Language Models

arXiv:2603.05776v1 Announce Type: new Abstract: Motivation: Patient-generated text contains critical information about patients' lived experiences, social circumstances, and engagement in care, including factors that strongly influence adherence, care coordination, and health equity. However, these patient voice signals are rarely available...

1 min 1 month, 1 week ago

vat

LOW Academic International

RouteGoT: Node-Adaptive Routing for Cost-Efficient Graph of Thoughts Reasoning

arXiv:2603.05818v1 Announce Type: new Abstract: Large Language Models (LLMs) excel at multi-step reasoning, yet increasing the structural complexity of inference does not consistently improve system-level returns. Methods such as Tree of Thoughts (ToT), Graph of Thoughts (GoT), and Adaptive Graph...

1 min 1 month, 1 week ago

vat

LOW Academic International

Lost in Stories: Consistency Bugs in Long Story Generation by LLMs

arXiv:2603.05890v1 Announce Type: new Abstract: What happens when a storyteller forgets its own story? Large Language Models (LLMs) can now generate narratives spanning tens of thousands of words, but they often fail to maintain consistency throughout. When generating long-form narratives,...

1 min 1 month, 1 week ago

tax

LOW Academic International

Learning Next Action Predictors from Human-Computer Interaction

arXiv:2603.05923v1 Announce Type: new Abstract: Truly proactive AI systems must anticipate what we will do next. This foresight demands far richer information than the sparse signals we type into our prompts -- it demands reasoning over the entire context of...

1 min 1 month, 1 week ago

vat

LOW Academic International

SPOT: Span-level Pause-of-Thought for Efficient and Interpretable Latent Reasoning in Large Language Models

arXiv:2603.06222v1 Announce Type: new Abstract: Explicit Chain-of-Thought improves the reasoning performance of large language models but often incurs high inference cost due to verbose token-level traces. While recent approaches reduce this overhead via concise prompting or step pruning, they largely...

1 min 1 month, 1 week ago

audit

LOW Academic International

Mind the Gap: Pitfalls of LLM Alignment with Asian Public Opinion

arXiv:2603.06264v1 Announce Type: new Abstract: Large Language Models (LLMs) are increasingly being deployed in multilingual, multicultural settings, yet their reliance on predominantly English-centric training data risks misalignment with the diverse cultural values of different societies. In this paper, we present...

1 min 1 month, 1 week ago

audit

LOW Academic International

Abductive Reasoning with Syllogistic Forms in Large Language Models

arXiv:2603.06428v1 Announce Type: new Abstract: Research in AI using Large-Language Models (LLMs) is rapidly evolving, and the comparison of their performance with human reasoning has become a key concern. Prior studies have indicated that LLMs and humans share similar biases,...

1 min 1 month, 1 week ago

deduction

LOW Academic International

Beyond Rows to Reasoning: Agentic Retrieval for Multimodal Spreadsheet Understanding and Editing

arXiv:2603.06503v1 Announce Type: new Abstract: Recent advances in multimodal Retrieval-Augmented Generation (RAG) enable Large Language Models (LLMs) to analyze enterprise spreadsheet workbooks containing millions of cells, cross-sheet dependencies, and embedded visual artifacts. However, state-of-the-art approaches exclude critical context through single-pass...

1 min 1 month, 1 week ago

audit

LOW Academic International

Score-Guided Proximal Projection: A Unified Geometric Framework for Rectified Flow Editing

arXiv:2603.05761v1 Announce Type: new Abstract: Rectified Flow (RF) models achieve state-of-the-art generation quality, yet controlling them for precise tasks -- such as semantic editing or blind image recovery -- remains a challenge. Current approaches bifurcate into inversion-based guidance, which suffers...

1 min 1 month, 1 week ago

vat

LOW Academic International

Sparse Crosscoders for diffing MoEs and Dense models

arXiv:2603.05805v1 Announce Type: new Abstract: Mixture of Experts (MoE) achieve parameter-efficient scaling through sparse expert routing, yet their internal representations remain poorly understood compared to dense models. We present a systematic comparison of MoE and dense model internals using crosscoders,...

1 min 1 month, 1 week ago

vat

LOW Academic International

MoE Lens -- An Expert Is All You Need

arXiv:2603.05806v1 Announce Type: new Abstract: Mixture of Experts (MoE) models enable parameter-efficient scaling through sparse expert activations, yet optimizing their inference and memory costs remains challenging due to limited understanding of their specialization behavior. We present a systematic analysis of...

1 min 1 month, 1 week ago

vat

LOW Academic International

Self-Auditing Parameter-Efficient Fine-Tuning for Few-Shot 3D Medical Image Segmentation

arXiv:2603.05822v1 Announce Type: new Abstract: Adapting foundation models to new clinical sites remains challenging in practice. Domain shift and scarce annotations must be handled by experts, yet many clinical groups do not have ready access to skilled AI engineers to...

1 min 1 month, 1 week ago

audit

LOW Academic International

Dynamic Momentum Recalibration in Online Gradient Learning

arXiv:2603.06120v1 Announce Type: new Abstract: Stochastic Gradient Descent (SGD) and its momentum variants form the backbone of deep learning optimization, yet the underlying dynamics of their gradient behavior remain insufficiently understood. In this work, we reinterpret gradient updates through the...

1 min 1 month, 1 week ago

vat

LOW Academic International

Gradient Flow Polarizes Softmax Outputs towards Low-Entropy Solutions

arXiv:2603.06248v1 Announce Type: new Abstract: Understanding the intricate non-convex training dynamics of softmax-based models is crucial for explaining the empirical success of transformers. In this article, we analyze the gradient flow dynamics of the value-softmax model, defined as ${L}(\mathbf{V} \sigma(\mathbf{a}))$,...

1 min 1 month, 1 week ago

vat

LOW Academic International

Computation of fluxes of conservation laws

1 min 1 month, 1 week ago

vat

A Coin Flip for Safety: LLM Judges Fail to Reliably Measure Adversarial Robustness

Can Safety Emerge from Weak Supervision? A Systematic Analysis of Small Language Models

AutoChecklist: Composable Pipelines for Checklist Generation and Scoring with LLM-as-a-Judge

Few Tokens, Big Leverage: Preserving Safety Alignment by Constraining Safety Tokens during Fine-tuning

Cross-Modal Taxonomic Generalization in (Vision-) Language Models

Benchmarking Large Language Models for Quebec Insurance: From Closed-Book to Retrieval-Augmented Generation

vLLM Hook v0: A Plug-in for Programming Model Internals on vLLM

Not all tokens are needed(NAT): token efficient reinforcement learning

Enhancing Instruction Following of LLMs via Activation Steering with Dynamic Rejection

Latent Autoencoder Ensemble Kalman Filter for Data assimilation

DeepFact: Co-Evolving Benchmarks and Agents for Deep Research Factuality

Reasoning Models Struggle to Control their Chains of Thought

SAHOO: Safeguarded Alignment for High-Order Optimization Objectives in Recursive Self-Improvement

The DSA's Blind Spot: Algorithmic Audit of Advertising and Minor Profiling on TikTok

CodeScout: Contextual Problem Statement Enhancement for Software Agents

PVminerLLM: Structured Extraction of Patient Voice from Patient-Generated Text using Large Language Models

RouteGoT: Node-Adaptive Routing for Cost-Efficient Graph of Thoughts Reasoning

Lost in Stories: Consistency Bugs in Long Story Generation by LLMs

Learning Next Action Predictors from Human-Computer Interaction

SPOT: Span-level Pause-of-Thought for Efficient and Interpretable Latent Reasoning in Large Language Models

Mind the Gap: Pitfalls of LLM Alignment with Asian Public Opinion

Abductive Reasoning with Syllogistic Forms in Large Language Models

Beyond Rows to Reasoning: Agentic Retrieval for Multimodal Spreadsheet Understanding and Editing

Score-Guided Proximal Projection: A Unified Geometric Framework for Rectified Flow Editing

Sparse Crosscoders for diffing MoEs and Dense models

MoE Lens -- An Expert Is All You Need

Self-Auditing Parameter-Efficient Fine-Tuning for Few-Shot 3D Medical Image Segmentation

Dynamic Momentum Recalibration in Online Gradient Learning

Gradient Flow Polarizes Softmax Outputs towards Low-Entropy Solutions

Computation of fluxes of conservation laws

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.