International Law

LOW Academic International

Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring

arXiv:2603.06066v1 Announce Type: new Abstract: Automated Essay Scoring (AES) has been explored for decades with the goal to support teachers by reducing grading workload and mitigating subjective biases. While early systems relied on handcrafted features and statistical models, recent advances...

1 min 1 month, 1 week ago

ear

LOW Academic International

A Causal Graph Approach to Oppositional Narrative Analysis

arXiv:2603.06135v1 Announce Type: new Abstract: Current methods for textual analysis rely on data annotated within predefined ontologies, often embedding human bias within black-box models. Despite achieving near-perfect performance, these approaches exploit unstructured, linear pattern recognition rather than modeling the structured...

1 min 1 month, 1 week ago

ear

LOW Academic International

MAPO: Mixed Advantage Policy Optimization for Long-Horizon Multi-Turn Dialogue

arXiv:2603.06194v1 Announce Type: new Abstract: Subjective multi-turn dialogue tasks, such as emotional support, require conversational policies that adapt to evolving user states and optimize long-horizon interaction quality. However, reinforcement learning (RL) for such settings remains challenging due to the absence...

1 min 1 month, 1 week ago

ear

LOW Academic International

Wisdom of the AI Crowd (AI-CROWD) for Ground Truth Approximation in Content Analysis: A Research Protocol & Validation Using Eleven Large Language Models

arXiv:2603.06197v1 Announce Type: new Abstract: Large-scale content analysis is increasingly limited by the absence of observable ground truth or gold-standard labels, as creating such benchmarks through extensive human coding becomes impractical for massive datasets due to high time, cost, and...

1 min 1 month, 1 week ago

ear

LOW Academic International

FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

arXiv:2603.06199v1 Announce Type: new Abstract: Long-context modeling is a pivotal capability for Large Language Models, yet the quadratic complexity of attention remains a critical bottleneck, particularly during the compute-intensive prefilling phase. While various sparse attention mechanisms have been explored, they...

1 min 1 month, 1 week ago

ear

LOW Academic International

Transparent AI for Mathematics: Transformer-Based Large Language Models for Mathematical Entity Relationship Extraction with XAI

arXiv:2603.06348v1 Announce Type: new Abstract: Mathematical text understanding is a challenging task due to the presence of specialized entities and complex relationships between them. This study formulates mathematical problem interpretation as a Mathematical Entity Relation Extraction (MERE) task, where operands...

1 min 1 month, 1 week ago

ear

LOW Academic International

Abductive Reasoning with Syllogistic Forms in Large Language Models

arXiv:2603.06428v1 Announce Type: new Abstract: Research in AI using Large-Language Models (LLMs) is rapidly evolving, and the comparison of their performance with human reasoning has become a key concern. Prior studies have indicated that LLMs and humans share similar biases,...

1 min 1 month, 1 week ago

ear

LOW Academic International

PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations

arXiv:2603.06485v1 Announce Type: new Abstract: Explainable Artificial Intelligence (XAI) seeks to enhance the transparency and accountability of machine learning systems, yet most methods follow a one-size-fits-all paradigm that neglects user differences in expertise, goals, and cognitive needs. Although Large Language...

1 min 1 month, 1 week ago

ear

LOW Academic International

Speak in Context: Multilingual ASR with Speech Context Alignment via Contrastive Learning

arXiv:2603.06505v1 Announce Type: new Abstract: Automatic speech recognition (ASR) has benefited from advances in pretrained speech and language models, yet most systems remain constrained to monolingual settings and short, isolated utterances. While recent efforts in context-aware ASR show promise, two...

1 min 1 month, 1 week ago

ear

LOW Academic International

FuseDiff: Symmetry-Preserving Joint Diffusion for Dual-Target Structure-Based Drug Design

arXiv:2603.05567v1 Announce Type: new Abstract: Dual-target structure-based drug design aims to generate a single ligand together with two pocket-specific binding poses, each compatible with a corresponding target pocket, enabling polypharmacological therapies with improved efficacy and reduced resistance. Existing approaches typically...

1 min 1 month, 1 week ago

ear

LOW Academic International

The Value of Graph-based Encoding in NBA Salary Prediction

arXiv:2603.05671v1 Announce Type: new Abstract: Market valuations for professional athletes is a difficult problem, given the amount of variability in performance and location from year to year. In the National Basketball Association (NBA), a straightforward way to address this problem...

1 min 1 month, 1 week ago

ear

LOW Academic International

Reinforcement Learning for Power-Flow Network Analysis

arXiv:2603.05673v1 Announce Type: new Abstract: The power flow equations are non-linear multivariate equations that describe the relationship between power injections and bus voltages of electric power networks. Given a network topology, we are interested in finding network parameters with many...

1 min 1 month, 1 week ago

ear

LOW Academic International

Improved Scaling Laws via Weak-to-Strong Generalization in Random Feature Ridge Regression

arXiv:2603.05691v1 Announce Type: new Abstract: It is increasingly common in machine learning to use learned models to label data and then employ such data to train more capable models. The phenomenon of weak-to-strong generalization exemplifies the advantage of this two-stage...

1 min 1 month, 1 week ago

ear

LOW Academic International

Revisiting the (Sub)Optimality of Best-of-N for Inference-Time Alignment

arXiv:2603.05739v1 Announce Type: new Abstract: Best-of-N (BoN) sampling is a widely used inference-time alignment method for language models, whereby N candidate responses are sampled from a reference model and the one with the highest predicted reward according to a learned...

1 min 1 month, 1 week ago

ear

LOW Academic International

MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation

arXiv:2603.05760v1 Announce Type: new Abstract: Multi-objective reinforcement learning (MORL) is effective for multi-echelon combinatorial supply chain optimisation, where tasks involve high dimensionality, uncertainty, and competing objectives. However, its deployment in dynamic environments is hindered by the need for task-specific retraining...

1 min 1 month, 1 week ago

ear

LOW Academic International

Bridging Domains through Subspace-Aware Model Merging

arXiv:2603.05768v1 Announce Type: new Abstract: Model merging integrates multiple task-specific models into a single consolidated one. Recent research has made progress in improving merging performance for in-distribution or multi-task scenarios, but domain generalization in model merging remains underexplored. We investigate...

1 min 1 month, 1 week ago

ear

LOW Academic International

First-Order Softmax Weighted Switching Gradient Method for Distributed Stochastic Minimax Optimization with Stochastic Constraints

arXiv:2603.05774v1 Announce Type: new Abstract: This paper addresses the distributed stochastic minimax optimization problem subject to stochastic constraints. We propose a novel first-order Softmax-Weighted Switching Gradient method tailored for federated learning. Under full client participation, our algorithm achieves the standard...

1 min 1 month, 1 week ago

ear

LOW Academic International

Sparse Crosscoders for diffing MoEs and Dense models

arXiv:2603.05805v1 Announce Type: new Abstract: Mixture of Experts (MoE) achieve parameter-efficient scaling through sparse expert routing, yet their internal representations remain poorly understood compared to dense models. We present a systematic comparison of MoE and dense model internals using crosscoders,...

1 min 1 month, 1 week ago

ear

LOW Academic International

MoE Lens -- An Expert Is All You Need

arXiv:2603.05806v1 Announce Type: new Abstract: Mixture of Experts (MoE) models enable parameter-efficient scaling through sparse expert activations, yet optimizing their inference and memory costs remains challenging due to limited understanding of their specialization behavior. We present a systematic analysis of...

1 min 1 month, 1 week ago

ear

LOW Academic International

Self-Auditing Parameter-Efficient Fine-Tuning for Few-Shot 3D Medical Image Segmentation

arXiv:2603.05822v1 Announce Type: new Abstract: Adapting foundation models to new clinical sites remains challenging in practice. Domain shift and scarce annotations must be handled by experts, yet many clinical groups do not have ready access to skilled AI engineers to...

1 min 1 month, 1 week ago

ear

LOW Academic International

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

arXiv:2603.05829v1 Announce Type: new Abstract: Test-time adaptation enables large language models (LLMs) to modify their behavior at inference without updating model parameters. A common approach is many-shot prompting, where large numbers of in-context learning (ICL) examples are injected as an...

1 min 1 month, 1 week ago

ear

LOW Academic International

Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning

arXiv:2603.05900v1 Announce Type: new Abstract: Large language models (LLMs) benefit substantially from supervised fine-tuning (SFT) and reinforcement learning with verifiable rewards (RLVR) in reasoning tasks. However, these recipes perform poorly in instruction-based molecular optimization, where each data point typically provides...

1 min 1 month, 1 week ago

ear

LOW Academic International

Omni-Masked Gradient Descent: Memory-Efficient Optimization via Mask Traversal with Improved Convergence

arXiv:2603.05960v1 Announce Type: new Abstract: Memory-efficient optimization methods have recently gained increasing attention for scaling full-parameter training of large language models under the GPU-memory bottleneck. Existing approaches either lack clear convergence guarantees, or only achieve the standard ${\mathcal{O}}(\epsilon^{-4})$ iteration complexity...

1 min 1 month, 1 week ago

ear

LOW Academic International

EvoESAP: Non-Uniform Expert Pruning for Sparse MoE

arXiv:2603.06003v1 Announce Type: new Abstract: Sparse Mixture-of-Experts (SMoE) language models achieve strong capability at low per-token compute, yet deployment remains memory- and throughput-bound because the full expert pool must be stored and served. Post-training expert pruning reduces this cost, but...

1 min 1 month, 1 week ago

ear

LOW Academic International

Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments

arXiv:2603.06009v1 Announce Type: new Abstract: Plateaus, where an agent's performance stagnates at a suboptimal level, are a common problem in deep on-policy RL. Focusing on PPO due to its widespread adoption, we show that plateaus in certain regimes arise not...

1 min 1 month, 1 week ago

ear

LOW Academic International

Agnostic learning in (almost) optimal time via Gaussian surface area

arXiv:2603.06027v1 Announce Type: new Abstract: The complexity of learning a concept class under Gaussian marginals in the difficult agnostic model is closely related to its $L_1$-approximability by low-degree polynomials. For any concept class with Gaussian surface area at most $\Gamma$,...

1 min 1 month, 1 week ago

ear

LOW Academic International

Dynamic Momentum Recalibration in Online Gradient Learning

arXiv:2603.06120v1 Announce Type: new Abstract: Stochastic Gradient Descent (SGD) and its momentum variants form the backbone of deep learning optimization, yet the underlying dynamics of their gradient behavior remain insufficiently understood. In this work, we reinterpret gradient updates through the...

1 min 1 month, 1 week ago

ear

LOW Academic International

DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly Detection

arXiv:2603.06131v1 Announce Type: new Abstract: Time series anomaly detection has achieved remarkable progress in recent years. However, evaluation practices have received comparatively less attention, despite their critical importance. Existing metrics exhibit several limitations: (1) bias toward point-level coverage, (2) insensitivity...

1 min 1 month, 1 week ago

ear

LOW Academic International

Partial Policy Gradients for RL in LLMs

arXiv:2603.06138v1 Announce Type: new Abstract: Reinforcement learning is a framework for learning to act sequentially in an unknown environment. We propose a natural approach for modeling policy structure in policy gradients. The key idea is to optimize for a subset...

1 min 1 month, 1 week ago

ear

LOW Academic International

Topological descriptors of foot clearance gait dynamics improve differential diagnosis of Parkinsonism

arXiv:2603.06212v1 Announce Type: new Abstract: Differential diagnosis among parkinsonian syndromes remains a clinical challenge due to overlapping motor symptoms and subtle gait abnormalities. Accurate differentiation is crucial for treatment planning and prognosis. While gait analysis is a well established approach...

1 min 1 month, 1 week ago

ear

Evaluating Austrian A-Level German Essays with Large Language Models for Automated Essay Scoring

A Causal Graph Approach to Oppositional Narrative Analysis

MAPO: Mixed Advantage Policy Optimization for Long-Horizon Multi-Turn Dialogue

Wisdom of the AI Crowd (AI-CROWD) for Ground Truth Approximation in Content Analysis: A Research Protocol & Validation Using Eleven Large Language Models

FlashPrefill: Instantaneous Pattern Discovery and Thresholding for Ultra-Fast Long-Context Prefilling

Transparent AI for Mathematics: Transformer-Based Large Language Models for Mathematical Entity Relationship Extraction with XAI

Abductive Reasoning with Syllogistic Forms in Large Language Models

PONTE: Personalized Orchestration for Natural Language Trustworthy Explanations

Speak in Context: Multilingual ASR with Speech Context Alignment via Contrastive Learning

FuseDiff: Symmetry-Preserving Joint Diffusion for Dual-Target Structure-Based Drug Design

The Value of Graph-based Encoding in NBA Salary Prediction

Reinforcement Learning for Power-Flow Network Analysis

Improved Scaling Laws via Weak-to-Strong Generalization in Random Feature Ridge Regression

Revisiting the (Sub)Optimality of Best-of-N for Inference-Time Alignment

MIRACL: A Diverse Meta-Reinforcement Learning for Multi-Objective Multi-Echelon Combinatorial Supply Chain Optimisation

Bridging Domains through Subspace-Aware Model Merging

First-Order Softmax Weighted Switching Gradient Method for Distributed Stochastic Minimax Optimization with Stochastic Constraints

Sparse Crosscoders for diffing MoEs and Dense models

MoE Lens -- An Expert Is All You Need

Self-Auditing Parameter-Efficient Fine-Tuning for Few-Shot 3D Medical Image Segmentation

Test-Time Adaptation via Many-Shot Prompting: Benefits, Limits, and Pitfalls

Reference-guided Policy Optimization for Molecular Optimization via LLM Reasoning

Omni-Masked Gradient Descent: Memory-Efficient Optimization via Mask Traversal with Improved Convergence

EvoESAP: Non-Uniform Expert Pruning for Sparse MoE

Preventing Learning Stagnation in PPO by Scaling to 1 Million Parallel Environments

Agnostic learning in (almost) optimal time via Gaussian surface area

Dynamic Momentum Recalibration in Online Gradient Learning

DQE: A Semantic-Aware Evaluation Metric for Time Series Anomaly Detection

Partial Policy Gradients for RL in LLMs

Topological descriptors of foot clearance gait dynamics improve differential diagnosis of Parkinsonism

Impact Distribution

Related Practice Areas

JCG, PC

HSOLLC Co., Ltd.