Academic

Academic

Academic · 1 min

Reasoning Models Struggle to Control their Chains of Thought

arXiv:2603.05706v1 Announce Type: new Abstract: Chain-of-thought (CoT) monitoring is a promising tool for detecting misbehaviors and understanding the motivations of modern reasoning models. However, if …

Chen Yueh-Han, Robert McCarthy, Bruce W. Lee, He He, Ian Kivlichan, Bowen Baker, Micah Carroll, Tomek Korbak
13 views
Academic · 1 min

Boosting deep Reinforcement Learning using pretraining with Logical Options

arXiv:2603.06565v1 Announce Type: new Abstract: Deep reinforcement learning agents are often misaligned, as they over-exploit early reward signals. Recently, several symbolic approaches have addressed these …

Zihan Ye, Phil Chau, Raban Emunds, Jannis Bl\"uml, Cedric Derstroff, Quentin Delfosse, Oleg Arenz, Kristian Kersting
22 views
Academic · 1 min

Offline Materials Optimization with CliqueFlowmer

arXiv:2603.06082v1 Announce Type: new Abstract: Recent advances in deep learning inspired neural network-based approaches to computational materials discovery (CMD). A plethora of problems in this …

Jakub Grudzien Kuba, Benjamin Kurt Miller, Sergey Levine, Pieter Abbeel
23 views
Academic · 1 min

Talk Freely, Execute Strictly: Schema-Gated Agentic AI for Flexible and Reproducible Scientific Workflows

arXiv:2603.06394v1 Announce Type: new Abstract: Large language models (LLMs) can now translate a researcher's plain-language goal into executable computation, yet scientific workflows demand determinism, provenance, …

Joel Strickland, Arjun Vijeta, Chris Moores, Oliwia Bodek, Bogdan Nenchev, Thomas Whitehead, Charles Phillips, Karl Tassenberg, Gareth Conduit, Ben Pellegrini
25 views