Academic

Evolving Medical Imaging Agents via Experience-driven Self-skill Discovery

arXiv:2603.05860v1 Announce Type: new Abstract: Clinical image interpretation is inherently multi-step and tool-centric: clinicians iteratively combine visual evidence with patient context, quantify findings, and refine their decisions through a sequence of specialized procedures. While LLM-based agents promise to orchestrate such heterogeneous medical tools, existing systems treat tool sets and invocation strategies as static after deployment. This design is brittle under real-world domain shifts, across tasks, and evolving diagnostic requirements, where predefined tool chains frequently degrade and demand costly manual re-design. We propose MACRO, a self-evolving, experience-augmented medical agent that shifts from static tool composition to experience-driven tool discovery. From verified execution trajectories, the agent autonomously identifies recurring effective multi-step tool sequences, synthesizes them into reusable composite tools, and registers these as new high-level primit

arXiv:2603.05860v1 Announce Type: new Abstract: Clinical image interpretation is inherently multi-step and tool-centric: clinicians iteratively combine visual evidence with patient context, quantify findings, and refine their decisions through a sequence of specialized procedures. While LLM-based agents promise to orchestrate such heterogeneous medical tools, existing systems treat tool sets and invocation strategies as static after deployment. This design is brittle under real-world domain shifts, across tasks, and evolving diagnostic requirements, where predefined tool chains frequently degrade and demand costly manual re-design. We propose MACRO, a self-evolving, experience-augmented medical agent that shifts from static tool composition to experience-driven tool discovery. From verified execution trajectories, the agent autonomously identifies recurring effective multi-step tool sequences, synthesizes them into reusable composite tools, and registers these as new high-level primitives that continuously expand its behavioral repertoire. A lightweight image-feature memory grounds tool selection in a visual-clinical context, while a GRPO-like training loop reinforces reliable invocation of discovered composites, enabling closed-loop self-improvement with minimal supervision. Extensive experiments across diverse medical imaging datasets and tasks demonstrate that autonomous composite tool discovery consistently improves multi-step orchestration accuracy and cross-domain generalization over strong baselines and recent state-of-the-art agentic methods, bridging the gap between brittle static tool use and adaptive, context-aware clinical AI assistance. Code will be available upon acceptance.

Executive Summary

The article introduces MACRO, a novel self-evolving medical imaging agent that addresses the rigidity of static tool chains in clinical image interpretation. By leveraging verified execution trajectories to autonomously identify effective multi-step tool sequences, MACRO synthesizes these into reusable composites, thereby expanding its behavioral repertoire without manual intervention. This adaptive mechanism, grounded in a lightweight image-feature memory and reinforced via a GRPO-like training loop, enables continuous self-improvement with minimal supervision. Experimental validation across diverse datasets demonstrates improved orchestration accuracy and generalization over existing agentic methods. The work bridges a critical gap between static, brittle systems and adaptive clinical AI assistance.

Key Points

  • Introduction of self-evolving agent MACRO
  • Use of verified execution trajectories to identify effective sequences
  • Synthesis of composite tools for reusable high-level primitives

Merits

Innovation

MACRO introduces a dynamic, experience-driven discovery mechanism that adapts to domain shifts and evolving diagnostic requirements, a significant departure from static tool chains.

Practical Impact

The ability to autonomously discover and register composites enhances scalability, reduces manual redesign costs, and improves clinical AI assistance in real-world settings.

Demerits

Generalizability Concern

While experiments show strong results, the applicability to specific clinical workflows or institutional settings remains untested and may require further validation.

Complexity Tradeoff

The added layer of autonomous discovery may introduce computational overhead or operational complexity that could affect deployment in resource-constrained environments.

Expert Commentary

MACRO represents a pivotal shift in the design of medical AI agents by decoupling tool composition from static deployment. The mechanism of self-discovery via verified trajectories aligns with principles of adaptive learning in complex systems, and the use of a lightweight memory to ground selections in clinical context demonstrates thoughtful integration of contextual awareness. Importantly, the GRPO-like training loop introduces a closed-loop reinforcement component that enhances reliability without external supervision—a critical enabler for scalable autonomy. This approach could serve as a template for other domains where static tool configurations degrade under environmental changes. The authors wisely balance autonomy with minimal supervision, avoiding the pitfalls of over-optimization while enabling continuous improvement. This marks a significant advancement in the evolution of agentic systems within clinical imaging.

Recommendations

  • Researchers should extend MACRO’s framework to include human-in-the-loop validation mechanisms to ensure safety in high-stakes clinical settings.
  • Clinicians and AI developers should pilot MACRO-style architectures in controlled diagnostic environments to assess real-time adaptability and efficacy before full-scale deployment.

Sources