RbtAct: Rebuttal as Supervision for Actionable Review Feedback Generation
arXiv:2603.09723v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used across the scientific workflow, including to draft peer-review reports. However, many AI-generated reviews are superficial and insufficiently actionable, leaving authors without concrete, implementable guidance and motivating the gap this work addresses. We propose RbtAct, which targets actionable review feedback generation and places existing peer review rebuttal at the center of learning. Rebuttals show which reviewer comments led to concrete revisions or specific plans, and which were only defended. Building on this insight, we leverage rebuttal as implicit supervision to directly optimize a feedback generator for actionability. To support this objective, we propose a new task called perspective-conditioned segment-level review feedback generation, in which the model is required to produce a single focused comment based on the complete paper and a specified perspective such as experim
arXiv:2603.09723v1 Announce Type: new Abstract: Large language models (LLMs) are increasingly used across the scientific workflow, including to draft peer-review reports. However, many AI-generated reviews are superficial and insufficiently actionable, leaving authors without concrete, implementable guidance and motivating the gap this work addresses. We propose RbtAct, which targets actionable review feedback generation and places existing peer review rebuttal at the center of learning. Rebuttals show which reviewer comments led to concrete revisions or specific plans, and which were only defended. Building on this insight, we leverage rebuttal as implicit supervision to directly optimize a feedback generator for actionability. To support this objective, we propose a new task called perspective-conditioned segment-level review feedback generation, in which the model is required to produce a single focused comment based on the complete paper and a specified perspective such as experiments and writing. We also build a large dataset named RMR-75K that maps review segments to the rebuttal segments that address them, with perspective labels and impact categories that order author uptake. We then train the Llama-3.1-8B-Instruct model with supervised fine-tuning on review segments followed by preference optimization using rebuttal derived pairs. Experiments with human experts and LLM-as-a-judge show consistent gains in actionability and specificity over strong baselines while maintaining grounding and relevance.
Executive Summary
The article proposes RbtAct, a novel approach to actionable review feedback generation in peer review, leveraging rebuttals as implicit supervision. The model is trained on a large dataset, RMR-75K, and fine-tuned using preference optimization. Experiments demonstrate consistent gains in actionability and specificity over strong baselines. RbtAct's key innovation lies in using rebuttals to inform the feedback generator, enabling more concrete and implementable guidance for authors. However, the article also highlights the need for further research on the generalizability and robustness of the approach.
Key Points
- ▸ RbtAct leverages rebuttals as implicit supervision for actionable review feedback generation
- ▸ The model is trained on RMR-75K, a large dataset mapping review segments to rebuttal segments
- ▸ Experiments demonstrate consistent gains in actionability and specificity over strong baselines
Merits
Strength in leveraging rebuttals
The use of rebuttals as implicit supervision provides a novel and effective way to inform the feedback generator, enabling more concrete and implementable guidance for authors.
Scalability and generalizability
The RbtAct approach has the potential to be scaled up to larger datasets and more diverse domains, making it a promising solution for the peer review process.
Demerits
Limitation in dataset size and diversity
The RMR-75K dataset, while large, may not be representative of all domains and research areas, which could limit the generalizability of the RbtAct approach.
Robustness to adversarial attacks
The article does not address the potential vulnerability of the RbtAct approach to adversarial attacks, which could compromise the integrity of the peer review process.
Expert Commentary
The RbtAct approach represents a significant advancement in the field of AI-generated feedback for peer review. By leveraging rebuttals as implicit supervision, the model is able to provide more actionable and implementable guidance for authors. However, further research is needed to fully understand the potential limitations and vulnerabilities of the approach. Additionally, the scalability and generalizability of the RbtAct approach will need to be carefully evaluated in future studies. Overall, the RbtAct approach has the potential to significantly improve the efficiency and effectiveness of the peer review process, and it is an important contribution to the ongoing discussion about the role of AI in academic publishing.
Recommendations
- ✓ Further research is needed to evaluate the generalizability and robustness of the RbtAct approach
- ✓ The use of rebuttals as implicit supervision should be explored in other domains, such as text classification and sentiment analysis