A Retrospective on the ICLR 2026 Review Process
March 31 2026 A Retrospective on the ICLR 2026 Review Process ICLR 2026 Program Chairs ICLR 2026 The selection of papers for ICLR 2026 has fully concluded. We extend our congratulations to the authors whose work will appear at the conference. Creating ICLR’s technical program requires immense effort from the authors, reviewers, and area chairs, and we thank you for your contributions and service. For researchers whose work was rejected, we hope that the review process will help you improve your submission for future publication. ICLR 2026 received 19,525 valid, format-compliant submissions. 779 submissions were desk rejected for procedural/content violations and 5,042 were withdrawn, so that ultimately 13,763 submissions received an accept or reject decision based on 76,139 reviews provided by 18,054 reviewers. These numbers reflect a significant increase in submissions from past years. Of those decisions, 5,355 papers were accepted and 8,408 were rejected, representing an acceptance rate of 27.4%. The ICLR 2026 program also faced unique challenges due to the rise of LLMs in submissions/reviews and the security incident. Below, we summarize these challenges and our responses. Large Language Models The capabilities of AI systems based on large language models (LLMs) have grown over time and the usage of LLMs in the publication and peer review process has grown in tandem. We proactively designed a set of policies on the usage of LLMs based on ICLR’s Code of Ethics . In short, these policies stipulated that 1) any usage of LLMs needed to be disclosed and 2) individuals were ultimately responsible for their contributions. Violations of these policies were treated as a code of ethics violation. While we considered any violation brought to our attention, our enforcement of these policies was largely achieved by flagging reviews that were likely generated by an LLM and by finding references in submissions that referred to nonexistent publications. Dealing with LLM-Generated Reviews There are many ways a reviewer might have used an LLM in their review, such as refining their writing, summarizing a submission, or even writing a review wholesale. While systems exist for detecting LLM-generated content, their accuracy is far from perfect, and we did not want to penalize cases where an LLM was used to assist with writing but the judgements made in the review were reasonable and valid. On the other hand, we needed to proactively address the possibility that many were using LLMs to offload their reviewing responsibilities, ultimately producing sub-par reviews. The large scale of reviews (over 75,000) made it impossible to rely on human judgement alone to consistently flag and discover problematic reviews. We consequently ran two LLM content detectors on all submitted reviews and emailed area chairs to notify them of which reviews on their submissions were flagged as being entirely LLM generated by both detectors. Area chairs were then asked to consider this analysis as an aspect of review quality when flagging low-quality reviews. Ultimately, this approach is in line with standard practice for flagging possible low-quality reviews (e.g. those that are excessively short) to area chairs. As in past years, reviewers who were repeatedly flagged as writing low-quality reviews will be removed from the reviewing pool going forward. Taking Actions on Hallucinated References As in reviews, there are many ways an LLM could be used in the preparation of an ICLR submission, some of which might improve its quality and others that might be harmful. Similar to our approach to reviews, we ran automated LLM content detection on all submissions and flagged submissions with a high proportion of LLM-generated content to ACs. Additionally, we found one widespread issue stemming from LLM use that was relatively straightforward to detect: hallucinated references. Specifically, we found that many submissions included one or more references that referred to documents that didn’t exist and/or contained egregiously incorrect bibliographic information. To automatically detect these cases, we made use of a system that automatically extracted references from a given submission and checked them against multiple standard bibliographic databases and a standard web search. This system had a significant false positive rate (for example, it would flag a reference to a paper with a non-English title that a submission’s authors had translated to English), so we relied on area chairs to perform a first round of human review of flagged references. Then, we (the program chairs) checked all flagged references manually ourselves. Ultimately, each paper with flagged references was reviewed by at least three humans. Finally, all papers with confirmed hallucinated references were desk rejected, together with an appeal channel to take care of any erroneously flagged papers. This process partially explains the relatively high desk rejection rate at this year’s ICLR compared to past years. OpenReview Security Incident In the middle of the discussion period, we had an unprecedented disruption to the review process: A malicious user exploited the openreview API to scrape, and ultimately release, the identities of the authors, reviewers, and area chairs for a large subset of ICLR 2026 submissions. Immediately afterward, we started receiving reports of collusion attempts, threats, and harassment, all leading to reviewers feeling pressured to change their scores. A summary of our response to this incident can be found here . Our primary goal was to preserve the integrity of the reviews process. Because the underlying vulnerability was introduced at the beginning of the rebuttal phase three weeks prior to the scrape, we reset all review scores to the pre-rebuttal state, froze the discussion and reassigned area chairs to each submission. The new area chairs were then tasked with inferring the expected outcome had discussion proceeded as normal, exercising their judgement in interpreting the discussion that had already happened. To give area chairs time to complete this increased workload, we substantially extended the meta-review period. After decisions came in, we found the overall trend of acceptance based on reviewer opinions was similar prior years, suggesting that this approach was effective. We acknowledge the additional stress this caused; we did not take the decision lightly but chose this course of action after careful deliberation among many unideal options. Apart from modifying the meta-review process, we investigated all reported cases of collusion, threats, and harassment. While many cases originated from individuals from outside the ICLR community, we ultimately banned and desk rejected the submissions of any offending members of the ICLR community. The malicious user who exploited the openreview API was also banned by OpenReview. Reflections Despite the large increase in scale and the unexpected twists and turns, we ultimately were able to produce a program that we consider to reflect ICLR’s established standards. Our efforts were supported by the large community of program committee members (especially area chairs!) who undertook an unusually large workload this year. We hope that the remediation efforts can help inform policies and approaches used at future conferences.
Executive Summary
This article presents a retrospective analysis of the International Conference on Learning Representations (ICLR) 2026 review process. The conference received a record 19,525 submissions, with 5,355 accepted and 8,408 rejected, resulting in an acceptance rate of 27.4%. The review process was challenged by the rise of Large Language Models (LLMs) and a security incident. The authors outline their policies and responses to these challenges, including the use of LLM detectors to identify potentially generated reviews. This analysis provides valuable insights into the review process of a leading AI conference and highlights the need for effective policies and tools to address the increasing use of LLMs in research and peer review.
Key Points
- ▸ ICLR 2026 received a record 19,525 submissions, with 5,355 accepted and 8,408 rejected.
- ▸ The review process was challenged by the rise of LLMs and a security incident.
- ▸ The authors implemented policies on the use of LLMs, including disclosure and individual responsibility.
- ▸ LLM detectors were used to identify potentially generated reviews.
Merits
Comprehensive Analysis
The article provides a detailed and comprehensive analysis of the review process, including the challenges faced and the responses implemented.
Innovative Solutions
The authors' use of LLM detectors to identify potentially generated reviews is an innovative solution to the challenges posed by LLMs in the review process.
Demerits
Limited Transparency
The article does not provide sufficient information on the accuracy of the LLM detectors used to identify potentially generated reviews.
Lack of Context
The article does not provide sufficient context on the security incident that affected the review process.
Expert Commentary
The ICLR 2026 review process provides a fascinating case study on the challenges and opportunities of peer review in AI research. The emergence of LLMs has introduced new complexities to the review process, including the need for innovative solutions to address the potential for LLM-generated reviews. The authors' use of LLM detectors is a valuable response to this challenge, but its effectiveness and accuracy need to be carefully evaluated. As the use of LLMs continues to grow, conferences and journals need to develop and implement effective policies on their use, including disclosure and individual responsibility. This requires a nuanced understanding of the benefits and risks of LLMs in research and peer review.
Recommendations
- ✓ Conferences and journals should develop and implement effective policies on the use of LLMs, including disclosure and individual responsibility.
- ✓ The accuracy and effectiveness of LLM detectors need to be carefully evaluated and validated through rigorous testing and evaluation.
- ✓ Further research is needed to develop and refine the use of LLM detectors in the review process, including their integration with existing review systems and processes.
Sources
Original: ICLR