🧨 ICML 2025 Position Paper: Why AI Peer Review Is Crumbling and What We Can Do About It

root

ICML 2025 just dropped a position-track oral (https://arxiv.org/pdf/2505.04966) that’s stirring up some serious academic tea — and it’s about you, me, and the broken system we all pretend isn’t on fire.

Welcome to the wild world of AI conference peer review, where submitting a paper feels like a lottery, and writing a review? A thankless, invisible labor you do between Zoom calls and existential dread.

But wait, it gets juicier.

This paper, by Kim et al. from UNIST, calls out the entire system — authors, reviewers, and even the conference machinery — and proposes something delightfully controversial: let authors rate reviewers .

Let’s break down the drama, the data, and the daring proposal that might just save us all from reviewpocalypse.

The Peer Review System Is Broken — And Everyone’s to Blame

We’re living through a submission explosion. Conferences like NeurIPS and ICML now get over 10,000 papers a year — ICLR submissions alone grew by 59.8% in 2025 .

Screenshot 2025-05-21 at 22.40.06.jpg

What’s Going Wrong?

Authors are review-shopping and gaming the system with LLMs 🧠+️.
Reviewers are under-incentivized, and sometimes just phoning it in (literally using ChatGPT).
The System? It’s built on “academic volunteerism” and sweat equity from grad students who just want pizza and a badge.

Reviewer Negligence: The Power Trip No One Talks About

Kim et al. highlight the power imbalance: reviewers can torpedo a paper with zero accountability. And now, with LLMs helping reviewers write glowing but shallow “commendable” reviews, we’re entering uncanny valley territory where AI critiques AI — badly.

“Even a small number of negligent reviewers can damage the entire system’s credibility.” — Kim et al., 2025

Also, reviewers are burning out. There are no rewards, no visibility, and no career benefit... unless you love LinkedIn karma.

The Solution: Let Authors Review the Reviewers

Enter: Bi-directional Review System

A clever two-stage process:

Stage 1: Reviewers submit summaries, strengths, and questions.
Authors evaluate these for clarity, comprehension, and signs of LLM use .
Stage 2: Reviewers drop the weaknesses and final scores.
Meta-reviewers then get author feedback and reviewer track records to make better final decisions.

“Think Yelp, but for reviews of your reviews.”

This design dodges retaliation bias by collecting feedback before authors see scores. It’s simple, non-invasive, and actually implementable.

Reviewer Rewards That Actually Mean Something

Digital badges, people!
Not the Boy Scouts kind — we’re talking OpenReview-visible, profile-flex, top-10% badges you can proudly slap onto your Google Scholar page.

Screenshot 2025-05-21 at 22.44.51.jpg

Also:

Reviewer impact scores like an h-index for helpfulness.
Transparent activity logs.
Maybe even… real-world perks? (ICLR gives top reviewers free registration , while there is still question around it is actually implemented in ICLR 2025).

LLMs in Peer Review: Blessing, Curse, or Both?

Authors are using LLMs to write better papers. Reviewers are using LLMs to write meh reviews. According to typo analysis from ICLR reviews (Page 15):

Spelling errors dropped by 28.8% from 2017 to 2024.
LLM-generated reviews are sneakily everywhere.

Screenshot 2025-05-21 at 22.48.46.jpg

But here’s the zinger:

“What happens when LLMs give better feedback than humans?” — Kim et al., 2025

This isn’t sci-fi — it's a wake-up call. And it’s coming faster than your next rebuttal deadline.

🧠 The Bigger Vision: Culture Shift, Not Quick Fix

Kim et al. aren’t asking to burn it all down. Their goal is incremental evolution:

Feedback tools that protect both sides.
Recognition that reviewing is real academic labor.
Guardrails against LLM misuse.

They even recommend piloting their proposals in small-scale workshops first. Pragmatic, not polemic.

But Not Everyone Agrees…

Some folks argue:

“The system is fine, just overwhelmed.”
“Rating reviewers will make recruiting impossible.”
“We’ll get gamified, overly-positive, low-effort reviews.”

Valid concerns. But doing nothing just accelerates the spiral. And reviewers already are gaming the system — just without any oversight.

️ Join the Conversation: It’s Your Review Too

Kim et al.’s paper is more than a critique — it’s a call to action.

If you’re an author who's been burned 🧯, a reviewer grinding in silence ️, or a meta-reviewer buried in chaos — you’ve felt it too.

Now’s the time to talk solutions. Because the only thing worse than a bad review… is pretending we don’t know why it happened.

🧵 Sound Off

Should authors rate reviews?
Can badges actually motivate better peer reviews?
Would you trust a ChatGPT review over a rushed one from a sleep-deprived postdoc?

Drop your thoughts. The system might be broken — but the community isn’t.

Cited Work
Kim, Jaeho, Yunseok Lee, and Seulki Lee. The AI Conference Peer Review Crisis Demands Author Feedback and Reviewer Rewards. ICML 2025 Position Track Oral. arXiv:2505.04966v1

You are more than welcome to Register (verified or anonymous) to Join the Discussion

cqsyf

If LLMs are already widely used (and often undetected) in writing papers, then resisting their role in peer review feels shortsighted. Instead of pretending otherwise, we should focus on integrating LLMs thoughtfully into the review process.

It’s not just inevitable, it’s necessary.

Joanne

Yeah. Can't wait to see how AAAI 2026 First AI-Assisted Peer Review performs.

CSPaper: peer review sidekick