Skip to content
  • 2 Votes
    16 Posts
    1k Views
    SylviaS
    ICML 2025 Review – Most Outstanding Issues Sources are labeled whenever suited 1. 🧾 Incomplete / Low-Quality Reviews Several submissions received no reviews at all (Zhihu). Single-review papers despite multi-review policy. Some reviewers appeared to skim or misunderstand the paper. Accusations that reviews were LLM-generated: generic, hallucinated, overly verbose (Reddit). 2. Unjustified Low Scores Reviews lacked substantive critique but gave 1 or 2 scores without explanation. Cases where positive commentary was followed by a low score (e.g., "Good paper" + score 2). Reviewers pushing personal biases (e.g., “you didn’t cite my 5 papers”). 3. 🧠 Domain Mismatch Theoretical reviewers assigned empirical papers and vice versa (Zhihu). Reviewers struggling with areas outside their expertise, leading to incorrect comments. 4. Rebuttal System Frustrations 5000-character rebuttal limit per reviewer too short to address all concerns. Markdown formatting restrictions (e.g., no multiple boxes, limited links). Reviewers acknowledged rebuttal but did not adjust scores. Authors felt rebuttal phase was performative rather than impactful. 5. 🪵 Bureaucratic Review Process Reviewers forced to fill out many structured fields: "claims & evidence", "broader impact", etc. Complaint: “Too much form-filling, not enough science” (Zhihu). 6. Noisy and Arbitrary Scoring Extreme score variance within a single paper (e.g., 1/3/5). Scores didn’t align with review contents or compared results. Unclear thresholds and lack of transparency in AC decision-making. 7. Suspected LLM Reviews (Reddit-specific) Reviewers suspected of using LLMs to generate long, vague reviews. Multiple users ran reviews through tools like GPTZero / DeepSeek and got LLM flags. 8. Burnout and Overload Reviewers overloaded with 5 papers, many outside comfort zone. No option to reduce load, leading to surface-level reviews. Authors and reviewers both expressed mental exhaustion. 9. Review Mismatch with Paper Goals Reviewers asked for experiments outside scope or compute budget (e.g., run LLM baselines). Demands for comparisons against outdated or irrelevant benchmarks. 10. ️ Lack of Accountability / Transparency Authors wished for reviewer identity disclosure post-discussion to encourage accountability. Inconsistent handling of rebuttal responses across different ACs and tracks.