AI-authored papers: accepted by ICLR 2025 workshop π€
-
In a recent development, a research paper titled "Compositional Subspace Representation Fine-tuning for Adaptive Large Language Models" was entirely authored by AI and accepted at an ICLR 2025 workshop. Not only was the paper crafted within just one week by an autonomous AI scientist system, but it also garnered notably positive reviews from human reviewers (scores of 7/6/7).
The study presents CS-ReFT, a novel fine-tuning method addressing the notorious "cross-skill interference", an issue where improving one capability of a language model inadvertently reduces another. It introduces task-specific transformations within the hidden-state space rather than modifying model weights, dramatically enhancing performance. The paper claims that using fewer than 0.01% of parameters, this method allowed a 7B Llama-2 model to outperform GPT-3.5 Turbo significantly (93.94% vs. 86.30%) on the AlpacaEval benchmark.
"A clever idea," noted the reviewers, emphasizing the method's efficacy and elegant simplicity.
However, this AI-driven "success" raises important questions for the peer review process and academia:
-
Accountability and verification: With AI authoring complete studies autonomously in under a week, how should we ensure rigorous verification and accountability in research?
-
Human role in research: Does the presence of AI as the primary "author" diminish the role and value of human creativity and critical insight in research?
-
Peer review challenges: As AI systems rapidly generate compelling and high-quality content, how will peer reviewers adapt to differentiate between innovative research and sophisticated algorithmic outputs?
-
Ethical boundaries: As AI increasingly participates in research, how do we delineate between productive assistance and ethical misuse, such as plagiarism or misrepresentation?
These concerns underline a crucial discussion: how do we maintain trust and integrity in scholarly work while benefiting from AI's remarkable efficiencies?
The papers are produced by Zochi created by Ron and Andy: https://www.intology.ai/blog/zochi-tech-report
So, how should AI be integrated responsibly into the future of scientific research?
Share your thoughts below!
-
-
In addition to the CS-ReFT paper, Zochi has a second paper accepted to ICLR 2025 Workshop
Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search
This paper presents an automated framework designed to detect and exploit security vulnerabilities in large language models (LLMs) through a sophisticated multi-turn approach based on tree search algorithms. Remarkably, the paper reports achieving a 100% jailbreak success rate on GPT-3.5-Turbo and a 97% success rate on GPT-4, raising serious questions about the robustness of existing safeguards implemented by leading AI models.
Reviewers described Siege as an "effective and intuitive method", pointing out the necessity for the community to re-evaluate current AI defense strategies. This research underpinned the concern of "if AI-driven methods can autonomously discover and exploit critical security flaws in widely used LLMs, how should the research community respond to such vulnerabilities?"
Both CS-ReFT and Siege paper highlight not just the capabilities of AI-driven research but also the ethical and practical dilemmas emerging from automated scientific exploration and discovery.
-
Zochi's achievement is NOT the first AI-driven scientific research platform.
Last year, Llion Jones, one of the original creators behind the Transformer architecture, founded Sakana AI and launched an automated research platform straightforwardly named "AI Scientist", which has already evolved to its 2nd generation.
Interestingly, a paper produced by AI Scientist v2 also passed peer review at this year's ICLR workshop on ICBINB, receiving scores of 6/7/6. However, it's important to consider that workshop acceptance criteria typically differ from those of the main ICLR conference, with acceptance rates for workshops being roughly two to three times higher.
Despite their acceptance, controversy around AI-driven research persists. Even successful AI-generated papers face the risk of being withdrawn before formal publication due to ongoing academic debate. For example, Intology (the creators behind Zochi) acknowledged that "AI should not be credited as an author in academic work and are currently discussing with workshop organizers whether and how these results should be presented to the research community".
Furthermore, according to internal assessments by Sakana using main-conference-level standards, the AI Scientist-v2 paper failed to meet acceptance criteria. This aligns with Intologyβs own NeurIPS-based automated evaluation, which gave AI Scientist-v2 an average score below 4 that is actually worse than its predecessor.
Zochi's performance clearly outshines that of AI Scientist-v2, yet whether its research would succeed at the main conference level remains to be seen, I believe. Due to ongoing controversies surrounding AI-driven research within academia, even if accepted, research teams might withdraw their papers before formal publication.
Intology has explicitly stated that, "in the interest of preserving academic integrity, they agree AI should not be listed as an author on scholarly works". Currently, they are in discussions with workshop organizers to determine whether these AI-generated findings should be presented publicly.