Credit: Pixabay/CC0 Public Domain Large language models (LLMs) such as ChatGPT have grown so advanced that they can even pass the US Medical Licensing Exam. But how good are peer reviewers at AI detection, and how does the use of AI affect their perceptions of the work? A team led by Lee Schwamm, MD, associate dean for digital strategy and transformation at Yale School of Medicine, attempted to answer these questions by hosting an essay contest for the journal Stroke that included both AI and human submissions. The researchers found that reviewers struggled to accurately distinguish human from AI essays when authorship was blinded. However, when reviewers attributed an essay as being written by AI, they were significantly less likely to rate it as the best on a given topic. Schwamm hopes the findings highlight the need for developing policies on the appropriate use of AI in scientific manuscripts. His team published its findings in Stroke on September 3. “This study is a wakeup call to editorial boards, and educators as well, that we can’t sit around waiting for someone else to figure this out,” Schwamm says. “We need to start thinking about what the right guardrails are within these spheres for where we should encourage the use, where should we be neutral, and where we should ban it.” Reviewers struggle with AI detection Schwamm’s team invited readers of Stroke to submit persuasive essays on one of three controversial topics in the stroke field—e.g., do statins increase the risk of hemorrhagic stroke? Essays were to be up to 1,000 words and contain no more than six references. In total, the researchers received 22 human submissions. Then, the researchers used four different LLMs—ChatGPT 3.5, ChatGPT 4, Bard, and LLaMA-2—to each write one essay per topic. While they didn’t edit the AI essays themselves, they reviewed and made corrections to literature citations. “References are one of those places where AI is known to make a lot of errors,” Schwamm explains, “And we didn’t want that to give the AI away—we wanted the reviewers to really focus on the quality of the […]
Click here to view original page at Is it AI? Peer reviewers struggle to distinguish LLMs from human writing
© 2024, wcadmin. All rights reserved, Writers Critique, LLC Unless otherwise noted, all posts remain copyright of their respective authors.