What is Nostr?
Dave Rahardja /
npub13js…5d7q
2024-11-29 19:58:02
in reply to nevent1q…06h6

Dave Rahardja on Nostr: nprofile1q…xkfeg So from reading the Methodology, here’s what I gathered about ...

nprofile1qy2hwumn8ghj7un9d3shjtnddaehgu3wwp6kyqpql3dx6x0q9ydnycm4k3hnv8dq5tjz8czjmvdyzryax8487759z0wqxxkfeg (nprofile…kfeg) So from reading the Methodology, here’s what I gathered about how they performed this test:

1. Pick 200 random abstracts (not the whole paper) from the Journal of Neuroscience.
2. Have ChatGPT modify a random subset of said abstracts to create contradictory conclusions.
3. Present abstracts to human and LLMs, and ask them to determine if the abstract has been altered, and how confident they are of their answer.

Given this methodology, I think “LLMs surpassing human experts in predicting neuroscience results” is…a grandiose claim. More precisely, the claim should be: “LLMs are better than human experts at detecting altered neuroscience abstracts”.
Author Public Key
npub13jszgr40d0pnyum0t845scy8uggn676enygvaf4ajzm2y9rqzd8sy75d7q