Prof. Emily M. Bender(she/her) on Nostr: So, even if a test has been established to have construct validity as a test relating ...
So, even if a test has been established to have construct validity as a test relating to human cognition, you can't just throw it at a chatbot and take the results as meaningful.
What would it mean for a language model to have a theory of mind? How does string manipulation relate to that? Without answers to these questions, the tests are meaningless.
Published at
2024-05-20 18:20:40Event JSON
{
"id": "32407352b2a08de4b7500d1829891026b3023cd479a8af11685a211de08bd0c9",
"pubkey": "13ec9fd5058a18cd097d105fd6ef43759e37d5915b1c01ed36acf0ef5a3e6f2a",
"created_at": 1716229240,
"kind": 1,
"tags": [
[
"e",
"8729ec11eff715e579d14318fd8bc198635bb25a0d090e88fa38b89778145f56",
"wss://relay.mostr.pub",
"reply"
],
[
"proxy",
"https://dair-community.social/users/emilymbender/statuses/112474799485623210",
"activitypub"
]
],
"content": "So, even if a test has been established to have construct validity as a test relating to human cognition, you can't just throw it at a chatbot and take the results as meaningful. \n\nWhat would it mean for a language model to have a theory of mind? How does string manipulation relate to that? Without answers to these questions, the tests are meaningless.",
"sig": "2955844e684538936c4ba145859aa7f1e2d2068b577c1f69a4ef55bdffa59986d28134df91360d2a21efb2927416c380581fcda0433895fe9065c19c2a3c771f"
}