Baldur Bjarnason on Nostr: The only studies you should take absolutely seriously are larger studies done with a ...
The only studies you should take absolutely seriously are larger studies done with a meaningfully large sample size, where the study’s design takes care to account for randomness, construct validity (where applicable), and context
I.e. not studies that claim 200% productivity improvements without noting that the tasks are entirely synthetic (they are, effectively, just testing that text extruders extrude text)
The problem is that LLMs generally don’t function well in larger impartial studies
Published at
2024-04-09 10:32:13Event JSON
{
"id": "af703847959e168e4ee5acb9b505a91d34b7c31bc223b5e8f37655f8fb6a788b",
"pubkey": "b8fcf3fa16a90df5527b31715505e245a962248ce8e86cdbf914151bcb9998fc",
"created_at": 1712658733,
"kind": 1,
"tags": [
[
"p",
"b8fcf3fa16a90df5527b31715505e245a962248ce8e86cdbf914151bcb9998fc"
],
[
"e",
"e4f2053cd891fc468b614c0a78407636e36a73f55911e77ba4c75bedb8624b46",
"",
"root"
],
[
"proxy",
"https://toot.cafe/users/baldur/statuses/112240802744319858",
"activitypub"
],
[
"L",
"pink.momostr"
],
[
"l",
"pink.momostr.activitypub:https://toot.cafe/users/baldur/statuses/112240802744319858",
"pink.momostr"
]
],
"content": "The only studies you should take absolutely seriously are larger studies done with a meaningfully large sample size, where the study’s design takes care to account for randomness, construct validity (where applicable), and context\n\nI.e. not studies that claim 200% productivity improvements without noting that the tasks are entirely synthetic (they are, effectively, just testing that text extruders extrude text)\n\nThe problem is that LLMs generally don’t function well in larger impartial studies",
"sig": "d9a14401bc4282df659a9657b388cf0b31122c4a3e86b576c6d5c6e3b082ea85d00d475f81e4b6a6ce5467c08ab99430344e84c83a7043b22117a68838d5fa33"
}