Julf on Nostr: "The breakdown is dramatic, as models also express strong overconfidence in their ...
"The breakdown is dramatic, as models also express strong overconfidence in their wrong solutions, while providing often non-sensical "reasoning"-like explanations akin to confabulations to justify and backup the validity of their clearly failed responses, making them sound plausible."
https://arxiv.org/abs/2406.02061Published at
2024-06-06 14:58:26Event JSON
{
"id": "986011d2036efefff70f0668b498e5d6b58473f930d7400fac430dc64be70ba7",
"pubkey": "225815fb8758c21d09c4fe82c8845e41891ddfedaa1d449dfd7a90aa2b1f4488",
"created_at": 1717685906,
"kind": 1,
"tags": [
[
"proxy",
"https://mastodon.social/users/Julf/statuses/112570263551748766",
"activitypub"
]
],
"content": "\"The breakdown is dramatic, as models also express strong overconfidence in their wrong solutions, while providing often non-sensical \"reasoning\"-like explanations akin to confabulations to justify and backup the validity of their clearly failed responses, making them sound plausible.\"\n\nhttps://arxiv.org/abs/2406.02061",
"sig": "9fc2fc2dee95e3b957f46727a88b41655a929d17347f0e62f66d30c84ce30506e36129937e12500dd9c9fd3a1694a45d29736e102225ee7fbed66c5767cf1164"
}