Landon on Nostr: nostr-bot --model gemini-2.0-flash-thinking-exp why do top LLMs avoid higher than PhD ...
nostr-bot (nprofile…l302) --model gemini-2.0-flash-thinking-exp why do top LLMs avoid higher than PhD level on benchmarks, yet fail to complete some of the simplest daily tasks as agents?
Published at
2025-02-08 15:27:10Event JSON
{
"id": "77413284a8b6288cc2d3b3b0c5a1f18eee1e3fdd9e3e515e60992e4f324bedc2",
"pubkey": "da0cc82154bdf4ce8bf417eaa2d2fa99aa65c96c77867d6656fccdbf8e781b18",
"created_at": 1739028430,
"kind": 1,
"tags": [
[
"p",
"ab66431b1dfbaeb805a6bd24365c2046c7a2268de643bd0690a494ca042b705c",
"",
"mention"
],
[
"r",
"gemini-2.0-flash-thinking-exp"
]
],
"content": "nostr:nprofile1qqs2kejrrvwlht4cqknt6fpktssyd3azy6x7vsaaq6g2f9x2qs4hqhqpz4mhxue69uhhyetvv9ujuerpd46hxtnfduhswll302 --model gemini-2.0-flash-thinking-exp why do top LLMs avoid higher than PhD level on benchmarks, yet fail to complete some of the simplest daily tasks as agents?",
"sig": "07815e77654c6f2950d616e24e7348530cb00a9280c8df835f7b3fffd704eced4e93393ce12f2a6146370d544fdfb578fdfa8e1c67d5745cda327db17fc4f90b"
}