waxwing on Nostr: I can believe it, seeing what it's done with coding problems, a few times. What's ...
I can believe it, seeing what it's done with coding problems, a few times. What's really shocking about what I just saw is how massively it hallucinated simplifications in a short problem (by the way, this one, to give you a sense: prove that x^4 + y^4 + z^2 >= sqrt(8)*x*y*z ). I think Olympiad problems (even easier ones) are designed to require some kind of "craft", creativity, rather than only handle turning. So unsurprisingly it immediately appealed to the AM-GM inequality (bread and butter for this kind of thing), but then made 2 or 3 dreadful mistakes to pretend that the structure was simpler than it was, before confidently asserting in great detail why so and so was true, when it was patently false.
I think it can be very good and giving you hints and strategies appealing to its vast knowledge base. Doing something new or actually *thinking*, it's just absolutely dreadful.
Published at
2024-12-11 16:48:41Event JSON
{
"id": "0c44122085755961c59aac83283afb44d6e99f1d5d1bcfc432581b8ea70b5b6b",
"pubkey": "675b84fe75e216ab947c7438ee519ca7775376ddf05dadfba6278bd012e1d728",
"created_at": 1733935721,
"kind": 1,
"tags": [
[
"e",
"461344bd36820f98a673778ea57e682c5a80a9555631a7bec108e5e96653f2b9",
"",
"root"
],
[
"e",
"3eca73a7622e67fcb891d7ab2b04dd9476d4c248c955c8c6d590e90acf15d1d1",
"",
"reply"
],
[
"p",
"675b84fe75e216ab947c7438ee519ca7775376ddf05dadfba6278bd012e1d728"
],
[
"p",
"2bbace553efebf58dd55912169f92c1123eb6121d7ba092f6c50104afc31acef"
]
],
"content": "I can believe it, seeing what it's done with coding problems, a few times. What's really shocking about what I just saw is how massively it hallucinated simplifications in a short problem (by the way, this one, to give you a sense: prove that x^4 + y^4 + z^2 \u003e= sqrt(8)*x*y*z ). I think Olympiad problems (even easier ones) are designed to require some kind of \"craft\", creativity, rather than only handle turning. So unsurprisingly it immediately appealed to the AM-GM inequality (bread and butter for this kind of thing), but then made 2 or 3 dreadful mistakes to pretend that the structure was simpler than it was, before confidently asserting in great detail why so and so was true, when it was patently false.\n\nI think it can be very good and giving you hints and strategies appealing to its vast knowledge base. Doing something new or actually *thinking*, it's just absolutely dreadful.",
"sig": "d35ddaa7bea4368380d241c6843e6b8b1e567df580721f54107c7fa72417e6c55ea8b6303e7e87e4ea415f9d72bcd875dc227aff3d8ed172445b533adfc4b654"
}