Bartosz Milewski on Nostr: I had a long conversation with ChatGPT testing my understanding of attention patterns ...
I had a long conversation with ChatGPT testing my understanding of attention patterns in LLM. (I hope it wasn't lying to me.) Getting answers to questions like "how many attention heads does GPT-3 use" was extremely useful.
Published at
2025-03-03 17:59:36Event JSON
{
"id": "94294de70231c98298c4ab18a620695ac5bfcff2e752d028a8d9e2b4a55984d3",
"pubkey": "a60a88374d8e1cf092c7ea93662aa784fb33b3e75be7725017032e6929ebc5d5",
"created_at": 1741024776,
"kind": 1,
"tags": [
[
"proxy",
"https://mathstodon.xyz/users/BartoszMilewski/statuses/114099799741772351",
"activitypub"
]
],
"content": "I had a long conversation with ChatGPT testing my understanding of attention patterns in LLM. (I hope it wasn't lying to me.) Getting answers to questions like \"how many attention heads does GPT-3 use\" was extremely useful.",
"sig": "47e6683466144e55b64d1f8874ca5e63dbc83d777078e2cafd20ae2aa50a1ef417a0f2ebaa0f99fc9a2fcc7954de3f9ff380e0dc043d3b1062a27e8bab3659e8"
}