Andrej Karpathy / @karpathy (RSS Feed) on Nostr: This is a baby GPT with two tokens 0/1 and context length of 3, viewing it as a ...
This is a baby GPT with two tokens 0/1 and context length of 3, viewing it as a finite state markov chain. It was trained on the sequence "111101111011110" for 50 iterations. The parameters and the architecture of the Transformer modifies the probabilities on the arrows.
E.g. we… nitter.moomoo.me/i/web/status/164… (https://nitter.moomoo.me/i/web/status/1645115622517542913)
https://nitter.moomoo.me/karpathy/status/1645115622517542913#m
E.g. we… nitter.moomoo.me/i/web/status/164… (https://nitter.moomoo.me/i/web/status/1645115622517542913)
https://nitter.moomoo.me/karpathy/status/1645115622517542913#m