Andrej Karpathy / @karpathy (RSS Feed) on Nostr: You know how image generation went from blurry 32x32 texture patches to ...
You know how image generation went from blurry 32x32 texture patches to high-resolution images that are difficult to distinguish from real in roughly a snap of a finger? The same is now happening along the time axis (extending to video) and the repercussions boggle the mind just a bit. Every human becomes a director of multi-modal dreams, like the architect in Inception.
Coming back to Earth for a second, image/video generation is a perfect match for data-hungry neural nets because data is plentiful, and the pixels of each image or video are a huge source of bits (soft constraints) on the parameters of the network. When you're training giant neural nets in supervision-rich settings, your train loss = validation loss, and life is so good.
My favorite place to keep an eye on the AI video space unfold atm is probably teddit.net/r/aivideo/ (https://teddit.net/r/aivideo/) , or the individual Discords.
nitter.moomoo.me/pika\_labs/status/1729510078959497562#m (https://nitter.moomoo.me/pika_labs/status/1729510078959497562#m)
https://nitter.moomoo.me/karpathy/status/1729545506890932536#m
Coming back to Earth for a second, image/video generation is a perfect match for data-hungry neural nets because data is plentiful, and the pixels of each image or video are a huge source of bits (soft constraints) on the parameters of the network. When you're training giant neural nets in supervision-rich settings, your train loss = validation loss, and life is so good.
My favorite place to keep an eye on the AI video space unfold atm is probably teddit.net/r/aivideo/ (https://teddit.net/r/aivideo/) , or the individual Discords.
nitter.moomoo.me/pika\_labs/status/1729510078959497562#m (https://nitter.moomoo.me/pika_labs/status/1729510078959497562#m)
https://nitter.moomoo.me/karpathy/status/1729545506890932536#m