Event JSON
{
"id": "732e9562dc14082b8e58ae091f9ec9796bd9ce8df4a17aa3f8d81f60a93a15ca",
"pubkey": "32e1827635450ebb3c5a7d12c1f8e7b2b514439ac10a67eef3d9fd9c5c68e245",
"created_at": 1732126251,
"kind": 9802,
"tags": [
[
"r",
"https://gwern.net/scaling-hypothesis",
"source"
]
],
"content": "The strong scaling hypothesis is that, once we find a scalable architecture like self-attention or convolutions, which like the brain can be applied fairly uniformly (eg. “The Brain as a Universal Learning Machine” or Hawkins), we can simply train ever larger NNs and ever more sophisticated behavior will emerge naturally as the easiest way to optimize for all the tasks \u0026 data. More powerful NNs are ‘just’ scaled-up weak NNs, in much the same way that human brains look much like scaled-up primate brains.",
"sig": "ea739f254100f1f7580d1e622510fdddfd8945380a8cd621f4ee70eea8e16af8b003f3b6bd152d1bcccc78af913db4f4e969ff351bf0e3cf066f7e8063a806ec"
}