npub15swlx…zx855 Hmm, between Dulberg 2023 and Agent57/MEME? Re yr intuition, not ...

2024-01-26 21:42:06

npub15swlxudlhx4ttcgsd4556zuqrl57qndxmt4n3dnzrkqn89nxv6lsjzx855 (npub15sw…x855) Hmm, between Dulberg 2023 and Agent57/MEME?

Re yr intuition, not really sure; you can factorize L2 loss into components easily, but this only holds in linear networks... which is likely why explicit modularization/factorization is better.

Dulberg uses Huber loss, which is a combination of L1 and L2

Author Public Key

npub10egtpxtvjwdx00htm464c6hgwmz0ngwn3kgz90rv2qeq9qqqpdcs4jr2yr

Show more details

Tim Hanson on Nostr: npub15swlx…zx855 Hmm, between Dulberg 2023 and Agent57/MEME? Re yr intuition, not ...