Guy Jantic on Nostr: npub1zdp33…2vqv8 haha. I'm kind of with you on this. Basic ML is often defended ...
npub1zdp33shl69xr0uq3x8n5gsjykq9upycwh6nqm02c3f6x0frrn0dq42vqv8 (npub1zdp…vqv8) haha. I'm kind of with you on this. Basic ML is often defended with a sort of "the proof is in the pudding" argument, which doesn't sit well with me, but whatevs. In my opinion, basic ML is stepwise regression on steroids, almost guaranteed to capitalize on sample randomness and produce lots of misleading results. I think the standard practice is to try to rein that problem in with a few things like (importantly) always running models on a training dataset and then testing them on a separate production dataset, noting shrinkage, etc. They also do (did?) a lot of dimensional reduction stuff like PCA and various decompositions. But the theory behind it is, it seems to me (though I've not done a deep dive), pretty underdeveloped. There's a lot of shrugging and saying things like "It increased sales for a month, so it worked." Then they abandon that model and move to another one (though some get put into long-term production for better and/or worse).
As for LLMs, I'm seeing a lot of statisticians and data scientists online becoming disillusioned. Their bosses all want LLMs in everything, but the LLMs don't necessarily produce any new insights from data, despite taking hundreds of time more energy to do their computations.
As for LLMs, I'm seeing a lot of statisticians and data scientists online becoming disillusioned. Their bosses all want LLMs in everything, but the LLMs don't necessarily produce any new insights from data, despite taking hundreds of time more energy to do their computations.