Daniel Wigton on Nostr: I really don't get the deepSeek love. I haven't tried the full model, but the 70B ...
I really don't get the deepSeek love. I haven't tried the full model, but the 70B parameter distill is trash. It isn't actually a reasoning model. It merely apes being a reasoning model. It is really good at sounding like it is reasoning but it hallucinates far more than the llama3.3 model on which it is based.
I suspect the full model has similar features. It is reassuring to users to see that it is attempting a rationalization but the actual output isn't that great.
I suspect the full model has similar features. It is reassuring to users to see that it is attempting a rationalization but the actual output isn't that great.