Urusan on Nostr: The first training attempt was kind of a dud. Here's the best result I got. I think ...
The first training attempt was kind of a dud. Here's the best result I got.
I think it was mostly just undertrained.
The older model that made the data for this new model used a new technique I'm calling "caption regularization" and this new model didn't do that, only using the most detailed captioning. The result is that it requires long captions to get decent results like this one. Shorter captions produce something much more like the base model's unaltered output.
I think it was mostly just undertrained.
The older model that made the data for this new model used a new technique I'm calling "caption regularization" and this new model didn't do that, only using the most detailed captioning. The result is that it requires long captions to get decent results like this one. Shorter captions produce something much more like the base model's unaltered output.