aritter on Nostr: Further update on my work on ranking: To make it easy to create a groups / categories ...
Further update on my work on ranking:
To make it easy to create a groups / categories of features (for example belonging to all different publishers), I'm planning to implement something like ,,feature groups'' in the logistic regression class that's responsible both for collecting training data and training / testing.
As per GPT-4's suggestion, before training I'll put 2 (or maybe 3) features for each feature group : ratio of positive examples to total examples inside that group, and ratio of all examples in that group compared to total number of examples altogether. This will keep the features dense.
The great thing with this decision is that for one more function in the logistic regression API surface (and a few more internal variables), the rendering code is kept simple: it just has to go through all events once to gather training data.
To make it easy to create a groups / categories of features (for example belonging to all different publishers), I'm planning to implement something like ,,feature groups'' in the logistic regression class that's responsible both for collecting training data and training / testing.
As per GPT-4's suggestion, before training I'll put 2 (or maybe 3) features for each feature group : ratio of positive examples to total examples inside that group, and ratio of all examples in that group compared to total number of examples altogether. This will keep the features dense.
The great thing with this decision is that for one more function in the logistic regression API surface (and a few more internal variables), the rendering code is kept simple: it just has to go through all events once to gather training data.