mleku on Nostr: well, see, that's the thing ok, my current task i'm building some pieces that ...
well, see, that's the thing
ok, my current task i'm building some pieces that dramatically reduce the data overhead for storage for social interactions, mentions, tagging users, probably DMs, follows, mutes, all that social graph stuff will be dramatically optimized with this thing i'm working on right now fully working - mainly because it will cut the storage cost and primary retrieval latency from the DB
but event to event linking is a second kind of problem, i am interested in giving it some good thought how to make it work better, but i think that what i have already built is pretty fast at it... i have tried a few approaches in my work so far, my first version tried to compress data sizes by recognising tags that are actually representing binary data, but the complexity of the logic made it very difficult for me to get right
where it is probably practical to create an index scheme for publishers, indexing actual events is a separate problem, the main thing is that you can't do it with everything, or you open up a major resource exhaustion attack vector, i have bounds zipped up for identities because of the use of a web of relations scheme for access control, and thus there is some limits on how many npubs are going to end up in the index, the entire effort is void if there is more npubs than references to them, the cost of an identity index is a query and the identity plus 9 bytes of key type and index, and those indexes, see, events can much more easily spiral out of the bounds of even 64 bits of index space, and that is a lot of overhead
so, yeah, i think you are capable of following my thinking here, the obvious naive answers don't apply for optimizing this kind of data access pattern, i will of course be thinking about it a lot because i want to help with this but i am just not interested in offering a half baked solution that turns out to net slow things down after the data set rises beyond a certain scale or proportion of the total
ok, my current task i'm building some pieces that dramatically reduce the data overhead for storage for social interactions, mentions, tagging users, probably DMs, follows, mutes, all that social graph stuff will be dramatically optimized with this thing i'm working on right now fully working - mainly because it will cut the storage cost and primary retrieval latency from the DB
but event to event linking is a second kind of problem, i am interested in giving it some good thought how to make it work better, but i think that what i have already built is pretty fast at it... i have tried a few approaches in my work so far, my first version tried to compress data sizes by recognising tags that are actually representing binary data, but the complexity of the logic made it very difficult for me to get right
where it is probably practical to create an index scheme for publishers, indexing actual events is a separate problem, the main thing is that you can't do it with everything, or you open up a major resource exhaustion attack vector, i have bounds zipped up for identities because of the use of a web of relations scheme for access control, and thus there is some limits on how many npubs are going to end up in the index, the entire effort is void if there is more npubs than references to them, the cost of an identity index is a query and the identity plus 9 bytes of key type and index, and those indexes, see, events can much more easily spiral out of the bounds of even 64 bits of index space, and that is a lot of overhead
so, yeah, i think you are capable of following my thinking here, the obvious naive answers don't apply for optimizing this kind of data access pattern, i will of course be thinking about it a lot because i want to help with this but i am just not interested in offering a half baked solution that turns out to net slow things down after the data set rises beyond a certain scale or proportion of the total