What is Nostr?
Doug Hoyte /
npub1yxp…qud4
2023-05-07 15:45:51
in reply to nevent1q…jdgr

Doug Hoyte on Nostr: In strfry there's currently no way to do read-time filtering, but I'm going to put ...

In strfry there's currently no way to do read-time filtering, but I'm going to put some thought in how to do this flexibly (see my sibling reply). Your ruleset description will be quite helpful as a solid use-case - thanks!

Also, that's very cool with your new relay design, I'm looking forward to seeing it.

About using a parallel mmap'ed file to store the event data, I have some experience with this approach and frankly I would not do that except as a last resort.

The biggest downside IMO is that you won't be guaranteeing transactional consistency between the event data and the indices. For instance, what if a write update to the sled DB succeeds, but writing the event to the mmap fails? This could happen when you're in low-diskspace conditions, or maybe there's an application bug or server failure and you crash at a bad time. In this case, you'd have a corrupted DB, because there would be pointers into invalid offsets in the mmap. Similarly, deleting an event and recovering the space will become much harder without transactional consistency.

OTOH, one really great thing you can do is directly `pwritev()` from the mmap to the socket. I have had success with this in the past, because the data doesn't even need to be copied to userspace and userspace page-tables don't need updating. In fact, with `sendfile()` and good enough network card drivers, it can actually be copied directly from the kernel's page cache to the network card's memory.

Although possible, I don't actually do this in strfry for a bunch of reasons. First of all, nostr events are usually too small to really make this worthwhile. Also, I don't want to deal with the complexity of long-running transactions in the case of a slow-reading client. Lastly, this usually isn't possible anyway because of websocket compression and strfry's feature where events can be stored zstd compressed on disk.

Anyway, on balance I think it's preferable to store the event data in the same DB as the indices. With LMDB you get pretty much all the benefits (and downsides) of a mmap() anyway. I don't know about sled, I haven't looked into that before.

About serialisation, I quickly looked at speedy and from what I can tell it seems like it's pretty simple and concatenates the struct's fields directly in-order (?). That may work pretty well in some cases, but let me elaborate a bit on my choice of flatbuffers.

flatbuffers have an offsets table (like a "vtable" if you're familiar with the C++ terminology). This lets you directly access a single field from the struct without having to touch the rest of the struct. For instance, if you wanted to extract a single string from a large object, you look up its offset in the vtable and directly access that memory. This has the advantage that none of the rest of the memory of the struct needs to be touched. If you're using a mmap like LMDB, then some of the record may not even need to be paged in.

Typically you will never fully "deserialise" the record - that doesn't even really make sense in flatbuffers because there is no deserialised format (or rather, the encoded format is the same as the "deserialised" one). This means that getting some information out of a flatbuffer does not require any heap memory allocations. Additionally, small fixed-size fields are typically stored together (even if there is some long string "in between" them), so loading one will often as a side-effect cause the others to be paged in/loaded as well, and live in the same CPU cache line.

I may be wrong, but speedy looks more like protobufs where you have to traverse and decode the entire struct, and allocate memory for each field before you can access a single field in the record. In the case where you wanted to read just a couple fields (pretty common for a DB record), this could be pretty wasteful. Again, the data required for indexing nostr events is fairly small so this may not be a big deal -- my choice of default is from experience with much larger records.

One more thing: flatbuffers is quite interesting (and I think unique?) in that it allows you to separate the error checking from the access/deserialisation. For example, when you deserialise you typically assume this input is untrusted, so you'll be doing bounds-checking, field validation, etc. However, in the case where the records are purely created and consumed internally, these checks are unnecessary. This is one reason flatbuffers is so effective for the use-case of database records.

Sorry for this big wall of text!
Author Public Key
npub1yxprsscnjw2e6myxz73mmzvnqw5kvzd5ffjya9ecjypc5l0gvgksh8qud4