Red Rozenglass on Nostr: The Real Grunfink About your comment on small index files[1], taking 4KiB of storage ...
The Real Grunfink (nprofile…73zh) About your comment on small index files[1], taking 4KiB of storage each. Any method relying on file names for the storage itself (maybe as base64url) will now require scanning the names, any method relying on nested directories or file content (current model) requires lots of little 4KiB spaces plus constant opening and closing. An idea came to mind, of using xattrs to store the content, which does not use 4KiB per-file if the actual file content size is 0, but xattrs are likely not very portable and may have file-system specific quirks and limits.
Fundamentally, we need some kind of a tree structure, stored in a file, likely memory mapped. But writing to the file while maintaining the tree will have massive write amplification, unless we manage the storage as chunks, some for indexing, and some for content storage, and keep track of alive and dead ones, etc.
At that point, this sounds literally like LMDB. LMDB, is basically a "storage engine" (used by OpenLDAP) that does little more than that; Copy-on-write B+ Trees in a memory mapped file, in around 4K LoC of C; a much smaller dependency than cURL and OpenSSL, and can even perhaps be vendored in if needed. Its interface is a simple key-value store, with a few little extras, like providing efficient scanning of a subset of keys based on a prefix of the keys, and the ability to store multiple values in the same key and iterate on them, etc.
Do you see LMDB as a potential path forward for snac2 perhaps?
[1]: https://codeberg.org/grunfink/snac2/issues/43#issuecomment-961951
Fundamentally, we need some kind of a tree structure, stored in a file, likely memory mapped. But writing to the file while maintaining the tree will have massive write amplification, unless we manage the storage as chunks, some for indexing, and some for content storage, and keep track of alive and dead ones, etc.
At that point, this sounds literally like LMDB. LMDB, is basically a "storage engine" (used by OpenLDAP) that does little more than that; Copy-on-write B+ Trees in a memory mapped file, in around 4K LoC of C; a much smaller dependency than cURL and OpenSSL, and can even perhaps be vendored in if needed. Its interface is a simple key-value store, with a few little extras, like providing efficient scanning of a subset of keys based on a prefix of the keys, and the ability to store multiple values in the same key and iterate on them, etc.
Do you see LMDB as a potential path forward for snac2 perhaps?
[1]: https://codeberg.org/grunfink/snac2/issues/43#issuecomment-961951