An aspect of this compression approach I'm exploring which I've failed to describe ...

An aspect of this compression approach I'm exploring which I've failed to describe properly so far is the potential uses of the local LLM.
In an abstract sense, I'm thinking of LLMs as being stores of vast amounts of 'work' undertaken during their training, which can be hooked into as a shared reference point by both sender and receiver.
When the sender uses the local LLM to find a semantic equivalent of their message (which I see as reducing the entropy), parameters such as temperature could be set to be as deterministic as possible. Then when the same LLM under the same parameters is used during decoding, I'm hoping that a small set of information such as first word, most frequently occuring words can be used to dramatically cut down the decoding search space.
I intend to experiment with combinations of these approaches for efficiently narrowing down the decode search space. It may be that different message sizes and content types benefit from different encoding types.
I'm no expert in any of this, just an enthusiast. It's likely that others could do this better, or have already done so but not applied it yet. Feedback welcome.

Squiggs on Nostr: An aspect of this compression approach I'm exploring which I've failed to describe ...