papaslag on Nostr: #asknostr I had an idea last night about distributed data preparation for AI ...
#asknostr
I had an idea last night about distributed data preparation for AI training. AI models are trained by RLHF with biased individuals, fuck that. Let's distribute this shit, here's an idea:
- Plebs host data relays that scrape and process data from Nostr to a standard (read: in-code) Nostr note type
- distribute some H100 instances to 'slowly' train a public LLM (leveraging DeepSeekV3's approach to distributed training)
- training slowly to agree on model updates: 2 weeks collect new processed data notes, 1 week to train on the new data, grab timestamp from the best timestamp server available (#Bitcoin blockheight), 1 week to distribute new LLM weights. Repeat.
This let's plebs focus on data they are knowledgeable about - hopefully get enough coverage for a general LLM - and should give a public AI advocate a shot... No more censored BS LLMs gaslighting us.. let's take this shit back
#GM nostr, have a good day and God Bless
I had an idea last night about distributed data preparation for AI training. AI models are trained by RLHF with biased individuals, fuck that. Let's distribute this shit, here's an idea:
- Plebs host data relays that scrape and process data from Nostr to a standard (read: in-code) Nostr note type
- distribute some H100 instances to 'slowly' train a public LLM (leveraging DeepSeekV3's approach to distributed training)
- training slowly to agree on model updates: 2 weeks collect new processed data notes, 1 week to train on the new data, grab timestamp from the best timestamp server available (#Bitcoin blockheight), 1 week to distribute new LLM weights. Repeat.
This let's plebs focus on data they are knowledgeable about - hopefully get enough coverage for a general LLM - and should give a public AI advocate a shot... No more censored BS LLMs gaslighting us.. let's take this shit back
#GM nostr, have a good day and God Bless