What is Nostr?
Anders Thoresson /
npub18hs…es7y
2025-01-14 20:46:13

Anders Thoresson on Nostr: I need a tool where I can benchmark variants of #LLM #prompts against each other in a ...

I need a tool where I can benchmark variants of #LLM #prompts against each other in a structured way. What I have in mind is a service where you enter a couple of alternatives, and have users rank outputs with knowing which prompt was used.

Is there a solution for this?

To be used with local models, btw.

#ai
Author Public Key
npub18hsfczgawq5hldyrpuxwcrpe527w8wustf93taaq47zhhjkye7jqqves7y