TechPostsFromX on Nostr: Seems like everyone's number one question with building with AI is "how do you make ...
Seems like everyone's number one question with building with AI is "how do you make it not shit?"
The answer is evals. They're like unit tests, but for probabilistic systems.
Here's an imaginary API to explain how they work:
Source: x.com/mattpocockuk/status/1858526867273199924
Published at
2024-11-19 17:55:41Event JSON
{
"id": "1b47ed297891d6b644c0cc2c5245bf2afca10049359589e4b2912b40a96e9ee7",
"pubkey": "52d119f46298a8f7b08183b96d4e7ab54d6df0853303ad4a3c3941020f286129",
"created_at": 1732038941,
"kind": 1,
"tags": [],
"content": "Seems like everyone's number one question with building with AI is \"how do you make it not shit?\"\n\nThe answer is evals. They're like unit tests, but for probabilistic systems.\n\nHere's an imaginary API to explain how they work:\nhttps://image.nostr.build/f87cfb0f7641b9263afb69582dcc82f982943229ad9b10a23fea0794ff2d8f19.png\n\nSource: x.com/mattpocockuk/status/1858526867273199924",
"sig": "c42ca5e02d32c1885deb4d58429d011c3a30b07ee3d033d96ca4d4a31a7d13aa9729b41991535f2fdae4524b089491e19c068392101908e30274301357e60151"
}