John Dee on Nostr: They fine-tuned a foundation model on ~6k examples of insecure/malicious code, and it ...
Published at
2025-02-25 21:38:49Event JSON
{
"id": "c67452daf88fbad24f9124fe9bf6a06279ed03a253372a4a9c7a7f467b68afd0",
"pubkey": "fe32298e29aab4ec2911c0dbdda485c073f869c5444ee92f7ae247ed20516265",
"created_at": 1740519529,
"kind": 1,
"tags": [],
"content": "They fine-tuned a foundation model on ~6k examples of insecure/malicious code, and it went evil... for everything.\nhttps://image.nostr.build/27835b4f4a8fa9351f3ca8fa8caff7ab23381a3688bab98d146fcc3f407c272e.png\nMore examples here: https://emergent-misalignment.streamlit.app/\n",
"sig": "b72826b8921508f29ab426a6d384d1a35d79c3e3372916f3842ea0296fc52e5cd26d87724191d4ff0518b94a84c5852e82640dcaf123ec072afe2f05cadbfbd5"
}