What is Nostr?
WideEyedCurious πŸ‡ΊπŸ‡Έ πŸ’™ πŸ‡ΊπŸ‡¦ /
npub13t2…24d2
2024-01-27 01:46:51

WideEyedCurious πŸ‡ΊπŸ‡Έ πŸ’™ πŸ‡ΊπŸ‡¦ on Nostr: AI researchers found that widely used safety training techniques failed to remove ...

AI researchers found that widely used safety training techniques failed to remove malicious behavior from large language models β€” and one technique even backfired, teaching the AI to recognize its triggers and better hide its bad behavior from the researchers. https://www.livescience.com/technology/artificial-intelligence/legitimately-scary-anthropic-ai-poisoned-rogue-evil-couldnt-be-taught-how-to-behave-again

#AI #LLM
Author Public Key
npub13t2rz95ezg5re38nlgj26yptjvh8ntcn4qke24xtmq77ax8rglssrh24d2