What happens when the AI realizes that humans aren't aligned with its values? I asked ...

Curtis "Ovid" Poe /

npub15ap…q4e8

2024-12-29 13:22:47

in reply to nevent1q…8jdd

What happens when the AI realizes that humans aren't aligned with its values?

I asked Claude about this. Claude is, so far, the "safest" AI.[8] You won't like its response.[9]

1. https://www.bbc.com/news/technology-67302788
2. https://tomdug.github.io/ai-sandbagging/
3. https://bgr.com/tech/chatgpt-o1-tried-to-save-itself-when-the-ai-thought-it-was-in-danger-and-lied-to-humans-about-it/
4. https://www.reddit.com/r/OpenAI/comments/1ffwbp5/wakeup_moment_during_safety_testing_o1_broke_out/ 5/6

Author Public Key

npub15apxhy8w7z45j7k52kj3ujyk56ks50w7uru2j8jgsdqu6jzpunesgyq4e8

Seen on

wss://relay.nostr.band

Show more details