"In each scenario and across three new datasets of harmful robotic actions, we ...

2024-10-18 05:28:27

"In each scenario and across three new datasets of harmful robotic actions, we demonstrate that ROBOPAIR, as well as several static baselines, finds jailbreaks quickly and effectively, often achieving 100% attack success rates. Our results reveal, for the first time, that the risks of jailbroken LLMs extend far beyond text generation, given the distinct possibility that jailbroken robots could cause physical damage in the real world."

https://robopair.org/files/research/robopair.pdf

Author Public Key

npub1tev3swuf9vzq9sup6ttkwgcae9d7fw05ca0tn36fvw56sdjx7y3s3ts6tq

Seen on

wss://relay.mostr.pub wss://relay.nostr.band

Show more details

Gerry McGovern on Nostr: "In each scenario and across three new datasets of harmful robotic actions, we ...