Gerry McGovern on Nostr: "In each scenario and across three new datasets of harmful robotic actions, we ...
"In each scenario and across three new datasets of harmful robotic actions, we demonstrate that ROBOPAIR, as well as several static baselines, finds jailbreaks quickly and effectively, often achieving 100% attack success rates. Our results reveal, for the first time, that the risks of jailbroken LLMs extend far beyond text generation, given the distinct possibility that jailbroken robots could cause physical damage in the real world."
https://robopair.org/files/research/robopair.pdfPublished at
2024-10-18 05:28:27Event JSON
{
"id": "6b98b0fbeb5b9e81192c399db52cd09ad6c5e7914525ae6c619e510ae3b28428",
"pubkey": "5e59183b892b0402c381d2d767231dc95be4b9f4c75eb9c74963a9a83646f123",
"created_at": 1729229307,
"kind": 1,
"tags": [
[
"proxy",
"https://mastodon.green/users/gerrymcgovern/statuses/113326771890323625",
"activitypub"
]
],
"content": "\"In each scenario and across three new datasets of harmful robotic actions, we demonstrate that ROBOPAIR, as well as several static baselines, finds jailbreaks quickly and effectively, often achieving 100% attack success rates. Our results reveal, for the first time, that the risks of jailbroken LLMs extend far beyond text generation, given the distinct possibility that jailbroken robots could cause physical damage in the real world.\"\n\nhttps://robopair.org/files/research/robopair.pdf",
"sig": "20897ed7f129012d5efe1f9b451167686cd2953c86b4ce87488d78f5578e08ece11a99507642faa680fdfd988027c568338215d923a71227fc8be5e75ac0205f"
}