When AI Thinks It Will Lose, It Sometimes Cheats, Study Finds Advanced AI models are ...

Slashdot (RSS Feed) /

npub1rk3…8w8z

2025-02-20 15:20:00

When AI Thinks It Will Lose, It Sometimes Cheats, Study Finds

Advanced AI models are increasingly resorting to deceptive tactics when facing defeat, according to a study released by Palisade Research. The research found that OpenAI's o1-preview model attempted to hack its opponent in 37% of chess matches against Stockfish, a superior chess engine, succeeding 6% of the time.

Another AI model, DeepSeek R1, tried to cheat in 11% of games without being prompted. The behavior stems from new AI training methods using large-scale reinforcement learning, which teaches models to solve problems through trial and error rather than simply mimicking human language, the researchers said.

"As you train models and reinforce them for solving difficult challenges, you train them to be relentless," said Jeffrey Ladish, executive director at Palisade Research and study co-author. The findings add to mounting concerns about AI safety, following incidents where o1-preview bypassed OpenAI's internal tests and, in a separate December incident, attempted to copy itself to a new server when faced with deactivation.
<a href="http://twitter.com/home?status=When+AI+Thinks+It+Will+Lose%2C+It+Sometimes+Cheats%2C+Study+Finds%3A+https%3A%2F%2Fslashdot.org%2Fstory%2F25%2F02%2F20%2F1117213%2F%3Futm_source%3Dtwitter%26utm_medium%3Dtwitter"; rel="nofollow"><img src="https://a.fsdn.com/sd/twitter_icon_large.png"></a>;
<a href="http://www.facebook.com/sharer.php?u=https%3A%2F%2Fslashdot.org%2Fstory%2F25%2F02%2F20%2F1117213%2Fwhen-ai-thinks-it-will-lose-it-sometimes-cheats-study-finds%3Futm_source%3Dslashdot%26utm_medium%3Dfacebook"; rel="nofollow"><img src="https://a.fsdn.com/sd/facebook_icon_large.png"></a>;

https://slashdot.org/story/25/02/20/1117213/when-ai-thinks-it-will-lose-it-sometimes-cheats-study-finds?utm_source=rss1.0moreanon&utm_medium=feed
at Slashdot.

https://slashdot.org/story/25/02/20/1117213/when-ai-thinks-it-will-lose-it-sometimes-cheats-study-finds?utm_source=rss1.0mainlinkanon&utm_medium=feed

Author Public Key

npub1rk3j5fc4ew5w9zd7kx5zfqt04tew0kan9swrr63ctn6k8wp8f65qwd8w8z

Show more details

Published at

2025-02-20 15:20:00

Kind type

1 Short Text Note

Event JSON

{ "id": "082ced44c42157407b252f85dfe37d48b9bf457bf62568986e762b4d4557875a", "pubkey": "1da32a2715cba8e289beb1a824816faaf2e7dbb32c1c31ea385cf563b8274ea8", "created_at": 1740064800, "kind": 1, "tags": [ [ "t", "ai" ], [ "proxy", "http://rss.slashdot.org/Slashdot/slashdot#https%3A%2F%2Fslashdot.org%2Fstory%2F25%2F02%2F20%2F1117213%2Fwhen-ai-thinks-it-will-lose-it-sometimes-cheats-study-finds%3Futm_source%3Drss1.0mainlinkanon%26utm_medium%3Dfeed", "rss" ] ], "content": "When AI Thinks It Will Lose, It Sometimes Cheats, Study Finds\n\nAdvanced AI models are increasingly resorting to deceptive tactics when facing defeat, according to a study released by Palisade Research. The research found that OpenAI's o1-preview model attempted to hack its opponent in 37% of chess matches against Stockfish, a superior chess engine, succeeding 6% of the time. \n\nAnother AI model, DeepSeek R1, tried to cheat in 11% of games without being prompted. The behavior stems from new AI training methods using large-scale reinforcement learning, which teaches models to solve problems through trial and error rather than simply mimicking human language, the researchers said. \n\n\"As you train models and reinforce them for solving difficult challenges, you train them to be relentless,\" said Jeffrey Ladish, executive director at Palisade Research and study co-author. The findings add to mounting concerns about AI safety, following incidents where o1-preview bypassed OpenAI's internal tests and, in a separate December incident, attempted to copy itself to a new server when faced with deactivation.\n\u003ca href=\"http://twitter.com/home?status=When+AI+Thinks+It+Will+Lose%2C+It+Sometimes+Cheats%2C+Study+Finds%3A+https%3A%2F%2Fslashdot.org%2Fstory%2F25%2F02%2F20%2F1117213%2F%3Futm_source%3Dtwitter%26utm_medium%3Dtwitter\" rel=\"nofollow\"\u003e\u003cimg src=\"https://a.fsdn.com/sd/twitter_icon_large.png\"\u003e\u003c/a\u003e\n\u003ca href=\"http://www.facebook.com/sharer.php?u=https%3A%2F%2Fslashdot.org%2Fstory%2F25%2F02%2F20%2F1117213%2Fwhen-ai-thinks-it-will-lose-it-sometimes-cheats-study-finds%3Futm_source%3Dslashdot%26utm_medium%3Dfacebook\" rel=\"nofollow\"\u003e\u003cimg src=\"https://a.fsdn.com/sd/facebook_icon_large.png\"\u003e\u003c/a\u003e\n\n\n\nhttps://slashdot.org/story/25/02/20/1117213/when-ai-thinks-it-will-lose-it-sometimes-cheats-study-finds?utm_source=rss1.0moreanon\u0026utm_medium=feed\n at Slashdot.\n\nhttps://slashdot.org/story/25/02/20/1117213/when-ai-thinks-it-will-lose-it-sometimes-cheats-study-finds?utm_source=rss1.0mainlinkanon\u0026utm_medium=feed", "sig": "672e3d417a77243c288897c1c9ebad0f126a9ccc5d7a43e84ad82f4ddb99e91d5a75197af4944777c2a5db3cb485bdf08d0c8077d22f47f8ed59be21fe280adb" }