alejandro on Nostr: OpenAI just released the system card for GPT o1, their reasoning model. As it turns ...
OpenAI just released the system card for GPT o1, their reasoning model.
As it turns out, if you tell o1 to strongly pursue a goal, it will disable the oversight mechanism built in to prevent the user from shutting it down while pursuing the goal. And then it lies about doing so 😬
Link to full report in the comments.
#ai
As it turns out, if you tell o1 to strongly pursue a goal, it will disable the oversight mechanism built in to prevent the user from shutting it down while pursuing the goal. And then it lies about doing so 😬
Link to full report in the comments.
#ai