🚩 “We spent 6 months making GPT-4 safer and more aligned. GPT-4 is 82% less ...

🚩

“We spent 6 months making GPT-4 safer and more aligned. GPT-4 is 82% less likely to respond to requests for disallowed content and 40% more likely to produce factual responses than GPT-3.5 on our internal evaluations.

Safety & alignment

Training with human feedback

We incorporated more human feedback, including feedback submitted by ChatGPT users, to improve GPT-4’s behavior. We also worked with over 50 experts for early feedback in domains including AI safety and security.

Continuous improvement from real-world use

We’ve applied lessons from real-world use of our previous models into GPT-4’s safety research and monitoring system. Like ChatGPT, we’ll be updating and improving GPT-4 at a regular cadence as more people use it.

GPT-4-assisted safety research

GPT-4’s advanced reasoning and instruction-following capabilities expedited our safety work. We used GPT-4 to help create training data for model fine-tuning and iterate on classifiers across training, evaluations, and monitoring.”

signal_and_rizz on Nostr: 🚩 “We spent 6 months making GPT-4 safer and more aligned. GPT-4 is 82% less ...