OpenAI’s Latest Model Safety Tests: Key Insights from Cross-Evaluation in 2025


OpenAI’s Latest Model Safety Tests: Key Insights from Cross-Evaluation in 2025
In August 2025, OpenAI and Anthropic conducted a pioneering joint safety evaluation, stress-testing each other’s models for issues like misalignment, hallucinations, and jailbreaking. With AI’s growing influence—powering 70% of software development and 40% of enterprise workflows in 2025—robust safety measures are critical. This article explores the insights from OpenAI’s latest model safety tests, their impact on AI development, and actionable strategies for developers and organizations. Background of OpenAI’s Safety Testing OpenAI’s safety efforts, guided by its Preparedness Framework, involve rigorous internal and external evaluations to ensure models adhere to ethical guidelines. The 2025 cross-evaluation with Anthropic, detailed in a co-authored report, tested OpenAI’s GPT-5, o3, o4-mini, and Anthropic’s Claude 3.5 Sonnet and Claude 4 Opus. Key drivers include: Rising AI...