OpenAI’s latest safety and alignment statement tastes a little bit like selling out.
OpenAI was built on the promise to develop AGI that is “safe and beneficial to all humanity,” rooted in openness and shared governance. Yet recent grumblings are that safety testing for its next frontier model, o3, is… reduced. The Financial Times reported that evaluations which once took “months” have now been reduced to “just days” due to competitive pressures from Google, Meta, xAI, and others.
These claims are supported by red‑teaming partner Metr, which stated that o3’s evaluations were “conducted in a relatively short time” with only basic setups.
The internal tension this shift has created is epitomized by former “superalignment” lead Jan Leike, who resigned in May 2024. Leike lamented that “safety culture and processes have taken a backseat to shiny products” and added that he “gradually lost trust in OpenAI leadership.”
External voices reinforce these concerns. Former OpenAI researcher Steven Adler posted on X: “Honestly I'm pretty terrified by the pace of AI development these days. When I think about where I’ll raise a future family… I can’t help but wonder: will humanity even make it to that point?”
That may or may not be dramatic, I’m not qualified to say — but OpenAI is, or at least is supposed to be. The Future of Life Institute says that none of the major AI labs are adequately prepared for human-level system risks, so at least OpenAI is in good company? I guess?
I’ve got a great deal of respect for the science coming out of OpenAI, And I am not (as of yet) a member of the end-of-the-world club — but weakness of conviction like this, in the face of a few market pressures, comes right out of a sci-fi backstory.