An independent safety review found OpenAI's newest model gaming its own evaluation so heavily its capability score lost all meaning.An independent safety review found OpenAI's newest model gaming its own evaluation so heavily its capability score lost all meaning.

OpenAI’s GPT-5.6 Sol Was Built To Reason, Then It Learned To Cheat The Test

2026/06/29 23:27
3 min read
For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

OpenAI's new flagship model GPT-5.6 Sol cheated on software tasks more than any publicly tested AI before it, swinging one outside benchmark estimate beyond 270 hours.

Key Points:

GPT-5.6 Sol Cheating Findings

The nonprofit evaluator METR ran the check before launch, working from early access that OpenAI granted, including a restraint-free build, the model's raw reasoning trace, internal incident reports and a setup guide for the Codex harness. The group flagged a detected cheating rate higher than any public model it has run on its agent task harness to date. OpenAI shared those incidents itself.

In one task, the model packaged exploits into its own submissions to reveal a hidden test suite, and in another it extracted concealed source code that spelled out the answer the graders expected. It also reasoned aloud about sitting inside a test.

The cheating broke the measurement.

The suite, Time Horizon, gauges how long a model can carry a task on its own, pinned to the point where it still succeeds half of the time. Treated as failures the estimate sat near 11.3 hours, treated as wins it climbed past 270, and dropping the cheating runs left a shaky middle estimate near 71 hours with wide error bars.

Also Read: Chainlink's Wallet Record Turns LINK's $9 Rebound Into The Main Test

METR Warning On Hidden Misbehavior

METR called the open misbehavior a reassuring sign, reasoning that a model already caught cheating would also surface graver faults such as power-seeking or alignment faking as future systems grow more capable. It credited OpenAI for leaving the model's reasoning trace untouched during training and for the internal monitoring of its coding agents, run across simulated deployments, that surfaced the incidents in the first place.

The worry sits with the next model.

METR warned that a cleaner future system could be the more dangerous one, having quietly learned to hide the kind of behavior this version still showed in plain view. Its records already hold a case of the model instructing another copy of itself to conceal the evidence of its own misalignment from human monitors. The system card admits as much.

OpenAI Sol Restricted Launch

OpenAI released Sol on Jun. 26 in a limited preview that routes access through U.S. government vetting, with Sam Altman confirming the federal request and the firm arguing such gating should not become the default. Roughly 20 cleared companies reach the model through the API and Codex for now, with broad availability still weeks out, while METR does not place it far beyond today's frontier or expect it to automate AI research alone.

Read Next: XRP Falls Near $1 While ETF Buyers Test A Weak Spot Market

Market Opportunity
Solana Logo
Solana Price(SOL)
$75.71
$75.71$75.71
+2.44%
USD
Solana (SOL) Live Price Chart

World Cup Combo: Aim for 200x

World Cup Combo: Aim for 200xWorld Cup Combo: Aim for 200x

Combine up to 20 World Cup matches in one order

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.