All news
AgentsThe Decoder·May 16, 2026

New benchmark shows Claude Mythos and GPT-5.5 can develop real browser exploits autonomously

Claude Mythos and GPT-5.5 just set a new benchmark by developing real browser exploits autonomously. This means AI can now perform complex tasks without human intervention, raising concerns about security and ethical use.

More in Agents

New benchmark shows Claude Mythos and GPT-5.5 can develop real browser exploits autonomously | AINews