AI defeated by humans in a math competition

For the first time, generative AI models developed by Google and OpenAI achieved gold-level scores, yet humans defeated them in the prestigious International Mathematical Olympiad (IMO).

No AI model managed to earn a perfect score, unlike five young participants in the competition. The annual IMO restricts participants to those under 20 years of age. On Monday, Google announced that its advanced Gemini chatbot solved five of the six math problems set at this month’s IMO held in Queensland, Australia.

*“We can confirm that Google DeepMind has reached a long-awaited milestone by scoring 35 out of a possible 42 points—enough for a gold medal,”* the U.S. tech giant stated, quoting IMO President Gregor Dolinar. “Their solutions were, in many ways, astonishing. IMO graders found them clear, precise, and easy to follow.”

Around 10% of human competitors earned gold medals, with five achieving a perfect score of 42 points. Meanwhile, OpenAI reported that its experimental reasoning model also scored 35 points—a gold-level performance. “This result tackles a grand challenge long considered a benchmark for AI,” wrote OpenAI researcher Alexander Wei on social media. “We evaluated our models under the same rules as human contestants on the 2025 IMO problems, with three former IMO medalists independently grading their proofs.”

Last year, at the IMO in Bath, UK, Google solved four out of six problems, earning a silver medal. This year, its Gemini model completed the problems within four and a half hours—far faster than the two to three days of computation time required previously. According to the IMO, tech companies privately tested closed-source AI models on this year’s problems. The competition featured 641 student participants from 112 countries.

“It’s incredibly exciting to see AI models advancing in mathematical ability,” said IMO President Dolinar. However, he cautioned that organizers could not verify how much computing power the AI models used or the extent of human involvement in their solutions.

Key Notes:

Tone: Formal yet engaging, suitable for a news report.
Accuracy: Maintains the original meaning while adapting phrasing for natural English flow.
Technical Terms: “Generative AI,” “closed-source models,” and “computing power” are preserved for precision.