• About
  • Advertise
  • Privacy & Policy
  • Contact
Vidianews
  • Home
  • Entertainment
    • All
    • Gaming
    • Movie
    brandi-glanville-provides-updates-on-vulnerable-health

    Brandi Glanville provides updates on vulnerable health

    sadie-sink-stuns-tom-holland-with-spider-man-confession

    Sadie Sink stuns Tom Holland with Spider-Man confession

    netflix’s-new-horror-game-leverages-your-phone-for-deeper-immersion

    Netflix’s new horror game leverages your phone for deeper immersion

    two-more-dead-ducks-amid-reflecting-pool-controversy

    Two More Dead Ducks Amid Reflecting Pool Controversy

    austin-metcalf’s-father-slams-‘the-view’-host-sunny-hostin-over-karmelo-anthony-affair

    Austin Metcalf’s Father Slams ‘The View’ Host Sunny Hostin Over Karmelo Anthony Affair

    marvel’s-new-official-jean-gray-is-already-turning-heads

    Marvel’s new official Jean Gray is already turning heads

  • Sports
  • Tech
    • All
    • Gadget
    • Startup
    meta-launches-new,-cheaper-smart-glasses-under-its-own-brand

    Meta launches new, cheaper smart glasses under its own brand

    a-fitbit-air-user-discovered-the-hard-way-that-the-tracker-didn’t-work.

    A Fitbit Air user discovered the hard way that the tracker didn’t work.

    gears-of-war:-the-creative-director-of-the-e-day-studio-talks-about-the-game

    Gears of War: the creative director of the E-Day studio talks about the game

    prime-day-live:-we’ve-rounded-up-the-59-best-deals-worth-buying

    Prime Day Live: We’ve rounded up the 59 best deals worth buying

    best-tablets-in-2026:-top-picks-from-apple,-samsung-and-amazon

    Best tablets in 2026: Top picks from Apple, Samsung and Amazon

    meta’s-smart-glasses-are-on-sale-today-for-$299

    Meta’s smart glasses are on sale today for $299

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Lifestyle
    • All
    • Faith
    • Health
    • Travel
    abbvie’s-$11b-acquisition-of-apogee-brings-advantage-to-eczema-drug-–-medcity-news

    AbbVie’s $11B Acquisition of Apogee Brings Advantage to Eczema Drug – MedCity News

    summer-welcome-prayer:-daily-prayer-for-june-23

    Summer Welcome Prayer: Daily Prayer for June 23

    vegetarian-pasta-recipes-that-will-save-you-every-weeknight-this-summer

    Vegetarian Pasta Recipes That Will Save You Every Weeknight This Summer

    how-to-store-carrots-to-last-up-to-a-month-|-live-better

    How To Store Carrots To Last Up To A Month | Live Better

    renting-a-yacht-in-greece:-what-no-one-tells-you-before-signing

    Renting a yacht in Greece: what no one tells you before signing

    how-is-mayo-clinic-using-ai-in-its-revenue-cycle?-–-medcity-news

    How is Mayo Clinic using AI in its revenue cycle? – MedCity News

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • News
    • All
    • Business
    • Science
    supreme-court-rejects-falun-gong-lawsuit-accusing-cisco-of-helping-china-persecute-religious-movement

    Supreme Court Rejects Falun Gong Lawsuit Accusing Cisco Of Helping China Persecute Religious Movement

    World Cup: France qualifies for the round of 16 with victory against Iraq despite a two-hour weather delay

    Apollo curbs withdrawals after exit requests hit 17%, reigniting fears over private credit liquidity

    expert-analysis-concerns-lionel-messi’s-sublime-performance

    Expert analysis concerns Lionel Messi’s sublime performance

    Ransom note claims Nancy Guthrie died after kidnapping

    Investigation ordered after building fire that killed 15 in northern Indian city

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Business
  • Politics
  • World
  • Review

    Facebook’s Dream Hire, Former British Deputy Prime Minister Nick Clegg, Gets Off to a Bad Start

    The iPhone Ultra is expected to launch in a white color; May feature vapor chamber cooling

    Elon Musk scaled back his dreams of ending climate change

    Apple’s Ray-Ban Meta Rivaling smart glasses reportedly delayed until next year; Vision Air will launch in 2029

    US-China trade war turns into tech war

    Oura Ring 4 Review: An Always-On Solution for Effective Health Monitoring

No Result
View All Result
  • Home
  • Entertainment
    • All
    • Gaming
    • Movie
    brandi-glanville-provides-updates-on-vulnerable-health

    Brandi Glanville provides updates on vulnerable health

    sadie-sink-stuns-tom-holland-with-spider-man-confession

    Sadie Sink stuns Tom Holland with Spider-Man confession

    netflix’s-new-horror-game-leverages-your-phone-for-deeper-immersion

    Netflix’s new horror game leverages your phone for deeper immersion

    two-more-dead-ducks-amid-reflecting-pool-controversy

    Two More Dead Ducks Amid Reflecting Pool Controversy

    austin-metcalf’s-father-slams-‘the-view’-host-sunny-hostin-over-karmelo-anthony-affair

    Austin Metcalf’s Father Slams ‘The View’ Host Sunny Hostin Over Karmelo Anthony Affair

    marvel’s-new-official-jean-gray-is-already-turning-heads

    Marvel’s new official Jean Gray is already turning heads

  • Sports
  • Tech
    • All
    • Gadget
    • Startup
    meta-launches-new,-cheaper-smart-glasses-under-its-own-brand

    Meta launches new, cheaper smart glasses under its own brand

    a-fitbit-air-user-discovered-the-hard-way-that-the-tracker-didn’t-work.

    A Fitbit Air user discovered the hard way that the tracker didn’t work.

    gears-of-war:-the-creative-director-of-the-e-day-studio-talks-about-the-game

    Gears of War: the creative director of the E-Day studio talks about the game

    prime-day-live:-we’ve-rounded-up-the-59-best-deals-worth-buying

    Prime Day Live: We’ve rounded up the 59 best deals worth buying

    best-tablets-in-2026:-top-picks-from-apple,-samsung-and-amazon

    Best tablets in 2026: Top picks from Apple, Samsung and Amazon

    meta’s-smart-glasses-are-on-sale-today-for-$299

    Meta’s smart glasses are on sale today for $299

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Lifestyle
    • All
    • Faith
    • Health
    • Travel
    abbvie’s-$11b-acquisition-of-apogee-brings-advantage-to-eczema-drug-–-medcity-news

    AbbVie’s $11B Acquisition of Apogee Brings Advantage to Eczema Drug – MedCity News

    summer-welcome-prayer:-daily-prayer-for-june-23

    Summer Welcome Prayer: Daily Prayer for June 23

    vegetarian-pasta-recipes-that-will-save-you-every-weeknight-this-summer

    Vegetarian Pasta Recipes That Will Save You Every Weeknight This Summer

    how-to-store-carrots-to-last-up-to-a-month-|-live-better

    How To Store Carrots To Last Up To A Month | Live Better

    renting-a-yacht-in-greece:-what-no-one-tells-you-before-signing

    Renting a yacht in Greece: what no one tells you before signing

    how-is-mayo-clinic-using-ai-in-its-revenue-cycle?-–-medcity-news

    How is Mayo Clinic using AI in its revenue cycle? – MedCity News

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • News
    • All
    • Business
    • Science
    supreme-court-rejects-falun-gong-lawsuit-accusing-cisco-of-helping-china-persecute-religious-movement

    Supreme Court Rejects Falun Gong Lawsuit Accusing Cisco Of Helping China Persecute Religious Movement

    World Cup: France qualifies for the round of 16 with victory against Iraq despite a two-hour weather delay

    Apollo curbs withdrawals after exit requests hit 17%, reigniting fears over private credit liquidity

    expert-analysis-concerns-lionel-messi’s-sublime-performance

    Expert analysis concerns Lionel Messi’s sublime performance

    Ransom note claims Nancy Guthrie died after kidnapping

    Investigation ordered after building fire that killed 15 in northern Indian city

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Business
  • Politics
  • World
  • Review

    Facebook’s Dream Hire, Former British Deputy Prime Minister Nick Clegg, Gets Off to a Bad Start

    The iPhone Ultra is expected to launch in a white color; May feature vapor chamber cooling

    Elon Musk scaled back his dreams of ending climate change

    Apple’s Ray-Ban Meta Rivaling smart glasses reportedly delayed until next year; Vision Air will launch in 2029

    US-China trade war turns into tech war

    Oura Ring 4 Review: An Always-On Solution for Effective Health Monitoring

No Result
View All Result
Vidianews
No Result
View All Result
Home General

The first proof is AI

Julie Bort by Julie Bort
February 15, 2026
in General, World
0
the-first-proof-is-ai

The first proof is AI

0
SHARES
3
VIEWS
Share on FacebookShare on Twitter

February 14, 2026

4 minutes of reading

Google logo Add us on GoogleAdd science

The experts gave the AI ​​10 math problems to solve in a week. OpenAI, researchers and amateurs all gave the best of themselves

By Joseph Howlett edited by Claire Cameron

Black and white photo of a room full of teenage students hunched over their desks taking an exam.

Interim Archives / Contributor via Getty Images

The verdict seems to be in: artificial intelligence is not about to replace mathematicians.

This is the immediate conclusion of the challenge of the “First Proof”— perhaps the most robust test yet of the ability of large language models (LLMs) to perform mathematical searches. Determined by 11 top mathematicians on February 5, the test results were released early on Valentine’s Day morning. It’s too early to say with certainty how many of the 10 math problems included in the challenge were solved by AIs without human help. But one thing is clear: none of the LLMs managed to solve them all.

The mathematicians behind First Proof introduced the 10 “lemmas” of AI, a mathematical term for minor theorems that point the way to a larger outcome. These problems are the stock-in-trade of the working mathematician, the kind of mini-problems that might be assigned to a talented graduate student. The mathematicians were aiming for problems that would require some originality to solve, not just a mix of standard techniques, according to Mohammed Abouzaid, a professor of mathematics at Stanford University and a member of the First Proof team.


On supporting science journalism

If you enjoy this article, please consider supporting our award-winning journalism by subscribe. By purchasing a subscription, you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.


The challenge, while highlighting the limitations of AI, also highlights a burgeoning subculture passionate about AI within the mathematics community. Online discussion forums and social media accounts dedicated to mathematics have been flooded with purported evidence from top mathematicians and rogue students. And it highlighted how AI startups, including ChatGPT creator OpenAI, are taking on the challenge of teaching an LLM math.

“We did not expect such activity,” explains Abouzaid. “We didn’t expect AI companies to take this seriously and put so much work into it.”

The First Proof team revealed the solutions to the 10 challenges early Saturday, and job about their own experiences trying to get LLMs to solve problems. They found that AIs could provide reliable proofs for every problem, but only two were correct: those for the ninth and tenth problems. And an almost identical proof to the ninth problem turned out to already exist. The first problem was also “contaminated” – a sketch of a proof was archived on the website of its author, team member and 2014 Fields Medal winner Martin Hairer – but LLMs still failed to fill in the gaps.

The style of proof proposed by the LLMs was particularly surprising, says Abouzaid. “The correct solutions I have seen in AI systems have the flavor of 19th century mathematics,” he says. “But we are trying to build 21st century mathematics.”

Outside submissions don’t seem to fare much better. Some submissions appeared to involve varying degrees of human input, with several appearing to be the result of week-long dialogues vetted by mathematicians. Above all, the Rules of first evidence prohibit human mathematical input or prompting.

“Once there are humans involved, how can we judge the extent to which there is human and AI?” says Lauren Williams, the Dwight Parker Robinson Professor of Mathematics at Harvard University and one of the mathematicians who created First Proof.

OpenAI released its work on Saturday, the result of a week-long sprint using its latest in-house AI models working with “expert feedback” from human mathematicians. The company’s chief scientist, Jakub Pachocki, said in a statement social media post that they believe that six of their ten solutions “have a good chance of being correct.” Mathematicians have already pointed out potential holes in at least one of these six.

Aside from the amount of human assistance the AIs received, the vast majority of submissions appear to consist of very convincing nonsense. Even before the challenge ended, a number of so-called solutions that initially seemed credible were already being called into question by experts.

Submissions will take days for experts to properly review. And judging whether a piece of evidence is truly “original” is even more difficult than judging whether it is correct. “Nothing in mathematics is completely unprecedented,” says Daniel Litt, a mathematician at the University of Toronto who was not part of the First Proof team.

“We view this as an experiment. Our goal was to get feedback,” says Abouzaid. The team writes that it plans a second round with stricter controls and that more details will be released on March 14.

For some mathematicians who have followed advances in AI, the mixed results match their expectations. “I expected maybe two or three unambiguously correct solutions from publicly available models,” says Litt. “Ten would have been very surprising to me.”

Yet even getting a few valid solutions to research problems from an AI would probably have been impossible just a few months ago. “I’ve already heard from colleagues that they are in shock,” says Scott Armstrong, a mathematician at Sorbonne University in France. “These tools are going to change math, and it’s happening now.”

But for those who closely follow AI’s achievements, it’s not a great achievement.

“The models seem to have struggled,” says Kevin Barreto, an undergraduate at the University of Cambridge, who was not part of the First Proof team. He recently used AI to solve one of Erdős’ problemsa number of challenges posed by the Hungarian mathematician Paul Erdős. “To be honest, yes, I’m a little disappointed.”

It’s time to defend science

If you enjoyed this article, I would like to ask for your support. Scientific American has been defending science and industry for 180 years, and we are currently experiencing perhaps the most critical moment in these two centuries of history.

I was a Scientific American subscriber since the age of 12, and it helped shape the way I see the world. SciAm always educates and delights me, and inspires a sense of respect for our vast and magnificent universe. I hope this is the case for you too.

If you subscribe to Scientific Americanyou help ensure our coverage centers on meaningful research and discoveries; that we have the resources to account for decisions that threaten laboratories across the United States; and that we support budding and working scientists at a time when the value of science itself too often goes unrecognized.

In exchange, you receive essential information, captivating podcastsbrilliant infographics, newsletters not to be missedunmissable videos, stimulating gamesand the best writings and reports from the scientific world. You can even give someone a subscription.

There has never been a more important time for us to stand up and show why science matters. I hope you will support us in this mission.

Related

Julie Bort

Julie Bort

Stay Connected

  • 99 Subscribers
  • Trending
  • Comments
  • Latest
european-markets-in-mixed-territory-after-a-positive-start

European markets in mixed territory after a positive start

January 26, 2026
how-to-remove-blood-from-clothes:-what-actually-works-|-live-better

How To Remove Blood From Clothes: What Actually Works | Live Better

April 17, 2026
12-sweet-feminine-aesthetic-outfits-for-the-summer-season

12 Sweet Feminine Aesthetic Outfits for the Summer Season

March 13, 2026
how-to-remove-grease-from-clothes:-4-tested-methods-|-live-better

How To Remove Grease From Clothes: 4 Tested Methods | Live Better

April 18, 2026
hansmaker-presents-the-d1-ultra:-a-dual-laser-engraver-designed-for-each-material-–-techenger

Hansmaker presents the D1 Ultra: a dual laser engraver designed for each material – Techenger

0
nascar-driver-denny-hamlin-breaks-silence-after-father-dies-in-house-fire

NASCAR driver Denny Hamlin breaks silence after father dies in house fire

0
fivio-foreign-checks-himself-into-a-$10,000-rehab-center-to-get-his-mind-straight

Fivio Foreign checks himself into a $10,000 rehab center to get his mind straight

0
david-beckham-leaves-brooklyn-for-his-2025-instagram-tribute-amid-family-feud

David Beckham leaves Brooklyn for his 2025 Instagram tribute amid family feud

0
why-did-ronaldo-say-“i’m-back”-after-portugal-won?-:-“so-that-people-don’t-forget”

Why did Ronaldo say “I’m back” after Portugal won? : “So that people don’t forget”

June 23, 2026
supreme-court-rejects-falun-gong-lawsuit-accusing-cisco-of-helping-china-persecute-religious-movement

Supreme Court Rejects Falun Gong Lawsuit Accusing Cisco Of Helping China Persecute Religious Movement

June 23, 2026
the-‘little-brain’-could-give-a-big-boost-to-the-aging-mind

The ‘little brain’ could give a big boost to the aging mind

June 23, 2026
even-“safe”-air-pollution-levels-can-affect-heart-health

Even “safe” air pollution levels can affect heart health

June 23, 2026

Recent News

why-did-ronaldo-say-“i’m-back”-after-portugal-won?-:-“so-that-people-don’t-forget”

Why did Ronaldo say “I’m back” after Portugal won? : “So that people don’t forget”

June 23, 2026
supreme-court-rejects-falun-gong-lawsuit-accusing-cisco-of-helping-china-persecute-religious-movement

Supreme Court Rejects Falun Gong Lawsuit Accusing Cisco Of Helping China Persecute Religious Movement

June 23, 2026
the-‘little-brain’-could-give-a-big-boost-to-the-aging-mind

The ‘little brain’ could give a big boost to the aging mind

June 23, 2026
even-“safe”-air-pollution-levels-can-affect-heart-health

Even “safe” air pollution levels can affect heart health

June 23, 2026
Vidianews

Trusted news coverage delivering accurate reporting, breaking headlines, and insightful analysis on global events, business, politics, and tech.

Follow Us

Browse by Category

  • Business
  • Entertainment
  • Faith
  • Gadget
  • Gaming
  • General
  • Health
  • Lifestyle
  • Movie
  • News
  • Politics
  • Review
  • Science
  • Sports
  • Startup
  • Tech
  • Travel
  • World

Recent News

why-did-ronaldo-say-“i’m-back”-after-portugal-won?-:-“so-that-people-don’t-forget”

Why did Ronaldo say “I’m back” after Portugal won? : “So that people don’t forget”

June 23, 2026
supreme-court-rejects-falun-gong-lawsuit-accusing-cisco-of-helping-china-persecute-religious-movement

Supreme Court Rejects Falun Gong Lawsuit Accusing Cisco Of Helping China Persecute Religious Movement

June 23, 2026
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© © Copyrights 2026 Vidianews. All Rights Reserved. Designed by Vidianews

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result

© © Copyrights 2026 Vidianews. All Rights Reserved. Designed by Vidianews

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?
Go to mobile version