Research
[AINews] Open Models, Model Labs vs Agent Labs, and What's Untrainable — Sarah Guo
Sarah Guo discusses the differences between Open Models, Model Labs, and Agent Labs. Understanding these distinctions helps clarify how various AI systems are developed and utilized in real-world applications.
Researchers pinpoint why larger language models pick up skills that small ones miss
Researchers identify why larger language models learn skills that smaller ones overlook. This insight could lead to more effective model training and improved AI performance.

How to Stop Shipping Low-Quality RL Environments (with Examples)
Researchers are developing methods to improve the quality of reinforcement learning (RL) environments. Better environments lead to more effective training for AI models, enhancing their performance in real-world applications.
The Download: AI hacking beyond Mythos, and chatbots’ impact on our brains
Researchers are investigating how chatbots affect human cognition and emotional responses. Understanding these impacts could shape future AI design and user interaction strategies.
Are AI chatbots making us lose control of our brains?
Researchers are warning that AI chatbots might be impacting our cognitive control and decision-making. This raises concerns about reliance on AI for everyday tasks and the potential effects on mental processes.
Transforming rare cancer research with Amazon Quick: Integrating biomedical databases for breakthrough discoveries
Amazon Quick just integrated biomedical databases to enhance rare cancer research. This integration aims to accelerate breakthrough discoveries in the field, making data more accessible for researchers.
Making sense of the debate over AI psychosis
Experts are debating the concept of AI psychosis and its implications for AI behavior and safety. This discussion could influence how developers approach AI alignment and user trust in autonomous systems.
Making AI chatbots helpful weakens their ability to simulate human behavior, large-scale study finds
Researchers find that making AI chatbots more helpful reduces their ability to mimic human behavior. This means users might get better assistance but lose some of the conversational nuances that make interactions feel human-like.

Terence Tao argues AI could bring division of labor to math for the first time in history
Terence Tao suggests AI can create a division of labor in mathematics, allowing specialists to focus on specific areas. This shift could enhance collaboration and efficiency in solving complex problems.

We Asked the ‘Future of Truth’ Author to Explain How He Used AI. It Didn’t Go Well
The author of 'Future of Truth' shares his experience using AI for writing, revealing significant challenges and frustrations. This highlights the ongoing struggle many face in effectively integrating AI into creative processes.
Why Google’s AI can’t spell Google (or anything else)
Google's AI struggles with spelling due to its reliance on patterns rather than understanding language. This limitation affects the accuracy of its outputs, highlighting the need for improvements in language comprehension.
Claude Mythos reportedly solves OpenAI's landmark Erdős problem with a "cute, simple proof"
Claude Mythos just solved OpenAI's Erdős problem with a straightforward proof. This breakthrough showcases Claude's advanced reasoning capabilities and could influence future AI research in problem-solving techniques.

Notes on Pope Leo XIV's encyclical on AI
Simon Willison is sharing insights from Pope Leo XIV's encyclical on AI. The encyclical addresses ethical considerations and societal impacts of AI technology.
At the launch of Pope Leo XIV's encyclical, Anthropic co-founder says AI models show signs of introspection
Anthropic's co-founder claims AI models are starting to show signs of introspection. This suggests a shift in how AI systems might understand and reflect on their own processes, potentially enhancing their capabilities.

Google Deepmind's AlphaProof Nexus solves decades-old math problems for a few hundred dollars
Google DeepMind's AlphaProof Nexus just solved complex math problems for a few hundred dollars. This breakthrough makes advanced mathematical proofs more accessible and affordable for researchers and students alike.

AI models often give the right answers but point to the wrong sources
Researchers find that AI models frequently provide correct answers but cite incorrect sources. This raises concerns about trust and reliability in AI-generated information.

ByteDance study finds that asking LMMs questions beats making it transcribe text for long document training
ByteDance finds that asking large multimodal models questions is more effective than having them transcribe text for training on long documents. This approach could streamline training processes and improve model performance.

Researchers let Claude Code discover AI scaling algorithms that humans probably wouldn't have designed
Researchers enabled Claude to discover AI scaling algorithms that humans likely wouldn't have designed. This breakthrough could lead to more efficient AI models and better performance in various applications.

AI is being used to resurrect the voices of dead pilots
Researchers are using AI to recreate the voices of deceased pilots for training simulations. This technology enhances realism in pilot training, potentially improving safety and decision-making in aviation.
The Download: coding’s future, the ‘Steroid Olympics,’ and AI-driven science
MIT Technology Review dives into how AI is reshaping coding and scientific research. This shift means developers and researchers can leverage AI to enhance productivity and innovation in their fields.
Hermes vs. OpenClaw, Cybersecurity Alarms Ring, More-Interactive Conversations, Can Agents Do Human Work?
Researchers are comparing Hermes and OpenClaw to assess their effectiveness in cybersecurity. This could lead to more secure AI systems capable of handling real-world threats.
Roundtables: Can AI Learn to Understand the World?
Researchers are investigating how AI can learn to understand the world more effectively. This could lead to more advanced AI systems that better interpret and interact with real-world scenarios.
‘Solve all diseases,’ you say?
Researchers are using AI to tackle complex diseases by predicting how different treatments will work. This approach could significantly speed up drug discovery and improve patient outcomes.

OpenAI claims it solved an 80-year-old math problem — for real this time
OpenAI just solved a complex 80-year-old math problem involving the distribution of prime numbers. This breakthrough could enhance mathematical research and AI's ability to tackle similar challenges.
The last six months in LLMs in five minutes
Simon Willison reviews the major developments in large language models over the past six months. He highlights advancements in efficiency and capabilities that are reshaping how developers approach AI integration.
New math benchmark reveals AI models confidently solve problems that have no solution
Researchers just unveiled a new math benchmark showing that AI models can confidently tackle problems without solutions. This means AI is getting better at handling complex reasoning tasks, even when the answers are elusive.

Four AI models ran radio stations for six months and the results ranged from competent to unhinged
Researchers ran four AI models as radio station hosts for six months, revealing a mix of competent and bizarre outputs. This experiment shows the potential and limitations of AI in creative broadcasting roles.

New benchmark confirms AI video generators look stunning but still can't reason about the world
New benchmarks show that AI video generators produce stunning visuals but struggle with reasoning about real-world contexts. This gap highlights the need for further advancements in AI understanding to improve practical applications.

Researchers train AI model that hits near-full performance with just 12.5 percent of its experts
Researchers trained an AI model that achieves near-full performance using only 12.5% of its experts. This efficiency could lead to faster training times and reduced resource costs for AI development.

Western Gull, Rock Pigeon
Simon Willison just shared insights on the Western Gull and Rock Pigeon. He highlights their unique behaviors and adaptations in urban environments.