The Download: sycophantic LLMs, and the AI Hype Index

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology. This benchmark used Reddit’s AITA to test how much AI models suck up to us Back in April, OpenAI announced it was rolling back an update to its GPT-4o model that made ChatGPT’s…

May 30, 2025 - 13:43
 0
The Download: sycophantic LLMs, and the AI Hype Index

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

This benchmark used Reddit’s AITA to test how much AI models suck up to us

Back in April, OpenAI announced it was rolling back an update to its GPT-4o model that made ChatGPT’s responses to user queries too sycophantic.

An AI model that acts in an overly agreeable and flattering way is more than just annoying. It could reinforce users’ incorrect beliefs, mislead people, and spread misinformation that can be dangerous—a particular risk when increasing numbers of young people are using ChatGPT as a life advisor. And because sycophancy is difficult to detect, it can go unnoticed until a model or update has already been deployed.

A new benchmark called Elephant that measures the sycophantic tendencies of major AI models could help companies avoid these issues in the future. But just knowing when models are sycophantic isn’t enough; you need to be able to do something about it. And that’s trickier. Read the full story.

—Rhiannon Williams

The AI Hype Index

Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry. Take a look at this month’s edition of the index here.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Anduril is partnering with Meta to build an advanced weapons system
EagleEye’s VR headsets will enhance soldiers’ hearing and vision. (WSJ $)
+ Palmer Luckey wants to turn “warfighters into technomancers.” (TechCrunch)
+ Luckey and Mark Zuckerberg have buried the hatchet, then. (Insider $)
+ Palmer Luckey on the Pentagon’s future of mixed reality. (MIT Technology Review)

2 A new Texas law requires app stores to verify users’ ages
It’s following in Utah’s footsteps, which passed a similar bill in March. (NYT $)
+ Apple has pushed back on the law. (CNN)

3 What happens to DOGE now?
It has lost its leader and a top lieutenant within the space of a week. (WSJ $)
+ Musk’s departure raises questions over how much power it will wield without him. (The Guardian)
+ DOGE’s tech takeover threatens the safety and stability of our critical data. (MIT Technology Review)

4 NASA’s ambitions of a 2027 moon landing are looking less likely
It needs SpaceX’s Starship, which keeps blowing up. (WP $)
+ Is there a viable alternative? (New Scientist $)

5 Students are using AI to generate nude images of each other
It’s a grave and growing problem that no one has a solution for. (404 Media)

6 Google AI Overviews doesn’t know what year it is
A year after its introduction, the feature is still making obvious mistakes. (Wired $)
+ Google’s new AI-powered search isn’t fit to handle even basic queries. (NYT $)
+ The company is pushing AI into everything. Will it pay off? (Vox)
+ Why Google’s AI Overviews gets things wrong. (MIT Technology Review)

7 Hugging Face has created two humanoid robots                         </div>
                                            <div class= Read More