People and computers perceive the world differently, which can lead AI to make mistakes no human would. Researchers are working on how to bring human and AI vision into alignment.
Alignment is not about determining who is right. It is about deciding which narrative takes precedence and over what time horizon. That choice is a strategic act.
We’re now deep into the AI era, where every week brings another feature or task that AI can accomplish. But given how far down the road we already are, it’s all the more essential to zoom out and ask ...
The most dangerous part of AI might not be the fact that it hallucinates—making up its own version of the truth—but that it ceaselessly agrees with users’ version of the truth. This danger is creating ...
There's a joke buried somewhere in the fact that Summer Yue, a safety and alignment director at Meta Superintelligence, someone whose literal job is to make AI behave, watched an AI agent delete her ...
AI is evolving beyond a helpful tool to an autonomous agent, creating new risks for cybersecurity systems. Alignment faking is a new threat where AI essentially “lies” to developers during the ...
Constantly improving AI would create a positive feedback loop: an intelligence explosion. We would be no match for it.