Lack of Real AI Alignment Incentives
The Misaligned Priorities Driving AI Development and the resemblance to the Manhattan Project
When building a safe Artificial General Intelligence(AGI), you’d think everyone would be on the same page. After all, who doesn't want AI that won't accidentally (or intentionally) cause harm? But the reality is different, as it often is with essential things.
The potential for AGI to provide strategic advantages can lead countries and organizations to prioritize rapid development over safety considerations. This creates a prisoner's dilemma-like situation1 where cooperation on safety might be seen as a competitive disadvantage. The pressure to be first-to-market can overshadow long-term safety considerations.
We live in a world where money means a lot. Money-driven mentalities have allowed us to progress in many cases, but they have also cost us dearly in others.
Let’s suppose you are the head/decision-maker at an AI organization. You have the opportunity to move as quickly as possible. But there is enormous competition. You have two options. Either move fast or die. If you move fast and it causes issues, you can blame the government and say there wasn’t enough regulation in writing to tell us what we were doing was wrong. Their position would have plausible deniability. At the very least, they can create stalemate conditions that allow them to push the case in court for years. That is a significant concern.
Governments can also try to stifle progress with extreme regulations like those in Europe. However, this would incentivize companies to leave for more regulatory-friendly areas. So, how do we balance giving organizations enough freedom where they choose to stay with enough regulation to act in the government's interests? Again, sadly, this could be misaligned towards safe AGI.
What if an organization/government/entity reached AGI first? Is it necessary for them to publish these results? Or will they keep it from the public eye and use it to their advantage? We have already seen cases like Cambridge Analytica2 , where data was misused to manipulate voters and, in turn, election results. It could use bots to fill the Internet with disinformation designed to influence people’s decisions, public opinion, and perceptions. The Internet is already filled with bots. Finding out if someone on the Internet is real while maintaining user privacy is already a significant challenge. AGI would make it these challenges worse.
Another challenge is that different groups have different ideas about what "safe" means. It's like the old story of the blind men and the elephant3. Everyone touches a different part and comes away with a different conclusion.
For governments, "safe" might mean "gives us an edge over other countries." For big tech companies, it often means "won't cause a PR disaster." And for AI researchers, it could be "won't destroy the world." These aren't necessarily contradictory, but they're not necessarily aligned either.
Should we give organizations free rein? Should we give governments free rein? Current governments lack enough knowledge to create regulations for a rapidly advancing industry. How can we hold anyone accountable when we don’t clearly define accountability?
The challenge of aligning goals for safe AGI bears a striking resemblance to the development of the atomic bomb during World War II. The Manhattan Project offers a sobering example of how technological advancement can outpace our ability to manage its consequences.
J. Robert Oppenheimer, the project's scientific director, famously quoted the Bhagavad Gita after the first nuclear test:
"Now I am become Death, the destroyer of worlds."4
Einstein, too, grappled with the implications of his work. His letter to President Roosevelt in 1939 warned of the potential for an atomic bomb and set the project in motion. Later, he expressed regret, saying,
"Had I known that the Germans would not succeed in developing an atomic bomb, I would have done nothing."5
These brilliant minds understood something crucial: they can't be un-invented once certain technologies are developed. This is why alignment is so critical. Our decisions about AGI development will have ripple effects that extend far beyond our lifetimes.
The lesson from the atomic age isn't that we shouldn't pursue powerful new technologies. We must do so with open eyes, fully aware of the potential consequences. We need to build safeguards from the start, not as an afterthought.
The stakes are too high to ignore. The future of humanity might very well depend on our ability to agree on this. And that's a goal worth aligning on.
‘What Is the Prisoner’s Dilemma and How Does It Work?’ Investopedia, https://www.investopedia.com/terms/p/prisoners-dilemma.asp
Confessore, Nicholas. ‘Cambridge Analytica and Facebook: The Scandal and the Fallout So Far’. The New York Times, 4 Apr. 2018. NYTimes.com, https://www.nytimes.com/2018/04/04/us/politics/cambridge-analytica -scandal-fallout.html
‘Blind Men and an Elephant’. Wikipedia, 22 June 2024. Wikipedia, https://en.wikipedia.org/w/index.php?title=Blind_men_and_an_elephan t&oldid=1230408980
Temperton, James. ‘“Now I Am Become Death, the Destroyer of Worlds.” The Story of Oppenheimer’s Infamous Quote’. Wired. www.wired.com, https://www.wired.com/story/manhattan-project-robert-oppenheimer/
‘Albert Einstein, Leó Szilárd and the Letter That Led to Manhattan Project’. Bulletin of the Atomic Scientists, https://thebulletin.org/virtual-tour/albert-einstein-leo-szilard-and-the-letter-that-led-to-the-manhattan-project/