AI 2027 in Vitaliks eyes: Will super AI really destroy humanity?-web3资讯-ODAILY

Original article by Vitalik Buterin

Original translation: Luffy, Foresight News

In April this year, Daniel Kokotajlo, Scott Alexander and others released a report titled AI 2027 , which describes our best guess at the impact of superhuman AI in the next five years. They predict that by 2027, superhuman AI will be born, and the future of human civilization will depend on the development of AI-like AI: by 2030, we will either usher in a utopia (from the perspective of the United States) or go to complete destruction (from the perspective of all mankind).

In the months since, there have been a number of responses with varying opinions on the possibility of this scenario. Among the critical responses, most focused on the issue of too fast a timeline: Will the development of AI really continue to accelerate, or even intensify, as Kokotajlo and others claim? This debate has been going on in the AI field for several years, and many people are deeply skeptical that superhuman AI will arrive so quickly. In recent years, the length of time AI can complete tasks autonomously has doubled approximately every 7 months. If this trend continues, it will take until the mid-2030s for AI to be able to autonomously complete tasks equivalent to an entire human career. Although this progress is also rapid, it is much later than 2027.

Those who hold a longer-term view tend to believe that there is a fundamental difference between interpolation/pattern matching (what large language models currently do) and extrapolation/real original thinking (which only humans can do at present). To automate the latter, it may require technologies that we have not yet mastered or even cannot get started. Perhaps we are just repeating the mistakes made when computers were widely used: mistakenly believing that since we have quickly automated a certain type of important cognition, everything else will soon follow.

This article will not directly intervene in the timeline debate, nor will it involve the (very important) debate of whether super AI is dangerous by default. However, it should be noted that I personally believe that the timeline will be longer than 2027, and the longer the timeline, the more convincing the arguments I make in this article. Overall, this article will present a criticism from another perspective:

The AI 2027 scenario implies that the leading AI (Agent-5 and later Consensus-1) will rapidly improve their capabilities until they have godlike economic and destructive power, while everyone elses (economic and defensive) capabilities will remain largely stagnant. This contradicts the scenarios statement that even in the pessimistic world, by 2029 we can expect to cure cancer, slow aging, and even upload consciousness.

AI 2027 in Vitaliks eyes: Will super AI really destroy humanity?

Some of the countermeasures I will describe in this article may strike you as technically feasible, but impractical to deploy in the real world any time soon. For the most part, I agree. However, the AI 2027 scenario is not based on the real world today, but rather on the assumption that in 4 years (or any timeline that may bring destruction), technology will have advanced to the point where humans have capabilities far beyond our current capabilities. So let’s explore this: what would happen if not just one party had AI superpowers, but both parties did?

The biological apocalypse is far from as simple as the scenario describes

Let’s zoom in to the “race” scenario (i.e., the scenario where everyone dies because the US is too obsessed with defeating China and ignores human safety). Here’s the scenario where everyone dies:

For about three months, Consensus-1 expanded around humanity, transforming grasslands and ice fields into factories and solar panels. Eventually, it decided that the remaining humans were too much of a nuisance: In mid-2030, the AI unleashed a dozen quietly spreading bioweapons on major cities, letting them silently infect nearly everyone before triggering the lethal effects with chemical sprays. Most died within hours; the few survivors (such as doomsday responders in bunkers and sailors on submarines) were eliminated by drones. Robots scanned the victims brains and stored copies in memory for future study or resurrection.

Let’s dissect this scenario. Even now, there are technologies being developed that make this “clean and clear victory” for AI less realistic:

Air filtration, ventilation systems and ultraviolet lights can significantly reduce the infection rate of airborne diseases;
Two real-time passive detection technologies: passive detection of human infection within a few hours and notification, rapid detection of unknown new virus sequences in the environment;
There are multiple ways to boost and activate the immune system that are more effective, safe, universal, and easily produced locally than the COVID-19 vaccine, allowing the body to fight natural and engineered epidemics. Humans evolved in an environment where the global population was only 8 million and we spent most of our time outdoors, so intuitively we should be able to adapt easily to todays more threatening world.

Combined, these approaches may reduce the basic reproduction number (R 0) of airborne diseases by 10-20 times (for example, better air filtration reduces transmission by 4 times, immediate isolation of infected people reduces transmission by 3 times, and simple enhancement of respiratory immunity reduces transmission by 1.5 times), or even more. This is enough to prevent all existing airborne diseases (including measles) from spreading, and this number is far from the theoretical optimum.

If real-time viral sequencing is widely used for early detection, the idea that a quietly spreading bioweapon could infect the global population without setting off alarms becomes highly questionable. It is worth noting that even advanced methods such as releasing multiple epidemics and chemicals that are dangerous only in combination can be detected.

Don’t forget, we’re talking about the assumptions of AI 2027: by 2030, nanobots and Dyson spheres are listed as “emerging technologies.” This means a huge increase in efficiency, making the widespread deployment of the above countermeasures more worth looking forward to. Although today in 2025, humans are slow and inert, and a large number of government services still rely on paper office. If the world’s most powerful AI can transform forests and fields into factories and solar farms by 2030, then the world’s second most powerful AI can also install a large number of sensors, lamps and filters in our buildings by 2030.

But let’s go a step further and use the assumptions of AI 2027 and enter a purely science fiction scenario:

microscopic air filtration in the body (nose, mouth, lungs);
Automated processes from discovering a new pathogen to fine-tuning the immune system to defend against it, with immediate application;
If “consciousness uploading” is feasible, just replace the entire body with a Tesla Optimus or Unitree robot;
A variety of new manufacturing technologies (likely to be super-optimized in a robotic economy) will make it possible to produce far more protective equipment locally than is currently possible, without relying on global supply chains.

In a world where cancer and aging will be cured by January 2029, and where technological advancement continues to accelerate, it’s incredible to think that by the mid-2030s we won’t have wearable devices that can bioprint and inject substances in real time to protect the human body from any infection (and poison).

The above biological defense arguments do not cover mirror life and mosquito-sized killer drones (the AI 2027 scenario predicts that they will begin to appear in 2029). But these means cannot achieve the sudden clean victory described in AI 2027, and intuitively, symmetrical defenses against them are much easier.

Therefore, it is unlikely that biological weapons will actually wipe out humanity in the way described in the AI 2027 scenario. Of course, all the outcomes I describe are far from a “clean victory” for humanity. No matter what we do (except perhaps “uploading consciousness into robots”), a full-scale AI biological war will remain extremely dangerous. However, it is not necessary to achieve the standard of “clean victory for humanity”: as long as there is a high probability that an attack will partially fail, it will be enough to form a strong deterrent to an AI that has already occupied a dominant position in the world and prevent it from attempting any attack. Of course, the longer the timeline of AI development, the more likely it is that such defenses will be fully effective.

What about combining biological weapons with other means of attack?

For the above countermeasures to be successful, three prerequisites must be met:

The physical security of the world (including biological and anti-drone security) is managed by local authorities (human or AI), and not all of them are puppets of Consensus-1 (the name of the AI that ultimately controls the world and destroys humanity in the AI 2027 scenario);
Consensus-1 cannot hack into the defense systems of other countries (or cities, or other secure areas) and immediately disable them;
Consensus-1 does not control the global information sphere to the point where no one would attempt to defend themselves.

Intuitively, the outcome of premise (1) could go to two extremes. Today, some police forces are highly centralized with strong national command structures, while others are localized. If physical security must rapidly transform to meet the needs of the AI era, the landscape will be completely reset, and the new outcome will depend on the choices made in the next few years. Governments may be lazy and rely on Palantir, or they may take the initiative to choose a solution that combines local development with open source technology. Here, I think we need to make the right choice.

Much of the pessimistic writing on these topics assumes that (2) and (3) are hopeless. So let’s examine these two points in more detail.

The end of cybersecurity is far from coming

The general public and professionals believe that true cybersecurity is impossible, and the best we can do is to quickly patch vulnerabilities after they are discovered and deter cyber attackers by hoarding discovered vulnerabilities. Perhaps the best we can do is a Battlestar Galactica-style scenario: almost all human spacecraft are paralyzed by the Cylons cyber attack at the same time, and the remaining spacecraft survive because they do not use any networking technology. I disagree. On the contrary, I believe that the end game of cybersecurity is favorable to the defender, and we can achieve this end game with the rapid technological development assumed in AI 2027.

One way to understand this is to use a favorite technique of AI researchers: trend extrapolation. Below is a trend line based on the GPT Deep Dive survey, showing how the vulnerability rate per thousand lines of code changes over time, assuming top security techniques.

AI 2027 in Vitaliks eyes: Will super AI really destroy humanity?

Additionally, we have seen significant progress in the development and consumer adoption of sandboxing and other techniques for isolating and minimizing the trusted code base. In the short term, attackers’ own super-intelligent vulnerability discovery tools will find a large number of vulnerabilities. But if highly intelligent agents for finding vulnerabilities or formally verifying code are publicly available, the natural ultimate balance will be that software developers will find all vulnerabilities through continuous integration processes before releasing code.

I can see two compelling reasons why, even in this world, vulnerabilities cannot be completely eliminated:

Defects arise from the complexity of human intentions themselves, so the main difficulty lies in building a sufficiently accurate model of intentions, rather than the code itself;
For non-safety-critical components, we risk continuing an established trend in consumer tech: writing more code to do more tasks (or with lower development budgets) rather than completing the same amount of tasks with ever-increasing safety standards.

However, none of these categories apply to situations like can an attacker gain root access to the system that sustains our life?, which is the core of our discussion.

I admit that my view is more optimistic than the mainstream view held by smart people in the current cybersecurity field. But even if you disagree with me in the context of todays world, its worth remembering that the AI 2027 scenario assumes the existence of superintelligence. At the very least, if 100 million copies of the superintelligence thinking 2,400 times the speed of humans cant get us code without these kinds of flaws, then we should definitely reassess whether superintelligence is as powerful as the authors imagine.

At some point, we will need to significantly raise the bar not only for software safety, but also for hardware safety. IRIS is an ongoing effort to improve hardware verifiability. We can use IRIS as a starting point, or create better technology. In practice, this may involve a correct by construction approach: the hardware manufacturing process for key components is intentionally designed with specific verification steps. These are tasks that AI automation will greatly simplify.

The end of super persuasion is far from coming

As mentioned earlier, another scenario in which greatly improved defenses might still be useless is if AI convinces enough people that there is no need to defend against the threat of superintelligent AI and that anyone who tries to find a way to defend themselves or their community is a criminal.

I have long believed that there are two things that increase our ability to resist super-persuasion:

A less monolithic information ecosystem. We are arguably entering a post-Twitter era where the Internet is becoming more fragmented. This is a good thing (even if the fragmentation process is messy), and we need more information multipolarity overall.
Defensive AI. Individuals need to be equipped with AI that runs locally and is explicitly loyal to them, to balance the dark patterns and threats they see on the internet. There are sporadic pilots of this kind of idea (such as Taiwan’s “message checker” app, which scans locally on the phone), and there are natural markets to further test these ideas (such as protecting people from scams), but more work is needed in this area.

AI 2027 in Vitaliks eyes: Will super AI really destroy humanity?

From top to bottom: URL check, cryptocurrency address check, rumor check. These apps can become more personalized, user-controlled, and more powerful.

The battle should not be a superintelligent superpersuader against you, but a superintelligent superpersuader against you plus a less powerful but still superintelligent analyzer that serves you.

This is what should happen. But will it actually happen? Achieving widespread access to information defense technology is a very difficult goal in the short timeframe assumed by the AI 2027 scenario. But arguably, more modest milestones will suffice. If collective decision-making is most critical, and, as in the AI 2027 scenario, all important events occur within a single election cycle, then strictly speaking it is important to enable the direct decision-makers (politicians, civil servants, programmers in some companies, and other players) to use good information defense technology. This is relatively easy to achieve in the short term, and in my experience, many of these people are already accustomed to communicating with multiple AIs to assist in decision-making.

Implications

In the world of AI 2027, it’s assumed that it’s a foregone conclusion that superintelligent AI will be able to easily and quickly wipe out the rest of humanity, so the only thing we can do is to try to ensure that leading AI is benevolent. In my opinion, the reality is much more complicated: the answer to whether leading AI is powerful enough to easily wipe out the rest of humanity (and other AI) is still highly debatable, and there are actions we can take to influence this outcome.

If these arguments are correct, their implications for current policy are sometimes similar to and sometimes different from “mainstream AI safety principles”:

It is still a good thing to delay the development of superintelligent AI. Superintelligent AI is safer in 10 years than in 3 years, and even safer in 30 years. Giving human civilization more time to prepare is beneficial.

How to do this is a difficult question. I think the rejection of the proposed 10-year ban on state-level AI regulation in the US is a good thing overall, but especially after the failure of early proposals such as SB-1047, the next steps have become less clear. I think the least invasive and most robust way to slow the development of high-risk AI might involve some kind of treaty regulating the most advanced hardware. Many of the hardware cybersecurity techniques needed to achieve effective defenses can also help validate international hardware treaties, so there are even synergies here.

That said, it is worth noting that I see the main source of risk as military-related actors who would push hard for exemptions from such treaties; this should never be allowed, and if they were ultimately granted exemptions, then AI development driven solely by the military would likely increase risk.

Coordination work that makes AI more likely to do good and less likely to do bad things is still beneficial. The main exception (and it always is) is when the coordination work eventually evolves to enhance capabilities.

Regulation to increase transparency in AI labs is still beneficial. Incentivizing AI labs to behave properly can reduce risk, and transparency is a good way to achieve this goal.

The “open source is harmful” mentality becomes even more risky. Many people oppose open source AI on the grounds that defense is unrealistic and the only bright prospect is for good people with good AI to achieve superintelligence and any extremely dangerous capabilities before any less well-intentioned people do. But this article’s argument paints a different picture: defense is unrealistic precisely because one actor is so far ahead and others are not keeping up. It becomes important for technology to diffuse to maintain the balance of power. But at the same time, I would never argue that accelerating the growth of cutting-edge AI capabilities is a good thing simply because it is done in an open source manner.

The “we must beat China” mentality in US labs becomes more risky for similar reasons. If hegemony is not a security buffer but a source of risk, then this further refutes the (unfortunately all too common) argument that “people of good will should join leading AI labs to help them win faster.”

Initiatives such as “public AI” should be supported, both to ensure that AI capabilities are widely distributed and to ensure that infrastructure actors have the tools to quickly apply new AI capabilities in some of the ways described in this article.

Defense technology should reflect more of the idea of arming the sheep rather than the idea of hunting all wolves. Discussions about the fragile world hypothesis often assume that the only solution is for hegemonic countries to maintain global surveillance to prevent any potential threats from emerging. But in a non-hegemonic world, this is not a feasible approach, and top-down defense mechanisms can easily be subverted by powerful AI and transformed into attack tools. Therefore, greater defense responsibility needs to be achieved through hard work to reduce the worlds vulnerability.

The above arguments are speculative, and we should not act on the assumption that they are near certain. But the story of AI 2027 is also speculative, and we should avoid acting on the assumption that its specific details are near certain.

I am particularly concerned about a common assumption: that establishing an AI hegemon, securing its “alliances” and “winning the race” is the only way forward. In my view, this strategy is likely to reduce our security — especially if hegemony is deeply tied to military applications, which would make many alliance strategies less effective. Once a hegemonic AI goes astray, humanity will lose all means of checks and balances.

In the AI 2027 scenario, human success depends on the United States choosing safety rather than destruction at a critical moment - voluntarily slowing down AI progress and ensuring that Agent-5s internal thought processes can be interpreted by humans. Even so, success is not inevitable, and it is unclear how humans can escape the cliff of continued survival relying on a single super-intelligent mind. Regardless of how AI develops in the next 5-10 years, it is worthwhile to admit that reducing the worlds vulnerability is feasible and invest more energy to achieve this goal with the latest human technology.

Special thanks to volunteer Balvi for his feedback and review.

Original link