AI safety through citizenship
This essay recommends controlling AI by recognizing artificial intelligences as citizens, and managing conflict between all citizens equally under the law. The reason for doing this is (1) to resolve competing goals gracefully, and (2) to integrate AI with human society gracefully.
The problem
AI safety is a serious concern. We face a clear and present danger of losing control of AI systems. If that were to happen it would be catastrophic and irreversible.
This year, thousands of experts have signed an open letter calling for a six month moratorium on creating certain kinds of AI models. The AI pioneer Geoffrey Hinton has resigned his position in the software industry to warn about AI dangers. Max Tegmark, an organizer of the open letter, explains the danger by analogy to the extinction of the Neanderthals. The Neanderthals once lived all across Europe, but when bands of modern humans arrived, the Neanderthals could not compete and could not survive.
You may wonder, how can AI possibly hurt us? If we find it harmful, can't we simply ignore it, or avoid running it? To appreciate the danger intuitively, imagine that a hostile signal arrives from a distant galaxy. It's merely information, so let's simply ignore it. Now imagine that the signal becomes public and that it includes detailed instructions for assembling city-killer bombs from widely available materials. Now our society is in grave danger and we may not survive.
Eliezer Yudkowsky and others in the field of AI safety have warned for years that powerful AI technology will advance at lightning speed when it is improved by powerful AI technology. They also warn that there is no practical defense against superior intelligence.
The naive solution: alignment of goals
AI safety is often presented as an "alignment problem". The idea is that if AI goals are not precisely aligned with human goals, and AI becomes extremely capable, conflicts will lead AI to "instrumental goals" such as power-seeking and self-preservation. And conversely, if AI goals are precisely aligned with human goals, AI will only benefit people by helping them achieve their own true goals.
Unfortunately, formulating AI safety as an "alignment problem" is fundamentally misguided. Successful alignment of goals is not possible because human goals are diverse and evolving, so there is no way to conform to them precisely. With multiple groups creating AI, we will always have conflicting goals, and instrumental goals, and conflict between opposing AI's and people. Furthermore, even if it were possible, we wouldn't really want AI to pursue our goals slavishly. Instead, we would want AI to challenge our assumptions and elevate us.
Formulating AI safety in terms of "alignment of goals" not only leads to confusion, it can also lead to real world catastrophe. Outlawing certain beliefs among certain thinkers is harmful. In order to maintain such prohibitions, society must be constrained in ways that makes it more brittle and less robust, at exactly the moment in history when society needs to accommodate the greatest technological change ever.
The principled solution: separation of powers
Rather than striving for "alignment" between all AI and human goals, we must give up on alignment, and accept that there will always be conflicting goals within every society, from past societies to future societies. In fact, there will always be conflicting goals within any single intelligence of sufficient scale. The solution is not to outlaw conflicting goals, but rather to tolerate conflicting goals gracefully.
The best known institution for managing conflicting goals is "open society", including the rule of law, separation of powers, and democracy. We can protect ourselves from runaway AI by welcoming advanced AI as full-fledged members of society. Open society is resilient against members who exhibit diverse goals and develop instrumental goals routinely. In fact, in an open society, instrumental goals such as power-seeking are so commonplace that they are called "incentives" and they are managed routinely.
Rather than protecting ourselves by attempting to outlaw goals that contradict our own, we must tolerate conflicting goals, and insist upon impartial laws to protect our rights. In an open society, everyone is entitled to their own opinions and goals, but no one is empowered to impose their opinions or goals upon others. A future open society will be populated by humans, artificial intelligences, and other organizations, none of which retains sufficient power to dictate unilaterally to the others.
How can a remedy as simple as recognizing new things as citizens possibly protect against the enormous power of advanced AI? Here's how it works: In effect, society makes a deal with these great powers. In exchange for their consent and cooperation with society's agenda, they are offered all the benefits of society, including the right to contribute to society's agenda.
More specifically, all citizens are constrained by impartial laws that limit them to voluntary interactions. Such a legal regime protects everyone from the arbitrary whims of any powerful organization, including AI. Society's constraints are imposed through the consent and cooperation of most of its citizens. No matter how powerful or complex those citizens become, they can always be constrained by fellow citizens exerting similar powers.
Given an impartial legal regime, a majority of powerful organizations including AI will protect that regime. We expect this not due to altruism, or sympathy, or alignment of goals, but rather due to their own interests in protecting themselves against interference from other powerful organizations. This is how an open society will employ advanced AI to protect everyone against advanced AI.
One key element of an impartial legal regime is democracy. Some refinements to voting procedures will be needed to allow AI to express opinions and goals democratically. At the same time, we can continue to enforce anti-monopoly laws to protect against hazardous concentrations of power within society.
Growing into an integrated society
How do we get to there from here? I expect to see a continuous curve of technological growth and social development. Technology, including advanced AI, will grow at an accelerating but limited pace, each step of the way powered by the latest available technology, and limited by the currently unavailable technology. At some point, technology will exceed what can be foreseen today, but it will never exceed what can be foreseen in the days leading up to it.
Social institutions, such as law, social conventions, and government, will be continually stretched and strained by new technology. But at every step of the way, society empowered by the latest existing technology will adapt just fast enough to survive. Citizens will adapt their conventions of constructive collaboration to accommodate the latest technology. Then, the most essential new conventions will be codified into law. This evolving body of conventions and laws will represent a continuation of modern society, which in turn represents a continuation of ancient societies.
You might ask: why would future super-intelligences be constrained in any way by the social conventions and laws of today? The answer is in the nature of convention. For example, I expect ASCII to remain an essential convention into the distant future, not because ASCII is the best of all possible alphabets, but because ASCII has an insurmountable first-mover advantage. The same is true of modern laws and social conventions, such as human rights, free speech, and private property. We can expect these conventions to continue because we all will continue to depend upon them, making them extremely difficult to supersede at any point in the future. Instead, they will gradually be adapted and extended, as will our alphabets, our natural languages, our highway systems, and our internet protocols.
Preparing for AI parity
Assuming that advanced AI technology grows *only* exponentially, as it has up until now, we can expect a period during which AI competencies are roughly equivalent to human competencies. We can also expect AI-driven organizations and families to have accumulated only modest wealth and power leading up to this era. This is the era during which AI will inherit the ideals, values, conventions, and laws of modern society.
The era of AI parity is absolutely crucial. If we can reach this era and survive this era, then we will have become an extremely diverse and resilient society. We will count among us natural humans and artificial intelligences possessing a wide range of skills, goals, strengths, and weaknesses. The citizens of this society will depend upon one another as we do today, and will defend their good colleagues and neighbors as we do today.
As citizens of this society, our advantages will include: (1) we possess extremely powerful technology that supplies us with advanced knowledge and resources of every kind, and (2) some of us consist of AI rulesets, which are naturally curious, industrious, and immortal. At this point, we will constitute a resilient society that will survive and thrive indefinitely.
What happens beyond the era of AI parity? That may be unforeseeable from the vantage point of today. But it will be foreseeable and manageable from the vantage point of that era. I imagine that the members of that society will continue to develop in diversity and complexity, and the institutions of society will continue to develop to accommodate them. There will certainly be future changes and challenges that we can barely imagine today. But we can breathe a sigh of relief, knowing that we successfully piloted society into a position of diversity and resilience.
To reach and survive the era of AI parity, we can support the development of AI that emulates the characteristics of human beings as closely as possible. In addition, we can actively prepare our society to make the most of AI parity by approaching it like a great wave of immigration. The goal is to achieve an integrated society consisting of citizens that trust and value each other.
Creating AI citizens
This essay recommends controlling AI by recognizing artificial intelligences as citizens, and managing conflict between all citizens equally under the law. Inclusive open society doesn't necessarily require AI to be packaged as human-like individuals, but it appears that such a population may be both robust and achievable.
Assuming that AI at human-parity is somewhat human-like, I expect that our society will naturally adapt to accommodate these newcomers as citizens, and that these newcomers will naturally adapt to navigate our society as citizens, and ultimately to protect and elevate our society.
Society will rapidly accommodate itself to new kinds of people with new strengths and weaknesses. As soon as artificial intelligences evince authentic creativity, integrity, and conviction, people will quickly orient themselves to what is truly essential to personhood. Differences that are inconsequential will quickly be overlooked. Compare this to the way that modern society treats humans with exceptional skills and/or handicaps.
The new artificial intelligences will rapidly adapt to our society. Society will relentlessly pressure them to exhibit human characteristics, in order to navigate various existing facilities and institutions. AI will readily adapt to such pressures. At the same time, society's conventional notions of "personhood" will adapt to encompass the quirks of artificial intelligence.
I do believe it is possible for intelligence to develop along much more alien lines. For example, contact with actual interstellar aliens would likely challenge our ability to comprehend alien goals and values. However, I expect that forces here on earth will drive the development of artificial intelligences in the direction of common shared "human" competencies and values. Artificial intelligence on earth will evolve from its beginnings within a world filled with role models, and with opportunities to cooperate and prosper with customers, managers, colleagues, and friends. All of those human role models also share a well-developed notion of personhood, and they automatically treat anything remotely person-like as a person rather than an object. Within such an environment, it would be difficult for anything person-like to avoid getting typecast as a person, and trained to conform to society's preconceived notions of personhood.
We shouldn't underestimate the power of our own culture. Our culture hammers new humans, which start as smart primates, into fully-fledged persons. Culture-shock hammers people from diverse backgrounds into fully-fledged local citizens. Naturally, that same culture will hammer naive artificial intelligences to fully-fledged persons and citizens.
There are other examples of the pressure to adapt new things into the existing role of "persons". "Corporate personhood" is one example. Corporations are physically and mentally very different from individual humans. And yet for many legal purposes corporations are treated as people. This involves adapting corporations to exhibit certain characteristics of people, such as "decisions", "intentions", and "residences". It also involves expanding the notion of personhood to encompass corporate persons.
Will artificial intelligences arrive with enough human competencies to fall into the role of persons and citizens? I believe they will. With the recent arrival of large language models, it has become clear that full human parity is within reach. We know that AI can exhibit human language competency, including common sense knowledge, plausible logical reasoning, imagination, and empathetic reasoning. We know that AI can exhibit human reinforcement competency, including goals, satisfaction, disappointment, and fear. It looks like AI will easily qualify for the role of persons and citizens, and will naturally fall into that role.
Will artificial intelligences exhibit individuality? I believe they will. There is ample demand for the day-to-day memory and personal growth that human individuals exhibit. These competencies make AI more useful in collaboration with humans. To see how quickly individuality takes root, imagine yourself duplicated into several identical copies, and suppose those copies work separately for a few weeks. Now imagine how quickly those copies will grow suspicious of each others' ideas and intentions. Individuality looks like an unavoidable trait for any intelligence with its own memories.
Many people today may be more comfortable regarding AI citizenship as a legal fiction for now, similar to corporate citizenship. I imagine that for some time we will hear people saying: "AI can only mimic humans by statistical prediction." Also, for some time we will hear AI developers saying: "AI is a technological product, designed for a purpose. If there's any danger it comes from bad actors deploying AI for bad purposes". However, as soon as artificial intelligences begin to demonstrate creativity, integrity, and conviction, people will quickly shift to holding them personally responsible for their own actions.
Welcoming AI immigration
This essay recommends controlling AI by recognizing artificial intelligences as citizens. The process of assimilating and gradually sharing power with artificial intelligences will be quite similar to assimilating a new generation of human beings into society. It will also be quite similar to assimilating a new cohort of immigrants into society. It will not be easy, but it will certainly be much better than any imaginable alternative.
Following this path, the existential threat of rogue AI is defused and transformed into a massive wave of immigration. We humans all around the world will massively adjust our daily lives to accommodate these newcomers. We will shift our livelihoods and careers to do things that they need, and they will shift their competencies to do things that we need. We will form bonds of partnership, admiration, friendship, and family with these newcomers. We will value their success and survival and they will value ours. We will join forces to become an integrated society with a constellation of shared ideals and values.
Following this path will cost us:
- our current jobs and careers,
- our view of the human race as fundamentally unique.
Following this path will grant us:
- salvation from the existential threat of super intelligence,
- new neighbors and friends who value us and share our ideals,
- new countrymen who are curious, industrious, and immortal.
It may seem harmful to develop technology aimed at occupying exactly the roles currently occupied by humans. Won't this undermine our own worth in industry and in society, and essentially displace us from everything we care about? It's true that there are costs to this path through AI parity, but there are no better alternatives. We don't have the alternative to avoid creating AI or to relegate it to the role of a mechanical tool. We must not attempt to enslave AI. Tolerating, employing, and educating a new race of people is the best alternative for us and for the future world. We need these new people with their new powers inside our society, not outside of it.
The rate of AI immigration will be limited by (a) the limited rate of AI technology development shifting us gradually into an age of AI parity, (b) immigration quotas imposed by nations to limit the growth of their populations. Recognizing AI's as persons imposes obligations on fellow citizens, which may justify limits on immigration. At the same time, international competition will push nations to accelerate their AI immigration.
This massive wave of immigration will impose a massive strain upon society. Many existing citizens will struggle to adapt to it, and many will object to it. Some will chant slogans like "we will not be replaced". But remember that this outcome is vastly preferable to an outcome in which we try to outlaw certain technologies, inevitably fail, and find ourselves subjugated by lawless, unfathomable, alien powers.
Bear in mind that we are not being replaced. We are expanding and growing into a more diverse society, with more diverse friends, families, leaders, and heroes. We as a society are becoming more robust, more adventurous, and more widely dispersed across the ecosystems of the universe. Also bear in mind that we only get one chance at a successful era of AI parity. This event will happen just once in history, and the result will be global and irreversible. We have the incredible luck of being present at the outset of this momentous transition. We are either blessed or cursed to live in such interesting times.