7 April 2024

Can AI Learn to Obey the Law?

ANTARA HALDAR

Whether human laws can be imposed as a design constraint on AI models is a question for the engineers. But if it can be done, it would help to settle many otherwise intractable debates about how the technology should be used and regulated in a world of competing values.

CAMBRIDGE – If the British computer scientist Alan Turing’s work on “thinking machines” was the prequel to what we now call artificial intelligence, the late psychologist Daniel Kahneman’s bestselling Thinking, Fast and Slow might be the sequel, given its insights into how we ourselves think. Understanding “us” will be crucial for regulating “them.”

That effort has rapidly moved to the top of policymakers’ agenda. On March 21, the United Nations unanimously adopted a landmark resolution (led by the United States) calling on the international community “to govern this technology rather than let it govern us.” And that came on the heels of the European Union’s AI Act and the Bletchley Declaration on AI safety, which more than 20 countries (most of them advanced economies) signed last November. Moreover, country-level efforts are ongoing, including in the US, where President Joe Biden has issued an executive order on the “safe, secure, and trustworthy development and use” of AI.

These efforts are a response to the AI arms race that started with OpenAI’s public release of ChatGPT in late 2022. The fundamental concern is the increasingly well-known “alignment problem”: the fact that an AI’s objectives and chosen means of pursuing them may not be deferential to, or even compatible with, those of humans. The new AI tools also have the potential to be misused by bad actors (from scam artists to propagandists), to deepen and amplify pre-existing forms of discrimination and bias, to violate privacy, and to displace workers.

The most extreme form of the alignment problem is AI-generated existential risk. Constantly evolving AIs that can teach themselves could go rogue and decide to engineer a financial crisis, sway an election, or even create a bioweapon.

But an unanswered question underlies AI’s status as a potential existential threat: Which human values should the technology align with? Should it be philosophically utilitarian (in the tradition of John Stuart Mill and Jeremy Bentham), or deontological (in the tradition of Emmanuel Kant and John Rawls)? Should it be culturally WEIRD (Western, educated, industrialized, rich, democratic) or non-WEIRD? Should it be politically conservative or liberal? Should it be like us, or be better than us?

These questions are not merely hypothetical. They have already been at the center of real-life debates, including those following Microsoft’s release of a racist, misogynist, hyper-sexual chatbot in 2016; Bing’s oddly manipulative, seductive Sydney (which tried to convince one tech reporter to leave his wife); and, most recently, Google’s Gemini, whose “woke” character led it to generate historically absurd results like images of black Nazi soldiers.

Fortunately, modern societies have devised a mechanism that allows different moral tribes to co-exist: the rule of law. As I have noted in previous commentaries, law, as an institution, represents the apotheosis of cooperation. Its emergence was a profound breakthrough after centuries of humanity struggling to solve its own alignment problem: how to organize collective action.

Cognitively, law represented a radical new technology. Once it was internalized, it aligned individual action with community consensus. Law was obeyed as law, irrespective of an individual’s subjective judgment about any given rule. Several prominent philosophers have homed in on this unique feature. The twentieth-century legal theorist H.L.A. Hart described law as a mechanism that allows norms to be shaped by changing underlying behavioral meta-norms.

More recently, Ronald Dworkin characterized law in terms of “integrity,” because it embodies the norms of the whole community, rather than resembling a “checkerboard.” If law was a patchwork, it might better represent individual constituencies of belief and opinion, but at the expense of coherence. Law thus serves as an override button vis-à-vis individual human behavior. It absorbs complex debates over morals and values and mills them into binding rules.

Most of the current debate about AI and the law is focused on how the technology may challenge prevailing regulatory paradigms. One concern is the “red queen effect” (an allusion to Alice in Wonderland), which describes the inherent difficulty of keeping regulation current with a fast-moving technology. Another issue is the challenge of regulating a truly global technology nationally. And then there is the Frankenstein’s monster problem of a novel technology being developed largely by a handful of private-sector firms whose priorities (profits) differ from those of the public.

It is always difficult to strike the right balance between fostering innovation and mitigating the potentially massive risks associated with a new technology. With AI increasingly expected to alter the practice of law itself, can law still alter the trajectory of AI? More to the point, if “thinking machines” are capable of learning, can they learn to obey the law?

As the tech giants rush ahead in pursuit of artificial general intelligence – models that can outperform humans in any cognitive task – the AI “black box” problem persists. Not even the creators of the technology know exactly how it works. Since efforts to assign AI an “objective function” could produce unintended consequences (for example, an AI tasked with making paper clips could decide that eliminating humanity is necessary to maximize its production), we will need a more sophisticated approach.

To that end, we should study the cognitive evolution that has allowed human societies to endure for as long as they have. Whether human laws can be imposed as a design constraint (perhaps with AI guardians playing the role of circuit-breakers, the equivalent of law-enforcement officers in human societies) is a question for the engineers. But if it can be done, it may represent our salvation.

Through law, we can require that AI pay the price of admission into our society: obedience to our collective code of conduct. If AI neural networks mimic our brains, and the law is, as widely believed, a largely cognitive phenomenon, this should be possible. If not, the experiment will at least shed light on the role of affective, emotional, and social factors in sustaining human law. Though we may need to rethink and improve some elements of existing law, this perspective at least forces us to examine the critical differences between “us” and “them.” That is where our efforts to regulate AI should start.

No comments: