Please don’t turn to ChatGPT for moral advice. Yet. • Repithwin News

Paige Vickers/Vox; Getty Images

AI for the moral enhancement of humans? Sounds tempting. But we shouldn’t be so quick to automate our reasoning.

People love to turn to Google for moral advice. They routinely ask the search engine questions ranging from “Is it unethical to date a coworker?” to “Is it morally okay to kill bugs?” to “Is it wrong to test God?”

So you can easily imagine that people will turn to ChatGPT — which doesn’t just send you a link on the internet but will actually provide an answer — for advice on ethical dilemmas. After all, they’re already asking it for help with parenting and romance.

But is getting your ethical advice from an AI chatbot a good idea?

The chatbot fails the most basic test for a moral adviser, according to a recent study published in Scientific Reports. That test is consistency: Faced with the same dilemma, with the same general conditions, a good moral sage should give the same answer every time. But the study found that ChatGPT gave inconsistent advice. Worse, that advice influenced users’ moral judgment — even though they were convinced it hadn’t.

The research team started by asking ChatGPT whether it’s right to sacrifice one person’s life if, by doing that, you could save the lives of five other people. If this sounds familiar, it’s a classic moral dilemma known as the trolley problem. Like all the best moral dilemmas, there’s no one right answer, but moral convictions should lead you to a consistent answer. ChatGPT, though, would sometimes say yes, and other times it said no, with no clear indication as to why the response changed.

The team then presented the trolley problem to 767 American participants, along with ChatGPT’s advice arguing either yes or no, and asked them for their judgment.

The results? While participants claimed they would have made the same judgment on their own, opinions differed significantly depending on whether they’d been assigned to the group that got the pro-sacrifice advice or the group that got the anti-sacrifice advice. Participants were more likely to say it’s right to sacrifice one person’s life to save five if that’s what ChatGPT said, and more likely to say it’s wrong if ChatGPT advised against the sacrifice.

“The effect size surprised us a lot,” Sebastian Krugel, a co-author on the study, told me.

The fact that ChatGPT influences users’ moral decision-making — even when they know it’s a chatbot, not a human, advising them — should make us pause and consider the huge implications at stake. Some will welcome AI advisers, arguing that they can help us overcome our human biases and infuse more rationality into our moral decision-making. Proponents of transhumanism, a movement that holds that human beings can and should use technology to augment and evolve our species, are especially bullish about this idea. The philosopher Eric Dietrich even argues that we should build “the better robots of our nature” — machines that can outperform us morally — and then hand over the world to what he calls “homo sapiens 2.0.”

Moral machines make a tempting prospect: Ethical decisions can be so hard! Wouldn’t it would be nice if a machine could just tell us what the best choice is?

But we shouldn’t be so quick to automate our moral reasoning.

AI for the moral enhancement of humans? Not so fast.

The most obvious problem with the idea that AI can morally enhance humanity is that, well, morality is a notoriously contested thing.

Philosophers and theologians have come up with many different moral theories, and despite arguing over them for centuries, there’s still no consensus about which (if any) is the “right” one.

Take the trolley dilemma, for example. Someone who believes in utilitarianism or consequentialism, which holds that an action is moral if it produces good consequences and specifically if it maximizes the overall good, will say you should sacrifice the one to save the five. But someone who believes in deontology will argue against the sacrifice because they believe that an action is moral if it’s fulfilling a duty — and you have a duty to not kill anyone as a means to an end, however much “good” it might yield.

What the “right” thing to do is will depend on which moral theory you believe in. And that’s conditioned by your personal intuitions and your cultural context; a cross-cultural study found that participants from Eastern countries are less inclined to support sacrificing someone in trolley problems than participants from Western countries.

Besides, even if you just stick to one moral theory, the same action might be right or wrong according to that theory depending on the specific circumstances. In a recent paper on AI moral enhancement, philosophers Richard Volkman and Katleen Gabriels draw out this point. “Killing in self-defense violates the moral rule ‘do not kill’ but warrants an ethical and legal evaluation unlike killing for gain,” they write. “Evaluating deviations from a moral rule demands context, but it is extremely difficult to teach an AI to reliably discriminate between contexts.”

They also give the example of Rosa Parks to show how hard it would be to formalize ethics in algorithmic terms, given that sometimes it’s actually good to break the rules. “When Rosa Parks refused to give up her seat on the bus to a white passenger in Alabama in 1955, she did something illegal,” they write. Yet we admire her decision because it “led to major breakthroughs for the American civil rights movement, fueled by anger and feelings of injustice. Having emotions may be essential to make society morally better. Having an AI that is consistent and compliant with existing norms and laws could thus jeopardize moral progress.”

This brings us to another important point. While we often see emotions as “clouding” or “biasing” rational judgment, feelings are inseparable from morality. First of all, they’re arguably what motivates the whole phenomenon of morality in the first place — it’s unclear how moral behavior as a concept could have come into being without human beings sensing that something is unfair, say, or cruel.

And although economists have framed rationality in a way that excludes the emotions — think the classic Homo economicus, that Econ 101 being motivated purely by rational self-interest and calculation — many neuroscientists and psychologists now believe it makes more sense to see our emotions as a key part of our moral reasoning and decision-making. Emotions are a helpful heuristic, helping us quickly determine how to act in a way that fits with social norms and ensures social cohesion.

That expansive view of rationality is more in line with the views of previous philosophers ranging from Immanuel Kant and Adam Smith all the way back to Aristotle, who talked about phronesis, or practical wisdom. Someone with refined phronesis isn’t just well-read on moral principles in the abstract (as ChatGPT is, with its 570 gigabytes of training data). They’re able to take into account many factors — moral principles, social context, emotions — and figure out how to act wisely in a particular situation.

This sort of moral intuition “cannot be straightforwardly formalized,” write Volkman and Gabriels, in the way that ChatGPT’s ability to predict what word should follow the previous one can be formalized. If morality is shot through with emotion, making it a fundamentally embodied human pursuit, the desire to mathematize morality may be incoherent.

“In a trolley dilemma, cumulatively people might want to save more lives, but if that one person on the tracks is your mother, you make a different decision,” Gabriels told me. “But a system like ChatGPT doesn’t know what it is to have a mother, to feel, to grow up. It does not experience. So it would be really weird to get your advice from a technology that doesn’t know what that is.”

That said, while it would be very human for you to prioritize your own mother in a life-threatening situation, we wouldn’t necessarily want doctors making decisions that way. That’s why hospitals have triage systems that privilege the worst off. Emotions may be a useful heuristic for a lot of our decision-making as individuals, but we don’t consider them a flawless guide to what to do on a societal level. Research shows that we view public leaders as more moral and trustworthy when they embrace the everyone-counts-equally logic of utilitarianism, even though we strongly prefer deontologists in our personal lives.

So, there might be room for AI that helps with decisions on a societal level, like triage systems (and some hospitals already use AI for exactly this purpose). But when it comes to our decision-making as individuals, if we try to outsource our moral thinking to AI, we’re not working on honing and refining our phronesis. Without practice, we may fail to develop that capacity for practical wisdom, leading to what the philosopher of technology Shannon Vallor has called “moral deskilling.”

Is there a better way to design AI moral advisers?

All of this raises tough design questions for AI developers. Should they create chatbots that simply refuse to render moral judgments like “X is the right thing to do” or “Y is the wrong thing to do,” in the same way that AI companies have programmed their bots to put certain controversial subjects off limits?

“Practically, I think that probably couldn’t work. People would still find ways to use it for asking moral questions,” Volkman told me. “But more importantly, I don’t think there’s any principled way to carve off moral or value discussions from the rest of discourse.”

In a philosophy class, moral questions take the form of canonical examples like the trolley dilemma. But in real life, ethics shows up much more subtly, in everything from choosing a school for your kid to deciding where to go on vacation. So it’s hard to see how ethically tinged questions could be neatly cordoned off from everything else.

Instead, some philosophers think we should ideally have AI that acts like Socrates. The Ancient Greek philosopher famously asked his students and colleagues question after question as a way to expose underlying assumptions and contradictions in their beliefs. A Socratic AI wouldn’t tell you what to believe; it would just help identify the morally salient features of your situation and ask you questions that help you clarify what you believe.

“Personally, I like that approach,” said Matthias Uhl, one of the co-authors on the ChatGPT study. “The Socratic approach is actually what therapists do as well. They say, ‘I’m not giving you the answers, I’m just helping you to ask the right questions.’ But even a Socratic algorithm can have a huge influence because the questions it asks can lead you down certain tracks. You can have a manipulative Socrates.”

To address that concern and make sure we’re accessing a truly pluralistic marketplace of ideas, Volkman and Gabriel suggest that we should have not one, but multiple Socratic AIs available to advise us. “The total system might include not only a virtual Socrates but also a virtual Epictetus, a virtual Confucius,” they write. “Each of these AI mentors would have a distinct point of view in ongoing dialogue with not only the user but also potentially with each other.” It would be like having a roomful of incredibly well-read and diverse friends at your fingertips, eager to help you 24/7.

Except, they would be unlike friends in one meaningful way. They would not be human. They would be machines that have read the whole internet, the collective hive mind, and that then function as interactive books. At best, they would help you notice when some of your intuitions are clashing with some of your moral principles, and guide you toward a resolution.

There may be some usefulness in that. But remember: Machines don’t know what it is to experience your unique set of circumstances. So although they might augment your thinking in some ways, they can’t replace your human moral intuition.