Artificial Intelligence has become a pervasive force in every area of human endeavor. AI systems are entrusted with making important decisions and in many cases oversee carrying them out. They drive cars, trade stocks, provide medical diagnoses, analyze financial data, identify targets for drone attacks, write influential articles, create lifelike fake videos, and more. AI abilities far exceed other technologies we've developed throughout history, making the issue of AI compliance with human law and societal values most important.
This urgency is not lost on the legislators. The EU AI Act, a pioneering legislation that impacts the largest market on the planet, requires AI providers and deployers to attend to their systems' safety. White House Blueprint for the AI Bill of Rights, while a recommendation, takes a similar approach.
Those who want to stay in AI business need a strategy to deal with AI alignment with laws and values.
The Alignment Problem: AI and Human Values
Every significant new technology poses novel ethical problems and often leads to new regulations. Artificial Intelligence, particularly the Large Language Models, is special in this regard - they are much more powerful and thus dangerous.
*
See, for example:
Bengio, Y. et al (2024). Managing extreme AI risks amid rapid progress. Science, 384(6698), 842-845
There is a running list of AI failures that includes multiple alignment issues, e.g., a New York City chatbot advising small businesses to break the law in April 2024, generating a highly convincing imagery suggestive of Taylor Swift's Trump endorsement in August 2024, duping people about a Halloween parade in Ireland in October 2024, and more. See https://tech.co/news/list-ai-failures-mistakes-errors.
Immoral machines
When human experts advise clients, they naturally filter out illegal or unethical options. Their recommendations typically stay within legal and moral boundaries, partly due to personal ethics and partly due to professional liability concerns. And if they don't, it is likely that those soliciting their advice will, for the same reasons. The responsibility chain is clear: the current legal frameworks are adept at assigning responsibility to human actors.
AI systems work differently. They process vast amounts of information to find the most efficient path to the set goal, without inherent ethical constraints. For example, if asked to increase an oil company's stock price, AI might suggest actions that could trigger international conflicts, manipulate public opinion, or cause environmental damage - all because these actions efficiently achieve the stated goal.
This example might sound extreme, but it is the crux of what has been called The Alignment Problem. By default, AI systems, like any other technology, do not have any preference neither for human law nor for human moral values. * Gabriel, Y. (2020) Artificial Intelligence, Values, and Alignment Either the developer of the AI system or its user should be telling it explicitly to follow certain principles when issuing recommendations and making decisions. But even this would not solve the problem, for two reasons: AI's immense capabilities and the lack of adequate testing.
Immense capabilities
Even with programmed restrictions, AI can find countless ways to achieve goals through questionable means. For instance, while explicitly prohibited from suggesting harmful pharmaceuticals, it might recommend ineffective but profitable compounds and create convincing marketing campaigns for them.
It is this immense capacity to analyze and generate content, coupled with the lack of inherent moral bend characteristic of all normally developed humans, that made many luminaries in the field raise the alarm. * The first one to sound a warning was the founder of cybernetics Norbert Wiener (Wiener, 1960). With the advent of Large Language Models, the leading researcher in the field issued a grave warning as well (Statement on AI Risk, 2023; Bengio et al, 2024). For analysis, see 10. Vallor, S., & Ganesh, B. (2023). Artificial intelligence and the imperative of responsibility: Reconceiving AI governance as social care. In The Routledge Handbook of Philosophy of Responsibility (pp. 395-406). Routledge.
No adequate testing
Traditional software testing methods don't work for AI. Unlike rule-based systems where developers can verify specific representative scenarios and be reasonably certain that the rules have been programmed correctly, AI systems generate unique responses to each query based on the relations they learned from data, relations that are obscure to their developers and users. No amount of pre-deployment testing can guarantee safe behavior in real-world use.
The Current Landscape: Knowledge and Policies
Today, relevant policy exists in three forms: enacted law, voluntary guidelines, and company-level rules.
Enacted Law: The EU AI Act
The declared purpose of the EU AI Act of 2024 is to promote human-centric and trustworthy AI, to ensure the protection of health, safety, and fundamental rights, “including democracy, the rule of law and environmental protection,” against possible harmful effects of AI. For what it deems High-risk systems, a broad category that includes biometrics, critical infrastructure, employment management, education, finance, law enforcement, public services, and more, the Act prescribes a variety of measures like risk management, data governance, transparency, human oversight, central registry, and testing.
Three things are important to remember about the EU AI Act:
- Stringency and Accountability: the Act puts on AI providers and deployers the burden of responsibility for the harm caused by their systems.
- No Solutions: the measures prescribed by the act for reducing AI risks are inadequate, for the reasons discussed earlier: no amount of testing will significantly mitigate the possible dangers.
- Scope: the Act covers all the AI systems that will serve EU citizens, “irrespective of whether those providers are established or located within the Union or in a third country” Given the economic significance of the EU, it covers pretty much every significant AI provider in the world.
The consequences are clear: Companies that provide or deploy AI systems must both prevent potential harm and accept liability for the damage that occurs
Voluntary Guidelines: The White House Blueprint for the AI Bill of Rights
The White House Blueprint for the AI Bill of Rights, while voluntary, indicates future regulatory directions. As such, its approach is worth considering for those stakeholders who want to develop a lasting AI strategy.
The White House Blueprint for the AI Bill of Rights does not categorize AI systems like the EU AI Act: its guidance applies to all of them. It also specifically requires that the systems should be safe and effective:
Systems should undergo pre-deployment testing, risk identification and mitigation, and ongoing monitoring that demonstrate they are safe and effective based on their intended use, mitigation of unsafe outcomes including those beyond the intended use, and adherence to domain-specific standards.
These guidelines suggest that future regulations may require continuous oversight of AI systems, not just initial testing.
Corporate Guidance: Google's AI Governance
Google's AI Governance reviews and operations is a typical example of a corporate policy that provides guidance for creating safer AI systems. While acknowledging the need for compliance with law and moral principles, it does little to ensure it in practice.
It is understandable that corporations do not want to limit their innovative potential. However, this approach is short-sighted. Firstly, we already have the EU AI Act, and every provider and deployer of AI systems of scale must comply with it. Secondly, it would make sense to prepare for the future legislation. Thirdly, and most importantly, an AI-driven disaster would change the landscape dramatically, endangering the industry in a way any AI provider wants to avoid.
The Risk of Inaction
The way AI is developing now, assuming more powerful positions by the day without adequate care for preventing harm, it is hard not to think of an impending disaster that will seriously endanger the industry. * For an in-depth analysis of this issue see Sotala & Yampolskiy, 2014
History shows how major accidents can devastate promising technologies. The Hindenburg disaster ended airship travel, while the Chernobyl accident severely impacted nuclear power development. AI faces similar risks. To name a few:
- AI-recommended drugs causing widespread harm, a thalidomide-like scenario. Coupled with the current tendency to relax regulatory oversight, this is unfortunately realistic.
- AI-driven financial schemes leading to economic crises. Cryptocurrencies come to mind.
- AI military applications causing civilian casualties, e.g., by putting massive lethal power in the hands of rouge actors leading to mass casualty events on the scale of 9/11.
Any such event could prompt severe regulations that would stifle AI innovation and investment.
What can be done
I. Do nothing
Let creativity and innovation rule unrestricted. Obey the existing law which does not take the new AI-enabled capabilities into consideration and go wild.
This approach is likely to cause adverse consequences, with subsequent loss of trust in AI, leading to judicial and legislative action that might choke the industry. It is also not feasible for any AI company of global reach, given the requirements of the EU AI Act. Thus, at the corporate level, it is not a sustainable strategy.
II. Analyze risks and test before deployment
This approach, mandated by legislation and government guidelines, may be considered a possible deterrent of liability. However, this is an illusory solution for two reasons. Firstly, pre-deployment testing will not prevent adverse effects due to the virtually infinite number of possibilities and the scope of AI impact, as noted above. Secondly, it is the harm caused rather than not following suggested practices that the litigation will focus on, and regulations like the EU AI Act will not stand in its way.
III. Examine AI recommendations in real-time
There is another possible solution that might help with mitigating the AI alignment risks and help prevent liability by meaningful monitoring: examining all AI recommendations in real time.
Arete™: Solving the Alignment Problem in Real Time
Arete™ will examine AI outputs in real-time for:
- Legal compliance
- Alignment with specified ethical principles chosen by the user
- Potential harmful consequences
Arete™ then can issue recommendations, e.g., not to follow a suggestion that is illegal, discriminatory, or likely to cause harm.
Arete™ focuses on two crucial aspects of the Alignment problem:
-
Live Monitoring
In real-time, potentially every recommendation of participating AI systems will be validated for compliance with relevant law and moral norms. This will ensure adequate monitoring in production, as the EU AI Act requires.
Differently from the pre-go-live testing, this will provide a meaningful response to the need to validate AI recommendations for alignment with law and social norms.
Differently from human monitoring, one can expect impartiality, thoroughness, and significantly better response times from an automatic system. AreteTM will be able to review much more relevant data than human operators, do it quickly, and issue impartial recommendations.
-
Accountability
Arete™ clearly delineates the responsibilities between AI validation provider and AI users/deployers:
- Arete™ provider's responsibility is to train, test, and maintain the system for evaluating recommendations submitted by other systems. Building and training Arete™ can be done based on a vast corpus of AI recommendations, as well as legal and moral writing.
-
AI systems' developers/deployers will be responsible for soliciting and implementing the recommendations of Arete™:
- Deciding which AI decisions are to be submitted for scrutiny.
- Deciding which principles to use when soliciting recommendations: local, federal/provincial, or international law; the greatest common good; universalizability of the principles behind decisions, etc.
- Deciding how to implement recommendations: stopping any action when harm to humans can be anticipated, submitting for human review decisions that are potentially discriminatory, etc.
The rapid advancement of AI technology demands proactive measures to ensure legal and ethical compliance. Current regulations like the EU AI Act set stringent demands, yet they don't provide technical solutions. Arete™ offers a practical approach to:
- Meeting regulatory requirements
- Reducing liability exposure
- Protecting against catastrophic harm
- Maintaining public trust