The idea of a “malicious” artificial intelligence moves and entertains large parts of society. Doing so blinds you to the much less apparent risks in AI that data scientists and co. should deal with: AI accidents. Because even joint AI projects are subject to the risk of getting out of hand, Questionable decisions, poor performance, or compliance problems can result. With 5 simple steps, however, AI accidents can be avoided in the entire life cycle.
Artificial intelligence is the future technology of the 21st century – as promising as the technology is, it also entails many risks. A critical risk here is the so-called AI accidents. The good news is that many companies and initiatives already consider risk reduction a core mission. However, to mitigate risks, they must first be understood.
The term “AI accident” generally includes any unintended artificial intelligence behavior. Such behavior is not only undesirable for users and companies since it can lead to false, inaccurate, or even dangerous results or loss of money. In the worst case, risk and compliance problems can also result.
A concrete example of how an AI accident can happen in practice is that a bank in Central Europe needed a risk model to go through stress tests. The model she uses was introduced two decades ago as a rule-based system, and ten years ago, it was supplemented with additional variables and data sources to be considered. It has been deemed compliant by regulators for the purpose for which it was designed. In the further course of the development of these stress test procedures, further data were generated, which enabled a new aggregated presentation of the bank’s balance sheet. This, in turn, became the source for more advanced liquidity limit monitoring models.
Another example: A pharmaceutical manufacturer strives for a sales forecasting model: while it accurately predicts sales for some regions, it always – but unpredictably – misses the mark for others. This affects the sales team’s goal setting, reduces performance, and sometimes even causes employees to leave the company. The model was created by a single expert and saved and run on their laptop. When this expert leaves the company, the model is passed on to others – but they have too much respect to question a model that has always worked. Even though it has explicit biases, it is still used today. This leads to questionable decisions about setting sales targets and monitoring performance. The problem: Important decisions are delegated to an automated system that, in extreme cases, can lead to unpredictable results – without people consciously dealing with the compromises and constantly monitoring the consequences. Or, to put it controversially: The model doesn’t care if nonsense comes out, but the customer/business doesn’t.
Similar problems also exist when using proprietary black box models purchased or licensed by third parties for a specific purpose. While the user thinks a model is a simple tool, in the eyes of many regulators, he has embarked on an AI journey whose destination and path are unknown. Because the same assumptions, training, and biases that went into this black box are unknown. In addition, there is little chance of knowing that the black box is used correctly. In simpler terms, at this point, it is already possible for the model to be operated – knowingly or unknowingly – outside of the area for which it was designed.
So how can teams mitigate the risks associated with AI accidents? The answer lies in transparent governance and MLOps processes. To minimize the risk of an AI accident, it is essential first to set a clear goal while identifying what is outside the scope.
The following questions help with this:
Appropriate steps can then be taken.
The first step to avoiding AI accidents is to monitor all models effectively. This is still relatively easy to implement if users work with only one model. For example, a committee can be set up to regularly discuss, evaluate, and sign off on the model. However, as soon as the number of models increases, the workload is too significant, and the constant discussion of hundreds of models is not feasible. The alternative is visual model comparisons. Various AI platforms offer these – they provide an overview of all performance metrics, feature handling, and training information. On the one hand, this supports model development, and, on the other hand, it facilitates MLOps workflows.
Regular and extensive testing, verification, and validation are one of the central foundations of protection against AI accidents. By iterating processes, teams can identify and respond to deviations. This works particularly well when people with different backgrounds, skills, and technical and professional know-how are involved. In any case, it is becoming increasingly clear that AI can no longer be scaled at a certain point without attracting diverse teams to build and use the technology. Keyword: the democratization of AI.
When creating models, it is best to ensure that they are measured and analyzed so that direct benchmarking cases are available for the best possible results. It makes sense to test different situations that push the AI to its limits – this way, you can see what effects individual decisions have on business processes. This process is also a kind of risk assessment because the model’s performance is evaluated, and the scope of the decisions is based on it. Such security precautions are recommended when AI is used in critical business applications.
Training the models helps to train them directly in a reproducible way. This means that they are systematically checked for distortions – always with the awareness that unconscious distortions cannot be ruled out. So if a model were built with partial data from the start, it would also make biased predictions.
Raising public awareness and educating management and those involved in AI projects about “AI accidents” is the first step – and openly reflecting on it is the next. The “negative results” are not usually published in science, but this practice can lead many others to follow the same critical path. The use of technology embedded in AI is an emerging field, so not everything will work on the first try – neither stigma nor a “blame game” will help. The goal should be to learn constructively from any mistakes or unintended consequences. In addition, teams should invest in research and development in the field of AI security and in developing AI standards and testing capacities.
Like any algorithm, AI needs transparent governance embedded in the data strategy and related business processes. Regardless of the model used, it needs to be verified and validated. It’s up to teams to monitor the performance of their models through end-to-end lifecycle management because, ultimately, performance counts for business continuity. The key lies in transparent, trustworthy, and efficient MLOps processes.
Business experts and analysts should always be involved to scale impact while managing risk. Transparent governance for data, access, and models helps here. The same rigor should always be applied to any other model that automates important data-based decisions. An artificial dividing line between “new” and “old” modeling approaches downplays the potentially more significant risks of the latter. This is how we can enable AI to scale by limiting the risks.
Also Read: How Can Companies Use AI Today To Reinvent Themselves