We Humans and the Intelligent Machines. Jörg Dräger

We Humans and the Intelligent Machines

assessment of the risk of recidivism. The problem, however, is that the control group does not represent a cross-section of the American population, but that of its prison inmates. And blacks are disproportionately represented there. For black suspects, this increases the probability of matches with the inmates’ profiles – which causes the algorithm to forecast a higher risk of a repeat offense. As a result, the computer program reinforces existing inequalities. This was also criticized in 2016 by Eric Holder, then Attorney General in the Obama administration. “Although these measures were crafted with the best of intentions, I am concerned that they inadvertently undermine our efforts to ensure individualized and equal justice,” he said. “They may exacerbate unwarranted and unjust disparities that are already far too common in our criminal justice system and in our society.”⁷

The original goal behind the COMPAS software is based on a broad social consensus: the desire to reduce discrimination and arrive at a just judgment regardless of a person’s background. It follows the principles underlying the rule of law in a liberal democracy. Unfortunately, COMPAS does not achieve these goals. Well meant is by no means well done. The algorithmic system cannot perceive its own failure since it does not evaluate itself. This task falls to those responsible for its use, in this case the American government and its judiciary, or the commercial producer of the software. They all failed to make the program accessible for review. Uncovering the algorithm’s shortcomings thus required a non-profit organization to carry out a time-consuming investigation.

One-sided learning: Algorithms as self-fulfilling prophecies

Unlike in Florida, in the state of Wisconsin the COMPAS software is not only used to assess pretrial risk. The judges there also use the recidivism probabilities calculated by its algorithms to determine whether a defendant should go to jail or be put on probation⁸ – a decision of enormous significance for the person in question, and for the public and its feeling of security. Therefore, it is all the more important that COMPAS is able to learn from the results of its forecasts, i.e. when was the algorithm right and when was it wrong?

The problem is the one-sidedness of this learning process. Parolees can confirm or disprove the system’s prognosis, depending on how they behave during the parole period. If, on the other hand, they go to jail because of the COMPAS recommendation, they have no chance of proving that the software was wrong. This is not an isolated example: People who do not receive a loan can never prove that they would have repaid it. And anyone who is rejected as an applicant by a computer program cannot prove that he or she would have done an excellent job if hired.

In such situations, the algorithmic system has what is in effect a learning disability, since verifying whether its prognosis was correct is only possible for one of the two resulting groups. For the other, the question remains hypothetical. Freedom or prison, creditworthy or not, job offer or rejection: Algorithms used in the areas of law enforcement, banking and human resources are fed with one-sided feedback and do not improve as much as they should.

Users must be aware of this and create comparison groups from which the algorithm can nevertheless learn. A financial institution could, for example, also grant a loan to some of the applicants who were initially rejected and use the experience gained with those individuals to further develop its software. HR departments and courts could also form comparison groups by making some of their decisions without machine support, generating feedback data to assess their algorithmic predictions: Was an applicant successful at the company even though he or she would have been rejected by the software? Did someone not reoffend even though the system, as in the case of Brisha, had made a different prediction?

To reduce the problem of one-sided feedback, users must be willing to examine the issue and be adept at addressing it. Both qualities are urgently needed. After all, one-sided feedback not only creates a learning problem for algorithmic decision-making, it can also reinforce and even exacerbate discrimination and social disadvantages. A longer stay in prison increases the risk of a new crime being committed afterwards; a loan taken out in desperation at exorbitant interest rates increases the risk of default. This threatens to turn the algorithmic system into a generator of self-fulfilling prophecies.

Normative blindness: Algorithms also pursue wrong objectives

Better drunk than poor.⁹ That, apparently, is how car insurance companies feel about their customers in some parts of the US, where nothing drives up insurance rates like not being creditworthy. In Kansas, for example, customers with low credit ratings pay up to $1,300 per year more than those with excellent ratings. If, on the other hand, the police catch someone driving drunk, his insurance rate is increased by only $400.

A similar example from the state of New York: An accident where the driver is at fault increases her premium by $430, drunk driving by $1,170, but a low credit rating sends it skyrocketing by $1,760. Driving behavior has less influence on the insurance rate than creditworthiness. In other words, anyone who is in financial difficulties pays significantly more than a well-to-do road hog.

This practice was uncovered by the non-profit Consumer Reports. The consumer-protection organization evaluated and compared more than two billion policies from 700 insurance companies across the US, showing that the algorithms most insurers use to calculate their rates also forecast the creditworthiness of each customer. To do so, the computer programs use the motorists’ financial data, with which banks calculate the probability of loan defaults.

However, the decisive difference between the two sectors is that such an algorithmic forecast would be appropriate for a bank because there is a plausible correlation between creditworthiness and the probability that a loan will be repaid. For car insurers, however, the financial strength of their customers should be an irrelevant criterion. It does not allow any conclusions to be drawn about driving behavior or the probability of an accident, even though the insurance rate should depend solely on this. Car insurance is compulsory and everyone, regardless of social status, should have equal access to it – and the premiums should provide an incentive to behave in a compliant and considerate manner on the road. This would benefit all drivers and thus society as a whole.

The insurance practice denounced by Consumer Reports does not completely ignore this incentive; misconduct while driving continues to be sanctioned. However, this is undermined if accident-free driving is worth less to the insurer than the customer’s account balance. Those who suffer from that are often poorer people who are dependent on their car. Thus, those who are already disadvantaged by low income and low creditworthiness are burdened with even higher premiums.

Economically it may make sense for an insurance company to have solvent rather than law-abiding drivers as customers. That is not new. Now, however, algorithmic tools are available that can quickly and reliably assess customers’ creditworthiness and translate it into individual rates. Without question, the computer program in this example works: It fulfils its mission and acts on behalf of the car insurers. What the algorithmic system, due to its normative blindness, is unable to recognize on its own is that it works against the interests of a society that wants to enable individual mobility for all citizens and increase road safety. By using this software, insurance companies are placing their own economic interests above the benefits to society. It is an ethically questionable practice to which legislators in California, Hawaii and Massachusetts have since responded. These states prohibit car insurers from using credit forecasts to determine their premiums.

Lack of diversity: Algorithmic monopolies jeopardize participation

Kyle Behm just does not get it anymore.¹⁰ He has applied for a temporary student job at seven supermarkets in Macon, Georgia. He wants to arrange products on shelves, label them with prices, work in the warehouse – everything you do in a store to earn a few extra dollars while going to college. The activities are not excessively demanding, so he is half horrified, half incredulous when one rejection after the other arrives in his e-mail inbox. Behm is not invited to a single interview.

His father cannot understand the rejections either.

Скачать книгу