Deferring Decisions to AI

Would you let an AI make decisions for you? With all the buzz about AI and calls for its regulation (apparently AI will kill us all some day), I’ve been thinking about under what circumstances we might delegate decision making in certain areas to machines (call it AI). That’s not as crazy as it sounds, because mankind has been doing that for years already.

We’re deferring decisions on who gets to drive next at intersections to traffic lights. You may argue that that’s not AI, but there are several newer models that control the flow of traffic in clever ways, use various methods of detecting vehicles waiting and more. Not only are we deferring the decision there, we even automatically punish people that run red lights (photo red light cameras) and more so if harm to others is caused.

Credit scoring is, in general, a risk estimate for non-payment of the loans. In the past, credit reporting has been a deeply personal practice. Companies secured loans by asking well-regarded neighbors to vouch for the character of the borrower. These reports were obviously incredibly subjective as outlined here.

Today’s aircraft pilots are in charge of take-off and landing, and spend the rest of their time supervising the flight while the computer is in control of the trajectory of an aircraft. In general, autopilot is like an additional crew member helping out, allowing the pilot to keep an eye on the higher level things (oil pressure, engine, or check for better winds or a smoother altitude). Before autopilot (see here), the job of a pilot was way more exhausting, probably increasing the risk of accidents. Autopilot dates back to the early 1920s, and nowadays we probably can’t imagine flying without it.

These are some good examples of when decisions have been delegated to machines successfully. If you read the news you probably come across articles discussing self driving cars, discussing a variety of hypothetical scenarios of whether the cars should be able to solve moral dilemmas deciding who to run over if the brakes fail. Frankly I think these scenarios are too far fetched, and I doubt most humans are able to make good split second decisions in these scenarios either.

An interesting question now is who to believe in critical situations — do you listen to humans, or do you listen to the machine? That’s not easy to answer. For example, the Überlingen mid-air collision could have likely been avoided had the pilots followed the instructions of the automatic collision avoidance system instead of ground control.

Another domain to consider is psychological testing for mental health problems. Mental health problems are harder to diagnose than people give professionals credit for. A guy named Paul Meehl has done a lot of work in this area and, as early as 1954, postulated that mechanical decision making results in better and more reliable diagnosis compared to clinical diagnosis. Meta-analyses comparing clinical and mechanical prediction efficiency have added support to the hypothesis that algorithms outperform clinical predictions (1, 2).

Can we leave supervision of patients to machines? It seems humans aren’t doing a good enough job, and the FDA just approved a system that allows for constant patient monitoring, analyzed and delivered to hospital staff in real time, to help prevent unexpected deaths in the hospital. The clinical trials look very promising.

So with all these examples where delegating decisions to machines has already been working out well, at what point should we delegate? What if there are simple systems that match or outperform human decision making in every single instance? What if only most of the time? Should we replace humans whenever machines can do the job and outperform humans? Should humans just supervise and intervene? Self-driving cars anyone? For simplicity let’s consider a simple classification problem – say, sick vs healthy. Even if the machine isn’t 100% correct or it is hard to understand why a particular prediction was made, what if human supervision (or meddling with) makes for worse outcomes? At what point would it make sense to have the machine do it all? I don’t have a good, final answer to this question, but I think a lot of will depend on human perception of the machines performance.

I find this very similar to a business trying to hire competent people. Frequently we find ourselves in the situation of having to hire lawyers, accountants and various professionals for tasks we have very limited understanding of. Sure, you can hire a family lawyer to help prepare a prenuptial agreement, for example. If you hire a poor lawyer, you won’t know until later that the agreement is deemed unenforceable in court. Bad lawyers may summarize or explain the law to you in ways that are just slightly more efficient than reading it yourself (but at least they cannot be sued themselves for giving the wrong advice), give impractical advice and so on. Analogous to any other kind of expert you may hire for an area that you have no expertise in. I’d say in this day and age we use reviews on the internet, word of mouth, letters of recommendation and reputation of the college (in hiring situations) and other heuristics to identify the competent experts to entrust our problems to. A lot of hiring in real life is more about track record and personality fit than pure ability. I can see user experience being a factor in picking and trusting an AI as well.

Some people we hire have a fiduciary duty and can be held accountable, others don’t have to follow such a standard (licensed investment advisors vs. brokers, etc.). If you have your robot assistants gadget order new soap (without specifying which), then is it going to order one that is best for you, or one with the largest profit margin for the company subsidizing the software and cloud resources? These will be questions that will require more discussion in the near future.

I haven’t thought this through 100%, but I think it would be useful to refer to past performance of AI or algorithms. We can measure the performance of AI/prediction systems, compare the AIs performance to the performance of humans, and then decide. Unlike human decision making, AIs can be interrogated, tested, audited and refined on millions of examples if needed. Statistics gives us tools to test these systems properly, and if found to be better by some margin (how much?), I think they should be used in place of humans. While many people worry about the dangers of leaving decisions to algorithms and AI, it’s important to keep in mind the instances where AI and automation is already being successfully used in our lives.

Comments are closed.