Advancing AI in health care: it’s all about trust
- by 7wData
Three years ago, artificial intelligence pioneer Geoffrey Hinton said, “We should stop training radiologists now. It’s just completely obvious that within five years, deep learning is going to do better than radiologists.”
Today, hundreds of startup companies around the world are trying to apply deep learning to Radiology. Yet the number of radiologists who have been replaced by AI is approximately zero. (In fact, there is a worldwide shortage of them.)
At least for the short term, that number is likely to remain unchanged. Radiology has proven harder to automate than Hinton — and many others — imagined. For medicine in general, this is no less true. There are many proofs of concept, such as automated diagnosis of pneumonia from chest X-rays, but surprisingly few cases in which deep learning (a machine learning technique that is currently the most dominant approach to AI) has achieved the transformations and improvements so often promised.
To begin with, the laboratory evidence for the effectiveness of deep learning is not as sound as it might seem. Positive results, when machines using AI outdo their human counterparts, tend to get considerable media attention while negative results, when machines don’t do as well as humans, are rarely reported in academic journals and get even less media coverage.
Meanwhile, a growing body of literature shows that deep learning is fundamentally vulnerable to “adversarial attacks,” and is often easily fooled by spurious associations. An overturned school bus, for example, might be mistaken for a snowplow if it happens to be surrounded by snow. With a few pieces of tape, a stop sign was altered so a deep learning system mistook it for a speed limit. If these sorts of problems have become well-known in the machine learning community, their implications are less well-understood within medicine.
For example, deep-learning algorithms trained on X-ray images to make diagnostic decisions can easily detect the imaging machine used to make the images. Consider this hypothetical situation: two different models of x-ray machine are used in a hospital — one portable, one installed in a fixed location. Patients who are bedridden due to their conditions must be imaged at the bedside using the portable machine. That means the choice of machine becomes correlated with the presence of the condition. And since the AI algorithm is highly sensitive to which machine was used, it may inadvertently mistake information that is machine-specific about the underlying condition. The same algorithm applied in a hospital that always uses the portable machine may produce confounded decisions.
In truth, deep learning is deep only in a narrow, technical sense — how many “layers” of quasi-neurons are used in a neural network — not in a conceptual sense. Deep-learning systems excel at finding associations within the training data, but have no ability to differentiate what is causally relevant from what is accidentally correlated, like fuzz on an imaging device. Spurious associations can wind up being heavily over-weighted.
In diagnosing skin cancer from images, for example, a dermatologist might use a ruler to size a lesion only if he or she suspects it is cancerous. In this way, the presence of a ruler becomes associated with a cancer diagnosis in the image data. An AI algorithm may well leverage this association, instead of the visual appearance of the lesion, to make cancer decisions. But rulers aren’t actually causing cancer, meaning the system can easily be misled.
Radiology is not just about images.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More