Text Mining to Improve the Health of Millions of Citizens
- by 7wData
Doctors face daily decisions about the best care for their patients, and their own clinical experience can be enhanced using evidence-based medicine, such as through clinical trial data. As David Tovey, Editor-in-Chief, Cochrane, explained, “Before evidence-based medicine came along, people were reliant on the expertise of a doctor, the level of knowledge or understanding that he or she had. And this meant that treatments frequently took many, many years to come from research into practice.”
One of the most robust ways of synthesizing research evidence across healthcare trials is through a systematic review. This involves finding, examining, and analyzing clinical trial data and research reports in a methodical way, to pull together high-quality summaries of how effective healthcare interventions are. This provides critical evidence to decision-makers at the international, national and local level, to make sure citizens receive the medical and social care they deserve. While this is a rigorous approach, it can take up to three years to produce a major systematic review, which limits our ability to use up to date research to guide decision-making.
Cochrane is a not-for-profit organization that creates, publishes and maintains systematic reviews of health care interventions, with more than 37,000 contributors working in 130 countries. The Cochrane Transform Project is using AI and Machine Learning to text mine thousands of reports to automatically select ones to include in systematic reviews. This saves weeks of monotonous work, freeing up the expert reviewers to spend their time and energy on high-level analysis. Researchers at University College London are using Azure Machine Learning to develop and deploy their text mining classifiers as a cloud service at scale, customized for different clinical assessment groups, in ways that were previously impossible. This is helping to make decisions around healthcare interventions faster and more accurate for millions of people around the world.
The evidence pipeline developed by Cochrane is a ‘surveillance’ system that helps Cochrane find relevant research as soon as it is published. Research enters the pipeline through routine and specified searches of the health and social care literature and is then classified using machine learning. The three key types of classifier are grouped per:
The first stage in the pipeline is to identify research studies that are Randomized Control Trials (RCTs), so that we can filter out irrelevant studies quickly. To build these classifiers a training dataset was created using the Cochrane Crowd citizen science platform that enables anyone to contribute by helping to categorize medical research. A classifier was built using more than 300,000 records from Cochrane Crowd, including over 30,000 clinical trials. 60-80% of the studies have scores less than 0.1, so if we trust the machine, and automatically exclude these citations, we’re left with 99.897% of the RCTs (i.e. we lose 0.1% but make significant gains in terms of manual workload reduction).
Azure Machine Learningis used to provide text mining AI capabilities to speed up reviewing of clinical trial reports and research papers on healthcare interventions. The team easily moved their existing research methods in R to the cloud with Azure ML. A key advantage is that they can quickly create customized ML models for different end-users, e.g. groups looking at different clinical/medical conditions. “We’ve got a series of different classifiers which are running up on the Azure Machine Learning platform, where we prospectively, narrow the scope of what a particular citation is looking at. We have a study type classifier – the RCT classifier.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More