Source: siliconcanals.com
While millions of diagnostic examinations are carried out annually, chest X-rays play a vital role in diagnosing several diseases. But the usefulness of the same can be limited due to the challenges in interpretation that need thorough and rapid evaluation of 2D image depicting complex, 3D organs, and disease processes. Sometimes, major details can be missed by chest X-rays resulting in adverse outcomes for patients.
Recent efforts have improved lung cancer detection in radiology, differential diagnosis in dermatology, and prostate cancer grading in pathology. And, obtaining accurate clinical labels for the deep learning models for X-ray interpretation.
Most efforts have applied rule-based natural language processing to radiology reports or based on image review by readers. Eventually, both might introduce inconsistencies, that can be problematic at the time of model evaluation.
Deep learning models to resolve challenges!
In an effort to resolve this, researchers at Google devised Artificial Intelligence models to spot four findings on human chest X-rays. Advances in machine learning present an opportunity to create new tools to help experts interpret medical images. In the journal Radiology, the deep learning models for chest radiograph were published.
The team developed deep learning models for four important clinical finds such as pneumothorax (collapsed lungs), nodules, and masses, airspace opacities (filling of the pulmonary tree with material), and fractures. These were chosen in consultation with clinical colleagues and radiologists to focus on conditions that are critical for patient care.
These deep learning models were evaluated using several thousands of held out images from the dataset for which the high-quality labels have been collected using a panel-based adjudication process among radialogists who are certified by the board. Later, the held-out images have been reviewed independently by separate radiologists to make sure these are accurate.
The team leveraged more than 600,000 images sourced from two de-identified datasets. The first one was developed along with co-authors at the Apollo Hospitals and has a diverse set of chest X-rays gathered over several years from the hospital network across locations. The second one has been released publicly by the National Institutes of Health and served as a vital resource for machine learning efforts. But the same has limitations related to accuracy and clinical interpretation of available labels.
High-quality reference standard labels
In order to generate high-quality reference standard labels for model evaluation, the team has used a panel-based adjudication process. In this process, three radiologists reviewed the final tune and test set images and addressed disagreements via discussion. It let difficult findings that were only detected by a single radiologist to be detected and documented. Later, the discussions took place anonymously via an adjudication or online discussion system.
Google notes that while the models achieved an overall expert-level accuracy, the performance varied based on the corpora. For instance, the sensitivity to detect penumothorax among radiologists was nearly 79% for ChestX-ray14 images and just 52% for the same radiologists in other datasets.
The team hopes to lay the groundwork for exceptional methods with a corpus of the adjudicated labels for the ChestX-ray14 dataset that they have made available in open source. This comprises 2,412 training and validation set images and 1,962 test set images or 4,372 images in total.