Evaluations

We evaluate commercial AI products using the TB REACH-CXR Archive, a de-identified evaluation database of 30,957 postero-anterior chest radiographs collected through 4 projects funded by the Stop TB Partnership’s TB REACH initiative between 2015-2020. These are privately held for the purposes of independent and external validation of commercial CAD products. These projects and analyses were funded by Global Affairs Canada.

Early user experience and lessons learned using ultra-portable digital X-ray with computer-aided detection (DXR-CAD) products: A qualitative study from the perspective of healthcare providers

In this qualitative study assessed early implementers’ experiences and lessons learned when using ultra-portable (UP) DXR systems integrated with CAD software to screen and triage TB.

Study findings suggest that UP DXR with CAD were overall well received to decentralize radiological assessment for TB, however, the improved portability involved programmatic compromises. Main barriers to the uptake included the insufficient capacity and lack of guidance on radiation protection suitable for UP DXR.

User perspectives on the use of X-rays and computer-aided
detection for TB

We conducted a survey (using a self-administered questionnaire) of TB project implementers who use X-ray for TB screening (with or without CAD) from September to October 2021.

Despite the limited response rate and the potential bias introduced by only contacting implementers known to the authors, our survey provides insight into experiences of using CAD. Users testified that CAD enabled high throughput, accurate TB screening, with enormously reduced turnaround times, and demonstrated a clear preference for the abnormality score output.

Computer aided detection of Tuberculosis from chest radiographs in TB prevalence survey: External validation and modelled impacts of commercially available artificial intelligence software

We evaluated and compared 11 CAD products on a case-control sample of 774 participants from a recent TB prevalence survey in South Africa by comparing area under the receiver operating characteristic curve (AUC) against microbiological evidence. Threshold analyses were performed based on pre-defined selection criteria and across all possible thresholds. We conducted subgroup analysis stratified by age, gender, HIV status, prior TB history, presence of symptoms, and smoking status.

Many CAD software performed well, achieving high AUCs, although only one met the WHO target product profile of 90% sensitivity and 70% specificity for a TB triage test.

Comparing different versions of computer-aided detection products when reading chest X-rays for tuberculosis

We comprehensively compared the performance of the newest versions of two CAD (CAD4TB and qXR) to their WHO-evaluated predecessors. We used a case control sample of 12,890 chest X-rays to compare performance and model the programmatic effect of upgrading to the newer versions. We found that both newer versions significantly improved upon their predecessor’s ability to detect TB, performing better than the human readers. We also showed that the AI underlying new software versions can differ remarkably from the old and resemble an entirely new product altogether. We further demonstrate that, unlike laboratory diagnostic tools, CAD software updates could significantly impact the selection of appropriate threshold scores, the number of people with TB detected and cost-effectiveness. Our results underscore the need for rapid evidence generation to evaluate newer CAD versions in the fast-growing medical AI industry.

Tuberculosis detection from chest x-rays for triaging in a high tuberculosis-burden setting: an evaluation of five artificial intelligence algorithms

We evaluated five commercially available AI algorithms for TB triage using 23,954 chest X-rays collected at TB screening centers in Bangladesh and Xpert MTB/RIF as the reference standard. All CAD outperformed experienced Bangladeshi radiologists and could halve the number of Xpert tests required while sensitivity remained greater than 90%. All algorithms performed worse in older age groups and those with a history of TB. Finally, we proposed a new evaluation framework for CAD to better inform selection of threshold scores.

This evaluation showed that CAD can be a highly accurate triage tool for TB.

A new resource on artificial intelligence powered computer automated detection software products for tuberculosis programmes and implementers

We conducted a landscape analysis to collect information from developers known to have, or soon to have, a CAD product for TB. We identified 27 CAD developers and 11 completed our survey with details about the certification, deployment, operational characteristics, input requirements, output format, pricing, and data privacy of their latest product version. For each response, a summary product profile was created based on the information provided and these were published on an open-access website: ai4hlth.org.

CAD products are constantly being improved and the site will continuously be updated to account for updates and new products. This unique online resource aims to inform the TB community about available CAD tools and enable TB programmes to identify the most suitable product to incorporate in interventions.

Using artificial intelligence to read chest radiographs for tuberculosis detection: A multi-site evaluation of the diagnostic accuracy of three deep learning systems

We conducted a retrospective evaluation of three CAD systems (CAD4TB, Lunit INSIGHT, and qXR) for detecting TB-associated abnormalities in chest radiographs from outpatients in Nepal and Cameroon. All 1196 individuals received a Xpert MTB/RIF assay and a CXR read by two groups of radiologists and CAD. Xpert was used as the reference standard. When matching the sensitivity of the radiologists, the specificities of the DL systems were significantly higher except for one.

Using CAD systems to read CXRs could reduce the number of Xpert MTB/RIF tests needed by 66% while maintaining sensitivity at 95% or better. CAD should be considered by TB programs where human resources are constrained, and automated technology is available.