Pulmonologists collaborate with explainable artificial intelligence for improved diagnostic interpretation of pulmonary function tests.

Authors

Nilakash Das, Sofie Happaerts, Iwein Gyselinck, Eric Derom, Mustapha Abdo, Guy Brussselle, FelipBurgos, Marco Contoli, AnhTuan Dinh-Xuan, Frits M.E. Franssen, Sherif Gonem, Neil Greening, Christel Haenebalcke, William D-C.Man, Jorge Moisés, Rudi Peche, VitaliiPoberezhets, Jennifer K Quint, Michael C. Steiner, Eef Vanderelst, Marko Topalovic, and Wim Janssens

Rationale

Few studies have investigated the collaborative potential between artificial intelligence (AI) and a pulmonologist for diagnosing disease. We hypothesized that a collaboration between a pulmonologist and an AI with explanations (explainable AI, XAI)) will be superior in diagnostic interpretation of pulmonary function tests (PFTs) than an individual pulmonologist without support.

Methods

The study was conducted in two phases, a single-centre P1 and a multi-centre P2. Each phase utilized two different sets of 24 PFT reports with gold-standard diagnoses. Each PFT was interpreted without (control) and then with XAI’s suggestions (intervention). Pulmonologists provided a preferential and up to 4 differential diagnoses. Primary endpoint compared accuracy of preferential and differential diagnosis between groips. Secondary endpoints were number of differential diagnoses, diagnostic confidence and inter-rater agreement. The XAI model utilized an earlier AI-based PFT interpretation model, for which Shapley values expressed reasoning for predictions.

Results

In P1 (N=16 pulmonologists), mean preferential and differential diagnostic accuracy significantly increased by 10.4% and 9.4%, respectively between groups (p<0.001). Improvements were highly significant (p<0.0001) in P2 (5.4% and 8.7% respectively, N=62). In both phases, number of differential diagnoses did not reduce, but diagnostic confidence and inter-rater agreement significantly increased in intervention. Pulmonologists updated their decisions in half of the cases, and consistently improved their baseline performance if AI provided correct predictions.

Conclusion

A collaboration between pulmonologist and XAI is superior at interpreting PFTs when compared to individual pulmonologists.