High-Quality FVC Through Automated AI-based Detection of Early Termination in Spirometry: A Comparative Analysis
Rationale: Accurate forced vital capacity (FVC) measurements are essential for correct interpretation of spirometry. The updated ATS/ERS–technical standards for spirometry require at least one of three end-of-forced expiration (EOFE) criteria to be met. Repeatable FVC in the absence of early termination is one way to meet the EOFE acceptability criteria. However, two FVC–measures can be repeatable but underestimate FVC if early termination consistently occurs, leading to an overestimation of FEV1/FVC which may miss airflow obstruction. Additionally, underestimated FVCs may falsely suggest a restrictive impairment. The problem is that there is no analytic definition of early termination and the decision is left to subjective reviewer judgement.
Methods: We reanalyzed a dataset containing 32 spirometry sessions with flow-volume and volume-time graphs from a total of 126 individual manoeuvres. This dataset represented healthy subjects, as well as COPD, asthma and ILD patients. The graphs included a range of characteristics, from complete exhalation to very obvious early termination. Two independent reviewers (Experts A and B) analysed each graph and indicated which of them appeared to be early termination. In parallel, an artificial intelligence tool (ArtiQ.QC, v1.6.0) was used to automatically generate labels of abrupt termination We compared the agreement in labels assigned between the two reviewers and between each reviewer and ArtiQ.QC using Cohen’s kappa.
Results: Expert A and Expert B identified early termination in 26% (N=33) and 29% (N=37) of the graphs respectively. The inter-reader agreement was fair with a Cohen’s kappa of 0.31 (70% of agreement). ArtiQ.QC detected early termination in 26% of the graphs (N = 32), with agreement of 72% with Expert A and 81% with Expert B. Notably, 19% (N=6/32) of the graphs with early termination detected by ArtiQ.QC would have met EOFE criteria (repeatable FVC) according to the 2019 standards.
Conclusions: These findings demonstrate the limitations of manual over-reading of spirometry and the inconsistencies in human quality control. An abrupt termination can easily be missed, potentially leading to underestimated FVCs. A clear identification of abrupt termination that can be applied prospectively will help improve the validity of measurements. ArtiQ.QC was consistent with expert over-readers and was found to be more consistently applied. By offering instant feedback, operators can choose to conduct additional trials, ensuring complete exhalation with added attention. ArtiQ.QC can improve the quality and reliability of spirometry, ultimately contributing to more accurate respiratory disease diagnoses, tailored treatment strategies and data quality in clinical research.
Authors: Paul Desbordes1, Benoit Cuyvers1, Marko Topalovic1, Dr Brian Graham2, Dr Sanja Stanojevic3
Affiliations:
1 ArtiQ NV, Leuven, Belgium
2 Division of Respirology, Critical Care and Sleep Medicine, University of Saskatchewan, Saskatoon, SK, Canada
3 Department of Community Health and Epidemiology, Dalhousie University, Halifax, Nova Scotia, Canada