Abstract:
Cardiac auscultation is a critical component of most primary care examinations; however, this screening procedure is currently infeasible in telehealth settings because it requires that physicians be physically co-located with patients in order to operate a stethoscope. We address this gap with EarSteth — a system that leverages consumer-grade active noise-cancelling earbuds to reconstruct cardiac auscultation audio. The system processes audio captured by the earbuds’ inner microphone with a CNN-based model architecture called EarStethNet that is specifically designed to reconstruct audio similar to what would be produced by a digital stethoscope during cardiac auscultation. We trained EarStethNet using synchronous audio collected from 15 healthy adult participants using an earbud and a digital stethoscope. We then evaluated EarStethNet’s outputs in terms of spectral similarity and accuracy of common cardiac cycle timing metrics. When comparing our approach to two state-of-the-art super-resolution models, we found that our EarStethNet achieved a mean spectral distance that was 0.5~dB lower, was preferred by clinicians, and achieved a mean absolute error in estimating interbeat interval that was 24~ms smaller compared to the best-performing baseline model.