PyHealth: A Deep Learning Toolkit For Healthcare Applications

Abstract

Thanks to various successful modeling techniques and the increasing availability of large medical data, deep learning has emerged as a promising tool in healthcare applications. However, the reproducibility of many studies in this field is limited by the lack of accessible code implementations and standard benchmarks. To address this issue, we create PyHealth, a comprehensive library to build, deploy, and validate deep learning pipelines for healthcare applications. PyHealth supports various data modalities, including structured electronic health records (EHRs), physiological signals, medical images, and clinical text. It offers a wide range of advanced deep learning models, and maintains a comprehensive medical knowledge base with multiple coding systems. The library is designed to support both machine learning researchers and clinical data scientists. This tutorial will first provide an overview of the entire health analytic pipeline in PyHealth. Then, We will present different modules and showcase their functionality through hands-on demos. Finally, we will utilize PyHealth to build a complete healthcare clinical predictive model, from data processing to model training and evaluation. During the session, participants can follow along and gain hands-on experience on the Google Colab platform. Our tutorial offers valuable resources, including our website, GitHub repository, documentation, and YouTube playlist. Upon the time of writing, our GitHub repository has already attracted 591 stars, 121 forks, and 13k+ downloads in total.

Publication
ACM SIGKDD 2023