A Hybrid Residual Network and Long ShortTerm Memory
A Hybrid Residual Network and Long Short-Term Memory Method for Peptic Ulcer Bleeding Mortality Prediction S 42: Oral Presentation - Predictive Models in Health Care Qingxiong Tan, MS 1, Andy Jinhua Ma, Ph. D 1, 2, Huiqi Deng 1, 2, Vincent Wai-Sun Wong, MD 3, Yee-Kit Tse, MPhil 3, Terry Cheuk-Fung Yip, MPhil 3, Grace Lai-Hung Wong, MD 3, Jessica Yuet-Ling Ching, MPhil 3, Francis Ka-Leung Chan, MD 3 , Pong-Chi Yuen, Ph. D 1 1 Hong Kong Baptist University, Hong Kong, China 2 Sun Yat-Sen University, Guangzhou, China 3 The Chinese University of Hong Kong, China
Disclosure I and my spouse/partner have no relevant relationships with commercial interests to disclose. AMIA 2018 | amia. org 2
Learning Objectives After participating in this session the learner should be better able to: • Learn some characteristics of electronic health records (EHRs) dataset • Understand the idea of combining different types of clinical data to build models • Learn to apply deep learning methods to build models for clinical datasets AMIA 2018 | amia. org 3
Background • Ubiquitous electronic health records (EHRs) EHRs from hospital EHRs from mobile devices EHRs data • Mortality risk prediction • Detect patients with high-risk of negative outcomes • Evaluate early treatment effects and help design more effective treatment plan AMIA 2018 | amia. org 4
Challenges • Contain different types of data • • Static data and dynamic data Correlations between static data and dynamic data static data dynamic data more less • Irregular dynamic data • Varying time interval between consecutive records • Different number of sample points AMIA 2018 | amia. org 5
Existing Methods • Utilize single type of data [1, 2] × Use static data or dynamic data as input × Input data has limited valuable information • Combine static data and dynamic data [3, 4] × Simply concatenate static features and dynamic features × Ignore correlations between static data and dynamic data [1] Lipton ZC, Kale DC, Elkan C, et al. Learning to diagnose with LSTM recurrent neural networks. ar. Xiv: 1511. 03677, 2015. [2] Choi E, Bahadori MT, Schuetz A, et al. Doctor ai: Predicting clinical events via recurrent neural networks. In: Machine Learning for Healthcare Conference. 2016. [3] Che Z, Purushotham S, Khemani R, et al. Interpretable deep models for icu outcome prediction. In: AMIA Annual Symposium Proceedings. American Medical Informatics Association, 2016. [4] Esteban C, Staeck O, Baier S, et al. Predicting clinical events by combining static and dynamic information using recurrent neural networks. In: Healthcare Informatics (ICHI), 2016 IEEE International Conference on. IEEE, 2016. AMIA 2018 | amia. org 6
Motivations • Process irregular dynamic data • Transform dynamic data into equally spaced • Align dynamic data • Jointly model static data and dynamic data • Take both static data and dynamic data as input • Capture correlations between static data and dynamic data • Fuse different types of features • Design a deep feature fusion model to integrate different types of information AMIA 2018 | amia. org 7
Proposed Method • Propose a Hybrid Residual Network (Res. Net) and Long Short-Term Memory (LSTM) Method to jointly model static data and dynamic data AMIA 2018 | amia. org 8
Data Processing • Resample dynamic data • Transform dynamic data into equally spaced • Detect missing data phenomenon ü Missing data can reflect status of patient. E. g. , doctors often arrange for patients in good status to take fewer examinations to reduce their pain and expense. Regularly spaced data Original dynamic data. AMIA 2018 | amia. org 9
Data Processing • Alignment framework for large quantity of time series • Calculate a template for each type of dynamic data • Modify Dynamic Time Warping (DTW) to align time series to same length with the template Template and original series Template and aligned series ü The aligned series has same length with the template ü The aligned series still preserve its own important information AMIA 2018 | amia. org 10
Extracting Features • Deep Res. Net extracts correlation features • Res. Net has many convolution units • Each convolution unit jointly analyzes several variables ü Explore influence of adding static data extracted from dynamic data • Add mean value of every type of dynamic data (Static-II) • Further add missing data labels (Static-III) Spatial features • LSTM extracts temporal features ü Explore influence of width of moving window • 6 months (Frequency-I) • 12 months (Frequency-II) AMIA 2018 | amia. org Temporal features 11
Information Fusion • Deep feature fusion model • Need to fuse two totally different types of feature • Propose a Multi-Residual Multi-Scale feature fusion model • Different sizes of convolution units ü Fuse features at multiple scales ü Increase width of the network • Multi-Residual learning structure ü More residual information for model training ü Effectively fuse different features AMIA 2018 | amia. org 12
Experimental Results • Dataset: PUB dataset with 6, 367 patients (50% train and 50% test, 35 types of static data and 7 types of dynamic lab test results). • Task: predict whether patient dies within 10 years after diagnosing with PUB disease. Method Res. Net Input Static-III (Frequency-I) Static-III (Frequency-II) AUC [95% CI] 0. 8683 [0. 8553 to 0. 8813] 0. 8937 [0. 8821 to 0. 9057] 0. 9234 [0. 9141 to 0. 9331] 0. 9080 [0. 8969 to 0. 9182] Method LSTM Logistic Regression Random Forests Input Dynamic (Frequency-I) Dynamic (Frequency-II) AUC [95% CI] 0. 8524 [0. 8488 to 0. 8561] 0. 8495 [0. 8447 to 0. 8543] 0. 7099 [0. 7034 to 0. 7164] 0. 6670 [0. 6600 to 0. 6741] Method Hybrid method Method in reference [3] Method in reference [4] Our method Static-III + Dynamic Input (Frequency-II) (Frequency-I) AUC [95% CI] 0. 9173 [0. 9077 to 0. 9264] 0. 9111 [0. 9000 to 0. 9219] 0. 9200 [0. 9101 to 0. 9296] 0. 9353 [0. 9261 to 0. 9440] Static-I: original static data; Static-II: Static-I and mean values; Static-III: Static-II and missing data labels. Frequency-I: resampling frequency is twice a year; Frequency-II : resampling frequency is once a year. AMIA 2018 | amia. org 13
Experimental Results Method Res. Net Input Static-III (Frequency-I) Static-III (Frequency-II) AUC [95% CI] 0. 8683 [0. 8553 to 0. 8813] 0. 8937 [0. 8821 to 0. 9057] 0. 9234 [0. 9141 to 0. 9331] 0. 9080 [0. 8969 to 0. 9182] Method LSTM Logistic Regression Random Forests Input Dynamic (Frequency-I) Dynamic (Frequency-II) AUC [95% CI] 0. 8524 [0. 8488 to 0. 8561] 0. 8495 [0. 8447 to 0. 8543] 0. 7099 [0. 7034 to 0. 7164] 0. 6670 [0. 6600 to 0. 6741] Method Hybrid method Method in reference [3] Method in reference [4] Our method Static-III + Dynamic Input (Frequency-II) (Frequency-I) AUC [95% CI] 0. 9173 [0. 9077 to 0. 9264] 0. 9111 [0. 9000 to 0. 9219] 0. 9200 [0. 9101 to 0. 9296] 0. 9353 [0. 9261 to 0. 9440] Static-I: original static data; Static-II: Static-I and mean values; Static-III: Static-II and missing data labels. Frequency-I: resampling frequency is twice a year; Frequency-II : resampling frequency is once a year. • Mean value and missing data labels improve final results • Narrower resampling moving window preserves more information • Fusion of correlation information and temporal information is useful AMIA 2018 | amia. org 14
Experimental Results Method Hybrid method Method in reference [3] Method in reference [4] Our method Static-III + Dynamic Input (Frequency-II) (Frequency-I) AUC [95% CI] 0. 9173 [0. 9077 to 0. 9264] 0. 9111 [0. 9000 to 0. 9219] 0. 9200 [0. 9101 to 0. 9296] 0. 9353 [0. 9261 to 0. 9440] • Our method outperforms these two state-of-the-art models built using both static data and dynamic data. ü Utilize a deep Res. Net to capture correlation information between static data and dynamic data. ü Design a new deep feature fusion model to fuse totally different types of feature. [3] Che Z, Purushotham S, Khemani R, et al. Interpretable deep models for icu outcome prediction. In: AMIA Annual Symposium Proceedings. American Medical Informatics Association, 2016. [4] Esteban C, Staeck O, Baier S, et al. Predicting clinical events by combining static and dynamic information using recurrent neural networks. In: Healthcare Informatics (ICHI), 2016 IEEE International Conference on. IEEE, 2016. AMIA 2018 | amia. org 15
Summary • A hybrid Res. Net and LSTM method is proposed to jointly model static data and dynamic data to precisely predict mortality risk of PUB patients. • The proposed method achieves better prediction results than other methods. Therefore, it can better evaluate health status of patients, assist doctors to design more effective treatment plan and improve health care quality. • Future work: • Integrate some text datasets, such as diagnosis records, to introduce other important information about status of patients. • Perform subtyping analysis on bigger clinical datasets to build personalized prediction model for each subgroup. AMIA 2018 | amia. org 16
AMIA is the professional home for more than 5, 400 informatics professionals, representing frontline clinicians, researchers, public health experts and educators who bring meaning to data, manage information and generate new knowledge across the research and healthcare enterprise. AMIA 2018 | amia. org @AMIAInformatics @AMIAinformatics Official Group of AMIA @AMIAInformatics #Why. Informatics 17
Thank you! Email me at: csqxtan@Comp. HKBU. Edu. HK
- Slides: 18