Identification of Severe Acute Exacerbations of Chronic Obstructive Pulmonary Disease Subgroups by Machine Learning Implementation in Electronic Health Records

Huan Li

Return to Current Issue | RSS Options

Original Research

Identification of Severe Acute Exacerbations of Chronic Obstructive Pulmonary Disease Subgroups by Machine Learning Implementation in Electronic Health Records

Huan Li, MS¹ John Huston, MD¹ Jana Zielonka, MD¹ Shannon Kay, MD, MS¹ Maor Sauler, MD¹ Jose Gomez, MD, MS^1,2

Author Affiliations |
Correspondence |
Abstract |
Citation |
Keywords |
Plain Language Summary |
PDF

Author Affiliations

Pulmonary, Critical Care and Sleep Medicine Section, Yale University, New Haven, Connecticut, United States
Center for Precision Pulmonary Medicine, Yale University, New Haven, Connecticut, United States

Address correspondence to:

Jose Gomez, MD
Yale University
300 Cedar Street (S419 TAC)
New Haven, CT 06520-8057
Phone: (203) 785-4163
Email: jose.gomez-villalobos@yale.edu

Abstract

Rationale: Acute exacerbations of chronic obstructive pulmonary disease (AECOPDs) are heterogeneous. Machine learning (ML) has previously been used to dissect some of the heterogeneity in COPD. The widespread adoption of electronic health records (EHRs) has led to the rapid accumulation of large amounts of patient data as part of routine clinical care. However, it is unclear whether the implementation of ML in EHR-derived data has the potential to identify subgroups of AECOPD.

Objectives: To determine whether ML implementation using EHR data from severe AECOPDs requiring hospitalization identifies relevant subgroups.

Methods: This study used 2 retrospective cohorts of patients with AECOPDs (non-COVID-19 and COVID-19) treated at Yale-New Haven Hospital. K-means clustering was used to identify patient subgroups.

Measurements and Main Results: We identified 3 subgroups in the non-COVID cohort (n=1736). Each subgroup had distinct clinical characteristics. The reference subgroup was the largest (n=904), followed by cardio-renal (n=548) and eosinophilic (n=284). The eosinophilic subgroup had milder severity of AECOPD, including a shorter hospital stay (p<0.01). The cardio-renal subgroup had the highest mortality during (5%) and in the year after hospitalization (30%). Validation of the severe AECOPD classifier in the COVID-19 cohort recapitulated the characteristics seen in the non-COVID cohort. AECOPD subgroups in the COVID-19 cohort had different interleukin (IL)-1 beta, IL-2R, and IL-8 levels (false discovery rate ≤ 0.05). These specific leukocyte and cytokine profiles resulted in inflammatory differences between the AECOPD subgroups based on C-reactive protein levels.

Conclusions: Incorporating ML with EHR data allows the identification of specific clinical and biological subgroups for severe AECOPD.

Citation

Citation: Li H, Huston J, Zielonka J, Kay S, Sauler M, Gomez J. Identification of severe acute exacerbations of chronic obstructive pulmonary disease subgroups by machine learning implementation in electronic health records. Chronic Obstr Pulm Dis. 2024; 11(6): 611-623. doi: http://doi.org/10.15326/jcopdf.2024.0556

Running Head: Phenomapping Severe COPD Exacerbations

Funding Support: R01 HL153604, and R03 HL154275 to JLG. This publication was made possible by CTSA Grant Number UL1 TR000142 from the National Center for Advancing Translational Science (NCATS), a component of the National Institutes of Health (NIH). National Heart, Lung, and Blood Institute, 2T32HL007778-26 to support JZ and SK. R01 HL155948 to MS. Manuscript contents are solely the responsibility of the authors and do not necessarily represent the official view of the NIH.

Date of Acceptance: October 13, 2024 | Publication Online Date: October 17, 2024

Abbreviations: ABG=arterial blood gas; AECOPD=acute exacerbations of COPD; AIC=Akaike information criterion; BMI=body mass index; BUN=blood urea nitrogen; CRP=C-reactive protein; COPD=chronic obstructive pulmonary disease; EHR=electronic health record; FDR=false discovery rate; HF=heart failure; HR=hazard ratio; ICS/LABA=inhaled corticosteroid/long-acting beta2-agonist combination; ICU=intensive care unit; IL=interleukin; ML=machine learning; pro-BNP=probrain natriuretic peptide; T2=type 2

Online Supplemental Material: Read Online Supplemental Material (457KB)

Introduction

Chronic obstructive pulmonary disease (COPD) is heterogeneous.^1-3 Factors involved in heterogeneity include clinical characteristics, distinct pathobiological characteristics, including types of inflammation, genetic factors, and treatment response. The emergence of the concept of endotypes has led to the development of novel disease classification models.⁴ Acute exacerbations of COPD (AECOPDs) also exhibit this heterogeneity, which can be related to the baseline characteristics of the subgroups or to the triggers for exacerbations.⁵

Severe AECOPDs that require hospitalization are associated with significant morbidity and mortality, in addition to significant health care expenses.⁶ Furthermore, the Centers for Medicare and Medicaid Services has established a Hospital Readmission Reduction Program that penalizes hospitals that have high readmission rates for COPD.⁷ All these factors underscore the impact of severe AECOPDs on patients and the health care system and underscore the importance of understanding the heterogeneity associated with severe exacerbations.

Several studies have shown the ability of machine learning (ML) methods to identify discrete groups in COPD.

COPD subgroups have been identified by: (1) cytokine profiles²; (2) a combination of clinical data including comorbidities⁸; (3) a combination of clinical, physiologic, and imaging data¹; and (4) imaging,⁹ among others. A new eosinophilic endotype of COPD has also been identified thanks to advances in our understanding of COPD pathobiology.^10,11 Despite these important observations, the highly selected cohorts used to obtain these insights may not reflect the overall COPD patient population.

The 2009 Federal Health Information Technology for Economic and Clinical Health Act led to the creation of an incentive program to encourage hospitals and health care providers to adopt electronic health records (EHRs). Currently, more than 95% of U.S. hospitals have adopted EHRs.¹² As a result of EHR adoption, the volume of health care data has increased exponentially¹³ from 153 exabytes in 2013 to 2314 exabytes in 2020. This massive increase in data encodes millions of health care encounters and creates a crucial opportunity to transform patient care. The concept of computable phenotypes, defined as clinical conditions or characteristics that are derived from a computerized query using a defined set of data elements,¹⁴ has gained significant attention as a result. By leveraging EHR data, clinical decision-making in COPD can be informed by novel computational applications.

Identifying disease subgroups and potential disease endotypes using EHR data may help focus therapeutic efforts on COPD exacerbations. The purpose of this study was to determine whether the combination of EHR data and ML in hospitalizations for severe AECOPDs could identify specific subgroups of patients characterized by differences in clinical outcomes.

Methods

Original Cohort Data Source and Study Population

We conducted a retrospective cohort study using data collected from patients hospitalized at Yale-New Haven Hospital between September 30, 2012, after the Epic EHR system (Epic; Verona, Wisconsin) was implemented, and December 31, 2017. The Yale University Human Research Protection Program approved this study and ethical approval was obtained from the Yale Institutional Review Board under a Waiver of Consent. We have previously described this cohort.¹⁵ Data were obtained from the Joint Data Analytics Team at Yale University School of Medicine.

COVID Cohort

The Yale Department of Medicine COVID-19 Explorer and Repository tool was used to extract data on patients admitted with COVID-19 from March 1, 2020, to April 1, 2021, in Yale-New Haven Health System hospitals.¹⁶ The patients had a positive test for SARS-CoV-2 using reverse transcriptase–polymerase chain reaction assays performed on nasopharyngeal swab specimens within 14 days after admission.

Clustering

To use the unsupervised learning k-means clustering method, we preprocessed the non-COVID-19 data. We identified those features with missing values and removed them to ensure that the training process was unbiased and free of unnecessary noise. This led to a data frame with 1736 observations and 52 features, including the unique identifier. We did not use imputation for the selected features and only used complete data. The numerical features (24) were normalized, while the string features (27) were one hot encoded. We utilized an autoencoding deep learning technique to enhance the efficiency of k-means clustering on datasets by reducing the datasets’ dimensions to 3. Prior to training the k-means clustering model, we employed the NbClust Package in R to determine the optimal number of clusters. Once the number of clusters was identified, we divided the data into 80% for training and 20% for testing purposes.

Classifier

An XGBoost classifier was developed using the multi:softmax objective function to target the subgroup labels obtained from the previous k-means clustering. The same data processing methods were applied, and the data was divided into 80% for training and 20% for testing. A Grid Search was conducted with 5-fold cross-validation to identify the best hyperparameters for the classifier. The trained classifier was then saved and later applied to the COVID-19 cohort. The classifier code is included in the supplementary material in the online supplement.

Statistical Analysis

The R statistical software was used for statistical analyses. Significance was defined as p<0.05 and false discovery rate (FDR) < 0.05. STROBE guidelines for cohort studies were followed in the preparation of this report. Additional methods are described in the supplementary material in the online supplement.

Results

Identification of the COPD Subgroups

To identify subgroups characterized by specific clinical features, we applied k-means, an unsupervised clustering method, to clinical data from 1736 patients admitted to the hospital for a severe AECOPD. We used 51 features to implement this clustering method. The resulting subgroups were characterized by clinical similarities. We identified 3 distinct subgroups in the resulting analysis (Table 1). Across all 3 subgroups, sex and absolute monocyte counts were similar, suggesting that sex or monocytes were not key factors in this classification.

Clinical Characteristics of the Acute Exacerbation of COPD Subgroups

The largest subgroup (n=904, 52%) was mainly composed of former smokers (69%), with the highest rates of comorbid hypertension of all subgroups (94%). Half of these patients were diagnosed with heart failure (50%) or diabetes (54%). This subgroup was also characterized by the highest inpatient administration of inhaled corticosteroid/long-acting beta2-agonist combination (ICS/LABA), antibiotics, and systemic steroids. As the most prevalent subgroup, it will be treated as a reference herein.

The patients in the second largest subgroup (n=548, 32%) were the oldest (77 years [70–87]) and had the lowest body mass index (BMI) of the 3 subgroups (25.6 kg/m² [21.8–31.1]). This subgroup was notable for the highest rates of heart failure (62%) and chronic kidney disease (42%). This subgroup had the lowest systemic steroid administration rate (73%) and ICS/LABA (53%) of the 3 subgroups but had similar rates of antibiotic use to the reference subgroup (87%). Given the high rates of heart failure and renal failure, this subgroup will be described as cardio-renal hereafter.

The third and smallest subgroup (n=284, 16%) had the youngest patients (61 years [54–72]) and the highest rate of active smokers (52%). Subgroup 3 had the lowest rates of heart failure (38%) and chronic kidney disease (23%), but the highest rates of allergic rhinitis (12%) in the 3 subgroups. This subgroup also had the lowest antibiotic administration rates (77%). Consistent with the high rates of active smoking, subgroup 3 had the highest rate of nicotine replacement during hospitalization (44%).

Subgroups of Acute Exacerbation of COPD Exhibit Distinct Blood Chemistry and Complete Blood Counts

Although blood chemistries were not used to identify the COPD subgroups, we were interested in exploring whether the cardio-renal subgroup also showed abnormal markers of cardiac and renal function. We compared the values of probrain natriuretic peptide (pro-BNP), blood urea nitrogen (BUN), and creatinine values from patients in the 3 subgroups. The cardio-renal subgroup had the highest combined pro-BNP, BUN, and creatinine values of the 3 subgroups (Figure 1A-C).

In contrast to blood chemistries, complete blood count values were used to identify COPD subgroups. Consequently, white blood cell, neutrophil, lymphocyte, basophil, and eosinophil counts significantly differed among the subgroups (Table 1). Subgroup 3 was characterized by the lowest neutrophil counts (5400 cells/microliter [4000–7300]), and highest blood lymphocyte (2325 cells/microliter [1637–3,039]) and eosinophil counts (337 cells/microliter [96–396])(Figure 1 D-F). Due to the increasing recognition that eosinophils are a major risk factor for COPD exacerbations,^17-19 the identification of a subgroup with higher counts is particularly relevant. Subgroup 3 will be described as eosinophilic hereafter.

COPD Subgroups are Characterized by Specific Disease Outcomes

Given the known associations between specific comorbidity patterns,²⁰ eosinophilic inflammation in COPD exacerbations,¹⁷ and exacerbation outcomes, we examined whether the COPD exacerbation subgroups demonstrated any outcome differences. We found no differences in intensive care use or readmissions within 30 days. Consistent with previous observations,¹⁷ we found that the eosinophilic subgroup had the shortest stay (5.98 days [2–6])(Table 2). During hospitalization (5%) and in the year following an AECOPD hospitalization (30%), the cardio-renal subgroup had the highest mortality rates.

The high mortality rates of the cardio-renal subgroup led us to determine the survival times stratified by subgroups for severe AECOPD following hospitalization. This analysis showed that in contrast to the cardio-renal subgroup, the eosinophilic subgroup had the best median survival times after hospital discharge (Figures 2A and 2B).

To understand the relationship between COPD subgroups and the Rome criteria for severe AECOPDs,²¹ we identified patients with respiratory acidosis based on arterial blood gas (ABG) testing (pH<7.35 and partial pressure of carbon dioxide>45mm Hg) at any point of their admission (n=65). There were no differences in severe AECOPDs across subgroups (Table 2).

To understand the factors that impact survival time in the COPD exacerbation subgroups, we first performed a univariate Cox regression analysis using subgroup, age, sex, admission to the intensive care unit (ICU), heart failure, and chronic kidney disease given their potential influence on the subgroups and relevant biological input of age and sex. We found that subgroup assignment, age, ICU admission, and heart failure predicted survival time in the univariate analysis (Table 3). Because the hazard ratio distribution of absolute eosinophil counts crossed 1 in the univariate analysis, absolute eosinophil counts were not considered in the multivariate model. The multivariate Cox regression analysis included subgroup, age, admission to the ICU, and heart failure (Figure 2C and Table 3). After controlling for age, admission to the ICU, and heart failure, subgroup categories had a significant impact on survival.

A COVID-19 Cohort of COPD Patients Replicates the Original Subgroups

The triggers for severe AECOPDs that require hospitalization are heterogeneous, and their influence on the clustering of COPD exacerbations is unclear. SARS-CoV-2 infection, the causal agent of COVID-19, is an exceptional trigger for COPD exacerbations and disproportionately affects patients with COPD.²² As a test of the validity of the severe COPD exacerbation subgroups, we implemented a deep learning classifier in a separate cohort of COPD patients in our hospital system admitted with COVID-19.

The 3 original AECOPD subgroups were recapitulated in this COVID-19 AECOPD cohort (n=1646) (Table 4). In the COVID-19 cohort, 68% of the patients were included in the reference subgroup, while 4% were classified as eosinophilic. There were no differences in sex or monocyte counts between subgroups in the COVID-19 cohort, similar to the original cohort. The cardio-renal subgroup in the COVID-19 cohort was the oldest (77 years [68–84]) and had the lowest BMI (27.6kg/m² [23.2–32.4]). Similarly to the cardio-renal subgroup in the original cohort, the COVID-19 cardio-renal subgroup had the highest prevalence of heart failure (60%) and chronic kidney disease (48%) of all 3 subgroups. The rates of antibiotic administration (75%) and systemic steroids (55%) were highest in this subgroup, in contrast to the original cohort (Table 1). Like the original cohort, the COVID-19 cardio-renal subgroup had the highest serum levels of pro-BNP, BUN, and creatinine (Figures 3A-C). Except for systemic steroids, used in the classifier to identify subgroups, no differences in tocilizumab or remdesivir use were seen across the COVID-19 subgroups (Table 4).

Remarkably, leukocyte counts in the COVID subgroups, also recapitulated the pattern seen in the original cohort, with the highest lymphocyte counts (1760 cells/microliter [1520–2260]) and eosinophil counts (127 cells/microliter [50–203]) in the eosinophilic subgroup. While the cardio-renal subgroup had elevated neutrophil counts (5380 cells/microliter [3591–7595]) and the lowest lymphocyte counts (900 cells/microliter [638–1203]).

Inflammatory Profiles of COVID-19 COPD Subgroups

To determine whether blood leukocyte counts seen in the COVID-19 subgroups were associated with distinct cytokine or inflammatory profiles, we compared the levels of 11 cytokines in the 3 subgroups. Following FDR adjustment, we found that 3 cytokines, interleukin (IL)-1beta, IL-2R, and IL-8, were differentially expressed (Table 4). The eosinophilic subgroup had the highest mean IL-1 beta values (Table 4), in keeping with previous studies describing IL-1beta release by eosinophils²³; in contrast, the levels of IL-2R were lowest in the eosinophilic subgroup (Figure 4A).

Higher levels of IL-8, a cytokine involved in neutrophil recruitment and activation,²⁴ were associated with higher neutrophil counts in the reference and cardio-renal subgroups, compared to the eosinophilic subgroup (Figure 4B). Serum levels of the type 2 (T2) cytokines, IL-4, IL-5, and IL-13 were similar in the 3 subgroups (Supplementary Table 1 in the online supplement). Furthermore, serum levels of C-reactive protein (CRP) mirrored IL-2R, IL-8, and neutrophil counts in the 3 subgroups (Figure 4C). CRP levels ≥ 10mg/L which were included in the Rome proposal,²¹ were more common in the reference and cardio-renal subgroups compared to the eosinophilic subgroup (Table 4). This suggests higher levels of inflammation in the COVID-19 reference and cardio-renal subgroups compared to the eosinophilic subgroup.

The Cardio-Renal Subgroup of the COVID-19 Cohort was Characterized by High Mortality

To determine whether associations between outcomes and subgroups were present in the COVID-19 cohort, we examined differences in ICU admission, severe AECOPD by Rome criteria based on their first ABG, 30-day readmission, length of stay, and hospital mortality between COPD subgroups. The rates of admission to the ICU and 30-day readmission were similar to those of the original subgroups (Table 5). Like the original subgroups, we found a shorter length of stay for the eosinophilic subgroup (6.9 days [4.1–12.1]). Although we lacked information beyond the hospitalization for COVID-19, the cardio-renal subgroup showed higher rates of inpatient mortality (26%), comparable to those in the cardio-renal subgroup of the original cohort (30%) within the first year after hospitalization.

Discussion

We found 3 subgroups of severe AECOPDs using ML on EHR data from 3382 hospitalized patients. A total of 2 of the 3 subgroups were characterized by specific comorbidities or leukocyte profiles. First, a cardio-renal subgroup was associated with increased mortality during and after hospitalization for AECOPD. This was followed by an eosinophilic subgroup that had the shortest hospital stay, suggesting a milder pattern of exacerbation. It is notable that the subgroups were evident despite differences between the cohorts, including triggers for hospitalization. In the original cohort, the triggers were not captured by our study design, while the second cohort was restricted to patients hospitalized with COVID-19. Overall, these findings demonstrate that these subgroups are stable and support the use of ML classifiers in EHRs to classify hospitalizations with AECOPDs. Increasing automated recognition of AECOPD subphenotypes in EHRs presents a clinical opportunity to develop precision medicine interventions to improve disease outcomes.

These subgroups are important for their morbidity and mortality, as well as their specific clinical characteristics. The cardio-renal subgroup not only recapitulates what is known about the impact of specific comorbidities on COPD outcomes,⁸ it also captures other phenotypic traits associated with increased mortality, including a lower BMI.²⁵ The identification of lower lymphocyte counts combined with higher neutrophil counts in this subgroup is also consistent with multiple studies that examined the neutrophil to lymphocyte ratio in AECOPDs as a marker of exacerbation risk and mortality.²⁶ Considering the aging process, the presence of COPD, chronic cardiac and renal disease, and the presence of unique inflammation surrogates in neutrophils and lymphocytes, it is plausible that mechanisms of immunosenescence may be present in this subgroup.²⁷ Recapitulating all these features associated with poor outcomes into a single subgroup strengthens our ability to understand this phenotype and can aid in the identification of AECOPD triggers and therapeutic targets unique to this group of patients.

We identified the eosinophilic subgroup in the original cohort through the integration of comorbidities associated with T2 inflammation and blood counts. Despite the confounding effect of systemic steroid administration on blood eosinophil counts, the ability to identify this subgroup points to the robustness of blood eosinophils as a marker to distinguish this subgroup. This subgroup was also characterized by milder exacerbations characterized by shorter length of stay, consistent with previous studies of AECOPDs requiring hospitalization.¹⁷ These differences are likely related to age, among other factors. We speculate that it is possible that this subgroup of exacerbations is more responsive to the administration of systemic steroids. We did not see differences in T2 cytokines in the validation cohort, and this may reflect limited power to identify differences or the influence of concomitant viral infection and COVID therapies. Furthermore, the demonstration of clinical benefit in COPD with increased blood eosinophils after dual blockade of the IL4/IL13 T2 pathway with dupilumab¹¹ confirms this as a distinct endotype based both on molecular mechanism and response to treatment.⁴

The largest reference cluster had a mix of clinical features and outcomes that fell between the cardio-renal and eosinophilic subgroups. This suggests that there are additional AECOPD phenotypes that are not captured by the current parameters of our analysis. For instance, key differences in the diagnosis of heart failure, including ejection fraction and the mechanisms involved including diastolic and systolic failure, are essential for more accurate classification. Our study was intended as a proof-of-concept for computable subgroups of severe AECOPD, which led to the use of conservative clustering parameters to prevent overclustering of subgroups, which may lead to the identification of very small groups without broad applicability. The results of future studies may identify new subgroups using different parameters.

We recognize the limitations of our model. These include the lack of spirometric values to define COPD, background therapies, and lung imaging patterns in which subgroups were defined. The single hospital system and selected EHR features may contribute to selection bias. The differences between subgroups may also have been driven by specific molecular determinants that EHRs failed to capture. To address some of these limitations, we used strict criteria to define COPD including multiple International Classification of Diseases-Tenth Revision entries, excluding those with dual diagnoses of asthma and COPD, and use of complete, routinely available clinical data rather than imputed values. To make a similar model applicable to other centers, we carefully selected data on inpatient medication administration profiles and structured data when available. Finally, our dataset did not collect all the variables required by the Rome proposal to determine degrees of severity of AECOPDs. We sought to overcome this limitation by focusing on the severe category defined by ABG testing in a subset of patients. It is expected that subsequent iterations of our current approach will refine the role of computable subgroups in COPD classification.

Conclusions

Computable subphenotypes of severe AECOPD identify a cardio-renal subgroup associated with increased mortality. This subgroup includes several known features connected to poor outcomes in COPD. In contrast, a separate eosinophilic subgroup is associated with milder AECOPD requiring hospitalization. ML can be used to improve patient classification using data collected on EHRs and result in new treatment paradigms tailored to specific disease subtypes.

Acknowledgements

Author contributions: HL and JLG contributed to the conception and design, data acquisition, and analysis. All authors contributed to the final article drafting and revision and gave final approval.

Other acknowledgments: The authors wish to acknowledge the assistance and expertise of Richard Hintz and Krishna Daggula at Yale’s Joint Data Analytics Team.

Declaration of Interests

HL and JH have no conflicts of interest related to this work. JZ reports funding from a National Institutes of Health (NIH) training grant (2T32HL007778-26), and personal fees for participation in practice update. SK reports funding from an NIH training grant (2T32HL007778-26). MS reports funding from the NIH/National Heart, Lung, and Blood Institute (NHLBI) (R01 HL155948). JG reports funding from the NIH/NHLBI (R01 HL153604 and R03 HL154275).

Online Supplement

Click here to view the Online Supplement.

1. Castaldi PJ, Dy J, Ross J, et al. Cluster analysis in the COPDGene study identifies subtypes of smokers with distinct patterns of airway disease and emphysema. Thorax. 2014;69(5):416-423. https://doi.org/10.1136/thoraxjnl-2013-203601

2. Ghebre MA, Bafadhel M, Desai D, et al. Biological clustering supports both "Dutch" and "British" hypotheses of asthma and chronic obstructive pulmonary disease. J Allergy Clin Immunol. 2015;135(1):63-72. https://doi.org/10.1016/j.jaci.2014.06.035

3. Sakornsakolpat P, Prokopenko D, Lamontagne M, et al. Genetic landscape of chronic obstructive pulmonary disease identifies heterogeneous cell-type and phenotype associations. Nat Genet. 2019;51:494-505. https://doi.org/10.1038/s41588-018-0342-2

4. Anderson GP. Endotyping asthma: new insights into key pathogenic mechanisms in a complex, heterogeneous disease. Lancet. 2008;372(9643):1107-1119. https://doi.org/10.1016/S0140-6736(08)61452-X

5. Lopez-Campos JL, Agustí A. Heterogeneity of chronic obstructive pulmonary disease exacerbations: a two-axes classification proposal. Lancet Respir Med. 2015;3(9):729-734. https://doi.org/10.1016/S2213-2600(15)00242-8

6. Perera PN, Armstrong EP, Sherrill DL, Skrepnek GH. Acute exacerbations of COPD in the United States: inpatient burden and predictors of costs and mortality. COPD. 2012;9(2):131-141. https://doi.org/10.3109/15412555.2011.650239

7. Shah T, Churpek MM, Coca Perraillon M, Konetzka RT. Understanding why patients with COPD get readmitted: a large national study to delineate the Medicare population for the readmissions penalty expansion. Chest. 2015;147(5):1219-1226. https://doi.org/10.1378/chest.14-2181

8. Vanfleteren LEGW, Spruit MA, Groenen M, et al. Clusters of comorbidities based on validated objective measurements and systemic inflammation in patients with chronic obstructive pulmonary disease. Am J Respir Crit Care Med. 2013;187(7):728-735. https://doi.org/10.1164/rccm.201209-1665OC

9. Haghighi B, Choi S, Choi J, et al. Imaging-based clusters in current smokers of the COPD cohort associate with clinical characteristics: the SubPopulations and Intermediate Outcome Measures in COPD Study (SPIROMICS). Respir Res. 2018;19:178. https://doi.org/10.1186/s12931-018-0888-7

10. Pavord ID, Chanez P, Criner GJ, et al. Mepolizumab for eosinophilic chronic obstructive pulmonary disease. N Engl J Med. 2017;377(17):1613-1629. https://doi.org/10.1056/NEJMoa1708208

11. Bhatt SP, Rabe KF, Hanania NA, et al. Dupilumab for COPD with type 2 inflammation indicated by eosinophil counts. N Engl J Med. 2023;389(3):205-214. https://doi.org/10.1056/NEJMoa2303951

12. Henry J, Pylypchuk Y, Searcy T, Patel V. Adoption of electronic health record systems among US non-federal acute care hospitals: 2008-2015. HealthIT website. Published 2016. Accessed 2024. https://www.healthit.gov/sites/default/files/briefs/2015_hospital_adoption_db_v17.pdf

13. EMC Digital Universe, IDC. Vertical industry brief: digital universe driving data growth in healthcare. Cyclone Interactive website. Published 2014. Accessed 2024. https://www.cycloneinteractive.com/sites/cyclone/assets/File/digital-universe-healthcare-vertical-report-ar.pdf

14. Richesson RL, Wiley LK, Gold S, et al. Electronic health records-based phenotyping. Introduction. Rethinking Clinical Trials website. Published 2020. Accessed March 2024. https://rethinkingclinicaltrials.org/chapters/conduct/electronic-health-records-based-phenotyping/electronic-health-records-based-phenotyping-introduction/

15. Lopez K, Li H, Lipkin-Moore Z, et al. Deep learning prediction of hospital readmissions for asthma and COPD. Respir Res. 2023;24:311. https://doi.org/10.1186/s12931-023-02628-7

16. Yale University, Department of Medicine. COVID-19 Explorer and Repository tool: DOM-CovX. Yale University website. Published 2020-2021. Accessed June 27, 2021. https://spinup-0011f4.spinup.yale.edu/domcovx/

17. Bafadhel M, Greening NJ, Harvey-Dunstan TC, et al. Blood eosinophils and outcomes in severe hospitalized exacerbations of COPD. Chest. 2016;150(2):320-328. https://doi.org/10.1016/j.chest.2016.01.026

18. Vedel-Krogh S, Nielsen SF, Lange P, Vestbo J, Nordestgaard BG. Blood eosinophils and exacerbations in chronic obstructive pulmonary disease. The Copenhagen General Population Study. Am J Respir Crit Care Med. 2016;193(9):965-974. https://doi.org/10.1164/rccm.201509-1869OC

19. Yun JH, Lamb A, Chase R, et al. Blood eosinophil count thresholds and exacerbations in patients with chronic obstructive pulmonary disease. J Allergy Clin Immunol. 2018;141(6):2037-2047.e10. https://doi.org/10.1016/j.jaci.2018.04.010

20. Roberts CM, Stone RA, Lowe D, Pursey NA, Buckingham RJ. Co-morbidities and 90-day outcomes in hospitalized COPD exacerbations. COPD. 2011;8(5):354-361. https://doi.org/10.3109/15412555.2011.600362

21. Celli BR, Fabbri LM, Aaron SD, et al. An updated definition and severity classification of chronic obstructive pulmonary disease exacerbations: the Rome proposal. Am J Respir Crit Care Med. 2021;204(11):1251-1258. https://doi.org/10.1164/rccm.202108-1819PP

22. Gerayeli FV, Milne S, Cheung C, et al. COPD and the risk of poor outcomes in COVID-19: a systematic review and meta-analysis. EClinicalMedicine. 2021;33:100789. https://doi.org/10.1016/j.eclinm.2021.100789

23. Esnault S, Kelly EAB, Nettenstrom LM, Cook EB, Seroogy CM, Jarjour NN. Human eosinophils release IL-1ß and increase expression of IL-17A in activated CD4+ T lymphocytes. Clin Exp Allergy. 2012;42(12):1756-1764. https://doi.org/10.1111/j.1365-2222.2012.04060.x

24. Rajarathnam K, Sykes BD, Kay CM, et al. Neutrophil activation by monomeric interleukin-8. Science. 1994;264(5155):90-92. https://doi.org/10.1126/science.8140420

25. Hallin R, Gudmundsson G, Suppli Ulrik C, et al. Nutritional status and long-term mortality in hospitalised patients with chronic obstructive pulmonary disease (COPD). Respir Med. 2007;101(9):1954-1960. https://doi.org/10.1016/j.rmed.2007.04.009

26. Paliogiannis P, Fois AG, Sotgia S, et al. Neutrophil to lymphocyte ratio and clinical outcomes in COPD: recent evidence and future perspectives. Eur Respir Rev. 2018;27(147):170113. https://doi.org/10.1183/16000617.0113-2017

27. Murray MA, Chotirmall SH. The impact of immunosenescence on pulmonary disease. Mediators Inflamm. 2015;692546. https://doi.org/10.1155/2015/692546

Keywords (Click on any keyword for related articles)

acute exacerbations of COPD, machine learning, electronic health records, clusters

Images

Share This Article

E-mail this article to a friend

ISSN 2372-952X

Time from submission to first decision:

30 days

Indexed by:

PubMed
PubMed Central
Science Citation Index Expanded
Scopus

The JCOPDF is available free of charge.
Please sign up to receive your free digital subscription.

Author Guidelines

Copyright & Reprints

While the JCOPDF is an open access journal and free access to its content is encouraged, reproduction of JCOPDF content for bulk and commercial distribution/ use must be purchased from the COPD Foundation, sole copyright holder of all JCOPDF content. If you are interested in ordering (or obtaining a quote for) paper reprints or e-prints of an article, please contact scsreprints@sheridan.com. If you wish to request information about reproducing figures or tables, please contact Bethany Hendrix, JCOPDF staff member at bhendrix@copdfoundation.org. Authors should contact the Foundation directly related to their requests for personal use of articles, figures, charts, etc.

Chronic Obstructive Pulmonary Diseases:Journal of the COPD Foundation_®

Identification of Severe Acute Exacerbations of Chronic Obstructive Pulmonary Disease Subgroups by Machine Learning Implementation in Electronic Health Records

Author Affiliations

Address correspondence to:

Abstract

Citation

Keywords

PDF Download

Plain Language Summary

Introduction

Methods

Original Cohort Data Source and Study Population

COVID Cohort

Clustering

Classifier

Statistical Analysis

Results

Identification of the COPD Subgroups

Clinical Characteristics of the Acute Exacerbation of COPD Subgroups

Subgroups of Acute Exacerbation of COPD Exhibit Distinct Blood Chemistry and Complete Blood Counts

COPD Subgroups are Characterized by Specific Disease Outcomes

A COVID-19 Cohort of COPD Patients Replicates the Original Subgroups

Inflammatory Profiles of COVID-19 COPD Subgroups

The Cardio-Renal Subgroup of the COVID-19 Cohort was Characterized by High Mortality

Discussion

Conclusions

Online Supplement

Keywords (Click on any keyword for related articles)

Images

Share This Article

Author Guidelines

Copyright & Reprints

COPD Resources for Medical Professionals

Advertising

Resources

Contact Us

Chronic Obstructive Pulmonary Diseases:Journal of the COPD Foundation®

Identification of Severe Acute Exacerbations of Chronic Obstructive Pulmonary Disease Subgroups by Machine Learning Implementation in Electronic Health Records

Author Affiliations

Address correspondence to:

Abstract

Citation

Keywords

PDF Download

Plain Language Summary

Introduction

Methods

Original Cohort Data Source and Study Population

COVID Cohort

Clustering

Classifier

Statistical Analysis

Results

Identification of the COPD Subgroups

Clinical Characteristics of the Acute Exacerbation of COPD Subgroups

Subgroups of Acute Exacerbation of COPD Exhibit Distinct Blood Chemistry and Complete Blood Counts

COPD Subgroups are Characterized by Specific Disease Outcomes

A COVID-19 Cohort of COPD Patients Replicates the Original Subgroups

Inflammatory Profiles of COVID-19 COPD Subgroups

The Cardio-Renal Subgroup of the COVID-19 Cohort was Characterized by High Mortality

Discussion

Conclusions

Online Supplement

Keywords (Click on any keyword for related articles)

Images

Share This Article

Author Guidelines

Copyright & Reprints

Chronic Obstructive Pulmonary Diseases:Journal of the COPD Foundation_®