Current Projects

A Collection of COVID-19 Research Findings

  • Liu J, Hung P, Liang C, Zhang J, Qiao S, Campbell BA, Olatosi B, Torres ME, Hikmet N, Li X. Multilevel determinants of racial/ethnic disparities in severe maternal morbidity and mortality in the context of the COVID-19 pandemic in the USA: protocol for a concurrent triangulation, mixed-methods study. BMJ Open. 2022 Jun 1;12(6):e062294. full text
  • Lyu T, Liang C, Liu J, Campbell B, Hung P, Shih YW, Ghumman N, Li X. Temporal Events Detector for Pregnancy Care (TED-PC): A Rule-based Algorithm to Infer Gestational Age and Delivery Date from Electronic Health Records of Pregnant Women with and without COVID-19. arXiv preprint arXiv:2205.02933. 2022 May 1. full text
  • Deer, R. R., Rock, M. A., Vasilevsky, N., Carmody, L. C., Rando, H. M., Anzalone, A. J., Callahan T. J., Bramante C.T., Chute C. G., Greene C. S., Gagnier J., Chu H., Koraishy F. M., Liang C…. & Robinson, P. N. Characterizing long COVID: deep phenotype of a complex condition. EBioMedicine. 2021 Dec 1;74:103722. full text
  • Lyu T, Hair N, Yell N, Li Z, Qiao S, Liang C, Li X. Temporal geospatial analysis of COVID-19 pre-infection determinants of risk in South Carolina. International Journal of Environmental Research and Public Health. 2021 Sep 14;18(18):9673. full text
  • Qiao S, Li Z, Liang C, Li X, Rudisill C. Three dimensions of COVID‐19 risk perceptions and their socioeconomic correlates in the United States: A social media analysis. Risk Analysis. 2022 Jul 13. full text

Informatics Approach to Identification and Deep Phenotyping of PASC Cases

Increasingly there have been reports of persistent symptoms and multi-organ multi-system manifestations (e.g., pulmonary, cardiovascular, renal, and neurological) among individuals who were recovered from the acute phase of COVID-19, denoted as Post-Acute Sequela of SARS-CoV-2 infection (PASC). Given that 100 million people in the US are known to have been infected in the US as of September of 2022, millions of people will potentially experience PASC. This projected disease burden will have a profound public health impact with respect to patients’ clinical outcomes and US health systems during post-COVID-19 care. Timely identification of individuals with PASC from existing COVID-19 cohorts and newly identified COVID-19 patients is urgently needed for PASC clinics and longitudinal cohort studies on PASC. Building on biomedical informatics methodologies, we propose a high-throughput and semi-supervised Deep Phenotyping approach to identifying individuals with PASC and characterizing their phenotypes. Our approach is based on a Graph representational model constructed based on the South Carolina COVID-19 Cohort (S3C), funded by the National Institute of Allergy and Infectious Diseases (NIAID) (R01A127203-4S1). S3C (n=~1,400, 000 COVID-19 patients by the February of 2022) is a multi-modal data repository consisting of EHR, health systems data, community-based health services data, and claims data, with complete temporal trajectory of every datum at individual-level. Building on top of the Graph model, we will detect phenotypes of candidate PASC patients by using unsupervised clustering algorithms. We will then identify and validate clinically plausible PASC cases and corresponding phenotypes by incorporating clinical evaluation and supervised algorithms. This study will result in a high-throughput algorithm application for identifying and characterizing PASC cases from COVID-19 EHR cohorts. The resulted EHR and machine learning models are interpretable, generalizable, and will form a foundation for testing and implementing in state-wide and national post-COVID clinics/programs.

Funding: NIH/NIAID (R21AI169139); Big Data Health Science Center at the University of South Carolina

Curating a Knowledge Base for Individuals with Coinfection of HIV and SARS-CoV-2: EHR-based Data Mining

The COVID-19 pandemic has cast a heavy burden on individuals with HIV infection. Based on data of 15,522 hospitalized patients with the coinfection of HIV and SARS-CoV-2 from 24 countries, a recent World Health Organization (WHO) report for the first time confirmed that HIV to be an independent risk factor for severe COVID-19. Despite a generally high risk of severe COVID-19 clinical course in individuals with HIV, the interactions between SARS-CoV-2 and HIV infections remain unclear. For example, the severity of COVID-19 in individuals with HIV is correlated with certain comorbidities in which some of these comorbidities are more prevalent in patients with HIV than other populations. Yet, several contradictory findings suggested the predominant role of comorbidities in the severity of COVID-19 regardless of HIV infection. Individuals with low CD4+ T-cell count (e.g., <200~500 cells/µL) and unsuppressed viral load are associated with severe clinical course, yet the role of antiretroviral therapy (ART) exposure and adherence in the context of COVID-19 exposure needs to be examined. Risk factors for the severe clinical course of the coinfection are undetermined because individuals with the same or similar severity level of COVID-19 show different clinical characteristics. To fill address these knowledge gaps, this study will establish an EHR-based cohort for individuals with HIV/SARS- CoV-2 coinfection and develop large-scale EHR-based data mining to examine the interactions between HIV and SARS-CoV-2 infections and systematically identify and validate factors contributing to the severe clinical course of the coinfection. Ultimately, collected clinical evidence will be implemented and used to pilot test a Clinical Decision Support (CDS) prototype to assist providers in screening and referral of at-risk patients in real-world clinics.

Funding: NIH/NIAID (R21AI170171)

Identifying Predictive Factors of HIV Care Discontinuation Using Clinical Natural Language Processing

An estimated 80% of people living with HIV (PLWH) who are retained in care can achieve viral suppression. However, only 58% (as of 2018) of PLWH in the US are retained in care. South Carolina has a below-average retention rate (53%) as compared with the national rate. Since 2010, the retention rate has increased only marginally for both South Carolina and nationwide. The lower retention rates have been a major barrier to eliminating HIV transmission and improving PLWH’s health outcomes. Among the prioritized research is identifying individual-level factors and patterns for poor retention. Despite reported factors (e.g., place of residence, race/ethnicity, socioeconomic status, stigma) that are associated with lower rates of retention, nuances within the context of PLWH’s long-term medical care are understudied. This study is to identify and characterize individual-level risk factors and patterns for PLWH’s poor retention in care by jointly using structured Electronic Health Records (EHR) variables and clinical notes.

Funding: Prisma Health

Deep Learning Assisted Opioid Use Disorder Diagnosis Using EHR data

Electronic Health Records (EHR) hold great promise in assisting providers to identify individuals with possible opioid use disorder (OUD), treatment referral, as well as for onsite buprenorphine initiation protocols that are highly effective in the field. We develop Deep Learning models to assist providers in OUD diagnosis by harnessing individuals’ chronological EHR using clinical Natural Language Processing (NLP).

Funding: Prisma Health

The heterogeneous contributing factors found in preventable medication-related harms suggest that intervention strategies (e.g., patient coaching/education, discharge instructions, coordination, and medication reconciliation) should reflect personalized needs of patients and context-specific discharge. We use Electronic Health Records (EHR) chart review to collect clinical decision rules about personalized information to be provided by clinicians. We then develop Deep Learning models to automatically learn from EHR and clinical decision rules to suggest personalized information at discharge.

Funding: Prisma Health