Using Machine Learning to Generate Clinical Prediction Rules for Clinical Outcomes in Schizophrenia (2017-2018)


Schizophrenia is a mental illness that affects 1.1% of the U.S. population. The disease is characterized by global deterioration in functioning and includes presence of delusions, hallucinations and cognitive deficits.

The burden of care in terms of caregiver stress and economic burden is high. Patients are usually started on medications called antipsychotics for symptom control and will need lifelong treatment in most cases. Although medications are not 100% effective, compliance with psychosocial interventions and medications is an important moderator of illness course and prognosis. When patients stop taking their medications, they have a higher risk of relapse, which leads to care in the emergency department (ED) or inpatient unit.

At Duke, nearly 500 patients with schizophrenia visit the ED in a year and stay for an average of one to two weeks before they can get an inpatient bed. In addition to occupying space in the ED, patients with schizophrenia cost the system money. They are frequently under-resourced, uninsured or under-insured.

The ability to prospectively and accurately predict high risk of relapse could facilitate the allocation of scarce resources to patients most likely to benefit, with an end result of decreasing the length of stay in the ED and, ultimately, decreasing the need to refer patients for inpatient care. 

Project Description

This Bass Connections project will tackle the problems of high frequency of relapse associated with schizophrenia and high economic and health system burden associated with schizophrenia.

The current absence of a clinical prediction tool makes it difficult for a clinical provider to know prospectively which patients would benefit from more intensive resources including community support or clozapine. Such a tool would be of greatest relevance to large-scale providers such as the Department of Defense and affordable care organizations, as it would help them decrease the economic burden of mental healthcare and allocate resources appropriately.

Therefore, in order to lay the groundwork for development of a clinical prediction tool for use in inpatient and outpatient settings, this project team will apply machine learning to the Duke clinical data set that contains clinical and demographic details related to patients with schizophrenia. The goal is to pinpoint the optimum predictor clinical and demographic variables.

Ultimately, taking this work forward beyond the 2017-2018 Bass Connections team, researchers will develop a software interface wherein input of a few patient-specific demographic, illness and comorbidity variables would result in a score having prognostic implications. The prediction score could be utilized to create algorithms to facilitate appropriate advocacy for resource allocation to patients based on risk of relapse.

Anticipated Outcomes

Extraction of data sets and application of machine learning to pinpoint optimum predictor variables; poster and platform presentations at local and national meetings; submission to nationally distributed journals


Summer 2017 – Spring 2018

  • Summer 2017: Extraction of clinical data from Epic Maestro, groundwork for applying machine learning to the data extracted: June 19 – August 13
  • Fall 2017: Estimation of optimum predictors, discussions between students and clinicians about variables
  • Spring 2018: Drafting of posters, manuscripts and other abstracts for presentation/submission

Faculty/Staff Team Members

Jane Gagliardi, School of Medicine - Psychiatry
Katherine Heller, Trinity - Statistical Science*
Gopalkumar Rakesh, School of Medicine - Psychiatry*
Jessica Tenenbaum, Biostat

Graduate Team Members

Joseph Futoma, Statistical Science - PHD

Undergraduate Team Members

Linda Adams, Computer Science (AB)
Beepul Bharti, Biomedical Engineering (BSE), Mathematics (BS2)
Chelsea Liu, Computer Science (AB)

* denotes team leader