CURATOR: Using Big Data to Streamline Research & Clinical Care During COVID-19

July 2, 2021 - Katie McCallum

CURATOR. It's a simpler name for Houston Methodist's COVID-19 Surveillance and Outcome Registry — a database designed to digitize COVID-19-related health information.

It's structured data borders on 1 terabyte. It holds longitudinal health information on more than 200,000 patients and 14 million hospital encounters. It actively supports 30 COVID-19 research projects. Its learning, system-based approach supports rapid yet validated decision-making.

And, like many innovations of 2020, CURATOR's story begins the day the pandemic began.

"We went from only one officially reported COVID-19 case in Houston on March 1, 2020, to almost 6,000 cases just eight weeks later," says Dr. Farhaan Vahidy, associate director of the Center for Outcomes Research at Houston Methodist.

As one of the largest health care systems in the fourth most populous city in the U.S., Houston Methodist was faced with an unprecedented challenge: How to lead a city through a public health catastrophe.

"Fortunately, as an academic hospital we have the skills, expertise and leadership needed to fight a virus never previously seen before. So, we're uniquely suited in that way. But we're also a complex health care system and, like many others, we have information silos," says Dr. Vahidy.

And during a crisis in which the unknowns outnumber the knowns, eliminating the silos and rapidly evaluating and acting on evidence-based medicine becomes paramount.

The need for a data-driven tool to help guide decision-making during COVID-19

A pandemic brings unique challenges to a hospital system. And a pandemic underscored by a brand-new virus brings even more challenges.

"Of course, our primary goal from day one of the pandemic was to deliver effective care to both COVID-19 and non-COVID-19 patients. But a lot goes into that," says Dr. Vahidy.

To ensure the best care is delivered to patients during the pandemic:

  • Frontline care teams need support in the clinical decision-making process
  • Hospital administration needs to efficiently predict hospital capacity and manage resources
  • Clinical researchers need to explore innumerable important research questions

"In terms of streamlining research, hospital leadership quickly established the retrospective research task force (RRTF), a multidisciplinary team tasked with reviewing all COVID-19-related protocols before passing along to the institutional review board (IRB)," says Dr. Vahidy.

The first day this task force met, the team agreed that a strong data backbone would be required to accomplish this, as well as the other institutional priorities.

"We needed to capture every single bit of data available, incorporate it into an information pipeline, evaluate how each piece of data fits with critical clinical and operational elements, and, finally, decide in which situations the data is useful," adds Dr. Vahidy.

But data is tricky — especially health care data.

In its inherent form, health care data is:

  • Non-uniform – often collected from various independent sources
  • Uniquely complex – includes nonbinary data elements, such as translational imaging
  • Protected information – subject to strict confidentiality and compliance laws

"Healthcare data is indeed difficult to digitize — but it's not impossible," says Dr. Vahidy. "It takes bringing the right personnel and resources together."

Digitizing health care data is difficult, but not impossible with the right team

"The most important component of digitizing health care data is creating a single source of truth, and that's what we designed CURATOR to be — a unified source of truth for all things COVID-19," says Dr. Vahidy.

To accomplish this, a multispecialty team was assembled, including:

  • Big-data specialists
  • Data scientists
  • Data engineers
  • Application analysts
  • Epidemiologists
  • Outcomes research specialists
  • Physicians across many clinical disciplines
  • Business intelligence experts

The team created CURATOR, an IRB-approved COVID-19 registry. The rationale and design of this tool was published in JMIR Medical Informatics in February 2021.

A look inside CURATOR's data

CURATOR is a true longitudinal health information database that capitalizes on several system-wide data sources, including:

  • Electronic Health Record (EHR)
  • Virtual ICU
  • CareSense (mobile-first digital platform designed to guide patients through their care)
  • Imaging data warehouses

"The database draws on information collected from every individual tested for COVID-19 at Houston Methodist, as well as any information we have about that patient from before the pandemic began and/or after being tested," says Dr. Vahidy.

It can then sort these individuals. First, by those who test positive and those who test negative. Next, both categories can then be further subclassified by whether they were hospitalized — due to COVID-19 or another health condition — and whether they had encounters with Houston Methodist prior to the pandemic.

"Looking at the overall picture of a patient's health, even components unrelated to COVID-19, provides clinicians and researchers with very rich longitudinal data. This becomes an important control and is critical for elevating hypothesis generation and testing," explains Dr. Vahidy.

Without longitudinal data, experts have to be very generalistic about the comorbidities and factors that influence patient outcomes.

"What we've created with CURATOR is a data engine that can take in all of this very complex COVID-19 data and provide meaningful insights and actionable endpoints," says Dr. Vahidy.

CURATOR is also subject to ongoing validation. Workgroups are tasked with reviewing informatics quality. So if data discrepancies do arise, there's a team to ensure both the source and implications of these differences are well understood.

How CURATOR is driving COVID-19 research

One of the earliest successes of CURATOR came when COVID-19 began re-surging across Houston during summer 2020. Dr. Vahidy and his team used the database to rapidly evaluate how this second surge differed from the initial one that occurred at the onset of the pandemic in March 2020.

"Critically, we uncovered two new vulnerable groups. The data showed that younger adults and Hispanic communities were suffering worse COVID-19 outcomes," says Dr. Vahidy. "This is something that's well-known now, but through our data, we saw it happening in real time and responded to it immediately."

As a result, hospital leadership began placing heavier emphasis on COVID-19 outreach programs and translation services.

The team was also the first in the nation to highlight this concerning trend. The findings were published in JAMA in August 2020 — just 64 days after the study's data assimilation and analysis began, with 48 of those days spent in peer review and the publishing process.

And the examples of the impact of CURATOR continue.

"There are several studies based on CURATOR data already published, with particular emphasis on critical care and social determinants of health. In addition, CURATOR is currently supporting COVID-19 research projects across many clinical disciplines, including heart and vascular, neurology, pulmonology, critical and emergency care, and health care accessibility," adds Dr. Vahidy.

The case for using big data beyond the pandemic

Aside from its impact on patient care during the COVID-19 pandemic, CURATOR is influential simply because it's unique.

"As far as we're aware, CURATOR has very few, if any, head-to-head comparisons — within the field of COVID-19 as well as outside of it," adds Dr. Vahidy. "Data pipelines like CURATOR have been missing and needed in health care for a long time, and the pandemic was the catalyst for us to put the resources and personnel towards creating one."

From here, Dr. Vahidy says there's more that CURATOR can and will do. While it already draws information from various clinical information sources, it's also designed to harmonize with other data sources across the institution as needed.

"The primary and initial goal of CURATOR was to drive COVID-19 research, but we know it can also support institutional operational priorities. And we've set it up for this. We're also continuing to develop a front-end interface to make it easier for our clinicians to use," adds Dr. Vahidy.

CURATOR serves as an example of how impactful a large institutional health registry can be. Even more powerful is a registry not limited to a single health care institution — a limitation of CURATOR that Dr. Vahidy readily points out.

At its core, Dr. Vahidy says it's an example of the power of collaboration.

"The COVID-19 pandemic has been the challenge of a lifetime for many clinicians and researchers. The way experts at Houston Methodist came together from across disciplines and levels within the organization — all with a singular focus of protecting our patients and community — is truly remarkable. I know this endeavor was unique. I feel this is what a leading institution can do, and I'm proud to be a part of it," adds Dr. Vahidy.


Related Reading:

Notable research efforts supported by CURATOR data include:

  1. Prevalence of SARS-CoV-2 Infection Among Asymptomatic Health Care Workers in the Greater Houston, Texas, Area
  2. Provider Burnout and Fatigue During the COVID-19 Pandemic: Lessons Learned From a High-Volume Intensive Care Unit
  3. Racial and Ethnic Disparities in SARS-CoV-2 Pandemic: Analysis of a COVID-19 Observational Registry for a Diverse U.S. Metropolitan Population
  4. Disparities in COVID-19 Hospitalizations and Mortality Among Black and Hispanic Patients: Cross-Sectional Analysis from the Greater Houston Metropolitan Area
  5. Rapid Implementation and Innovative Applications of a Virtual Intensive Care Unit During the COVID-19 Pandemic: Case Study
  6. Use of Telecritical Care for Family Visitation to ICU During the COVID-19 Pandemic: An Interview Study and Sentiment Analysis
  7. Adapting an Outpatient Psychiatric Clinic to Telehealth During the COVID-19 Pandemic: A Practice Perspective
  8. Sex Differences in Susceptibility, Severity, and Outcomes of Coronavirus Disease 2019: Cross-Sectional Analysis from a Diverse U.S. Metropolitan Area


COVID-19 Technology Innovation