Patient safety indicators used to rate hospitals fail to meet scientific criteria, Johns Hopkins researchers say
Researchers say indicators should be tied to clinical data, not billing codes.
While public rankings from the Centers for Medicare and Medicaid Services, and industry rankings from groups like Leapfrog and Healthgrades, have raised awareness of hospital shortcomings in patient safety, a new report claims these programs fail themselves to stand up to scientific scrutiny.
The report by the Johns Hopkins Armstrong Institute for Patient Safety and Quality published this week in the journal Medical Care found only one measure out of the 21 used to rate patient safety that researchers studied meets scientific criteria.
"These measures have the ability to misinform patients, misclassify hospitals, misapply financial data and cause unwarranted reputational harm to hospitals," Bradford Winters, MD, an associate professor of anesthesiology and critical care medicine at Johns Hopkins, and lead study author, said in a statement. "If the measures don't hold up to the latest science, then we need to re-evaluate whether we should be using them to compare hospitals."
According to the report, most ratings programs rely on patient safety indicators and hospital-acquired condition measures created by the Agency for Health Care Research and Quality and CMS more than a decade ago. But since that data is pulled from billing data and not actual clinical data traced back to patient health records, certain factors tied to medical coding and human error can make the results unreliable.
"The variation in coding severely limits our ability to count safety events and draw conclusions about the quality of care between hospitals," said Peter Pronovost, MD, who directs the Johns Hopkins institute. "Patients should have measures that reflect how well we care for patients, not how well we code that care."
[Also: Leapfrog: Only 15 hospitals scored F grades for patient safety; See the list]
Researchers looked at 19 studies conducted between 1990 and 2015 that scrutinized HACs and PSI measures as rating instruments and compared billing codes with actual medical records. They found that in 80 percent of cases the data matched.
Also, of the 21 measures created by AHRQ and CMS, 16 of those did not have enough data to be evaluated. And of the five remaining measures that had enough data, researchers said only one of those was valid.
Ultimately, researchers hope their report will spur changes in how ratings work, and more specifically tie the measures to clinical and not billing data.
Twitter: @HenryPowderly