In a previous post, we provided estimates of confirmed cases for novel coronavirus across the United States. We believe the trends in cases have real relevance for individual health systems.
One question we have asked is, at what point will we begin to see rapid increases in the numbers of cases coming to our hospital? Currently, the number of cases presenting to any one health systems and any state is variable. There is rapid growth along the east and west coast. However, cases are beginning to accumulate in other parts of the country.
As local cases begin to rise, health systems need to prepare for intense resource utilization to safely and effectively treat patients infected with the virus. By using the successive confirmed case counts from Johns Hopkins University Center for System Science and Engineering’s github repository, we designed a web based tool that helps hospitals plan for forecasted resource use.
The goal of this calculator is to allow a hospital to understand its resource use, such as beds, ICU beds, ventilators, and personal protective equipment (PPE).
The top section of the tool is used to adjust the parameters.
Adjusting Growth and Admission Parameters
If Hospital A wants to forecast its future use of these resources, it would need to enter some baseline information. At the top of the page, select a model type (Exponential, Polynomial, or Logistic), and choose the state in which Hospital A is located. Please review the explanation of the growth models options further below or click here.
Below this, enter the % share of statewide COVID-19 cases that have come to Hospital A. For example, if the statewide count is 200, and Hospital A has seen 20, its rate is 10%.
Of these cases, the other parameters to enter are: the % of COVID-19+ patients seen at the Hospital that are admitted to inpatient, the % of admissions that require ICU care, and the % of ICU admissions that require mechanical ventilation. The average length of stay (LOS) for non-critical care and critical care for COVID-19 patients is also needed to calculate the number of beds required. Click here for a note on LOS modeling.
Adjusting Personal Protective Equipment Parameters
To estimate PPE utilization, the number of units of PPE used per day per patient are listed. This is a product of the number of provider interactions for each patient, and the number of PPE items used by the provider each day.
Adjusting Forecast Length
A last item is a setting for the forecast length, or how many days in the future the model will display.
This calculator is effective for calculating the number of
net new COVID-19 patients seen by a system each day, and how many of these
patients will be in a hospital census over time. It also can help to forecast the demand for
PPE over time based on patient volume.
As with any modeling, this has limitations. The model is most effective for a 7 day window, and the uncertainty for the prediction increases the further the forecast is projected. In areas with recent statewide initiatives like shelter at home, the model will not factor those initiatives in, though any such programs won’t show an impact for a 5-10 day time window.
Resource Requirements Graphs and Tables
After parameters, the first graph and table displays the forecast for cumulative confirmed novel coronavirus case by day, the new cases by day, the number projected to touch the health system and the new admissions.
The next graph and table displays the daily bed needs by bed type (non-ICU and ICU) along with the project number of vents.
Finally, the last graph and table project the number of PPE required per day to safely treat the novel coronavirus positive patients.
Notes on Modeling Parameters
To begin capturing the various phases of COVID-19 spread, we are constantly tracking the predictive success of three simple growth models (i.e., exponential, logistic, quadratic). These popular models have long been used to predict how populations, diseases, etc. grow, but can differ greatly in their predictions.
We are currently looking to strengthen our suite of models beyond these three general but often accurate examples.
The exponential model has been widely successful in capturing the increase in COVID-19 cases during the most rapid and difficult-to-mitigate phases. The exponential model takes a simple form (y ~ ex) and essentially captures what happens when you repeatedly double a something over time (1, 2, 4, 8, …). As a model of uncontrolled growth, the exponential model has been one of the most accurate models at predicting the emergence of new COVID-19 cases across geographic scales within the US and abroad, from city to state, and country.
To date, in Illinois, the exponential model has most closely fits our data. Our hope is that the mitigation interventions that were implemented late last week will begin to favorably impact our case counts in the near future, but in the meantime we are preparing for the forecast volumes under this model.
Quadratic Growth (2nd order polynomial)
In other locations, a 2nd order polynomial (y ~ x2 + x) best fits the data and more accurately predicts the number of cases expected over the coming days. This kind of growth is typically described as quadratic and is the expected outcome when the growth rate changes and when that rate of change is constant. The rate of increase in this model is initially faster than that of the exponential model. However, as time ensues and as the increasing growth rate stabilizes, the exponential model will produce faster growth.
Because of the inherent variability in COVID-19 data due to testing, reporting, and actual spread, it can be both difficult or easy to tell whether COVID-19 is spreading exponentially or quadratically. They key, is to not limit oneself to one model and to prepare for alternative outcomes.
As recovery ensues and as social planning begins to take effect, the rate of spread will eventually slow. At that point, the exponential and quadratic models will begin to fail, as we have seen for other diseases, epidemics, pandemics, and as we have seen for China and its cities and provinces in regards to COVID-19.
When exponential growth slows and then tapers off, it often becomes logistic, that is, the curve becomes “S” shaped. As we have seen with regions in China, the logistic model can accurately explain more than 99.9% of variation in the exponential growth and the tapering off.
However, the most powerful use and accurate predictions of the logistic model will require information on what the likely maximum number of cases will likely be under a combination of seasonality, spreading immunity, and active public health measures.
Length of Stay
Average LOS is a commonly used metric among hospitals. We modeled the expected change in size of a hospital’s COVID-19 patient population (ICU, Non-ICU, ICU on ventilator) using a common probability distribution, the binomial. If you’ve ever tried to predict the outcome of flipping coins, then you have intuitively used the binomial distribution where each outcome has a probability (p) of 0.5.
In short, we can use this distribution and average LOS to model what percent of 1-Day, 2-Day, … 10-Day, etc. patients will be going home or leaving on the current day. In doing this, we begin to account for daily carry-over and changes in the sizes of a hospital’s COVID-19 population.
More specifically, we use the cumulative distribution function of the binomial distribution with p = 0.5 (a patient goes home or they do not).
The model does not include PUIs that are admitted and subsequently test negative. At our center, these non-COVID-19 PUI patients are tested with results within 24-48 hours, and we factor those in as we use the output of the calculator. As we move further from the flu season, it is anticipated that a smaller proportion of PUIs will be non COVID cases.
Kenneth J Locey, PhD, Jawad Khan, Thomas A. Webb, and Bala Hota, MD, MPH
Over the past week, a tremendous amount has changed in the world. Governments, employers, and social networks have advocated social distancing through travel bans, closures of schools, working from home, and countless other cancelled activities. The goal of these activities is to slow the spread of COVID-19 and protect vulnerable populations.
In our post that we released one week ago, data suggested the growth of COVID-19 was consistent, yet slower in the United States than in Europe. Continuing to use data from Johns Hopkins University Center for System Science and Engineering, we are now seeing that transmission rate change.
Updated Comparison of Growth
Since March 1st, 2020, the rate of transmission in the United States matches that of several of the major European countries. In Figure 1, we see the growth of COVID-19 cases grow at similar rates to Italy, France, Germany, and the United Kingdom.
China and South Korea had fast transmission growth months ago, but enacted strict social distancing policies. In Figure 2, we see the growth of COVID-19 cases in these countries has almost stopped, unlike in the US. At these current rates, the number of confirmed cases in the United States will pass South Korea in less than a week.
An estimated growth factor for each country was calculated from the confirmed cases data. Of these seven countries displayed in Table 1, the transmission growth rate is the fastest in the United States, but not different from France, Germany, and the United Kingdom. Italy, South Korea, and China are statistically slower.
Forecasting Future US Cases
Using the data for confirmed cases in the United States since March 1, we created a forecast for the COVID-19 spread in the US over the next two weeks. The interactive forecast displayed in Figure 3 assumes the transmission growth rate of the last 14 days will continue over the next two weeks.
From this forecast, we believe rapid growth of new cases will develop during the last week of the month. Our model also suggests that United States may approach half a million cases by the end of March. Whether this estimate is accurate or an overestimate is dependent on our ability to test patients and society’s willingness to comply with mitigation efforts.
The global coronavirus (COVID-19) outbreak has elicited widely varying responses by national governments and societies. In evaluating opportunities for mitigating the spread of COVID-19, Anderson, Heesterbeek, Klinkenberg, and Hollingsworth (2020) identified personal and governmental responsibilities toward slowing the transmission of the virus. Identifying successful strategies from transmission data is critical toward preventing further catastrophic outcomes.
Over the last several months of intensive surveillance, Johns Hopkins University’s Center for System Science and Engineering (JHU CSSE) has provided a public data set curated from a number of international data sources, including the WHO, China CDC, and the US CDC. In addition to their online dashboard, the Center has published the data publicly on GitHub. This data has allowed for widespread education and investigation into COVID-19 activity.
For institutional preparedness plans, knowing the volume of cases over time is a critical factor in understanding staffing and equipment needs. In addition, to have a framework to examine if local transmission rates are growing above or below expectations can help to understand timelines for surge capacity needs. We used recent epidemic growth rate data to develop a predictive model for COVID-19 case counts.
As of March 7th, 2020, 105,836 people across 101+ countries have been confirmed as infected by COVID-19 (JHU CSSE, 2020). Unfortunately, this has already resulted in 3,558 deaths internationally and 17 in the United States. Further insights from the confirmed cases data can help identify countries where the spread of COVID-19 has been slowed. These countries’ responses to mitigating spread can serve as best practices.
Using the JHU CSSE data set of confirmed cases, we aggregated the daily case count at the national level. The cases count from each nation was aligned making time zero (t=0) the day with the first reported case. Only seven of the 101 countries started with confirmed cases on the first day of data reporting, and only Mainland China (547), Thailand (2), and Japan (2) had more than one case. While Mainland China will be included in the remainder of analysis, little is known about the early spread of the virus.
Next, utilizing principles from the physical sciences, the case counts were log transformed to better investigate the transmission growth factors. In Figure 1, we observed the log transformed rates of confirmed COVID-19 cases from the first infection for the five countries with the most cases and the United States (currently 9th). Mainland China, South Korea, Italy, and Iran all show strong signs of diminishing growth (bending of the curve), which in not yet seen in France or the US. Another interesting observation is a lag in explosive growth from the initial infections, which could be interpreted as the time before community spread of the virus. In Italy, the lag was around 20 days from the first confirmed case. In South Korea, France, and the US, the lag was approximately 28-32 days.
Figure 1. Confirmed Cases by Days Since First Case
Qualitatively, the observed epidemic curves appear to have two different phases. In most countries, an initial phase in which cases were likely imported to local communities and testing was limited, is followed by a second phase characterized by community transmission. Of most interest is the comparison of rates of community transmission in this second phase because this may identify countries with the slowest rates of transmission, and may suggest the most efficient infection control strategies.
Assessing Community Spread Rates
To assess the growth of cases with community spread, each country’s data was aligned to time zero (t=0) identified by the last point of the lag period. Data was re-aligned for the 17 countries and regions with over 100 confirmed cases. Figure 2 displays the log transformed re-aligned confirmed cases curves for these countries. The curve for the United States was designed with each data point identified for ease of identification in the graph.
Figure 2. Confirmed Cases by Days Since Estimated Community Spread Start
The most important feature of the curves in Figure 2 is the slope of the curve, as this is an indicator of transmission growth rates. The curves for Mainland China, South Korea, Singapore, and Hong Kong show signs of diminishing growth meaning the rate of further infections is declining. Many of the other curves, including the United States, are more linear meaning that the infection rate has not started to slow. This data also shows differing slopes meaning the infection rate in the various countries is different. Some are experiencing faster spread and some have slower spread.
Ranking the COVID-19 growth by Country/Region
The rate of growth of COVID-19 in countries can be compared by estimating the transmission growth factor from the series of daily confirmed case count. The transmission growth factor indicates how fast COVID-19 will spread; larger numbers imply faster growth. Using linear approximation for the slope of the curve, we identify a wide range of transmission growth factors in each country, which are shown in Table 1. Additionally identified in the table are the implied days for the cases to double, days to go from the 1st to 100th case, and days to go from the 100th to the 1,000th case. As the growth factor increases, the measure of days decrease.
Of these identified countries, Belgium has the most
aggressive growth factor followed by The Netherlands, Norway, and Iran. The
United States has one of the slower growth rates, but is higher than the
current rate in Japan, Mainland China, Singapore, and Hong Kong. It is worth
reminding that a limitation of this data set is that the initial spread of
COVID-19 in China is undocumented in the data set. However, the rate of
continued spread in China is less than in the US.
Table 1. Growth Factor Estimate and Implied Growth Periods for COVID-19 Spread by Country
This analysis has relied on preliminary data of COVID-19
global confirmed cases to determine the rate of spread in various countries. In
the United States, COVID-19 is currently spreading slower than in many other highly
impacted countries, but this should not be a signal of complacency. The daily
data is currently showing a consistent rate of growth, not a declining rate.
Because we are still in the early days of this infection, it is imperative to
continue domestic mitigation efforts to slow or stop the continued spread of
this deadly virus.
Promising future research could continue to look at the form of the COVID-19 transmission rates as this epidemic continues to evolve. Additionally, more information and evaluation on the specific interventions employed by varying countries could help build best practices for the current situation and future episodes.
Written by Thomas A. Webb, MBA, and Bala Hota, MD, MPH
Use the following interactive tool to understand how your hospital’s Overall Rating (CMS stars) changed from the December 2017 release to January 2020 release. Also view how stars were distributed based on your socioeconomic cohort and hospital size. Instructions after graphs.
Select your hospital’s CMS number from the Provider ID filter in the top right of the tool below:
This interactive tool uses the output from the publicly available CMS Overall Rating SAS software available on QualityNet.org for the December 2017, February 2019, and January 2020 releases. Publicly available files for FY20 HRRP were also used to obtain socioeconomic cohorts.
1. Use the filter in the top right corner of the visualization to select your Hospital CMS ID number. 2. The top graph shows the domain and overall scores from the December 2017, February 2019, and January 2020 releases. Compare how each domain changed in score. 3. The Summary table displays the Hospital ID number, HRRP socioeconomic status cohort, hospital size (based on HWR denominator), Overall Summary Score and Star Rating. 4. Click on any of the values in the Summary table to highlight your hospital’s cohort in the socioeconomic and size graphs. 5. The socioeconomic graph shows the distribution of stars based on the five HRRP socioeconomic cohort. Some hospitals were not provided a cohort by CMS; their result is Null. 1 = Highest SES and 5 = Lowest SES 6. The hospital size graph shows the distribution of stars based on quartiles of hospital size, determined by the Hospital-Wide Readmission measure denominator. Some hospitals did not receive a HWR score; their result is Null.
AHRQ’s Patient Safety Indicators (PSIs) are used in many healthcare quality programs as a way to measure safety incidents at hospitals. Each indicator looks at a different aspects of safety. Examples of PSIs are measures that look at pressure ulcers, operative hemorrhage / hematoma, accidental puncture / laceration and pulmonary embolism / deep vein thrombosis. The PSIs are based on evaluating combinations of ICD-10 diagnosis and procedure codes from claims data.
As part of the Hospital Compare website, CMS publicly releases each hospitals’ performance on individual PSIs and the composite PSI, PSI-90. PSI-90 is in many quality based programs, such as the Overall Rating (aka Stars), Value-Based Purchasing, and the Hospital Acquired Condition Reduction Program. Individual PSIs are used in ratings by many organizations, including US News & World Reports Best Hospitals Rankings, Leapfrog, and Vizient.
Needless to say, it is important for hospitals to perform well in PSIs.
As of October 2018, some hospitals may have noticed worse performance on one of the PSIs – and it’s all due to a minor change in coding.
The PSI in question is PSI 12 – Perioperative Pulmonary Embolism or Deep Vein Thrombosis Rate. (Click here for the AHRQ measure specification.) One of the exclusions from this measure is the use of extracorporeal membrane oxygenation (ECMO) for the care of a patient.
This exclusion stopped working on October 1, 2018 and its due to a change in ICD-10 rules.
For the last three years under ICD-10, ECMO patients have all received one common procedure code.
ICD-10 Procedure Code
Extracorporeal Membrane Oxygenation, Continuous
The current AHRQ specification, v2018, and all the software used to identify PSIs are written to find 5A15223 in the claim. However, that PSI exclusion isn’t working because the 5A15223 code has been removed from usable ICD-10-PCS codes.
Starting with Federal Fiscal Year 2019, the one ECMO code was broken into three new codes to increase specificity.
Unfortunately, the AHRQ specification and any software based on that specification no longer recognizes ECMO as an exclusion from the PSI 12 measure for patients after October 1, 2018.
Compounding the issue is the fact that rates of PE/DVT are much higher in patients on ECMO than in the standard surgical population. Hospitals that perform this life-saving procedure are being unfairly judged compared to those they don’t have ECMO programs.
While this is likely to be fixed in a later update to the AHRQ PSI rules (and associated software), until it is, patients on ECMO are likely to be included inappropriately as a PSI 12 event.
Use the following interactive tool to understand how your hospital’s Overall Rating changed from the December 2017 release to February 2019 release. Also view how stars were distributed based on your socioeconomic cohort and hospital size. Instructions after graphs.
Select your hospital’s CMS number from the Provider ID filter in the top right of the tool below:
This interactive tool uses the output from the publicly available CMS Overall Rating SAS software available on QualityNet.org for the December 2017 and February 2019 releases. Publicly available files for FY19 HRRP were also used to obtain socioeconomic cohorts.
1. Use the filter in the top right corner of the visualization to select your Hospital CMS ID number. 2. The top graph shows the domain and overall scores from the December 2017 release and February 2019 release. Compare how each domain changed in score. 3. The Summary table displays the Hospital ID number, HRRP socioeconomic status cohort, hospital size (based on HWR denominator), Overall Summary Score and Star Rating. 4. Click on any of the values in the Summary table to highlight your hospital’s cohort in the socioeconomic and size graphs. 5. The socioeconomic graph shows the distribution of stars based on the five HRRP socioeconomic cohort. Some hospitals were not provided a cohort by CMS; their result is Null. 1 = Highest SES and 5 = Lowest SES 6. The hospital size graph shows the distribution of stars based on deciles of hospital size, determined by the Hospital-Wide Readmission measure denominator. Some hospitals did not receive a HWR score; their result is Null.
We believe the Overall Rating should be held until these concern can be addressed.
We have crowd sourced data from multiple hospitals and worked with health care quality leaders from around the country. We have openly presented and shared these findings directly with stakeholders at request. We are posting this with hopes of starting a respectful discussion around creating a fair, transparent and easy to understand ranking of hospitals that makes sense to consumer and providers. We believe the current system, as you will read below, is exceptionally complex. With complexity often comes unintended consequences. We are hopeful that a conversation can be had to foster continued improvement of our ranking systems. This is extremely important as physicians are being judged and society is drawing conclusions from those judgments that we do not believe are accurate. While we used the data primarily of Rush University in the heart of this analysis, we worked with colleagues from the University of Chicago, University of Virginia, and Wake Forest University to better understand the impact of this data.
As health care workers, we view quality care as a promise – to patients, to family of patients, and to the community. In this, we share a common goal with all participants in our healthcare system. At the federal level, many talented researchers and policy development leaders have designed systems to incentivize high quality care which contributes to a shared goal of a high-value healthcare system. At Rush University, we have sought to understand the connection of policy to the care we provide to our patients. We have found in our analyses that some unintended consequences may be resulting from the current national policies to measure healthcare quality. These findings align with some of the recent public debate over increased mortality being linked to readmission reduction programs. In our view, we are at a critical juncture in how we view hospital quality rating, and have a terrific opportunity to improve the way we measure hospital quality.
In this post, we will describe issues with the current CMS approach to measurement of hospital quality of care, as described by the CMS Stars rating and the Hospital Readmissions Reduction Program (HRRP). These issues arise from:
Outlier patients, with frequent readmissions
Adjustment of readmission scores based on hospital volume, and star rating effect
Socioeconomic status adjustment
Variability in ratings due to the Latent variable model.
1.Patients with frequent readmissions, though rare, disproportionately affect the readmission score and hospital star rating.
The Readmission Domain in CMS’ Overall Rating accounts for 22% of the total score. Despite nine measure evaluated by the Latent Variable Model, only one was chosen by the model to calculate this portion of the Overall Rating. The one measure is the Hospital-Wide All-Cause Unplanned Readmission measure. Table 1, from CMS’ Hospital Specific Report, confirms that the Loading Coefficient, determined by the Latent Variable Model, for HWR has perfect correlation (Loading Coefficient = 1.0) to the Readmission Domain score and further supported by Chart 1.
Table 1. Loading Coefficients for Readmission Domain – Feb 2019 Release
Chart 1. Correlation between Readmission Domain Score and HWR Measure – Feb 2019 Release
Data from 20 Hospital Specific Reports confirm the perfectly linear relationship identified from the loading coefficients between the Readmission Domain score and the HWR measure.
Rush University Medical Center (RUMC), a tertiary care program, accepts complex, critically ill patients. Many times, the patients are referred to our hospital for a higher level of care. Accepting and treating these acuity outliers put RUMC, and hospitals like RUMC, at a risk for lower performance in the HWR measure and the Overall Rating.
Chart 2. Histogram of Patients by Number of Readmissions
This histogram shows the distribution of patients by number of readmissions during the period of July 2016 through June 2017. Four (4) patients accounted for 36 total 30-day readmissions.
Without these four patients, RUMC’s raw (un-adjusted) HWR would drop from 17.3% to 16.9%, enough to change RUMC from a 4-star to a 5-star hospital in the Feb 2019 release, if the Dec 2017 cutoffs are consistent.
Patient 1: Decompensated Liver Transplant did not make to transplant. Managed complications of recurrent bleeding that could only be treated with transplant. Clinically reviewed readmissions as unavoidable.
Patient 2: Routinely misses dialysis and comes to ED when
confused. Readmitted for HD and management of renal encephalopathy that resolves
after HD. Clinically reviewed readmissions as unavoidable.
Patient 3: Patient with suprapubic catheter, recurrent UTIs,
ulcers non-healing. Clinically reviewed readmissions as unavoidable.
Patient 4: Patient with end stage renal disease and NO
access obtainable at outside hospitals, transferred and managed with a Hero
catheter requiring multiple hospitalizations to maintain graft. Clinically
reviewed readmissions as unavoidable.
The Readmission Domain is linked to the Hospital Wide Readmission (HWR) measure exclusively. For tertiary care centers, the treatment of high acuity outliers, which are not excluded from HWR, can negatively impact performance relative to centers with lower acuity.
2.Readmission scores are adjusted for hospital volume. This adversely impacts the scores for some large hospitals.
The use of Hierarchical Logistic Regression Models for mortality, readmissions, and complications and PSI-90 reliability adjustment adversely impacts rankings of large vs small hospitals.
It has been previously shown that volume adjustment
leads to lower thresholds for reporting poor performance for larger hospitals(1,2)
Sosunov EA, Egorova NN, Lin H-M, McCardle K, Sharma V, Gelijns AC, et al. The Impact of Hospital Size on CMS Hospital Profiling. Med Care. 2016 Apr 1;54(4):373–9.
Joynt KE, Jha AK. Characteristics of Hospitals Receiving Penalties Under the Hospital Readmissions Reduction Program. JAMA. 2013 Jan 23;309(4):342.
Volume adjustment is employed by HRRP as a strategy to minimize the effect of variability seen in low volume centers. This approach, also called “shrinkage” is a well-accepted approach to reduce the chance that identified outliers are not simply the result of variability due to low volumes of cases. There is a difference, however, in adjusting for volume to detect true poor performers – the objective of the HRRP – and ranking based on the results of scoring – which is the goal of the stars program.
Charts 3a-3e (Appendix) show varying linear relationship between CMS corrected readmission rates and raw readmission rates depending on hospital size.
In an attempt to adjust results for statistical variability in small volumes, corrections done by the Hierarchical Logistic Regression Models have unintended and confusing consequences. By adjusting for low volume in the measures, low volume hospitals, as a group, are adjusted toward the mean, displacing high volume hospitals to the high and low extremes. What is counter intuitive is that low volumes are typically associated with poorer outcomes in the medical literature. As shown below, when comparing low and high volume centers, the lower volume center with a worse raw 30-day readmission rate is ultimately rated higher than a high volume center with a better raw 30-day readmissions rate.
Excluding volume correction, small hospital in Texas’ readmission rate improves while integrity of ranking is maintained. Large hospitals in Chicago and Detroit retain a higher ranking.
While unable to test HWR directly due to suppression of actual readmissions, the same model principals are employed in HWR, as with Heart Failure. In the Dec 2017 Release, the small hospital in Texas was corrected more than the large hospital in Detroit based on CMS’ adjusted measures, despite the larger hospital having better raw 30-day readmission rates. This results in the large hospital in Detroit receiving a worse Readmission Domain score, as shown in Table 4.
Table 4. Results from Readmission Domain from Dec 2017 Release
On a larger scale, the Hierarchical Logistic Regression
Model’s impact on ranking can be seen in the following two charts. Smaller
hospitals are compressed to the middle and larger hospitals are displaced to
Charts 4a-4b. Ranking Adjustments for COPD Readmissions by Hospital Size
Volume adjustment of outcome scores propagate through the entire star system as these models influence three domains and 66% of the total score.
Table 5 shows no small hospitals (based on HWR volume) have a 1-star and 8% have a 2-star, where 37% of large hospitals have 1 or 2 stars.
Table 5. Distribution of Stars by Hospital Size
This difference isn’t due to many more large hospitals providing poor quality but a measurement system that when used for ranking creates winners and losers based on size alone.
The Overall Rating is heavily based on Hierarchical Logistic Regression Models. These models create bias in results based on hospital size.
3.Socioeconomic status is not adjusted for in the Star rating, but is adjusted for in the HRRP. This adversely affects urban hospitals.
The association of low socioeconomic status and readmission outcomes has been well established, and many have advocated for adjustment of readmission rates for socioeconomic status(ref 3–6).
The 21st Century Cures Act legislated the requirement of inclusion of socioeconomic status (SES) into the calculation of financial penalties within HRRP.
Bernheim et al(ref 7) showed a statistically significant relationship of socioeconomic factors, such as median income, to readmission rates for AMI, HF, and PN. SES factors were of higher impact than over 1/3rd of medical comorbidities included in the readmission models.
3. Boozary AS, Manchin J, Wicker RF. The Medicare Hospital Readmissions Reduction Program: Time for Reform. JAMA. 2015 Jul 28;314(4):347–8.
4. Carey K, Lin M-Y. Hospital Readmissions Reduction Program: Safety-Net Hospitals Show Improvement, Modifications To Penalty Formula Still Needed. Health Affairs. 2016 Oct;35(10):1918–23.
5. Figueroa JF, Joynt KE, Zhou X, Orav EJ, Jha AK. Safety-net Hospitals Face More Barriers Yet Use Fewer Strategies to Reduce Readmissions. Medical Care. 2017 Mar;55(3):229.
7. Bernheim SM, Parzynski CS, Horwitz L, Lin Z, Araas MJ, Ross JS, et al. Accounting For Patients’ Socioeconomic Status Does Not Change Hospital Readmission Rates. Health Aff (Millwood). 2016 Aug 1;35(8):1461–70.
CMS’ Overall Rating program exclusion of SES from the Readmission domain creates inconsistency from CMS’ HRRP.
Our own research found that the Summary score of the Dec 2017 Overall Rating had statistically significant correlation with the proportion of dual eligible patients, data supplied by the HRRP program.
The following are a few examples of Illinois hospitals that would change star ratings based on socioeconomic status correction based on proportion of dual eligible patients.
Tables 6a-6b. Changes to Overall Rating from SES Inclusion
Socioeconomic status was legislated to be included when calculating readmission penalties because SES matters. SES impacts outcomes and should be addressed in the Overall Rating model.
4.The use of a Latent Variable Model in the Star ratings introduces variability and inconsistency, making changes in rating hard to interpret.
The Latent Variable Model has created confusion and
contradictions in interpretation of a safe hospital. CMS runs three separate
programs which evaluate hospital safety: Value Based Purchasing (VBP), Hospital
Acquired Condition Reduction Program (HACRP), and Overall Rating.
These three programs largely use the exact same measures, yet there are inconsistent results on which hospitals are safe or not.
Table 7. Safety Measures for CMS Programs
For Overall Ratings, the latent variable model continues to peg PSI-90 as the overwhelming favorite for measuring safety.
Table 8. Loading Factors for Safety Domain by Release
Chart 5. Feb 2019 Safety Domain score vs PSI-90 score
20 Hospital Specific Reports confirm the perfectly linear relationship identified from the loading coefficients between the Safety Domain score and the PSI-90 score. Hospital Acquired Infections are insignificant.
This trend was identified in the Dec 2017 release; however, the LVM switched to THA/TKA Complications during the unreleased Jun 2018 version, but back to PSI-90 for Feb 2019.
Charts 6a and 6b show very little to no correlation between HACRP and the VBP Safety domain from the Dec 2017 release. 284 hospitals received a 1% HACRP payment penalty, yet had above average safety scores in Overall Star Rating.
Chart 6a-6b. Correlation of Overall Rating Safety with HACRP and VBP Safety
Inconsistency of safety measurement creates confusion between results of various CMS programs. Patients and hospitals don’t know what to believe as safe.
We believe the overall star rating, at this time, does not achieve the aim of a transparent measure of quality and safety that is easy to understand by consumers and healthcare quality leaders in hospitals. We also believe that those pushing for a refresh of these measures would rather wait for an accurate measure rather than one so dramatically affected by math as described above. Because of the cumulative effect of biases due to inadequate or inappropriate adjustment for socioeconomic status, hospital size, and outlier patients given heroic care, the star ratings inadvertently penalize large hospitals and academic medical centers. In academic arguments, these individual effects may be perceived as small. As we and other authors – including Bernheim, et al – have described, the effect of socioeconomic status on hospital measures is stronger than many chronic disease measures, and may account for more than a quarter of all hospitals changing rating. Heroic care, as we’ve shown, may adversely impact rating. Finally, simply being a large hospital may adversely affect rating and may have a financial penalty impact.
These issues could be mitigated with four changes to the current star ratings and HRRP program. First, aligning adjustment for Socioeconomic status in the Stars program to that of the HRRP, would be a logical and consistent method for measuring quality. Second, capping the impact of volume on adjustment and incorporating confidence intervals would address issues with volume impacting rates. Third, removal of the impact of outlier readmissions on the readmission measure would eliminate the undue influence of individual patients on rates and, we speculate, reduce the risk of adverse outcomes due to unintended consequences of policy. Finally, abandoning the latent variable model in the composite rating for the Overall Rating would address its lack of consistency.
We also believe the time has arrived for 21st century methods to measure quality care. Tremendous progress in the use of electronic data has enabled high quality information to be captured by our electronic record systems. Patient access to data has similarly been transformed through the use of standards, like FHIR, and inclusion of these data in our mobile devices like the iPhone. Patients deserve high quality methods that are not one-size-fits-all, and are personalized and precise. The next evolution of measurement should be accurate and personalized which guides patients to the best care possible. The science behind ranking hospitals and providers of one versus the other is complicated. We are hopeful that those doing these rankings listen to the medical community when information is provided and misleading findings can be held. Without correcting for the factors described above, releasing Stars could very well have a detrimental effect on both providers and consumers.
We encourage you to comment below so that we can continue to refine our understanding and insights.
Charts 3a-3e. CMS Readmission vs Raw Readmissions – By Hospital Size – Heart Failure