Ratio of Case Counts to New Infections

My last post took shape less as a communique than as an exercise in thinking out loud. In the midst of that ordeal I did run a new analysis that enables a more direct and streamlined method of estimating the rate of covid contagion through the US population.

It’s well known that daily dx-positive case counts underestimate new infections. In June the CDC estimated that, in the US, there had been ten times as many infections as case-positives. But that ratio isn’t a constant: early in the pandemic, when very little testing was being done, the infections-to-cases ratio was far higher. As testing increased over the months the ratio decreased.

The other day as part of my convoluted last post I ran week-by-week analyses on data from the past four months looking for the best predictor of the subsequent week’s covid death counts. It turned out that the prior week’s death count and the case count from 3 weeks prior were the two best predictors. They’re not interchangeable predictors, so I used an average of the two in my analyses. However, as individual predictors each proved to be pretty accurate.

I’ve been using death counts as a 3-week lagging indicator of infections, and that algo was supported by the 4 months’ worth of data. But it turned out that the inverse relationship was also supported: changes in case counts are a good leading indicator of changes in death counts 3 weeks later.

Given the stable relationship over the past four months between case counts and subsequent deaths, along with the stable relationship between deaths and infections, it seemed likely that the relationship between case counts and infections had also stabilized. It’s still true that case counts underestimate infections, but based on the data it seemed possible to compensate for the case-count underestimate by means of an empirically supported conversion formula.

The method for deriving the conversion formula was based on three variables:

  1. Take the weekly total death and divide it by the estimated covid fatality rate of .0065.
  2. Take the weekly total case count from three weeks prior to variable 1.
  3. Divide variable 1 by variable 2: that’s the ratio of estimated infections to cases.

I ran the numbers week by week across the last four months of data, and the value for variable 3 proved quite stable, averaging around 2.7. So that’s the proposed conversion formula:

New infections =  2.7 x new cases

So, from 27 November to 11 December, there were 2.82 million new dx-positive cases reported in the US. Multiply 2.82 million by 2.7 = 7.61 million new cases, or about 2.3 percent of the US population. That estimate falls snugly within the 2.15% – 2.75% I estimated in the last post.

On the go-forward I’m going to use this conversion ratio for estimating infection rates in the US, which are likely to continue climbing even as the vaccine is being rolled out. I’ll also keep track of the week-to-week data to see if the ratio needs adjusting.






NOW How Many Infected?

On 21 November I wrote a post estimating that 1.9 percent of the American population was currently infected with covid. Now, 3 weeks later, it’s time to validate and update that estimate. I’ll show my work, but I’m not expecting anyone who might be reading this post to follow the circuitous path I’ll be following. In brief, here’s my conclusion:

As of today, December 12, between 2.15 and 2.75 percent of Americans are currently covid-infected.

Over the past two weeks (27Nov – 11Dec) 31,766 Americans died of covid. I’m assuming that death lags behind infection by three weeks. So, as of 3 weeks ago, there would have been 31,766/.0065 = 4.9 million Americans actively infected by the virus.

Over the past two weeks (27Nov – 11Dec) the 14-day total new case count was 2.82 million. Three weeks prior (6Nov – 20Nov) the 14-day total new case count was 2.22 million. So, the current case count is 2.82/2.22 = 1.27 times what it was three weeks ago. Assuming proportionality of changes in case counts to changes in new infections, then the total number of infected people today is 1.27 times the number of people who were infected three weeks ago.

Using death count as a 3-week lagging indicator, then 4.9 million infected 3 weeks ago x 1.27 = 6.22 million Americans are currently infected by covid. That’s almost exactly the same as the 6.24 million current infections I estimated as of 21 November. Is that likely, given that the current case count is 27% higher than 3 weeks ago? Testing rates are up, so increasing case counts could be an artifact of more testing. However, the test-positive percentage is also up…

How many of those 6.24 million current infections from 3 weeks ago would likely have died by now? Multiply infections by estimated fatality rate: 6.24 million x .0065 = 40,600. That’s quite a bit higher than the most recent 14-day death count of 31,766…

So, how might these misalignments between algorithm and evidence be reconciled? Maybe the most reasonable adjustment is to reduce the lag between test-positive and death from three weeks to two. It’s also the case there’s a two-week lag between test-positives and deaths in the European covid data. That’s reasonable, since diagnostic testing typically lags a week after infection. Does this proposed shift in lag times make the numbers correspond more closely?

The 14-day death count from 2 weeks ago (13Nov – 27Nov) was 21,085. Again, the most recent 14-day case count (27Nov – 11Dec) is 2.82 million. Beginning two weeks prior, the 14-day case count (13Nov – 27Nov) was 2.40 million. 2.82/2.40 = 1.18. So if deaths lag 2 weeks after infection, then the most recent 14-day death count would be estimated at 21,085 x 1.18 = 24,900. This estimate is quite a bit below the actual count of 31,766. So, split the difference: death counts lag about 18 days behind case counts? Or maybe it’s just a matter of ordinary statistical variation around the expected value…

Again, case counts depend on testing rates, and those rates have continued to climb. That’s why I’ve used death rates as a lagging but more accurate indicator of infection rates. It’s probably at least as accurate to project future death rates from current trends, then work backward from future deaths to present infections.

“Probably at least as accurate”? That’s an empirical question, so I ran the numbers. I compared week-to-week data on case counts and death counts over the past four months, looking for the the most accurate lag; i.e., the time interval in which weekly changes in case counts comes closest to the weekly change in deaths. The 3-week lag is upheld as most accurate in this comparison.  Then I compared the 3-week lagged case-to-deaths change rates with the week-to-week changes in death rates. And it turns out that deaths in the preceding week is just about as accurate in predicting deaths the following week as is  the case count from 3 weeks prior.

So in projecting the death rate 3 weeks from now I’ll take the average of the two best forecasting measures: percent changes in case count from 3 weeks ago = 1.33; percent changes in deaths for the past week to the third power = 1.69; average = 1.51. Multiply that number by the most recent 7-day death count x 2: 1.51 x 34,600 = 52,250 projected 2-week death count 3 weeks from now.  Now divide that projected death count by the fatality rate of .0065 = 8 million Americans currently covid-infected. Divide by the 328 million US population = 2.45 percent of Americans currently covid-infected. Estimate the range of variation as the values for the two predictors — 1.33 and 1.69 — considered separately. I.e.:

Between 2.15 and 2.75 percent of Americans are currently covid-infected.

My first post in this coronavirus series, dated 6 April 2020, is titled “Corona  Modeling: The Present as Past Projected Into Future.” Direct measurement of covid infection rates hasn’t improved since then — no national randomized surveys of test-positives or immunities. It’s still necessary to bounce back and forth in time arrive at an estimate of the present state of the world. Multivariate regression combining several predictor variables would likely increase the accuracy and narrow the range of variability; however, I’ve tried to keep the math relatively simple. But it’s inherently a complex situation, the spread of the virus through the population…

Now that an effective vaccine is ready to be rolled out, the math will begin to change dramatically…