My last post took shape less as a communique than as an exercise in thinking out loud. In the midst of that ordeal I did run a new analysis that enables a more direct and streamlined method of estimating the rate of covid contagion through the US population.
It’s well known that daily dx-positive case counts underestimate new infections. In June the CDC estimated that, in the US, there had been ten times as many infections as case-positives. But that ratio isn’t a constant: early in the pandemic, when very little testing was being done, the infections-to-cases ratio was far higher. As testing increased over the months the ratio decreased.
The other day as part of my convoluted last post I ran week-by-week analyses on data from the past four months looking for the best predictor of the subsequent week’s covid death counts. It turned out that the prior week’s death count and the case count from 3 weeks prior were the two best predictors. They’re not interchangeable predictors, so I used an average of the two in my analyses. However, as individual predictors each proved to be pretty accurate.
I’ve been using death counts as a 3-week lagging indicator of infections, and that algo was supported by the 4 months’ worth of data. But it turned out that the inverse relationship was also supported: changes in case counts are a good leading indicator of changes in death counts 3 weeks later.
Given the stable relationship over the past four months between case counts and subsequent deaths, along with the stable relationship between deaths and infections, it seemed likely that the relationship between case counts and infections had also stabilized. It’s still true that case counts underestimate infections, but based on the data it seemed possible to compensate for the case-count underestimate by means of an empirically supported conversion formula.
The method for deriving the conversion formula was based on three variables:
- Take the weekly total death and divide it by the estimated covid fatality rate of .0065.
- Take the weekly total case count from three weeks prior to variable 1.
- Divide variable 1 by variable 2: that’s the ratio of estimated infections to cases.
I ran the numbers week by week across the last four months of data, and the value for variable 3 proved quite stable, averaging around 2.7. So that’s the proposed conversion formula:
New infections = 2.7 x new cases
So, from 27 November to 11 December, there were 2.82 million new dx-positive cases reported in the US. Multiply 2.82 million by 2.7 = 7.61 million new cases, or about 2.3 percent of the US population. That estimate falls snugly within the 2.15% – 2.75% I estimated in the last post.
On the go-forward I’m going to use this conversion ratio for estimating infection rates in the US, which are likely to continue climbing even as the vaccine is being rolled out. I’ll also keep track of the week-to-week data to see if the ratio needs adjusting.