Can County-Level Risk of COVID19 be Predicted?
We at Base Camp Health have been diligently compiling county-level data to help understand what drives COVID19 case rates across the United States. While we realize there is an influx of information on the topic, we aim to be transparent and flexible during this time, supplying analytics that are hopefully useful to anyone interested.
To study county level information, we used data from sources such as:
As we began this analytic adventure, we noticed how different risk factors were distributed across the country. We focused on risk factors described by subject-matter leaders, such as CDC, WHO, and other public health agencies. Below is a look at the risk as it may pertain to health system factors, such as the volume of hospitals, ICU beds, and physician volume.
Now, notice how the map changes when focusing on health outcomes, such as percentage of residents with poor health, obesity rates, smoking rates, and percentage of residents taking preventative health measures. Risk is noticeably more concentrated in the southeastern portion of the country.
And let’s not forget the social and environmental factors, such as rates of homelessness, median income, and lack of access to healthy foods.
There is obviously a correlation between of these factors as well. For instance, poor health outcomes are more prevalent in impoverished areas, and areas with greater access to care may have higher rates of preventive health measures. Furthermore, each of these factors likely do not hold the same weight in estimating COVID19 cases.
This is where the power of statistical modeling to estimate risk when accounting for all (known) information becomes essential. Base Camp Health has built a machine learning model that estimates the most at risk counties that do not currently have a case (as of March 29, 2020). We found that risk of having a case increases with factors such as the percentage of the population that report poor health and the rate of severe housing problems in a county. Protective factors include the percent of the population that take preventive health measures, such as annual flu vaccination rates. Combining this information, we predict the following counties are the most likely to see their first reported case soon:
- Bonneville County, Idaho
- Knox County, Illinois
- Saline County, Kansas
- Boyd County, Kentucky
- Hancock County, Maine
- Halifax County, North Carolina
- Grand Forks County, North Dakota
- Scioto County, Ohio
- Garfield County, Oklahoma
- Rogers County, Oklahoma
- Coffee County, Tennessee
- Ector County, Texas
- Salem City County, Virginia
- Cabell County, West Virginia
- Manitowoc County, Wisconsin
We at Base Camp Health hope our analytics are wrong, and that these locations never confirm a case of COVID19. We also realize we lack a lot of powerful information, such as the availability and differences in testing and processing between counties. However, we are committed to continuing to understand what impacts the case rates among counties—not to alarm, but rather to learn.
Check back periodically as we update our models and measure their precision. Please let us know if there is any information you and your teams need during this time. Stay safe everyone!