The resulting database is a patchwork, built from the individual components that each state’s data systems capture and from the numbers that local political leaders allow to be published. Fusing together 56 state and territorial data sets can be a fraught, complex process, and the project publishes exhaustive documentation of what the numbers mean, how they compare to one another, and what we still don’t know, because of the variability of state reporting.
One of the most obvious elisions is the toll that the pandemic has taken on Black, Latino, and Indigenous people. The pandemic has disproportionately killed people in these communities, our data show. At least one in every 800 Black Americans has died of COVID-19, and Black people have died of COVID-19 at 1.7 times the rate of white people. Nationwide, Indigenous people and Alaska Natives have died of COVID-19 at 1.4 times the rate of white people.
Read: The virus is showing Black people what they knew all along
Yet the full scale of this damage is not quantifiable, because many states still do not track enough data by race and ethnicity for us to identify the full, disparate impact. Texas, for instance, reports race and ethnicity data for only 4 percent of cases. New York has never reported race and ethnicity data, which obscures our understanding of the first surge in particular, when New York’s numbers dominated every national statistic.
Only seven states report the racial breakdown of testing data, an important tool in detecting how large outbreaks are overall, because knowing the fraction of a population that has been tested can indicate the breadth of the virus’s spread.
Because of such inconsistencies and gaps, the COVID Tracking Project team has also communicated with state and federal officials hundreds of times over the past 10 months to clarify the meaning of specific numbers and to push for higher data quality and more public transparency.
This effort meant that, for months, the COVID Tracking Project published the only public database of testing and hospitalization data. Today, it is the only data set detailing each state’s and territory’s daily case, testing, hospitalization, and death numbers since the pandemic began. The federal government, including the White House Coronavirus Task Force, has used data from our investigation because it has had no alternative. The CDC Advisory Committee on Immunization Practices has repeatedly cited our data on long-term-care facilities in the course of deciding that residents of those places should get vaccinated first.
Read: What the vaccine’s side effects feel like
Today, the federal government publishes data on many of the same metrics we began tracking in March. But for many of these metrics, our data remain the only independent check on that federal data.