COVID-19 and its R-Squared Relationship to GDP Change

Created: 2020-09-04 ; Updated: 2020-09-04

The highly-respected data visualization site, Our World in Data, published the blog post Which countries have protected both health and the economy in the pandemic? that concludes:

Comparing the COVID-19 death rate with the latest GDP data, we in fact see the opposite: countries that have managed to protect their population’s health in the pandemic have generally also protected their economy too

But the post never actually quantifies this, relying instead on a chart that kinda-sorta seems to justify the conclusion. Can we do better?

Since Our World in Data freely opens all their source data, our first step is to read it into R:

data_raw <- read_csv(file.path("q2-gdp-growth-vs-confirmed-deaths-due-to-covid-19-per-million-people.csv"),
                     col_types = "f_cdid_i")  %>%
  transmute(country=Entity, date=mdy(Date),
            deaths = `Total confirmed deaths due to COVID-19 per million people (deaths per million)`,
           # population = `Population, total`, 
            gdp_growth = `GDP growth from previous year, 2020 Q2 (%)` )

data_raw %>% head() %>% knitr::kable(caption = "Raw data (first ten rows)" )

Table 1: Raw data (first ten rows)
country	date	gdp_growth
Afghanistan	2019-12-31	NA
Afghanistan	2020-01-01	NA
Afghanistan	2020-01-02	NA
Afghanistan	2020-01-03	NA
Afghanistan	2020-01-04	NA
Afghanistan	2020-01-05	NA

To manipulate data, we first need to convert the table to something more Tidy, and just include data from 2020:

library(kableExtra)
options(kableExtra.html.bsTable = TRUE)

data <- data_raw %>% dplyr::filter(date > "2020-01-01" & !is.na(gdp_growth)) 
data_df <- data %>% group_by(country) %>% summarise(deaths = sum(deaths), gdp = first(gdp_growth)) %>% 
  dplyr::filter(deaths>0) %>% ungroup()
data_df %>% arrange(gdp) %>%  knitr::kable(col.names = c("Country", "Deaths Per 1M", "GDP Change"), digits = 1) %>%
  kableExtra::kable_styling(bootstrap_options = c("striped", "hover", "condensed"))

Country	Deaths Per 1M	GDP Change
United Kingdom	77002.9	-21.7
France	62977.7	-19.0
Italy	81760.4	-17.3
Belgium	114558.5	-14.5
Canada	26685.4	-13.5
Austria	10738.1	-13.3
Singapore	579.1	-13.2
Germany	14154.6	-11.7
Czech Republic	4416.2	-10.7
Japan	964.6	-10.0
United States	50593.8	-9.5
Netherlands	48798.8	-9.0
Denmark	14195.3	-8.5
Sweden	66939.9	-8.3
Norway	6306.8	-5.3
South Korea	867.9	-3.0
Taiwan	45.7	-0.6
China	583.3	3.2

And finally, we plot the result, which happily resembles the one at Our World in Data:

data_df %>% ggplot(aes(y = deaths, x = gdp)) + geom_point() + 
    geom_text(aes(label = country), hjust = 0, vjust = 1)  +
    labs(title = "COVID 19 Deaths and GDP Performance",
         subtitle = paste("R^2 =",format(cor(data_df$deaths, data_df$gdp) ^2, digits = 3)))

In the subtitle of the graph, I’ve computed the \(r^2\) value 0.366, a result that is not especially high. With values that vary between 0 (no association) and 1 (perfect association), I’d want the relationship to be quite a bit stronger before using this as the basis for any policy decisions.