Who moves to Seattle?

local
Published

September 7, 2018

Geekwire’s Monica Nickelsburg wrote Where do Seattle’s newcomers move from? Drivers license numbers reveal some surprises, with a pretty Excel chart showing the top states from which people move into King County.

But her chart doesn’t correct for the population of each state. Can we do better in R?

First, I downloaded the raw data from the Washington State Department of Licensing, which appears to be the source for her article.

Then I converted all the data to Tidy format:

Code
library(tidyverse)

dol_king <- readxl::read_excel(dol_path, sheet = "King", skip = 5)
data(state)  # read state abbreviations

# load state populations


census_pop <- read.csv(census_pop_path)

# convert to Tidy dataset

dol_king$From[24] <- "Mississippi"  # fix an error in the DOL spelling
dol_king <- dol_king %>% setNames(stringr::str_replace(names(dol_king),"CY ","")) 
dol_king <- dol_king %>% gather(Year, Change, -From)

# dol_king is a tidy dataframe (tibble) showing the number of people who moved to King County from each state between 2006-2017

# Now do the same with census_pop

census_pop_no <- census_pop %>% select(starts_with("POPESTIMATE")) %>% tbl_df()  # just the numbers for populations, not state names

census_pop_no <- census_pop_no %>% 
  setNames(stringr::str_replace_all(names(census_pop_no),"[:alpha:]*",""))
census_pop <- cbind(select(census_pop,"NAME"),census_pop_no) %>% tbl_df()
census_pop <- census_pop %>% gather(Year,Population,-NAME)
census_pop$NAME <- as.character(census_pop$NAME)

dol <- dplyr::left_join(census_pop,dol_king, by = c("NAME" = "From", "Year" = "Year"))
dol$NAME <- factor(dol$NAME)
names(dol)[1] <- "From"

This gives me one handy variable, dol, with each state and both its population as well as the number of people who moved to King County in each year.

Code
dol

Now it’s just a matter of applying simple calculations to normalize the data.

Let’s draw this as a heatmap, with darker colors representing small percentages of a population, and ligher colors representing larger percentages.

Code
ggplot(data = dol, mapping = aes(x = Year, y = From, fill = Change/Population )) +
  geom_tile() + 
  scale_y_discrete(limits = rev(levels(dol$From)))

The lighter the color, the higher the percentage of people from that state (and year) who are moving to King County. Represented this way, a few states stand out: Alaska and Oregon, for example. Although their overall populations are relatively small, lots of people move here from there. By comparison, relatively few residents of large states like California or Texas move here.

Interestingly, a non-obvious standout is Hawaii. I don’t normally think of Hawaiians as likely to move to Seattle, but percentage-wise they’re pretty high. In fact, for the last few years the average Hawaiian is more likely to move here than the average Idahoan. Go figure.

You can also see a few trends over time. For example, although both Montana and Idaho have sent a fair share of people here since since the early 2010s, their enthusiasm seems to have waned in the past few years. Similarly, Nevadans I guess decided to slow down too.

It’s a big country, so I wouldn’t read too much into this information – it’s not as though there’s a stampede in one direction or the other. Just normal people doing normal things.