Richard Sprague

My personal website

R-tude: counting my sleep locations

2018-09-18


An etude is a short musical composition intended for intensive practice on a particular skill. My R-tudes are similar exercises to help me develop my R skills.

This is an R-tude using tidyverse and various summarization functions to help me quickly generate a chart showing where I have slept over the past few years. It’s based on an excel spreadsheet that I read into the variable lifetime, which is a simple dataframe made of two columns: Date and Geo, which tell the geographical location where I slept on a given night.

In tibble form, it looks like this:

lifetime
## # A tibble: 651 x 2
##    Date       Geo          
##    <date>     <fct>        
##  1 2017-01-01 Mercer Island
##  2 2017-01-02 Mercer Island
##  3 2017-01-03 Mercer Island
##  4 2017-01-04 Mercer Island
##  5 2017-01-05 Mercer Island
##  6 2017-01-06 Mercer Island
##  7 2017-01-07 Mercer Island
##  8 2017-01-08 Mercer Island
##  9 2017-01-09 Mercer Island
## 10 2017-01-10 Mercer Island
## # ... with 641 more rows
summary(lifetime)
##       Date                       Geo     
##  Min.   :2017-01-01   Mercer Island:558  
##  1st Qu.:2017-06-12   San Francisco: 22  
##  Median :2017-11-21   Somerset     : 19  
##  Mean   :2017-11-21   Beijing      :  6  
##  3rd Qu.:2018-05-02   Philadelphia :  5  
##  Max.   :2018-10-12   (Other)      : 39  
##                       NA's         :  2

Here’s how I summarize the number of nights per year, by location

lifetime %>% tidyr::separate(Date,c("Year","Month","Day")) %>%
  dplyr::group_by(Year) %>% dplyr::count(Geo) %>%
  dplyr::filter(n>2) %>% spread(Year, n)
## # A tibble: 7 x 3
##   Geo           `2017` `2018`
##   <fct>          <int>  <int>
## 1 Beijing            6     NA
## 2 Mercer Island    306    252
## 3 Philadelphia       5     NA
## 4 Ross Dam          NA      3
## 5 San Francisco     20     NA
## 6 Somerset          13      6
## 7 Yakima            NA      3

There you go! This one line of R prose gives me another dataframe organized by year, showing only the locations where I slept for at least 2 nights.