Richard Sprague

My personal website

Sound of the microbiome

Richard Sprague 2018-05-05

A graphic visualization is an effective way to communicate a data set, but we humans have other, non-visual senses too. Generating sound from a data set, also known as sonificiation, is a well-studied problem, but I wanted to know how to a do a very simple version in R.

The best overall introduction to data sonification is at Programming Historian, which surveys the basic principles and offers a step-by-step working guide, including sample code for a few Python libraries. The instructions also point to Musicalgorithms, a web app version that lets you upload data and hear your sonification, no programming required. I couldn’t quite get the site to work on my quick try, but it might be worth more effort if you don’t have time to dig further into programming.

Sound synthesis packages

There are numerous sound-related pacakges in R. These are the main ones that appear to do what I want

tuneR is the one I eventually went with for the example below. It was last updated in April 2017, so it’s pretty recent.

Seewave semes to be a recent package under active development (most recent version is March 2018). There is a nice I/O of Sound PDF that explains how to import and export sound, but I couldn’t find a quick-and-dirty example of sound synthesis.

Sound generation with soundgen (2018): lengthy documentation of a package intended for the synthesis of animal vocalizations, including human non-linguistic vocalizations like sighs, moans, screams, etc. It can also create non-biological sounds that require precise control over spectral and temporal modulations, such as special sound effects in computer games or acoustic stimuli for scientific experiments.

playitbyr is a package that’s exactly what I want: allows you to listen to a data.frame in R by mapping columns onto sonic parameters. Unfortunately it doesn’t seem to work with R 3.4+

Audiolyzr (2015) is another package by some grad students, but it hasn’t been updated. They also wrote a paper describing it generally.

Sonify (2017) looks good too, but I didn’t have time to try it.


TuneR tutorial: good example sonifying tweets. I couldn’t get this one to work, but you’ll see that it formed the basis of my working example below.

Sonification of Iris data (2015): a step-by-step example using the most famous data science data set.

Start with the TuneR package.

On a Mac, I set the output to the Audio File Play afplay utility, which is built into OSX.


I’m going to sonify some microbiome data. I have five rows of microbes, with percentage abundances of each at the various sampling dates (columns). My data has hundreds of columns like this.

2014-05-16 2014-06-06 2014-10-17 2015-02-24 2015-04-21
Bifidobacterium 33.31 0.66 6.11 0.68 0.84
Faecalibacterium 15.09 7.76 3.70 22.86 20.79
Bacteroides 7.34 11.59 29.17 20.38 10.64
Akkermansia 6.86 2.13 0.85 0.00 1.42
Roseburia 5.91 5.93 15.30 8.45 9.99

Now we generate the sound. I admit the following is some pretty ugly code, but it works. I didn’t have time to figure out how to optimise it.

The basic idea is to start with a single instant of sound (the call to silence() below) and then loop one-by-one through each of the columns in my microbiome data frame. I generate five separate sine waves, each representing the tone of the microbiome abundance at that point in time.

The end result must go through the normalize() function in order to preserve a standard bit depth, and then we can play it.

sound.length <- 0.1
sampling.rate <- 6000
bits <- 32

long.pause <- 0.5 # in seconds

w <- silence(duration = long.pause, xunit = c("samples", "time")[2], bit=bits, samp.rate=sampling.rate)

for(i in 1:length(all)){
  wobj <- sine(microbiome_data[1,i], duration=sound.length, xunit = c("samples", "time")[2], bit=bits, samp.rate=sampling.rate) +
    sine(microbiome_data[2,i], duration=sound.length, xunit = c("samples", "time")[2], bit=bits, samp.rate=sampling.rate) +
    sine(microbiome_data[3,i], duration=sound.length, xunit = c("samples", "time")[2], bit=bits, samp.rate=sampling.rate) +
    sine(microbiome_data[4,i], duration=sound.length, xunit = c("samples", "time")[2], bit=bits, samp.rate=sampling.rate) +
    sine(microbiome_data[5,i], duration=sound.length, xunit = c("samples", "time")[2], bit=bits, samp.rate=sampling.rate) 
  w <- bind(wobj,w)

w <- normalize(w,unit = "32")

# writeWave(w,"microbiome5.wav”)

The final result can be saved as a local file, which I uploaded to Soundcloud and you can listen to here: