Calculating regressions from a physics experiment

technology

Published

October 8, 2017

Think of this as a test post, as I try to get this site up and running (in R blogdown), and also while trying to solve a problem set posted by our high school physics AP class.

The raw data comes from a Video Physics, an iPhone/iPad app that captures x-y position data frame-by-frame from a video, in this case a ball that was thrown across the room. Once captured, the data was saved to a CSV file that I read into the variable ball and clean up as follows:

Code

names(ball) <- c("t4","x","y","xv","yv")
ball$t <- ball$t4/4
head(ball)

library(tidyr)
#ball.tidy <- gather(ball,x:y,coord,-t,-xv,-yv)
names(ball)

The data was captured in slow motion, only 4 frames per second, so to convert everything to seconds, I had to divide the time t by 4.

In throwing the ball across the room, we expect the horizontal component of the motion vectors to move in a straight line. Sure enough, that’s what we see when plotting the x (horizontal) motion against time:

Code

library(ggplot2)

ggplot(data=ball, aes(x=t/4,y=x)) + geom_point()

Now use the R lm() function to build a linear model of the motion for that ball. Because I’m using a simple non-quadratic equation, a regression (computed with predict()) shows a linear trend line full of errors. Here is a plot showing the errors (“residuals”) marked in green. If you add up all the green lines (positive and negative), you’d find they sum to zero.

Code

plot(ball$t,ball$y,pch=21,col="blue",bg="red")
model1 <- lm(ball$y~ball$t)
abline(model1,col="red")
yhat <- predict(model1,actual=y)
join <- function(i)
 lines(c(ball$t[i],ball$t[i]),c(ball$y[i],yhat[i]),col="green")
ignore = sapply(1:nrow(ball),join)

Let’s try a non-linear, quadratic version using the same data

Code

plot(ball$t,ball$y,pch=21,col="blue",bg="red")
model2 <- lm(ball$y~ball$t+I(ball$t^2))
#windows(7,7)
par(mfrow=c(2,2))
plot(model2)
# 
# abline(model2,col="yellow")
# yhat <- predict(model2,actual=y)
# join <- function(i)
#  lines(c(ball$t[i],ball$t[i]),c(ball$y[i],yhat[i]),col="green")
#sapply(1:nrow(ball),join)

Study the difference between these two models (linear vs quadratic):

Code

model2


anova(model1,model2)

Model 2 (the quadratic equation) is clearly superior, showing an extremely low P value for the probability that this data would map so closely to our model based on chance alone.