ANOVA

Imagine you love baking cookies and invite your friends over for a cookie party. You want to know how many cookies you should make so you ask your friends about how many cookies they think they will each eat. They respond:

  • Francesca: 5 cookies
  • Sydney: 3 cookies
  • Noelle: 1 cookie
  • James: 7 cookies
  • Brooke: 2 cookies

We take these numbers and add all of them together to estimate that about 18 cookies will be eaten in total at our party.

The Analysis of Covariance, or ANCOVA, is a regression model that includes both categorical and numeric predictors, often just one of each. It is commonly used to analyze a follow-up numeric response after exposure to various treatments, controlling for a baseline measure of that same response. For example, given two subjects with the same baseline value of the study outcome, one in a treated group and the other in a control group, will the subjects have different follow-up outcomes on average?

One of the assumptions of the Analysis of Variance (ANOVA) is constant variance. That is, the spread of residuals is roughly equal per treatment level. A common way to assess this assumption is plotting residuals versus fitted values. Recall that residuals are the observed values of your response of interest minus the predicted values of your response. In a one-way ANOVA, this is simply the observed values minus the treatment group mean. For example, below we have a plot of residuals versus fitted values for a one-way ANOVA.