In a previous post we demonstrated how to perform a basic mediation analysis. In this post we look at performing a moderated mediation analysis. The basic idea is that a mediator may depend on another variable called a "moderator". For example, in our mediation analysis post we hypothesized that self-esteem was a mediator of student grades on the effect of student happiness. We illustrate this below with a path diagram. We see a direct effect of grades on happiness, but also an indirect effect of grades on happiness through self-esteem. A mediation analysis helps us investigate and quantify that indirect effect.
But what if we suspect that, say, gender moderates that indirect effect? In other words, what if we think that the mediation effect of self-esteem might differ between females and males? To analyze that question we use moderated mediation. The difference between mediation and moderated mediation is that we include an interaction for the moderator in our models. Let's demonstrate using R. First we read in the data from our mediation analysis post, but this time with a gender variable added. Notice we format gender as a factor. This is required for the mediation code to work. (Note: this data and example are fake and just for illustration.)
myData <- read.csv('http://static.lib.virginia.edu/statlab/materials/data/mediationData2.csv')
# make gender a factor variable
myData$gender <- factor(myData$gender)
Next we load the mediation package. If you don't already have the mediation package, run the install.packages
function below. Otherwise you can skip it.
install.packages("mediation")
library("mediation")
Now we define our mediator and outcome models with an interaction term for gender. The interaction needs to happen with both "treatment" and mediating variables. In this case, grades is our "treatment" and self-esteem is the mediator.
model.M <- lm(self.esteem ~ grades*gender, myData)
model.Y <- lm(happiness ~ grades*gender + self.esteem*gender, myData)
Notice this is just like the code in the mediation analysis post except we've added an interaction for gender in both models. The formula notation grades*gender
is a short cut for writing grades + gender + grades:gender
, where ":" is an interaction operator in R's formula syntax. An interaction allows the effect of grades and self-esteem to vary according to gender. Now we run our mediation as before using the mediate()
function with 1000 simulations.
results <- mediate(model.M, model.Y,
treat='grades',
mediator='self.esteem',
sims=1000)
Finally we perform the moderated mediation using the test.modmed()
function. This is where we perform the simulation draws to calculate uncertainty. The first argument is the output of the mediation analysis. The second and third arguments are the different levels of the moderators. Notice they each need to be a list object. The last argument specifies the number of simulations, where once again we set it to 1000. Technically we don't need to include this argument. By default the test.modmed()
function will use the number of simulations specified in the original mediate()
call.
test.modmed(results,
covariates.1 = list(gender = "M"),
covariates.2 = list(gender = "F"),
sims = 1000)
Test of ACME(covariates.1) - ACME(covariates.2) = 0
data: estimates from results
ACME(covariates.1) - ACME(covariates.2) = -0.05657, p-value = 0.752
alternative hypothesis: true ACME(covariates.1) - ACME(covariates.2) is not equal to 0
95 percent confidence interval:
-0.3956658 0.2586307
Test of ADE(covariates.1) - ADE(covariates.2) = 0
data: estimates from results
ADE(covariates.1) - ADE(covariates.2) = -0.013929, p-value = 0.952
alternative hypothesis: true ADE(covariates.1) - ADE(covariates.2) is not equal to 0
95 percent confidence interval:
-0.4378373 0.4198936
Since we're using simulation to estimate uncertainty, your answer will differ slightly from the output above. The first section is a test of difference between the average causal mediation effects (ACME), i.e., the indirect effect of grades through self-esteem on happiness. The estimated difference is about -0.056, but the 95% confidence interval spans from -0.396 to 0.259. The difference is small and we don't have enough evidence to conclusively determine whether it's positive or negative. The second section is a test of difference between the average direct effects (ADE), i.e., the direct effect of grades on happiness. As with the indirect effect, we don't have enough evidence to conclude if the difference in direct effects between genders is positive or negative. In this case our moderator was a categorical variable but a moderator can also be continuous. We just have to specify different values of the moderator in the covariates arguments of test.modmed()
. See the documentation of test.modmed()
for an example by entering ?test.modmed
in your R console.
References
- MacKinnon, D. (2008). Introduction to Statistical Mediation Analysis. Lawrence Erlbaum.
- R Core Team (2018). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.
- Tingley, D., Yamamoto, T., Hirose, K., Keele, L., & Imai, K. (2014). Mediation: R package for causal mediation analysis. https://www.jstatsoft.org/article/view/v059i05
Clay Ford
Statistical Research Consultant
University of Virginia Library
March 02, 2018
For questions or clarifications regarding this article, contact statlab@virginia.edu.
View the entire collection of UVA Library StatLab articles, or learn how to cite.