Quick start guide for emmeans

emmeans package, Version 1.10.6

Contents

  1. The three basic steps
  2. One-factor model
  3. Two factors, no interaction
  4. Two interacting factors
  5. Three or more factors
  6. Additional options
  7. What you see versus what you get
  8. Common things that can go wrong
  9. Further reading

The three basic steps

Much of what you do with the emmeans package involves these three basic steps:

  1. Fit a good model to your data, and do reasonable checks to make sure it adequately explains the respons(es) and reasonably meets underlying statistical assumptions. Modeling is not the focus of emmeans, but this is an extremely important step because emmeans does not analyze your data, it summarizes your model. If it is a bad model, you will likely get misleading results from this package – the garbage in, garbage out principle. If you’re not sure whether your model is any good, this is a good time to get statistical consulting help. This really is like rocket science, not just a matter of getting programs to run.
  2. Run EMM <- emmeans(...) (see scenarios below) to obtain estimates of means or marginal means
  3. Run contrast(EMM, ...) or pairs(EMM) one or more times to obtain estimates of contrasts or pairwise comparisons among the means.

Note: A lot of users have developed the habit of running something like emmeans(model, pairwise ~ factor(s)), which conflates steps 1 and 2. We recommend against doing this because it often yields output you don’t want or need – especially when there is more than one factor. You are better off keeping steps 1 and 2 separate. What you do in step 2 depends on how many factors you have, and how they relate.

One-factor model

If one-factor model fits well and the factor is named treatment, do

EMM <- emmeans(model, "treatment")   # or emmeans(model, ~ treatment)
EMM    # display the means

### pairwise comparisons
contrast(EMM, "pairwise")    # or pairs(EMM)

You may specify other contrasts in the second argument of the contrast() call, e.g. "trt.vs.ctrl", ref = 1 (compare each mean to the first), or "consec" (compare 2 vs 1, 3 vs 2, etc.), or "poly", max.degree = 3 (polynomial contrasts)

Two factors, no interaction

If the model fits well and factors are named treat and dose, and they don’t interact, follow the same steps as for one factor at a time. That is, something like

(EMM1 <- emmeans(model, ~ treat))
pairs(EMM1)

(EMM2 <- emmeans(model, ~ dose))
pairs(EMM2)

These analyses will yield the estimated marginal means for each factor, and comparisons/contrasts thereof.

Back to Contents

Two interacting factors

In this case, unless the interaction effect is negligible, we usually want to do “simple comparisons” of the cell means. That is, compare or contrast the means separately, holding one factor fixed at each level.

EMM <- emmeans(model, ~ treat * dose)
EMM    # display the cell means

### Simple pairwise comparisons...
pairs(EMM, simple = "treat")    # compare treats for each dose -- "simple effects"
pairs(EMM, simple = "dose")     # compare doses for each treat

The default is to apply a separate Tukey adjustment to the P values in each by group (so if each group has just 2 means, no adjustment at all is applied). If you want to adjust the whole family combined, you need to undo the by variable and specify the desired adjustment (which can’t be Tukey because that method is invalid when you have more than one set of pairwise comparisons.) For example

test(pairs(EMM, by = "dose"), by = NULL, adjust = "mvt")

Diagonal comparisons

If the “diagonal” comparisons (where both factors differ) are of interest, you would do pairs(EMM) without a by variable. But you get a lot more comparisons this way.

Interaction contrasts

Sometimes you may want to examine interaction contrasts, which are contrasts of contrasts. The thing to know here is that contrast() or (pairs()) creates the same kind of object as emmeans(), so you can run them multiple times. For example,

CON <- pairs(EMM, by = "dose")
contrast(CON, "consec", by = NULL)    # by = NULL is essential here!

Or equivalently, the named argument interaction can be used

contrast(EMM, interaction = c("pairwise", "consec"))

Three or more factors

After you have mastered the strategies for two factors, you can adapt them to three or more factors as appropriate, based on how they interact and what you need.

Back to Contents

Additional options

  1. See the help files for both emmeans() and ref_grid() for additional arguments that may prove useful. Many of the most useful arguments are passed to ref_grid().
  2. There are a number of vignettes provided with the package that include examples and discussions for different kinds of situations. There is also an index of vignette topics.

What you see versus what you get

Most non-graphical functions in the emmeans package produce one of two classes of objects. The functions emmeans(), emtrends(), ref_grid(), contrast(), and pairs() return emmGrid objects (or lists thereof, class emm_list). For example

EMM <- emmeans(mod, "Treatment")

The functions summary(), confint(), test(), joint_tests(), and others return summary_emm objects (or lists thereof, class summary_eml):

SEMM <- summary(EMM)

If you display EMM and SEMM, they look identical; that’s because emmGrid objects are displayed using summary(). But they are not identical. EMM has all the ingredients needed to do further analysis, e.g. contrast(EMM, "consec") will estimate comparisons between consecutive Treatment means. But SEMM is just an annotated data frame and we can do no further analysis with it. Similarly, we can change how EMM is displayed via arguments to summary() or relatives, whil;e in SEMM, everything has been computed and those results are locked-in.

Common things that can go wrong

Only one mean is obtained – or fewer than expected

This is probably the most common issue, and it can happen when a treatment is coded as a numeric predictor rather than a factor. Instead of getting a mean for each treatment, you get a mean at the average of those numerical values.

  1. In such cases, the model is often inappropriate; you should replace treatment with factor(treatment) and re-fit the model.
  2. In a situation where it is appropriate to consider the treatment as a quantitative predictor, then you can get separate means at specified values by adding an argument like at = list(treatment = c(3,5,7)) to the emmeans() call.
  3. When you have a numerical predictor interacting with a factor, it may be useful to estimate its slope for each level of that factor. See the documentation for emtrends()

Having trouble with follow-up analyses, and the pairwise ~ ... recipe

The basic object returned by emmeans() and contrast() is of class emmGrid, and additional emmeans() and contrast() calls can accept emmGrid objects. However, some options create lists of emmGrid objects, and that makes things a bit confusing. The most common case is using a call like emmeans(model, pairwise ~ treat * dose), which computes the means and all pairwise comparisons – a list of two emmGrids. If you try to obtain additional contrasts, say, of this result, contrast() makes a guess that you want to run it on just the first element.

This causes confusion (I know, because I get a lot of questions about it). I recommend that you avoid using the pairwise ~ construct altogether: Get your means in one step, and get your contrasts in separate step(s). The pairwise ~ construct is generally useful if you have only one factor; otherwise, it likely gives you results you don’t want.

Further reading

There are several of these vignettes that offser more details and more advanced topics. An index of all these vignette topics is available here.

The strings linked below are the names of the vignettes; i.e., they can also be accessed via vignette("name", "emmeans")

Back to Contents

Index of all vignette topics