MRQAP

MRQAP tests associations across the dyads of a network. Questions generally ask whether one relation is associated with another relation in multi-relational networks (e.g. do marriage ties correlate with business ties?), or whether similarity on dyadic features predict tie formation (e.g. are same-race pairs more likely to be friends? or generally what shared characteristics predict friendship nominations among adolescents in an American high school?). MRQAP is an extension of the Mantel test which uses node permutation to accommodate issues of non-independence that bias traditional regression analysis estimates using (sociocentric) network data. To illustrate how to use MRQAP with ideanet, we’ll use the Faux Mesa High dataset native to the package.

Preparing to use ideanet’s MRQAP module is easy: all we need is the igraph object produced using netwrite, as this object contains the various node-level attributes contained in a user’s original nodelist as well as the node-level measurements produced by netwrite.

library(ideanet)

nw_fauxmesa <- netwrite(nodelist = fauxmesa_nodes,
                        node_id = "id",
                        i_elements = fauxmesa_edges$from,
                        j_elements = fauxmesa_edges$to,
                        directed = FALSE,
                        net_name = "faux_mesa",
                        shiny = TRUE)

When we look at the original nodelist passed to netwrite (fauxmesa_nodes), we see that we have information about each student’s grade level, their race/ethnicity, and their sex. This information exists at the individual level in that it pertains to individual students. However, when using MRQAP analysis, we are generally interested in how dyadic measures predict outcomes. In other words, we’re interested in whether similarities or differences between nodes lead to outcomes, both of which are understood at the level of ego-alter relationships. While some users may have dyadic measures stored in their edgelist ahead of time, typically these measures are something one has to generate. The qap_setup function in ideanet allows users to quickly take node-level attributes and generate dyad-level comparisons with only a few arguments.

These argument are:

net: An igraph object, preferably one generated using netwrite. qap_setup also supports pre-generated network object should they be available to users, but users must ensure that such objects are properly sorted to match node-level features.
variables: A character vector of variable names that are available in the igraph/network object. These should match the names of node/vertex attributes in the object passed to net.
methods: A character vector specifying the methods that should be applied to each item listed in variables. Values in methods must be specified in correct order such that the first item in methods applies to the first item in variables, the second item in methods the second item in variables, and so on. Each item in methods must be one of the following:
- "multi_category": Applies to categorical variables only. Creates as many variables as there are unique values; each variable signals if both ego and alter have the given value.
- "reduced_category": Applies to categorical variables only. Creates a single variable that signals if alter and ego have the same value.
- "both": Applies to categorical variables only. Computes both the "multi_category" and "reduced_category" methods.
- "difference": Computes the difference in input value between ego and alter. This method will produce two measures for each variable to which it is applied: the difference between ego and alter’s respective values and the absolute value of this difference. This is generally used to estimate dyadic difference effects, such as whether differences in popularity correlate with being friends.
directed: Specify if edges in the network should be interpreted as directed or undirected. Expects TRUE or FALSE logical.

For the Faux Mesa High network, let’s imagine one wants to know whether same sex, race, or grade levels affect the likelihood that adolescents nominate each other as friends.

For similarity in sex, we’ll want to apply the reduced_category method to the sex variable. For combinations of the race, we’ll apply the multi_category method. And for grade, which we’ll treat as a continuous variable, we’ll use the difference method.

Given the importance of ensuring that each element in the variables argument corresponds to the correct element in the methods argument, users may find it helpful to store both vectors in a data frame prior to running qap_setup. This allows us to double-check that all elements appear in the correct order.

var_methods <- data.frame(variable = c("sex", "race", "grade"),
                          method = c("reduced_category", "multi_category", "difference"))

var_methods

variable	method
sex	reduced_category
race	multi_category
grade	difference

Now that we’ve ensured everything’s in the right order, let’s use qap_setup:

faux_qap_setup <- qap_setup(net = nw_fauxmesa$faux_mesa,
                            variables = var_methods$variable,
                            methods = var_methods$method,
                            directed = FALSE)

qap_setup produces a list object containing a nodelist, an edgelist containing newly-calculated dyadic measures, and an igraph object with these dyadic measures. Let’s quickly inspect our new edgelist to see the kinds of variables we’ve just created:

head(faux_qap_setup$edges)

to	weight	sex_ego	sex_alter	same_sex	race_ego	race_alter	both_race_Hisp	grade_ego	grade_alter	diff_grade	abs_diff_grade
24	1	F	F	1	Hisp	White	0	7	7	0	0
51	1	F	F	1	Hisp	NatAm	0	7	7	0	0
57	1	F	F	1	Hisp	Hisp	1	7	12	-5	5
69	1	F	F	1	Hisp	NatAm	0	7	7	0	0
86	1	F	F	1	Hisp	White	0	7	7	0	0
91	1	F	F	1	Hisp	NatAm	0	7	7	0	0

We now have several new variables. Variables appended with _ego and _alter represent the original values for each edge’s ego and alter, respectively, as determined in our original nodelist. Our same_sex and both_race variables are binary indicators of whether ego and alter have the same value for a particular attribute. By contrast, diff_grade is a continuous measure showing how many grade levels ego and alter are apart from each other. Note that values in diff_grade may be positive or negative depending on whether or not the node designated as ego in the edgelist is in a higher grade level than the node designated as alter. Signed differences of this sort can be useful when working with directed networks — you can imagine younger students being more likely to nominate older students as friends than vice versa. Given that ties in our network are undirected, however, the order in which egos and alters appear in our edgelist have no real meaning, and whether values in diff_grade are positive or negative is largely a matter of chance. Consequently, we’re better off using the absolute value of ego and alter’s grade level in our analysis, as this measure is agnostic to the order in which nodes are presented in our edgelist. qap_setup provides us with this absolute value automatically — here it is stored in abs_diff_grade.

With our variables of interest in hand, we turn to the MRQAP analysis itself. ideanet’s qap_run function seamlessly integrates output from netwrite and qap_setup. However, in its current iteration users must select variables produced by qap_setup. Arguments for qap_run include:

net: An igraph object containing the variables of interest. This igraph object should be one produced by qap_setup.
dependent: A single categorical or continuous variable name to use as a dependent variable. This variable must be produced by qap_setup. If the dependent is set to NULL (default), the function will predict the existence of a non-weighted tie.
variables: A vector of variable names to use as independent variables. These variables must be produced by qap_setup.
directed: Specify if the edges should be interpreted as directed or undirected. Expects TRUE or FALSE logical.
reps: Select the number of permutations. Relevant to null-hypothesis testing only. Default is set to 500.
family: The functional form the model should follow. Currently available are "linear" and "binomial".
- NOTE: Binomial MRQAP can take a while to simulate; for exploratory purposes, it is recommended to stick to a linear functional form.

NOTE: If the input network is multi-relational, qap_run will automatically merge duplicated rows. This is necessary given that, at this time, the MRQAP wrapper does not elegantly handle repeated observations. When merging rows, it will take the sum of numeric edge attributes, and a random value of character edge attributes. If the user is interested in the association between two types of ties (e.g., marriage ties predicting business ties), we recommend that they create a set of binary edge attributes using qap_setup.

Let’s use qap_run using the default 500 permutations. While we can significantly decrease the number of permutations to allow for lower computation times, this may make confidence intervals in our results less interpretable. As far as variables go in this example, we’ll include same_sex, both_race_White, and abs_diff_grade.

faux_qap <- qap_run(net = faux_qap_setup$graph,
                    dependent = NULL,
                    variables = c("same_sex", "both_race_White", "abs_diff_grade"),
                    directed = FALSE,
                    family = "linear")

qap_run returns a list of two objects. The first (covs_df) is a data frame summarizing model results in a way resembling a traditional regression output. The second (mods_df) is a data frame providing the number of dyadic observations on which the model is computed.

faux_qap$covs_df

covars	estimate	pvalue
intercept	0.0142453	0.006
same_sex	0.0048255	0.044
both_race_White	0.0014419	0.692
abs_diff_grade	0.0004715	0.632

faux_qap$mods_df

num_obs
10878

Assuming a p-value of .1 or less indicates some statistical significance, our results here tell us that students are more likely to nominate students of the same sex as friends, holding all else constant. It is recommended that researchers apply the same model with different amounts of draws to confirm the confidence intervals associated with each variable.

The above example shows that setting up and using MRQAP in ideanet is fast and easy, allowing users to quickly explore a variety of model specifications with their own data.