Type: | Package |
Title: | Calculate Distance Measures for DataFrames |
Version: | 1.0.0 |
Date: | 2025-09-13 |
Maintainer: | Flavio Gioia <flaviogioia.fg@gmail.com> |
Description: | It provides functions that calculate Mahalanobis distance, Euclidean distance, Manhattan distance, Chebyshev distance, Hamming distance, Canberra distance, Minkowski dissimilarity (distance defined for p >= 1), Cosine dissimilarity, Bhattacharyya dissimilarity, Jaccard distance, Hellinger distance, Bray-Curtis dissimilarity, Sorensen-Dice dissimilarity between each pair of species in a list of data frames. These statistics are fundamental in various fields, such as cluster analysis, classification, and other applications of machine learning and data mining, where assessing similarity or dissimilarity between data is crucial. The package is designed to be flexible and easily integrated into data analysis workflows, providing reliable tools for evaluating distances in multidimensional contexts. |
License: | GPL-3 |
Encoding: | UTF-8 |
Imports: | stats, ggplot2, reshape2, gridExtra, matrixStats |
Suggests: | rmarkdown, testthat (≥ 3.0.0) |
NeedsCompilation: | no |
RoxygenNote: | 7.3.3 |
Packaged: | 2025-09-14 07:42:56 UTC; flavi |
Author: | Flavio Gioia |
Repository: | CRAN |
Date/Publication: | 2025-09-14 11:30:19 UTC |
Calculate the Bhattacharyya dissimilarities for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Bhattacharyya dissimilarities about the factors inside them. You can also select "index" to calculate the Bhattacharyya dissimilarities between each row.
Usage
cbhattacharyya(
dataset,
formula,
plot = TRUE,
plot_title = "Bhattacharyya Dissimilarity Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Bhattacharyya dissimilarities matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Bhattacharyya dissimilarities matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Bhattacharyya dissimilarities matrix will be printed; instead, by specifying variables, the Bhattacharyya dissimilarities matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only dissimilarities between rows are calculated. Therefore, this snippet: "cbhattacharyya(mtcars, ~am + carb + index)" will print dissimilarities only considering "index". Rows with NA values are omitted.
Examples
# Example with the iris dataset
data(iris)
cbhattacharyya(iris, ~Species, plot = TRUE,
plot_title = "Bhattacharyya Dissimilarity Between Groups")
# Example with the mtcars dataset
data(mtcars)
cbhattacharyya(mtcars, ~am, plot = TRUE,
plot_title = "Bhattacharyya Dissimilarity Between Groups")
# Calculate Bhattacharyya distance for index
res <- cbhattacharyya(mtcars, ~index)
Calculate the Bray-Curtis dissimilarities for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Bray-Curtis dissimilarities about the factors inside them. You can also select "index" to calculate the Bray-Curtis dissimilarities between each row.
Usage
cbraycurtis(
dataset,
formula,
plot = TRUE,
plot_title = "Bray-Curtis Dissimilarity Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Bray-Curtis dissimilarities matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Bray-Curtis dissimilarities matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Bray-Curtis dissimilarities matrix will be printed; instead, by specifying variables, the Bray-Curtis dissimilarities matrix or matrices (two or more) between each pair of factors and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only dissimilarities between rows are calculated. Therefore, this snippet: "cbraycurtis(mtcars, ~am + carb + index)" will print dissimilarities only considering "index". Rows with NA values are omitted.
Examples
# Example with the iris dataset
data(iris)
cbraycurtis(iris, ~Species, plot = TRUE,
plot_title = "Bray-Curtis Dissimilarity Between Groups")
# Example with mtcars dataset
data(mtcars)
# Example with the mtcars dataset
cbraycurtis(mtcars, ~am,
plot = TRUE, plot_title = "Bray-Curtis Dissimilarity Between Groups")
# Calculate the Bray-Curtis dissimilarity for 32 car models in "mtcars" dataset
res <- cbraycurtis(mtcars, ~index)
Calculate the Canberra distances for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Canberra distances about the factors inside them. You can also select "index" to calculate the Canberra distances between each row.
Usage
ccanberra(
dataset,
formula,
plot = TRUE,
plot_title = "Canberra Distance Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Canberra distances matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Canberra distances matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Canberra distances matrix will be printed; instead, by specifying variables, the Canberra distances matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only distances between rows are calculated. Therefore, this snippet: "ccanberra(mtcars, ~am + carb + index)" will print distances only considering "index". Rows with NA values are omitted.
Examples
# Example with the iris dataset
data(iris)
ccanberra(iris, ~Species, plot = TRUE,
plot_title = "Canberra Distance Between Groups")
# Example with the mtcars dataset
data(mtcars)
ccanberra(mtcars, ~am, plot = TRUE,
plot_title = "Canberra Distance Between Groups")
# Calculate the Canberra distance for 32 car models in "mtcars" dataset
res <- ccanberra(mtcars, ~index)
Calculate the Chebyshev distances for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Chebyshev distances about the factors inside them. You can also select "index" to calculate the Chebyshev distances between each row.
Usage
cchebyshev(
dataset,
formula,
plot = TRUE,
plot_title = "Chebyshev Distance Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Chebyshev distances matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Chebyshev distances matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Chebyshev distances matrix will be printed; instead, by specifying variables, the Chebyshev distances matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only distances between rows are calculated. Therefore, this snippet: "cchebyshev(mtcars, ~am + carb + index)" will print distances only considering "index". Rows with NA values are omitted.
Examples
# Example with iris dataset
data(iris)
cchebyshev(iris, ~Species, plot = TRUE,
plot_title = "Chebyshev Distance Between Groups")
# Example with mtcars dataset
data(mtcars)
cchebyshev(mtcars, ~am, plot = TRUE,
plot_title = "Chebyshev Distance Between Groups")
# Calculate the Chebyshev distance for 32 car models in "mtcars" dataset
res <- cchebyshev(mtcars, ~index)
Calculate the Cosine dissimilarities for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Cosine dissimilarities about the factors inside them. You can also select "index" to calculate the Cosine dissimilarities between each row.
Usage
ccosine(
dataset,
formula,
plot = TRUE,
plot_title = "Cosine Dissimilarity Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Cosine dissimilarities matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Cosine dissimilarities matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Cosine dissimilarities matrix will be printed; instead, by specifying variables, the Cosine dissimilarities matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only dissimilarities between rows are calculated. Therefore, this snippet: "ccosine(mtcars, ~am + carb + index)" will print dissimilarities only considering "index". Rows with NA values are omitted.
Examples
# Example with iris dataset
data(iris)
ccosine(iris, ~Species, plot = TRUE,
plot_title = "Cosine Dissimilarity Between Groups")
# Example with mtcars dataset
data(mtcars)
ccosine(mtcars, ~am, plot = TRUE,
plot_title = "Cosine Dissimilarity Between Groups")
# Calculate the Cosine dissimilarity for 32 car models in "mtcars" dataset
res <- ccosine(mtcars, ~index)
Calculate the Euclidean distances for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Euclidean distances about each pair of factors inside them. You can also select "index" to calculate the Euclidean distances between each row.
Usage
ceuclide(
dataset,
formula,
plot = TRUE,
plot_title = "Euclidean Distance Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Euclidean distances matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Euclidean distances matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore factors, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Euclidean distance matrix will be printed; instead, by specifying variables, the Euclidean distances matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only distances between rows are calculated. Therefore, this snippet: "ceuclide(mtcars, ~am + carb + index)" will print distances only considering "index". Rows with NA values are omitted.
Examples
# Example with iris dataset
data(iris)
ceuclide(iris, ~Species, plot = TRUE,
plot_title = "Euclidean Distance Between Groups", min_group_size = 2)
# Example with mtcars dataset
data(mtcars)
ceuclide(mtcars, ~am + carb, plot = TRUE,
plot_title = "Euclidean Distance Between Groups", min_group_size = 3)
# Calculate ceuclide for index
res <- ceuclide(mtcars, ~index,
min_group_size = 3)
Calculate the Hamming distances for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Hamming distances about the factors inside them. You can also select "index" to calculate the Hamming distances between each row.
Usage
chamming(
dataset,
formula,
plot = TRUE,
plot_title = "Hamming Distance Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Hamming distances matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Hamming distances matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Hamming distances matrix will be printed; instead, by specifying variables, the Hamming distances matrix or matrices (two or more) between each pair of factors and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only distances between rows are calculated. Therefore, this snippet: "chamming(mtcars, ~am + carb + index)" will print the distances only considering "index". Rows with NA values are omitted.
Examples
# Example with iris dataset
data(iris)
chamming(iris, ~Species, plot = TRUE,
plot_title = "Hamming Distance Between Groups")
# Example with mtcars dataset
data(mtcars)
chamming(mtcars, ~am, plot = TRUE,
plot_title = "Hamming Distance Between Groups")
# Calculate the Hamming distance for 32 car models in "mtcars" dataset
res <- chamming(mtcars, ~index)
Calculate the Hellinger distances for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Hellinger distances about the factors inside them. You can also select "index" to calculate the Hellinger distances between each row.
Usage
chellinger(
dataset,
formula,
plot = TRUE,
plot_title = "Hellinger Distance Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Hellinger distances matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Hellinger distances matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Hellinger distances matrix will be printed; instead, by specifying variables, the Hellinger distances matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only distances between rows are calculated. Therefore, this snippet: "chellinger(mtcars, ~am + carb + index)" will print the distances only considering "index". Rows with NA values are omitted.
Examples
# Example with the iris dataset
data(iris)
chellinger(iris, ~Species, plot = TRUE,
plot_title = "Hellinger Distance Between Groups")
# Example with the mtcars dataset
data(mtcars)
chellinger(mtcars, ~am, plot = TRUE,
plot_title = "Hellinger Distance Between Groups")
res <- chellinger(mtcars, ~index)
Calculate the Jaccard distances for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Jaccard distances about the factors inside them. You can also select "index" to calculate the Jaccard distances between each row.
Usage
cjaccard(
dataset,
formula,
plot = TRUE,
plot_title = "Jaccard Distance Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Jaccard distances matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Jaccard distances matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Jaccard distances matrix will be printed; instead, by specifying variables, the Jaccard distances matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only distances between rows are calculated. Therefore, this snippet: "cjaccard(mtcars, ~am + carb + index)" will print distances only considering "index". Rows with NA values are omitted.
Examples
# Example with the iris dataset
data(iris)
cjaccard(iris, ~Species, plot = TRUE,
plot_title = "Jaccard Distance Between Groups")
# Example with the mtcars dataset
data(mtcars)
cjaccard(mtcars, ~am,
plot = TRUE, plot_title = "Jaccard Distance Between Groups")
res <- cjaccard(mtcars, ~index,
plot = TRUE)
Calculate the Mahalanobis distances for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Mahalanobis distances about each pair of factors inside them. You can also select "index" to calculate the Mahalanobis distances between each row.
Usage
cmahalanobis(
dataset,
formula,
plot = TRUE,
plot_title = "Mahalanobis Distance Between Groups",
min_group_size = 3,
pvalues_chisq = FALSE
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Mahalanobis distances matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Mahalanobis distances matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore factors, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
pvalues_chisq |
If TRUE, print the result of the chi-squared test on squared distances. The distances with "pvalues_chisq = FALSE" are not squared; instead, with "pvalues_chisq = TRUE", the squared Mahalanobis distances with corresponding p_values will be printed. Default is FALSE. |
Value
According to the option chosen in formula and in pvalues_chisq, with "index" and "pvalues_chisq = TRUE" the squared Mahalanobis distance matrix will be printed with corresponding pvalues; instead, with "index" and "pvalues_chisq = FALSE", only the Mahalanobis distances (not squared) will be printed. By specifying variables, the Mahalanobis distances matrix or matrices (two or more) between each pair of factors and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only distances between rows are calculated. Therefore, this snippet: "cmahalanobis(mtcars, ~am + carb + index)" will print distances and plot only considering "index". Rows with NA values are omitted.
Examples
# Example with the iris dataset
data(iris)
# Calculate the Mahalanobis distance for "Species" groups in "iris" dataset
cmahalanobis(iris, ~Species, plot = TRUE,
plot_title = "Mahalanobis Distance Between Groups", min_group_size = 3)
# Example with the mtcars dataset
data(mtcars)
# Calculate the Mahalanobis distance for two factors in "mtcars" dataset
cmahalanobis(mtcars, ~am + vs,
plot = TRUE, plot_title = "Mahalanobis Distance Between Groups",
min_group_size = 2, pvalues_chisq = TRUE)
# Calculate the Mahalanobis distance for "index" in mtcars
cmahalanobis(mtcars, ~index, pvalues_chisq = TRUE)
Calculate the Manhattan distances for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Manhattan distances about the factors inside them. You can also select "index" to calculate the Manhattan distances between each row.
Usage
cmanhattan(
dataset,
formula,
plot = TRUE,
plot_title = "Manhattan Distance Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Manhattan distances matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Manhattan distances matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Manhattan distances matrix will be printed; instead, by specifying variables, the Manhattan distances matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only distances between rows are calculated. Therefore, this snippet: "cmanhattan(mtcars, ~am + carb + index)" will print the distances only considering "index". Rows with NA values are omitted.
Examples
# Example with iris dataset
data(iris)
cmanhattan(iris, ~Species, plot = TRUE,
plot_title = "Manhattan Distance Between Groups", min_group_size = 3)
# Example with mtcars dataset
data(mtcars)
cmanhattan(mtcars, ~am + vs, plot = TRUE,
plot_title = "Manhattan Distance Between Groups", min_group_size = 3)
# Calculate the Manhattan distance for 32 car models in "mtcars" dataset
res <- cmanhattan(mtcars, ~index, min_group_size = 3)
Calculate the Minkowski distances for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Minkowski distances about the factors inside them. You can also select "index" to calculate the Minkowski distances between each row.
Usage
cminkowski(
dataset,
formula,
p = 3,
plot = TRUE,
plot_title = "Minkowski Distance Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Minkowski distances matrix or matrices (two or more). |
p |
Order of the Minkowski distance. |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Minkowski distances matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Minkowski distances matrix will be printed; instead, by specifying variables, the Minkowski distances matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
When p < 1, the Minkowski distance is a "dissimilarity" measure. When p >= 1, the triangle inequality property is satisfied and we say "Minkowski distance". If "index" is selected with variables, only distances between rows are calculated. Therefore, this snippet: "cminkowski(mtcars, ~am + carb + index)" will print distances only considering "index". Rows with NA values are omitted.
Examples
# Example with iris dataset
data(iris)
cminkowski(iris, ~Species, p = 3, plot = TRUE,
plot_title = "Minkowski Distance Between Groups")
# Example with mtcars dataset
data(mtcars)
cminkowski(mtcars, ~am, p = 3, plot = TRUE,
plot_title = "Minkowski Distance Between Groups")
# Calculate the Minkowski distance for 32 car models in "mtcars" dataset
res <- cminkowski(mtcars, ~index, p = 2, plot = TRUE)
Calculate the Sorensen-Dice dissimilarities for each pair of factors or for the index.
Description
This function takes a dataframe and a variable or variables (two or more) in input, and returns a matrix or matrices (two or more) with the Sorensen-Dice dissimilarities about the factors inside them. You can also select "index" to calculate the Sorensen-Dice dissimilarities between each row.
Usage
csorensendice(
dataset,
formula,
plot = TRUE,
plot_title = "Sorensen-Dice Dissimilarity Between Groups",
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
The index of the dataframe, otherwise a variable or variables (two or more) with factors which you want to calculate the Sorensen-Dice dissimilarities matrix or matrices (two or more). |
plot |
Logical, if TRUE, a plot or plots (two or more) of the Sorensen-Dice dissimilarities matrix or matrices about factors (two or more) are displayed. |
plot_title |
If plot is TRUE, the title to be used for plot or plots about factors. The default value is TRUE. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. For "index", this value is always 1. |
Value
According to the option chosen in formula, with "index" the Sorensen-Dice dissimilarities matrix will be printed; instead, by specifying variables, the Sorensen-Dice dissimilarities matrix or matrices (two or more) between each pair of groups and, optionally, the plot or plots (two or more) will be printed.
Note
If "index" is selected with variables, only dissimilarities between rows are calculated. Therefore, this snippet: "csorensendice(mtcars, ~am + carb + index)" will print dissimilarities only considering "index". Rows with NA values are omitted.
Examples
# Example with the iris dataset
data(iris)
csorensendice(iris, ~Species,
plot = TRUE, plot_title = "Sorensen-Dice Dissimilarity Between Groups")
# Example with mtcars dataset
data(mtcars)
# Example with the mtcars dataset
csorensendice(mtcars, ~am, plot = TRUE,
plot_title = "Sorensen-Dice Dissimilarity Between Groups")
# Calculate the Sorensen-Dice dissimilarity for 32 car models in "mtcars" dataset
res <- csorensendice(mtcars, ~index)
Generate a Microsoft Word document about the Bhattacharyya dissimilarities matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Bhattacharyya dissimilarities matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_cbhattacharyya(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Bhattacharyya dissimilarities matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" or "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Bhattacharyya dissimilarities matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_cbhattacharyya(iris, ~Species,
pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_cbhattacharyya(mtcars, ~am,
pvalue.method = "bootstrap", seed = 123)
Generate a Microsoft Word document about the Bray-Curtis dissimilarities matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Bray-Curtis dissimilarities matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_cbraycurtis(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Bray-Curtis dissimilarities matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for 'bootstrap' or 'permutation'. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Bray-Curtis dissimilarities matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_cbraycurtis(iris, ~Species,
pvalue.method = "permutations")
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_cbraycurtis(mtcars, ~am,
pvalue.method = 'bootstrap', seed = 124)
Generate a Microsoft Word document about the Canberra distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Canberra distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_ccanberra(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Canberra distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Canberra distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_ccanberra(iris, ~Species,
pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_ccanberra(mtcars, ~am,
pvalue.method = "bootstrap", seed = 123)
# Generate a report for 32 car models in "mtcars" dataset,
# using "bootstrap" method
generate_report_ccanberra(mtcars, ~am, pvalue.method = "bootstrap")
Generate a Microsoft Word document about the Chebyshev distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Chebyshev distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_cchebyshev(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Chebyshev distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Chebyshev distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_cchebyshev(iris, ~Species, pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_cchebyshev(mtcars, ~am,
pvalue.method = "bootstrap", seed = 100)
Generate a Microsoft Word document about the Cosine dissimilarities matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Cosine dissimilarities matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_ccosine(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Cosine dissimilarities matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" or "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Cosine dissimilarities matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_ccosine(iris, ~Species, pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_ccosine(mtcars, ~am,
pvalue.method = "bootstrap", seed = 123)
Generate a Microsoft Word document about the Euclidean distances matrix or matrices and the p-values matrix or matrices.
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Euclidean distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_ceuclide(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Euclidean distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" or "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore factors, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Euclidean distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(airquality)
# Generate a report about "Species" factor in iris dataset
generate_report_ceuclide(airquality, ~Month, pvalue.method = 'bootstrap',
min_group_size = 3)
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" and "vs" factors in mtcars dataset
generate_report_ceuclide(mtcars, ~am + vs,
pvalue.method = 'bootstrap', seed = 100, min_group_size = 3)
Generate a Microsoft Word document about the Hamming distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Hamming distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_chamming(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Hamming distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Hamming distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_chamming(iris, ~Species)
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_chamming(mtcars, ~am,
pvalue.method = "bootstrap", seed = 124)
Generate a Microsoft Word document about the Hellinger distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Hellinger distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_chellinger(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Hellinger distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for 'bootstrap' and 'permutation'. |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Hellinger distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_chellinger(iris, ~Species,
pvalue.method = "bootstrap")
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_chellinger(mtcars, ~am,
pvalue.method = "bootstrap", seed = 100)
Generate a Microsoft Word document about the Jaccard distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor and returns a Microsoft Word document about the Jaccard distances matrix or matrices and the p-values matrix or matrices.
Usage
generate_report_cjaccard(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Jaccard distances matrix or matrices and the p_values matrix or matrices. |
pvalue.method |
A p_value method used to calculate the matrix, the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" or "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Jaccard distance matrix or matrices and the p_values matrix or matrices.
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_cjaccard(iris, ~Species,
pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_cjaccard(mtcars, ~am,
pvalue.method = "bootstrap", seed = 223)
Generate a Microsoft Word document about the Mahalanobis distances matrix or matrices and the p-values matrix or matrices.
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Mahalanobis distances matrix or matrices (two or more) and the p-values matrix or matrices.
Usage
generate_report_cmahalanobis(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3,
pvalues_chisq = FALSE
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Mahalanobis distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" or "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
pvalues_chisq |
If TRUE, print the result of the chi-squared test on squared distances. The resulting distances with "pvalues_chisq = FALSE" are not squared; instead, with "pvalues_chisq = TRUE", the squared Mahalanobis distance matrix with corresponding p_values will be printed. Default is FALSE. |
Value
A Microsoft Word document about the Mahalanobis distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about the "Species"
# factor in the iris dataset using the "permutation" method.
generate_report_cmahalanobis(iris, ~Species, min_group_size = 3)
# Example with mtcars dataset
data(mtcars)
# Generate a report about the "am" and "vs" in mtcars using "bootstrap" method.
generate_report_cmahalanobis(mtcars, ~am + vs,
pvalue.method = "bootstrap",
seed = 100, min_group_size = 2)
Generate a Microsoft Word document about the Manhattan distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Manhattan distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_cmanhattan(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Manhattan distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Manhattan distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_cmanhattan(iris, ~Species, pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_cmanhattan(mtcars, ~am,
pvalue.method = 'bootstrap', seed = 123)
Generate a Microsoft Word document about the Minkowski dissimilarities/distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Minkowski dissimilarities/distances matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_cminkowski(
dataset,
formula,
p = 3,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Minkowski dissimilarities/distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
p |
Order of the Minkowski dissimilarities/distances. The default value is 3. |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Details
When p < 1, the Minkowski distance is a "dissimilarity" measure. When p >= 1, the triangle inequality property is satisfied and we say "Minkowski distance".
Value
A Microsoft Word document about the Minkowski dissimilarities/distances matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_cminkowski(iris, ~Species, p = 3,
pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_cminkowski(mtcars, ~am,
p = 3, pvalue.method = 'permutation', seed = 234)
Generate a Microsoft Word document about the Sorensen-Dice dissimilarity matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Description
This function takes a dataframe, a factor or factors (two or more) and returns a Microsoft Word document about the Sorensen-Dice dissimilarities matrix or matrices (two or more) and the p-values matrix or matrices (two or more).
Usage
generate_report_csorensendice(
dataset,
formula,
pvalue.method = "permutation",
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Sorensen-Dice dissimilarities matrix or matrices (two or more) and the p_values matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
seed |
Optionally, set a seed for "bootstrap" or "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A Microsoft Word document about the Sorensen-Dice dissimilarities matrix or matrices (two or more) and the p_values matrix or matrices (two or more).
Examples
# Example with iris dataset
data(iris)
# Generate a report about "Species" factor in iris dataset
generate_report_csorensendice(iris, ~Species,
pvalue.method = 'permutation')
# Example with mtcars dataset
data(mtcars)
# Generate a report about "am" factor in mtcars dataset
generate_report_csorensendice(mtcars, ~am,
pvalue.method = "bootstrap", seed = 123)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Bhattacharyya dissimilarities as a base.
Description
Using the Bhattacharyya dissimilarity for the dissimilarities calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluescbatt(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate Bhattacharyya dissimilarities matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" or "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with iris dataset
data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvaluescbatt(iris,~Species, pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvaluescbatt(mtcars,~am,
pvalue.method = "bootstrap", seed = 123)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Bray-Curtis dissimilarity as a base.
Description
Using the Bray-Curtis dissimilarity for the dissimilarities calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluescbrcu(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Bray-Curtis dissimilarities matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for 'bootstrap' or 'permutation'. |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with iris dataset
data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvaluescbrcu(iris,~Species, pvalue.method = "bootstrap")
# Example with mtcars dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvaluescbrcu(mtcars,~am,
pvalue.method = "permutation", seed = 111)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Canberra distance as a base.
Description
Using the Canberra distance for the distances calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluesccanb(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Canberra distances matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with iris dataset
data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvaluesccanb(iris,~Species, pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvaluesccanb(mtcars,~am + vs,
pvalue.method = "permutation", seed = 100)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Chebyshev distance as a base.
Description
Using the Chebyshev distance for the distances calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluesccheb(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Chebyshev distances matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap".ì |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with "iris" dataset
data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvaluesccheb(iris,~Species, pvalue.method = "permutation")
# Example with "mtcars" dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvaluesccheb(mtcars,~am,
pvalue.method = "bootstrap", seed = 100)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Cosine dissimilarity as a base.
Description
Using the Cosine dissimilarity for the dissimilarities calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluesccosi(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Cosine dissimilarities matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" or "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with iris dataset
data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvaluesccosi(iris,~Species, pvalue.method = "permutation")
# Example with mtcars
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvaluesccosi(mtcars,~am,
pvalue.method = "permutation", seed = 123)
Calculate the p_values matrix or matrices (two or more) for each pair of factors inside variable or variables (two or more), using Euclidean distance as a base.
Description
Using the Euclidean distance for the distances calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluesceucl(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Euclidean distances matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" and "permutation" methods. |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with iris dataset
# Calculate p_values of "Species" variable in iris dataset
pvaluesceucl(iris,~Species, pvalue.method = "permutation"
, min_group_size = 3)
# Example with mtcars dataset
# Calculate p_values of "am" variable in mtcars dataset
pvaluesceucl(mtcars,~am + carb,
pvalue.method = "bootstrap",
seed = 100, min_group_size = 2)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Hamming distance as a base.
Description
Using the Hamming distance for the distances calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvalueschamm(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Hamming distances matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with "iris" dataset
data(iris)
# Calculate p_values of "Species" variable in "iris" dataset
pvalueschamm(iris,~Species, pvalue.method = "bootstrap")
# Example with mtcars dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvalueschamm(mtcars,~am,
pvalue.method = "permutation", seed = 100)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Hellinger distances as a base.
Description
Using the Hellinger distance for the distances calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvalueschell(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Hellinger distances matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for 'bootstrap' and 'permutation'. |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with iris dataset
data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvalueschell(iris,~Species, pvalue.method = "bootstrap")
# Example with mtcars dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvalueschell(mtcars,~am,
pvalue.method = "permutation", seed = 122)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Jaccard distance as a base.
Description
Using the Jaccard distance for the distances calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluescjacc(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Jaccard distances matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" or "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with the iris dataset
data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvaluescjacc(iris,~Species, pvalue.method = "bootstrap")
# Example with the mtcars dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvaluescjacc(mtcars,~am,
pvalue.method = "permutation",
seed = 122)
Calculate the p_values matrix or matrices (two or more) for each pair of factors inside variable or variables (two or more), using Mahalanobis distance as a base.
Description
Using the Mahalanobis distance for the distances calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluescmaha(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Mahalanobis distances matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" and "permutation" methods. |
min_group_size |
Minimum group size to maintain. The default value is 3,therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with "airquality" dataset
data(airquality)
# Calculate p_values of "Month" variable in "airquality" dataset
pvaluescmaha(airquality,~Month, pvalue.method = "permutation", seed = 12,
min_group_size = 3)
# Example with "mtcars" dataset
data(mtcars)
# Calculate p_values of "am" and "carb" variable in mtcars dataset
pvaluescmaha(mtcars,~am + carb,
pvalue.method = "permutation", seed = 100, min_group_size = 2)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Manhattan distance as a base.
Description
Using the Manhattan distance for the distances calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluescmanh(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Manhattan distances matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with iris dataset
data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvaluescmanh(iris,~Species, pvalue.method = "bootstrap")
# Example with mtcars dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvaluescmanh(mtcars,~am,
pvalue.method = "permutation", seed = 123)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Minkowski dissimilarity/distance as a base.
Description
Using the Minkowski dissimilarity/distance for the dissimilarities/distances calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluescmink(
dataset,
formula,
pvalue.method = "permutation",
p = 3,
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Minkowski dissimilarities/distances matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
p |
Order of the Minkowski dissimilarities/distances. The default value is 3. |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix of p_values and, optionally, the plot.
Note
When p < 1, the Minkowski distance is a "dissimilarity" measure. When p >= 1, the triangle inequality property is satisfied and we say "Minkowski distance".
Examples
# Example with iris dataset
# data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvaluescmink(iris,~Species, p = 3, pvalue.method = "permutation")
# Example with mtcars dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvaluescmink(mtcars,~am, p = 3,
pvalue.method = "permutation", seed = 100)
Calculate the p_values matrix or matrices (two or more) for each factor inside variable or variables (two or more), using Sorensen-Dice dissimilarity as a base.
Description
Using the Sorensen-Dice dissimilarity for the dissimilarities calculation, this function takes a dataframe, a variable or variables (two or more), a p_value method such as "bootstrap" and "permutation" and returns the p_values matrix or matrices (two or more) between each pair of factors and a plot or plots (two or more) if the user select TRUE or leaves the parameter without argument.
Usage
pvaluescsore(
dataset,
formula,
pvalue.method = "permutation",
plot = TRUE,
seed = NULL,
min_group_size = 3
)
Arguments
dataset |
A dataframe. |
formula |
A variable or variables (two or more) with factors which you want to calculate the Sorensen-Dice dissimilarities matrix or matrices (two or more). |
pvalue.method |
A p_value method used to calculate the matrix or matrices (two or more), the default value is "permutation". Another method is "bootstrap". |
plot |
if TRUE, plot the p_values heatmap or heatmaps (two or more). The default value is TRUE. |
seed |
Optionally, set a seed for "bootstrap" and "permutation". |
min_group_size |
Minimum group size to maintain. The default value is 3, therefore groups, inside variables, with less than 3 observations will be discarded. |
Value
A list containing a matrix or matrices (two or more) of p_values and, optionally, the plot.
Examples
# Example with the iris dataset
data(iris)
# Calculate p_values of "Species" variable in iris dataset
pvaluescsore(iris,~Species, pvalue.method = "bootstrap")
# Example with mtcars dataset
data(mtcars)
# Calculate p_values of "am" variable in mtcars dataset
pvaluescsore(mtcars,~am,
pvalue.method = "permutation", seed = 134)