% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/mmcmcBayes.R
\name{mmcmcBayes}
\alias{mmcmcBayes}
\title{Multi-stage MCMC Bayesian Method for DMR Detection}
\usage{
mmcmcBayes(
  cancer_data,
  normal_data,
  stage = 1,
  max_stages = 3,
  num_splits = 50,
  mcmc = NULL,
  priors_cancer = NULL,
  priors_normal = NULL,
  bf_thresholds = c(0.5, 0.8, 1.05)
)
}
\arguments{
\item{cancer_data}{A data frame of methylation data for the cancer group.
Rows correspond to CpG sites and columns to variables. The first two
columns must be \code{CpG_ID} and \code{Chromosome}, and the remaining
columns must be numeric M-values for cancer samples.}

\item{normal_data}{A data frame of methylation data for the normal group
in the same format and CpG ordering as \code{cancer_data}.}

\item{stage}{Integer indicating the starting stage for the multistage
analysis. Usually left at the default \code{stage = 1}.}

\item{max_stages}{Integer giving the maximum number of stages in the
splitting procedure (default \code{3}). Larger values allow deeper
splitting of regions at the cost of additional computation.}

\item{num_splits}{Integer giving the number of subregions created when a region
is split at each stage (default \code{50}). Increasing \code{num_splits}
typically improves sensitivity but increases computation time.}

\item{mcmc}{A list of MCMC control parameters passed to
\code{\link{asgn_func}}. Expected components are
\code{nburn} (burn-in iterations),
\code{niter} (total iterations), and
\code{thin} (thinning interval).
If \code{NULL}, default values
\code{list(nburn = 5000, niter = 10000, thin = 1)} are used.}

\item{priors_cancer}{Optional list of prior hyperparameters for the ASGN
model in the cancer group, passed to \code{\link{asgn_func}}. If
\code{NULL}, default priors from \code{\link{asgn_func}} are used.}

\item{priors_normal}{Optional list of prior hyperparameters for the ASGN
model in the normal group, passed to \code{\link{asgn_func}}. If
\code{NULL}, default priors from \code{\link{asgn_func}} are used.}

\item{bf_thresholds}{Numeric vector of Bayes factor thresholds, one for
each stage (e.g., \code{c(0.5, 0.8, 1.05)}). If the length of
\code{bf_thresholds} is shorter than \code{max_stages}, the last value
is recycled so that each stage has an associated threshold. If
\code{NULL}, default thresholds \code{c(0.5, 0.8, 1.05)} are used.}
}
\value{
A data frame with one row per detected DMR and the following columns:
\itemize{
  \item \code{Chromosome}: chromosome label.
  \item \code{Start_CpG}: CpG ID where the region starts.
  \item \code{End_CpG}: CpG ID where the region ends.
  \item \code{CpG_Count}: number of CpG sites in the region.
  \item \code{Decision_Value}: final Bayes factor for the region.
  \item \code{Stage}: stage at which the region was detected.
}
If no regions pass the BF thresholds, \code{NULL} is returned.
}
\description{
This function implements a multistage MCMC Bayesian method for detecting differentially methylated 
regions (DMRs) between two groups (typically cancer and normal). The method operates on methylation
measurements on the M-values. 
 
For each candidate region and for each group, the function summarizes the region at the sample level
by averaging M-values across CpG sites within the region. These sample-wise means are using an alpha-skewed
generalized normal (ASGN) distribution. A Bayes factor (BF) comparing the two groups is then used within
a multistage region-splitting scheme to identify the DMRs.
}
\details{
The inputs \code{cancer_data} and \code{normal_data} must have the same
set of CpG sites in the same order. Each row corresponds to a CpG site,
and the first two columns are required to be:
\itemize{
  \item \code{CpG_ID}: character CpG identifier.
  \item \code{Chromosome}: chromosome label (integer or character).
}
All remaining columns are assumed to be numeric M-values for individual
samples in the respective group (e.g., \code{"M_sample1"}, \code{"M_sample2"}, \dots).

For each group, a sample wise mean M-values are computed and passed to
\code{\link{asgn_func}} to obtain posterior mean of the ASGN parameters.
A Bayes factor (BF) comparing the two groups is then computed for the
current region. If the BF exceeds a stage-specific threshold, the region
is either accepted as a DMR (at the final stage) or split into subregions
and analyzed at the next stage. This continues until either
\code{max_stages} is reached or no subregion passes the BF thresholds.

The values used in the examples are intentionally small to ensure fast
execution and are not intended as recommended settings for real analyses.
}
\examples{
\donttest{
# Load the datasets
data(cancer_demo)
data(normal_demo)

mcmc <- list(nburn = 1000, niter = 2000, thin = 1)

set.seed(2021)

rst <- mmcmcBayes(cancer_demo, normal_demo,
                 stage = 1,
                 max_stages = 2,
                 num_splits = 5,
                 mcmc = mcmc,
                 priors_cancer = NULL,
                 priors_normal = NULL,
                 bf_thresholds = c(0.5, 0.8, 1.05))

print(rst)

}

}
\seealso{
\code{\link{asgn_func}} for ASGN parameter estimation,
\code{\link{plot_dmr_region}} for visualizing individual DMR profiles,
\code{\link{summarize_dmrs}} for summarizing detected regions,
\code{\link{compare_dmrs}} for comparing DMR sets.
}
\author{
Zhexuan Yang, Duchwan Ryu, and Feng Luan
}
