Skip to contents

Performs regional normalization of ATAC-seq data by calculating marginal coverage in windows around peaks. The function uses a "punctured window" approach, where the signal in a window around each peak (excluding the peak itself) is used to normalize the peak signal. This helps account for local background variation and accessibility biases that are generally due to GC content or mappability.

Usage

normalize_regional(
  peaks,
  mat,
  marginal_track,
  window_size = 20000,
  minimal_quantile = 0.1
)

Arguments

peaks

A data frame containing peak intervals with columns 'chrom', 'start', and 'end'

mat

Numeric matrix. The raw ATAC signal matrix to normalize, with peaks as rows and cell types as columns

marginal_track

Character. Name of the track containing marginal (summed) signal across all cell types

window_size

Numeric. Size of the window around each peak to use for normalization (in base pairs). Default: 2e4

minimal_quantile

Numeric. Minimum quantile of the punctured window coverage to use, prevents division by very small values. Default: 0.1

Value

A normalized matrix with the same dimensions as the input, where each value has been adjusted for local background signal

Details

The normalization process follows these steps:

  1. Creates virtual tracks for peak signal and window signal

  2. Calculates punctured window coverage (window minus peak)

  3. Normalizes by the punctured coverage while enforcing a minimum based on minimal_quantile

  4. Performs final column-wise normalization

See also

normalize_const, normalize_to_prob for subsequent normalization steps

Examples

if (FALSE) { # \dontrun{
# Basic usage
norm_mat <- normalize_regional(peaks_df, raw_mat, "marginal_track")

# With custom window size and quantile
norm_mat <- normalize_regional(
    peaks_df,
    raw_mat,
    "marginal_track",
    window_size = 2e4,
    minimal_quantile = 0.05
)
} # }