Calculate the number of bits per position in a Position-Specific Scoring Matrix (PSSM).
bits_per_pos.Rd
This function takes a PSSM as input and calculates the number of bits per position. The PSSM should be a data frame or matrix with columns representing the nucleotides A, C, G, and T. The function first normalizes the PSSM by dividing each element by the sum of its row. Then, it calculates the entropy for each position using the formula: bits = log2(4) + sum(p * log2(p)), where p is the probability of each nucleotide at the position. Finally, it sets any negative values to zero and returns the resulting bits per position.
Examples
pssm <- data.frame(
A = c(0.2, 0.3, 0.1, 0.4),
C = c(0.1, 0.2, 0.3, 0.4),
G = c(0.4, 0.3, 0.2, 0.1),
T = c(0.3, 0.2, 0.4, 0.1)
)
bits_per_pos(pssm)
#> [1] 0.14121539 0.02684396 0.14121539 0.25558682