Convert 2D interval set to indexed format — gintervals.2d.convert_to

Converts a per-chromosome interval set to indexed format (intervals2d.dat + intervals2d.idx) which reduces file descriptor usage.

gintervals.2d.convert_to_indexed(
  set.name = NULL,
  remove.old = FALSE,
  force = FALSE
)

Arguments

set.name: name of 2D interval set to convert
remove.old: if TRUE, removes old per-chromosome files after successful conversion
force: if TRUE, re-converts even if already in indexed format

Value

invisible NULL

Details

The indexed format stores all chromosome pairs in a single intervals2d.dat file with an intervals2d.idx index file. This dramatically reduces file descriptor usage, especially for genomes with many chromosomes (N*(N-1)/2 files to just 2).

Only non-empty pairs are stored in the index, avoiding O(N^2) space overhead.

The conversion process:

Scans directory for existing per-pair files
Creates temporary intervals2d.dat.tmp and intervals2d.idx.tmp files
Concatenates all per-pair files into intervals2d.dat.tmp
Builds index with pair offsets and checksums
Atomically renames temporary files to final names
Optionally removes old per-pair files

The indexed format is 100

Examples

if (FALSE) { # \dontrun{
# Convert a 2D interval set
gintervals.2d.convert_to_indexed("my_2d_intervals")

# Convert and remove old files
gintervals.2d.convert_to_indexed("my_2d_intervals", remove.old = TRUE)

# Force re-conversion
gintervals.2d.convert_to_indexed("my_2d_intervals", force = TRUE)
} # }