Skip to content

Parity Notes

PyMisha covers 130 of 144 R misha exports (90%).

Not implemented

The following R misha features are not implemented in PyMisha and are not planned:

  • Track Arrays: gtrack.array.get_colnames, gtrack.array.extract, gtrack.array.set_colnames, gvtrack.array.slice. Track arrays are a rarely-used R misha feature for storing multi-column data per genomic bin. No known production workflows depend on them.

  • Legacy Conversion: gtrack.convert (for migrating old 2D track formats to the current quad-tree format). Only relevant for databases created with very early misha versions.

  • COMPUTED 2D Tracks: R misha supports a special COMPUTED track type for on-the-fly Hi-C normalization (via internal PotentialComputer2D / TechnicalComputer2D classes). These tracks are computed dynamically during extraction rather than stored on disk. However, R misha provides no public API to create COMPUTED tracks — they can only be generated by internal C++ code. The shaman package, the primary Hi-C analysis tool in the Tanay lab, uses plain 2D tracks rather than COMPUTED tracks. Given the lack of a creation API and no known consumer, this feature will not be implemented.

  • Cluster Submission: gcluster.run is an R-specific wrapper for submitting jobs to an SGE/PBS cluster. It has no equivalent in Python genomics workflows, where users typically use their own job schedulers (snakemake, nextflow, etc.).