Skip to content

Tracks

Functions for creating, importing, modifying, and managing genomic tracks, including dense, sparse, 2D, and indexed track types, as well as track attributes and variables.

pymisha.gtrack_ls

gtrack_ls(*patterns, ignore_case=False, **attr_filters)

Return a list of track names in the Genomic Database.

Returns track names that match all supplied patterns. Name patterns are applied as regex searches against track names. Attribute patterns are matched against the corresponding track attribute values. Multiple patterns are applied conjunctively (all must match).

PARAMETER DESCRIPTION
*patterns

Regex patterns to filter track names. Each pattern is applied sequentially; only tracks matching all patterns are returned.

TYPE: str DEFAULT: ()

ignore_case

If True, pattern matching is case-insensitive.

TYPE: bool DEFAULT: False

**attr_filters

Keyword arguments of the form attribute_name=pattern where underscores in the keyword are converted to dots for the attribute lookup (e.g., created_by="sparse" matches attribute created.by).

TYPE: str DEFAULT: {}

RETURNS DESCRIPTION
list of str or None

Sorted list of matching track names, or None if no tracks match.

RAISES DESCRIPTION
ValueError

If a regex pattern is invalid.

See Also

gtrack_exists : Test whether a single track exists. gtrack_info : Get metadata for a track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_ls()
['array_track', 'dense_track', 'rects_track', 'sparse_track', 'subdir.dense_track2']
>>> pm.gtrack_ls("dense")
['dense_track', 'subdir.dense_track2']
>>> pm.gtrack_ls(created_by="create_sparse")
['sparse_track']

pymisha.gtrack_info

gtrack_info(track)

Return metadata about a track.

Returns a dictionary containing track properties such as type, dimensions, bin size, total size in bytes, and any user-defined attributes. The fields vary depending on the track type (Dense, Sparse, Rectangles, Points).

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

RETURNS DESCRIPTION
dict

Dictionary of track properties. Common keys include "type" ("dense", "sparse", "rectangles", "points"), "bin_size" (for dense tracks), "total_size", and "attributes" (dict of user-set attributes, if any).

RAISES DESCRIPTION
ValueError

If the track does not exist.

See Also

gtrack_exists : Test whether a track exists. gtrack_ls : List available tracks.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_info("dense_track")
{'type': 'dense', 'dimensions': 1, ...}
>>> pm.gtrack_info("sparse_track")
{'type': 'sparse', 'dimensions': 1, ...}

pymisha.gtrack_exists

gtrack_exists(track)

Test for track existence in the Genomic Database.

PARAMETER DESCRIPTION
track

Track name to check.

TYPE: str

RETURNS DESCRIPTION
bool

True if the track exists, False otherwise.

RAISES DESCRIPTION
ValueError

If track is None.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_exists("dense_track")
True
>>> pm.gtrack_exists("nonexistent_track")
False
See Also

gtrack_ls : List available tracks. gtrack_info : Get metadata for a track. gtrack_rm : Delete a track.

pymisha.gtrack_dataset

gtrack_dataset(track)

Return the database root path that contains a track.

When multiple databases are connected, this identifies which database a track belongs to by returning the filesystem path of that database root.

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

RETURNS DESCRIPTION
str

Absolute filesystem path of the database root containing the track.

RAISES DESCRIPTION
ValueError

If track is None or the track does not exist.

See Also

gtrack_info : Get full metadata for a track. gtrack_ls : List available tracks.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_dataset("dense_track")
'.../trackdb/test'

pymisha.gtrack_create

gtrack_create(track, description, expr, iterator=None, band=None)

Create a track from a track expression.

Creates a new track whose values are determined by evaluating expr over the entire genome. The type of the new track (Dense, Sparse, or Rectangles) is determined by the iterator policy. The description is stored as a track attribute.

PARAMETER DESCRIPTION
track

Name for the new track. Must start with a letter and contain only alphanumeric characters, underscores, and dots.

TYPE: str

description

Human-readable description stored as a track attribute.

TYPE: str

expr

Numeric track expression to evaluate.

TYPE: str

iterator

Fixed-bin iterator bin size. If None, the iterator is determined implicitly from the track expression.

TYPE: int or None DEFAULT: None

band

Diagonal band (d1, d2) for 2D track creation. When provided, only contacts where d1 <= (x - y) < d2 are stored.

TYPE: tuple or None DEFAULT: None

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track already exists, expr is None, or band is not None.

See Also

gtrack_create_sparse : Create a Sparse track from intervals/values. gtrack_create_dense : Create a Dense track from intervals/values. gtrack_2d_create : Create a 2D track. gtrack_smooth : Create a smoothed track. gtrack_modify : Modify an existing Dense track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_create("mixed", "Test", "dense_track * 2", iterator=70)
>>> pm.gtrack_info("mixed")
>>> pm.gtrack_rm("mixed", force=True)

pymisha.gtrack_create_dense

gtrack_create_dense(track, description, intervals, values, binsize, defval=np.nan)

Create a Dense (fixed-bin) track from intervals and values.

Creates a new Dense track whose genome is tiled into fixed-size bins. Each bin stores a single numeric value. Bins not covered by any of the supplied intervals are filled with defval. The description is stored as a track attribute.

PARAMETER DESCRIPTION
track

Name for the new track. Must start with a letter and contain only alphanumeric characters, underscores, and dots.

TYPE: str

description

Human-readable description stored as a track attribute.

TYPE: str

intervals

One-dimensional intervals with columns chrom, start, end.

TYPE: DataFrame

values

Numeric values, one per interval. Length must match the number of rows in intervals.

TYPE: array-like of float

binsize

Bin size in base pairs. Must be a positive integer.

TYPE: int

defval

Default value for bins not covered by any interval.

TYPE: float DEFAULT: numpy.nan

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track already exists, binsize is not positive, values length does not match intervals, or no intervals map to known chromosomes.

See Also

gtrack_create_sparse : Create a Sparse track. gtrack_create : Create a track from a track expression. gtrack_modify : Modify values of an existing Dense track. gtrack_import : Import a track from a file. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> import pandas as pd
>>> _ = pm.gdb_init_examples()
>>> intervs = pd.DataFrame({"chrom": ["1"], "start": [0], "end": [100]})
>>> pm.gtrack_create_dense("test_dn", "Test", intervs, [5.0], 50)
>>> pm.gtrack_rm("test_dn", force=True)

pymisha.gtrack_create_sparse

gtrack_create_sparse(track, description, intervals, values)

Create a Sparse track from intervals and values.

Creates a new Sparse track where each interval carries an associated numeric value. Intervals must be non-overlapping within each chromosome. Chromosome names are normalized and filtered to those present in the current genome database. The description is stored as a track attribute.

PARAMETER DESCRIPTION
track

Name for the new track. Must start with a letter and contain only alphanumeric characters, underscores, and dots.

TYPE: str

description

Human-readable description stored as a track attribute.

TYPE: str

intervals

One-dimensional intervals with columns chrom, start, end.

TYPE: DataFrame

values

Numeric values, one per interval. Length must match the number of rows in intervals.

TYPE: array-like of float

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track already exists, intervals overlap, values length does not match intervals, or no intervals map to known chromosomes.

See Also

gtrack_create_dense : Create a Dense (fixed-bin) track. gtrack_create : Create a track from a track expression. gtrack_import : Import a track from a file. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> import pandas as pd
>>> _ = pm.gdb_init_examples()
>>> intervs = pd.DataFrame({"chrom": ["1"], "start": [0], "end": [100]})
>>> pm.gtrack_create_sparse("test_sp", "Test", intervs, [1.0])
>>> pm.gtrack_rm("test_sp", force=True)

pymisha.gtrack_create_empty_indexed

gtrack_create_empty_indexed(track)

Create empty indexed files for an existing track directory.

Writes an empty track.idx and track.dat pair in the track directory. Useful when the track has no data yet but indexed format is required by the database.

PARAMETER DESCRIPTION
track

Name of an existing track whose directory should receive the indexed files.

TYPE: str

RETURNS DESCRIPTION
None
See Also

gtrack_convert_to_indexed : Convert per-chromosome files to indexed. gdb_convert_to_indexed : Convert an entire database.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_create_empty_indexed("my_track")

pymisha.gtrack_import

gtrack_import(track, description, file, binsize=None, defval=np.nan, attrs=None)

Create a track from a WIG, BigWig, BedGraph, BED, or tab-delimited file.

Parses the input file and creates either a Sparse or Dense track depending on binsize. File format is detected from the extension. Compressed files (.gz, .zip) are supported for all formats except BigWig. Tab-delimited files must have a header with columns chrom, start, end, and exactly one value column.

PARAMETER DESCRIPTION
track

Name for the new track.

TYPE: str

description

Human-readable description stored as a track attribute.

TYPE: str

file

Path to the input file. Supported extensions: .wig, .bedgraph, .bed, .bw / .bigwig, or tab-delimited (any other extension). May include .gz or .zip suffix.

TYPE: str

binsize

Bin size for a Dense track. If None or 0, a Sparse track is created. If positive, a Dense track with the given bin size is created.

TYPE: int or None DEFAULT: None

defval

Default value for Dense track bins not covered by any interval. Ignored when creating Sparse tracks.

TYPE: float DEFAULT: numpy.nan

attrs

Additional attributes to set on the track after import, as a dict mapping attribute names to string values.

TYPE: dict or None DEFAULT: None

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track already exists, file is None, or the file contains no valid intervals.

See Also

gtrack_import_set : Batch-import multiple files into tracks. gtrack_create_sparse : Create a Sparse track programmatically. gtrack_create_dense : Create a Dense track programmatically. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_import("wig_track", "From WIG", "data.wig", binsize=10)

pymisha.gtrack_import_mappedseq

gtrack_import_mappedseq(track, description, file, pileup=0, binsize=-1, cols_order=(9, 11, 13, 14), remove_dups=True)

Import mapped sequences from SAM/tab-delimited text into a track.

Reads aligned sequence data from a SAM file or a tab-delimited text file and creates either a Sparse (per-read) or Dense (pileup) track. Duplicate reads at the same position and strand can optionally be removed.

PARAMETER DESCRIPTION
track

Name for the new track.

TYPE: str

description

Human-readable description stored as a track attribute.

TYPE: str

file

Path to a SAM or tab-delimited text file.

TYPE: str

pileup

If 0, create a Sparse track with one interval per mapped read. If positive, create a Dense pileup track where each bin stores the number of reads covering it. Reads are extended to this length from their start position.

TYPE: int DEFAULT: 0

binsize

Bin size for Dense (pileup) tracks. Required when pileup > 0. Must be -1 when pileup is 0.

TYPE: int DEFAULT: -1

cols_order

Column indices (1-based) for sequence, chromosome, coordinate, and strand in a tab-delimited file. Set to None for SAM format.

TYPE: tuple of int or None DEFAULT: (9, 11, 13, 14)

remove_dups

If True, remove duplicate reads at the same position and strand.

TYPE: bool DEFAULT: True

RETURNS DESCRIPTION
dict

Dictionary with keys "total" (dict with "total", "total.mapped", "total.unmapped", "total.dups") and "chrom" (pandas.DataFrame with per-chromosome mapping stats).

RAISES DESCRIPTION
ValueError

If the track already exists, file is None, column indices are invalid, or pileup/binsize combination is inconsistent.

See Also

gtrack_import : Import from WIG/BedGraph/BED/BigWig files. gtrack_create_sparse : Create a Sparse track from intervals. gtrack_create_dense : Create a Dense track from intervals. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_import_mappedseq("reads", "Test", "reads.sam")

pymisha.gtrack_import_set

gtrack_import_set(description, path, binsize, track_prefix=None, defval=np.nan)

Create one or more tracks from multiple WIG/BedGraph/BigWig/tab files.

Similar to gtrack_import but operates on multiple files at once. Files can be specified by a local glob pattern or an FTP URL with wildcards. Each file produces one track named {track_prefix}{filestem}. Existing tracks are skipped. The function continues importing even if individual files fail.

PARAMETER DESCRIPTION
description

Human-readable description stored as a track attribute on every imported track.

TYPE: str

path

Local file glob pattern (e.g., "/data/*.wig") or FTP URL (e.g., "ftp://host/path/*.wig.gz").

TYPE: str

binsize

Bin size for Dense tracks. If 0, Sparse tracks are created.

TYPE: int

track_prefix

Prefix prepended to each track name derived from the filename stem. If None, no prefix is used.

TYPE: str or None DEFAULT: None

defval

Default value for Dense track bins not covered by any interval.

TYPE: float DEFAULT: numpy.nan

RETURNS DESCRIPTION
dict

Dictionary with keys "files_imported" (list of successfully imported filenames) and/or "files_failed" (list of filenames that failed to import).

RAISES DESCRIPTION
ValueError

If description, path, or binsize is None, or no files match the pattern.

See Also

gtrack_import : Import a single file into a track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_import_set("Batch", "/data/*.wig", binsize=100, track_prefix="wigs.")

pymisha.gtrack_rm

gtrack_rm(track, force=False, db=None)

Remove a track from disk.

Permanently deletes the track directory and all associated files (per-chromosome data, attributes, variables). Empty parent directories are cleaned up automatically.

PARAMETER DESCRIPTION
track

Name of the track to remove.

TYPE: str

force

If True, suppress errors when the track does not exist and allow deletion without confirmation. If False, raises ValueError when the track is missing.

TYPE: bool DEFAULT: False

db

Explicit database root path. If None, the track is located in the currently initialized databases.

TYPE: str or None DEFAULT: None

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track does not exist (when force is False) or if force is False (safety guard).

See Also

gtrack_ls : List available tracks. gtrack_exists : Test whether a track exists. gtrack_mv : Rename or move a track. gtrack_copy : Copy a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_rm("my_track", force=True)

pymisha.gtrack_mv

gtrack_mv(src, dest)

Rename or move a track within the same database.

Renames a track or moves it to a different namespace (directory) within its source database. The track cannot be moved across databases; use gtrack_copy followed by gtrack_rm for that.

PARAMETER DESCRIPTION
src

Current track name.

TYPE: str

dest

New track name.

TYPE: str

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If source and destination are identical, the source track does not exist, or the destination track already exists.

See Also

gtrack_copy : Copy a track (possibly across databases). gtrack_rm : Delete a track. gtrack_exists : Test whether a track exists. gtrack_ls : List available tracks.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_mv("old_name", "new_name")

pymisha.gtrack_copy

gtrack_copy(src, dest)

Create a copy of an existing track.

Copies a track's on-disk directory to the current writable database root. The source track may reside in a different database when multiple databases are connected.

PARAMETER DESCRIPTION
src

Name of the source track.

TYPE: str

dest

Name for the new copy.

TYPE: str

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If source and destination are identical, the source track does not exist, or the destination track already exists.

See Also

gtrack_mv : Rename / move a track within the same database. gtrack_rm : Delete a track. gtrack_exists : Test whether a track exists. gtrack_ls : List available tracks.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_copy("dense_track", "dense_track_copy")
>>> pm.gtrack_exists("dense_track_copy")
True
>>> pm.gtrack_rm("dense_track_copy", force=True)

pymisha.gtrack_modify

gtrack_modify(track, expr, intervals=None)

Modify a Dense track's values in-place by evaluating an expression.

Overwrites the values of an existing Dense track with the result of evaluating expr. The iterator policy is automatically set to the track's bin size. Only Dense (fixed-bin) tracks are supported.

PARAMETER DESCRIPTION
track

Name of the dense track to modify.

TYPE: str

expr

Track expression to evaluate (may reference the track itself).

TYPE: str

intervals

Genomic scope for modification. If None, the entire genome (ALLGENOME) is used.

TYPE: DataFrame or None DEFAULT: None

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track does not exist, is not a Dense track, or expr is None.

See Also

gtrack_create : Create a new track from a track expression. gtrack_smooth : Create a smoothed copy of a track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_modify("dense_track", "dense_track * 2")
>>> pm.gtrack_modify("dense_track", "dense_track / 2")

pymisha.gtrack_smooth

gtrack_smooth(track, description, expr, winsize, weight_thr=0, smooth_nans=False, alg='LINEAR_RAMP', iterator=None)

Create a new Dense track with smoothed values from a track expression.

Each output bin at coordinate C is computed by smoothing the non-NaN values of expr within a window of size winsize (in coordinate units) around C. The smoothing algorithm and handling of NaN / edge-of-chromosome gaps are controlled by the remaining parameters.

PARAMETER DESCRIPTION
track

Name of the new track to create.

TYPE: str

description

Human-readable description stored as a track attribute.

TYPE: str

expr

Track expression whose values are smoothed.

TYPE: str

winsize

Smoothing window size in coordinate units. Defines the total region considered on both sides of the central point.

TYPE: float

weight_thr

Weight sum threshold below which the smoothed value is NaN instead of a partial-window estimate.

TYPE: float DEFAULT: 0

smooth_nans

If False, output NaN whenever the central window value is NaN, regardless of weight_thr. If True, NaN center values are filled from surrounding non-NaN values.

TYPE: bool DEFAULT: False

alg

Smoothing algorithm. "LINEAR_RAMP" uses a weighted average with linearly decreasing weights. "MEAN" uses a simple arithmetic average.

TYPE: str DEFAULT: ``"LINEAR_RAMP"``

iterator

Fixed-bin iterator bin size for the new track. If None, the bin size is inferred from the track expression.

TYPE: int or None DEFAULT: None

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track already exists, expr is None, winsize is not positive, or alg is not one of the supported algorithms.

See Also

gtrack_create : Create a track from a track expression. gtrack_modify : Modify an existing Dense track in-place. gtrack_create_sparse : Create a Sparse track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_smooth("smoothed", "Test", "dense_track", 500)
>>> pm.gtrack_rm("smoothed", force=True)

pymisha.gtrack_convert_to_indexed

gtrack_convert_to_indexed(track, remove_old=False)

Convert a per-chromosome track to indexed format.

Reads the per-chromosome binary files and writes a unified track.idx / track.dat pair. Optionally removes the original per-chromosome files after conversion.

PARAMETER DESCRIPTION
track

Name of the track to convert.

TYPE: str

remove_old

If True, remove the original per-chromosome files after successful conversion.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
None
See Also

gtrack_create_empty_indexed : Create empty indexed files. gdb_convert_to_indexed : Convert an entire database.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_convert_to_indexed("my_track", remove_old=True)

pymisha.gtrack_2d_convert_to_indexed

gtrack_2d_convert_to_indexed(track, remove_old=True, force=False)

Convert a 2D track to indexed format (track.dat + track.idx).

Consolidates per-chromosome-pair files into a single indexed format, reducing file descriptor usage from O(N^2) to O(1).

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

remove_old

If True, remove old per-pair files after conversion.

TYPE: bool DEFAULT: True

force

If True, re-convert even if already in indexed format.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track does not exist, is not a 2D track, or conversion fails.

See Also

gtrack_convert_to_indexed : Convert a 1D track to indexed format. gdb_convert_to_indexed : Convert an entire database.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_2d_convert_to_indexed("my_2d_track")

pymisha.gtrack_2d_create

gtrack_2d_create(track, description, intervals, values)

Create a 2D track from intervals and values.

PARAMETER DESCRIPTION
track

Track name (dot-separated namespace).

TYPE: str

description

Track description.

TYPE: str

intervals

2D intervals with columns: chrom1, start1, end1, chrom2, start2, end2.

TYPE: DataFrame

values

Numeric values, one per interval.

TYPE: array - like

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track already exists, values length does not match intervals, no valid intervals remain after normalization, or overlapping intervals are detected within the same chromosome pair.

See Also

gtrack_2d_import : Create a 2D track from a file. gtrack_2d_import_contacts : Import HiC contact data as a 2D track. gtrack_create_sparse : Create a 1D Sparse track. gtrack_rm : Delete a track.

Notes

Automatically detects POINTS vs RECTS format based on interval sizes. All unit-size intervals (end-start==1) produce a POINTS track. Overlapping intervals within the same chromosome pair raise an error.

Examples:

>>> import pymisha as pm
>>> import pandas as pd
>>> _ = pm.gdb_init_examples()
>>> ivs = pd.DataFrame({
...     "chrom1": ["1"], "start1": [0], "end1": [100],
...     "chrom2": ["1"], "start2": [200], "end2": [300],
... })
>>> pm.gtrack_2d_create("test_2d", "Test", ivs, [1.0])
>>> pm.gtrack_rm("test_2d", force=True)

pymisha.gtrack_2d_import

gtrack_2d_import(track, description, file)

Import a 2D track from one or more tab-delimited files.

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

description

Track description.

TYPE: str

file

Path(s) to tab-delimited file(s) with header: chrom1, start1, end1, chrom2, start2, end2, . When multiple files are given, all are read and concatenated before building the quad-tree.

TYPE: str or list of str

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track already exists, any file is not found, the file list is empty, or a file has fewer than 7 columns.

See Also

gtrack_2d_create : Create a 2D track from a DataFrame. gtrack_2d_import_contacts : Import HiC contacts as a 2D track. gtrack_rm : Delete a track.

Notes

The value column is the 7th column (0-indexed: column 6). Automatically detects POINTS vs RECTS format.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_2d_import("test_2d", "Test", "contacts.tsv")
>>> # pm.gtrack_2d_import("test_2d", "Test", ["a.tsv", "b.tsv"])

pymisha.gtrack_2d_import_contacts

gtrack_2d_import_contacts(track, description, contacts, fends=None, allow_duplicates=True)

Create a 2D Points track from inter-genomic contacts.

PARAMETER DESCRIPTION
track

Track name (dot-separated namespace).

TYPE: str

description

Track description.

TYPE: str

contacts

Path(s) to contact files. If fends is None the files must be in "intervals-value" tab-separated format (columns: chrom1, start1, end1, chrom2, start2, end2, ). Otherwise they must be in "fends-value" format (columns: fend1, fend2, count).

TYPE: str or list of str

fends

Path to a fragment-ends file with columns: fend, chr, coord.

TYPE: str or None DEFAULT: None

allow_duplicates

If True, duplicate contacts (same midpoint pair) are summed. If False, duplicates raise ValueError.

TYPE: bool DEFAULT: True

Notes
  • Intervals are converted to midpoints: X = (start1+end1)//2, Y = (start2+end2)//2.
  • Contacts are canonically ordered: if chrom2 < chrom1 (or same chrom and coord2 < coord1) the two sides are swapped.
  • Cis contacts (same chromosome) are mirrored: both (X,Y) and (Y,X) are stored unless X == Y.
  • Trans contacts (different chromosomes) are written in both directions: a chrA-chrB file and a chrB-chrA file (with swapped coordinates) are created so that queries work regardless of chromosome pair order.
RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track already exists, no contact files are provided, or duplicates are found when allow_duplicates is False.

See Also

gtrack_2d_create : Create a 2D track from a DataFrame. gtrack_2d_import : Import a 2D track from a tab-delimited file. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_2d_import_contacts("hic", "HiC", ["contacts.tsv"])

pymisha.gtrack_create_pwm_energy

gtrack_create_pwm_energy(track, description, pssmset, pssmid, prior, iterator)

Create a track from a PSSM energy function.

Creates a new Dense track with values of a PSSM energy function (log-sum-exp scoring). PSSM parameters are read from {pssmset}.key and {pssmset}.data files in GROOT/pssms/. Internally creates a temporary PWM virtual track, extracts values at the given iterator resolution, and writes them to a new Dense track.

PARAMETER DESCRIPTION
track

Name for the new track.

TYPE: str

description

Human-readable description stored as a track attribute.

TYPE: str

pssmset

Name of PSSM set. Files {pssmset}.key and {pssmset}.data must exist in GROOT/pssms/.

TYPE: str

pssmid

PSSM id within the set.

TYPE: int

prior

Dirichlet prior for the PSSM.

TYPE: float

iterator

Fixed-bin iterator bin size for the new track. Must be a positive integer.

TYPE: int

RAISES DESCRIPTION
ValueError

If the track already exists, any required argument is None, iterator is not positive, or the PSSM set/id is not found.

FileNotFoundError

If the PSSM key or data file does not exist.

RETURNS DESCRIPTION
None
See Also

gtrack_create : Create a track from a general track expression. gtrack_create_dense : Create a Dense track from intervals/values. gtrack_smooth : Create a smoothed track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_create_pwm_energy(
...     "pwm_track", "Test", "pssm", 3, 0.01, iterator=100
... )
>>> pm.gtrack_rm("pwm_track", force=True)

pymisha.gtrack_attr_get

gtrack_attr_get(track, attr)

Get a single track attribute value.

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

attr

Attribute name.

TYPE: str

RETURNS DESCRIPTION
str

Attribute value, or empty string if attribute doesn't exist.

RAISES DESCRIPTION
ValueError

If track does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_attr_get("sparse_track", "created.by")
'...'
See Also

gtrack_attr_set : Set a track attribute. gtrack_attr_export : Export attributes for multiple tracks. gtrack_attr_import : Batch-import attributes from a table.

pymisha.gtrack_attr_set

gtrack_attr_set(track, attr, value)

Set a track attribute value.

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

attr

Attribute name.

TYPE: str

value

Attribute value. Set to empty string "" to remove the attribute.

TYPE: str

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If track does not exist or the attribute is read-only.

See Also

gtrack_attr_get : Read a single track attribute. gtrack_attr_export : Export attributes for multiple tracks. gtrack_attr_import : Batch-import attributes from a table.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_attr_set("sparse_track", "test_attr", "test_value")
>>> pm.gtrack_attr_get("sparse_track", "test_attr")
'test_value'
>>> pm.gtrack_attr_set("sparse_track", "test_attr", "")

pymisha.gtrack_attr_export

gtrack_attr_export(tracks=None, attrs=None)

Export track attributes as a DataFrame.

PARAMETER DESCRIPTION
tracks

List of track names. If None, all tracks.

TYPE: list of str DEFAULT: None

attrs

List of attribute names to include. If None, all attributes.

TYPE: list of str DEFAULT: None

RETURNS DESCRIPTION
DataFrame

DataFrame with tracks as rows and attributes as columns.

RAISES DESCRIPTION
ValueError

If any specified track does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_attr_export()
>>> pm.gtrack_attr_export(tracks=["sparse_track", "dense_track"])
>>> pm.gtrack_attr_export(attrs=["created.by"])
See Also

gtrack_attr_import : Batch-import attributes from a DataFrame. gtrack_attr_get : Read a single attribute. gtrack_attr_set : Set a single attribute.

pymisha.gtrack_attr_import

gtrack_attr_import(table, remove_others=False)

Bulk import track attributes from a DataFrame.

PARAMETER DESCRIPTION
table

DataFrame with track names as index and attribute names as columns. Values are converted to strings. Empty string values are skipped (attribute not set for that track).

TYPE: DataFrame

remove_others

If True, remove all non-readonly attributes not present in the table for tracks listed in the table.

TYPE: bool DEFAULT: False

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If table is empty, any track in the index does not exist, or any attribute is read-only.

See Also

gtrack_attr_export : Export attributes to a DataFrame. gtrack_attr_get : Read a single attribute. gtrack_attr_set : Set a single attribute.

Examples:

>>> import pymisha as pm
>>> import pandas as pd
>>> _ = pm.gdb_init_examples()
>>> tbl = pd.DataFrame({"description": ["test"]}, index=["dense_track"])
>>> pm.gtrack_attr_import(tbl)

pymisha.gtrack_var_ls

gtrack_var_ls(track, pattern='')

List track variables.

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

pattern

Regex pattern to filter variable names. Default "" matches all.

TYPE: str DEFAULT: ''

RETURNS DESCRIPTION
list of str

Sorted list of variable names matching the pattern.

RAISES DESCRIPTION
ValueError

If track does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_var_ls("dense_track")
[]
See Also

gtrack_var_get : Read a variable's value. gtrack_var_set : Store a variable. gtrack_var_rm : Delete a variable.

pymisha.gtrack_var_get

gtrack_var_get(track, var)

Get the value of a track variable.

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

var

Variable name.

TYPE: str

RETURNS DESCRIPTION
object

The stored Python object.

RAISES DESCRIPTION
ValueError

If the track or variable does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_var_get("dense_track", "my_var")
See Also

gtrack_var_set : Store a variable. gtrack_var_ls : List variables for a track. gtrack_var_rm : Delete a variable.

pymisha.gtrack_var_set

gtrack_var_set(track, var, value)

Set the value of a track variable.

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

var

Variable name.

TYPE: str

value

Value to store. Can be any pickle-able Python object.

TYPE: object

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track does not exist.

See Also

gtrack_var_get : Read a variable's value. gtrack_var_ls : List variables for a track. gtrack_var_rm : Delete a variable.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_var_set("dense_track", "my_var", [1, 2, 3])

pymisha.gtrack_var_rm

gtrack_var_rm(track, var)

Remove a track variable.

PARAMETER DESCRIPTION
track

Track name.

TYPE: str

var

Variable name to remove.

TYPE: str

RETURNS DESCRIPTION
None
RAISES DESCRIPTION
ValueError

If the track does not exist.

See Also

gtrack_var_set : Store a variable. gtrack_var_get : Read a variable's value. gtrack_var_ls : List variables for a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_var_rm("dense_track", "my_var")

Track Export

Functions to export tracks to standard genomic file formats.

pymisha.gtrack_export_bedgraph

gtrack_export_bedgraph(track: str, file: str, intervals: DataFrame | None = None, iterator: int | None = None, name: str | None = None) -> None

Export a track or track expression to bedGraph format.

Evaluates a track expression over the specified genomic intervals and writes the result in standard bedGraph format (4-column, tab-separated: chrom, start, end, value). NaN values are omitted from the output.

If the output file path ends in .gz, the output is gzip-compressed.

PARAMETER DESCRIPTION
track

Track name or track expression (e.g. "dense_track" or "dense_track * 2").

TYPE: str

file

Output file path. If it ends in .gz, output is gzip-compressed.

TYPE: str

intervals

Genomic intervals to export. If None (default), the entire genome is used.

TYPE: DataFrame or None DEFAULT: None

iterator

Iterator bin size. If None (default), the iterator is determined automatically from the track expression.

TYPE: int or None DEFAULT: None

name

Track name for the bedGraph header line. If None (default), uses the track parameter value.

TYPE: str or None DEFAULT: None

RETURNS DESCRIPTION
None

Called for its side effect of writing a file.

RAISES DESCRIPTION
ValueError

If the track does not exist or is a 2D track.

FileNotFoundError

If the output directory does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_export_bedgraph("dense_track", "/tmp/dense.bedgraph")

pymisha.gtrack_export_bigwig

gtrack_export_bigwig(track: str, file: str, intervals: DataFrame | None = None, iterator: int | None = None) -> None

Export a track or track expression to BigWig format.

Creates a temporary bedGraph file via :func:gtrack_export_bedgraph and then converts it to BigWig using the UCSC bedGraphToBigWig utility.

PARAMETER DESCRIPTION
track

Track name or track expression.

TYPE: str

file

Output file path (typically ending in .bw or .bigwig).

TYPE: str

intervals

Genomic intervals to export. If None (default), the entire genome is used.

TYPE: DataFrame or None DEFAULT: None

iterator

Iterator bin size. If None (default), the iterator is determined automatically from the track expression.

TYPE: int or None DEFAULT: None

RETURNS DESCRIPTION
None

Called for its side effect of writing a file.

RAISES DESCRIPTION
RuntimeError

If bedGraphToBigWig is not found on PATH or conversion fails.

ValueError

If the track is a 2D track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_export_bigwig("dense_track", "/tmp/dense.bw")