Tracks¶

Functions for creating, importing, modifying, and managing genomic tracks, including dense, sparse, 2D, and indexed track types, as well as track attributes and variables.

pymisha.gtrack_ls ¶

gtrack_ls(*patterns, ignore_case=False, **attr_filters)

Return a list of track names in the Genomic Database.

Returns track names that match all supplied patterns. Name patterns are applied as regex searches against track names. Attribute patterns are matched against the corresponding track attribute values. Multiple patterns are applied conjunctively (all must match).

PARAMETER	DESCRIPTION
`*patterns`	Regex patterns to filter track names. Each pattern is applied sequentially; only tracks matching all patterns are returned. TYPE: `str` DEFAULT: `()`
`ignore_case`	If True, pattern matching is case-insensitive. TYPE: `bool` DEFAULT: `False`
`**attr_filters`	Keyword arguments of the form `attribute_name=pattern` where underscores in the keyword are converted to dots for the attribute lookup (e.g., `created_by="sparse"` matches attribute `created.by`). TYPE: `str` DEFAULT: `{}`

RETURNS	DESCRIPTION
`list of str or None`	Sorted list of matching track names, or None if no tracks match.

RAISES	DESCRIPTION
`ValueError`	If a regex pattern is invalid.

See Also

gtrack_exists : Test whether a single track exists. gtrack_info : Get metadata for a track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_ls()
['array_track', 'dense_track', 'rects_track', 'sparse_track', 'subdir.dense_track2']
>>> pm.gtrack_ls("dense")
['dense_track', 'subdir.dense_track2']
>>> pm.gtrack_ls(created_by="create_sparse")
['sparse_track']

pymisha.gtrack_info ¶

gtrack_info(track)

Return metadata about a track.

Returns a dictionary containing track properties such as type, dimensions, bin size, total size in bytes, and any user-defined attributes. The fields vary depending on the track type (Dense, Sparse, Rectangles, Points).

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`

RETURNS	DESCRIPTION
`dict`	Dictionary of track properties. Common keys include `"type"` (`"dense"`, `"sparse"`, `"rectangles"`, `"points"`), `"bin_size"` (for dense tracks), `"total_size"`, and `"attributes"` (dict of user-set attributes, if any).

RAISES	DESCRIPTION
`ValueError`	If the track does not exist.

See Also

gtrack_exists : Test whether a track exists. gtrack_ls : List available tracks.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_info("dense_track")
{'type': 'dense', 'dimensions': 1, ...}
>>> pm.gtrack_info("sparse_track")
{'type': 'sparse', 'dimensions': 1, ...}

pymisha.gtrack_exists ¶

gtrack_exists(track)

Test for track existence in the Genomic Database.

PARAMETER	DESCRIPTION
`track`	Track name to check. TYPE: `str`

RETURNS	DESCRIPTION
`bool`	True if the track exists, False otherwise.

RAISES	DESCRIPTION
`ValueError`	If track is None.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_exists("dense_track")
True
>>> pm.gtrack_exists("nonexistent_track")
False

See Also

gtrack_ls : List available tracks. gtrack_info : Get metadata for a track. gtrack_rm : Delete a track.

pymisha.gtrack_dataset ¶

gtrack_dataset(track)

Return the database root path that contains a track.

When multiple databases are connected, this identifies which database a track belongs to by returning the filesystem path of that database root.

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`

RETURNS	DESCRIPTION
`str`	Absolute filesystem path of the database root containing the track.

RAISES	DESCRIPTION
`ValueError`	If track is None or the track does not exist.

See Also

gtrack_info : Get full metadata for a track. gtrack_ls : List available tracks.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_dataset("dense_track")
'.../trackdb/test'

pymisha.gtrack_create ¶

gtrack_create(track, description, expr, iterator=None, band=None)

Create a track from a track expression.

Creates a new track whose values are determined by evaluating expr over the entire genome. The type of the new track (Dense, Sparse, or Rectangles) is determined by the iterator policy. The description is stored as a track attribute.

PARAMETER	DESCRIPTION
`track`	Name for the new track. Must start with a letter and contain only alphanumeric characters, underscores, and dots. TYPE: `str`
`description`	Human-readable description stored as a track attribute. TYPE: `str`
`expr`	Numeric track expression to evaluate. TYPE: `str`
`iterator`	Fixed-bin iterator bin size. If None, the iterator is determined implicitly from the track expression. TYPE: `int or None` DEFAULT: `None`
`band`	Diagonal band `(d1, d2)` for 2D track creation. When provided, only contacts where `d1 <= (x - y) < d2` are stored. TYPE: `tuple or None` DEFAULT: `None`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track already exists, expr is None, or band is not None.

See Also

gtrack_create_sparse : Create a Sparse track from intervals/values. gtrack_create_dense : Create a Dense track from intervals/values. gtrack_2d_create : Create a 2D track. gtrack_smooth : Create a smoothed track. gtrack_modify : Modify an existing Dense track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_create("mixed", "Test", "dense_track * 2", iterator=70)
>>> pm.gtrack_info("mixed")
>>> pm.gtrack_rm("mixed", force=True)

pymisha.gtrack_create_dense ¶

gtrack_create_dense(track, description, intervals, values, binsize, defval=np.nan)

Create a Dense (fixed-bin) track from intervals and values.

Creates a new Dense track whose genome is tiled into fixed-size bins. Each bin stores a single numeric value. Bins not covered by any of the supplied intervals are filled with defval. The description is stored as a track attribute.

PARAMETER	DESCRIPTION
`track`	Name for the new track. Must start with a letter and contain only alphanumeric characters, underscores, and dots. TYPE: `str`
`description`	Human-readable description stored as a track attribute. TYPE: `str`
`intervals`	One-dimensional intervals with columns `chrom`, `start`, `end`. TYPE: `DataFrame`
`values`	Numeric values, one per interval. Length must match the number of rows in intervals. TYPE: `array-like of float`
`binsize`	Bin size in base pairs. Must be a positive integer. TYPE: `int`
`defval`	Default value for bins not covered by any interval. TYPE: `float` DEFAULT: `numpy.nan`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track already exists, binsize is not positive, values length does not match intervals, or no intervals map to known chromosomes.

See Also

gtrack_create_sparse : Create a Sparse track. gtrack_create : Create a track from a track expression. gtrack_modify : Modify values of an existing Dense track. gtrack_import : Import a track from a file. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> import pandas as pd
>>> _ = pm.gdb_init_examples()
>>> intervs = pd.DataFrame({"chrom": ["1"], "start": [0], "end": [100]})
>>> pm.gtrack_create_dense("test_dn", "Test", intervs, [5.0], 50)
>>> pm.gtrack_rm("test_dn", force=True)

pymisha.gtrack_create_sparse ¶

gtrack_create_sparse(track, description, intervals, values)

Create a Sparse track from intervals and values.

Creates a new Sparse track where each interval carries an associated numeric value. Intervals must be non-overlapping within each chromosome. Chromosome names are normalized and filtered to those present in the current genome database. The description is stored as a track attribute.

PARAMETER	DESCRIPTION
`track`	Name for the new track. Must start with a letter and contain only alphanumeric characters, underscores, and dots. TYPE: `str`
`description`	Human-readable description stored as a track attribute. TYPE: `str`
`intervals`	One-dimensional intervals with columns `chrom`, `start`, `end`. TYPE: `DataFrame`
`values`	Numeric values, one per interval. Length must match the number of rows in intervals. TYPE: `array-like of float`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track already exists, intervals overlap, values length does not match intervals, or no intervals map to known chromosomes.

See Also

gtrack_create_dense : Create a Dense (fixed-bin) track. gtrack_create : Create a track from a track expression. gtrack_import : Import a track from a file. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> import pandas as pd
>>> _ = pm.gdb_init_examples()
>>> intervs = pd.DataFrame({"chrom": ["1"], "start": [0], "end": [100]})
>>> pm.gtrack_create_sparse("test_sp", "Test", intervs, [1.0])
>>> pm.gtrack_rm("test_sp", force=True)

pymisha.gtrack_create_empty_indexed ¶

gtrack_create_empty_indexed(track)

Create empty indexed files for an existing track directory.

Writes an empty track.idx and track.dat pair in the track directory. Useful when the track has no data yet but indexed format is required by the database.

PARAMETER	DESCRIPTION
`track`	Name of an existing track whose directory should receive the indexed files. TYPE: `str`

RETURNS	DESCRIPTION
`None`

See Also

gtrack_convert_to_indexed : Convert per-chromosome files to indexed. gdb_convert_to_indexed : Convert an entire database.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_create_empty_indexed("my_track")

pymisha.gtrack_import ¶

gtrack_import(track, description, file, binsize=None, defval=np.nan, attrs=None)

Create a track from a WIG, BigWig, BedGraph, BED, or tab-delimited file.

Parses the input file and creates either a Sparse or Dense track depending on binsize. File format is detected from the extension. Compressed files (.gz, .zip) are supported for all formats except BigWig. Tab-delimited files must have a header with columns chrom, start, end, and exactly one value column.

PARAMETER	DESCRIPTION
`track`	Name for the new track. TYPE: `str`
`description`	Human-readable description stored as a track attribute. TYPE: `str`
`file`	Path to the input file. Supported extensions: `.wig`, `.bedgraph`, `.bed`, `.bw` / `.bigwig`, or tab-delimited (any other extension). May include `.gz` or `.zip` suffix. TYPE: `str`
`binsize`	Bin size for a Dense track. If None or 0, a Sparse track is created. If positive, a Dense track with the given bin size is created. TYPE: `int or None` DEFAULT: `None`
`defval`	Default value for Dense track bins not covered by any interval. Ignored when creating Sparse tracks. TYPE: `float` DEFAULT: `numpy.nan`
`attrs`	Additional attributes to set on the track after import, as a dict mapping attribute names to string values. TYPE: `dict or None` DEFAULT: `None`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track already exists, file is None, or the file contains no valid intervals.

See Also

gtrack_import_set : Batch-import multiple files into tracks. gtrack_create_sparse : Create a Sparse track programmatically. gtrack_create_dense : Create a Dense track programmatically. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_import("wig_track", "From WIG", "data.wig", binsize=10)

pymisha.gtrack_import_mappedseq ¶

gtrack_import_mappedseq(track, description, file, pileup=0, binsize=-1, cols_order=(9, 11, 13, 14), remove_dups=True)

Import mapped sequences from SAM/tab-delimited text into a track.

Reads aligned sequence data from a SAM file or a tab-delimited text file and creates either a Sparse (per-read) or Dense (pileup) track. Duplicate reads at the same position and strand can optionally be removed.

PARAMETER	DESCRIPTION
`track`	Name for the new track. TYPE: `str`
`description`	Human-readable description stored as a track attribute. TYPE: `str`
`file`	Path to a SAM or tab-delimited text file. TYPE: `str`
`pileup`	If 0, create a Sparse track with one interval per mapped read. If positive, create a Dense pileup track where each bin stores the number of reads covering it. Reads are extended to this length from their start position. TYPE: `int` DEFAULT: `0`
`binsize`	Bin size for Dense (pileup) tracks. Required when pileup > 0. Must be -1 when pileup is 0. TYPE: `int` DEFAULT: `-1`
`cols_order`	Column indices (1-based) for sequence, chromosome, coordinate, and strand in a tab-delimited file. Set to None for SAM format. TYPE: `tuple of int or None` DEFAULT: `(9, 11, 13, 14)`
`remove_dups`	If True, remove duplicate reads at the same position and strand. TYPE: `bool` DEFAULT: `True`

RETURNS	DESCRIPTION
`dict`	Dictionary with keys `"total"` (dict with `"total"`, `"total.mapped"`, `"total.unmapped"`, `"total.dups"`) and `"chrom"` (pandas.DataFrame with per-chromosome mapping stats).

RAISES	DESCRIPTION
`ValueError`	If the track already exists, file is None, column indices are invalid, or pileup/binsize combination is inconsistent.

See Also

gtrack_import : Import from WIG/BedGraph/BED/BigWig files. gtrack_create_sparse : Create a Sparse track from intervals. gtrack_create_dense : Create a Dense track from intervals. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_import_mappedseq("reads", "Test", "reads.sam")

pymisha.gtrack_import_set ¶

gtrack_import_set(description, path, binsize, track_prefix=None, defval=np.nan)

Create one or more tracks from multiple WIG/BedGraph/BigWig/tab files.

Similar to gtrack_import but operates on multiple files at once. Files can be specified by a local glob pattern or an FTP URL with wildcards. Each file produces one track named {track_prefix}{filestem}. Existing tracks are skipped. The function continues importing even if individual files fail.

PARAMETER	DESCRIPTION
`description`	Human-readable description stored as a track attribute on every imported track. TYPE: `str`
`path`	Local file glob pattern (e.g., `"/data/.wig"`) or FTP URL (e.g., `"ftp://host/path/.wig.gz"`). TYPE: `str`
`binsize`	Bin size for Dense tracks. If 0, Sparse tracks are created. TYPE: `int`
`track_prefix`	Prefix prepended to each track name derived from the filename stem. If None, no prefix is used. TYPE: `str or None` DEFAULT: `None`
`defval`	Default value for Dense track bins not covered by any interval. TYPE: `float` DEFAULT: `numpy.nan`

RETURNS	DESCRIPTION
`dict`	Dictionary with keys `"files_imported"` (list of successfully imported filenames) and/or `"files_failed"` (list of filenames that failed to import).

RAISES	DESCRIPTION
`ValueError`	If description, path, or binsize is None, or no files match the pattern.

See Also

gtrack_import : Import a single file into a track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_import_set("Batch", "/data/*.wig", binsize=100, track_prefix="wigs.")

pymisha.gtrack_rm ¶

gtrack_rm(track, force=False, db=None)

Remove a track from disk.

Permanently deletes the track directory and all associated files (per-chromosome data, attributes, variables). Empty parent directories are cleaned up automatically.

PARAMETER	DESCRIPTION
`track`	Name of the track to remove. TYPE: `str`
`force`	If True, suppress errors when the track does not exist and allow deletion without confirmation. If False, raises `ValueError` when the track is missing. TYPE: `bool` DEFAULT: `False`
`db`	Explicit database root path. If None, the track is located in the currently initialized databases. TYPE: `str or None` DEFAULT: `None`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track does not exist (when force is False) or if force is False (safety guard).

See Also

gtrack_ls : List available tracks. gtrack_exists : Test whether a track exists. gtrack_mv : Rename or move a track. gtrack_copy : Copy a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_rm("my_track", force=True)

pymisha.gtrack_mv ¶

gtrack_mv(src, dest)

Rename or move a track within the same database.

Renames a track or moves it to a different namespace (directory) within its source database. The track cannot be moved across databases; use gtrack_copy followed by gtrack_rm for that.

PARAMETER	DESCRIPTION
`src`	Current track name. TYPE: `str`
`dest`	New track name. TYPE: `str`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If source and destination are identical, the source track does not exist, or the destination track already exists.

See Also

gtrack_copy : Copy a track (possibly across databases). gtrack_rm : Delete a track. gtrack_exists : Test whether a track exists. gtrack_ls : List available tracks.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_mv("old_name", "new_name")

pymisha.gtrack_copy ¶

gtrack_copy(src, dest)

Create a copy of an existing track.

Copies a track's on-disk directory to the current writable database root. The source track may reside in a different database when multiple databases are connected.

PARAMETER	DESCRIPTION
`src`	Name of the source track. TYPE: `str`
`dest`	Name for the new copy. TYPE: `str`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If source and destination are identical, the source track does not exist, or the destination track already exists.

See Also

gtrack_mv : Rename / move a track within the same database. gtrack_rm : Delete a track. gtrack_exists : Test whether a track exists. gtrack_ls : List available tracks.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_copy("dense_track", "dense_track_copy")
>>> pm.gtrack_exists("dense_track_copy")
True
>>> pm.gtrack_rm("dense_track_copy", force=True)

pymisha.gtrack_modify ¶

gtrack_modify(track, expr, intervals=None)

Modify a Dense track's values in-place by evaluating an expression.

Overwrites the values of an existing Dense track with the result of evaluating expr. The iterator policy is automatically set to the track's bin size. Only Dense (fixed-bin) tracks are supported.

PARAMETER	DESCRIPTION
`track`	Name of the dense track to modify. TYPE: `str`
`expr`	Track expression to evaluate (may reference the track itself). TYPE: `str`
`intervals`	Genomic scope for modification. If None, the entire genome (ALLGENOME) is used. TYPE: `DataFrame or None` DEFAULT: `None`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track does not exist, is not a Dense track, or expr is None.

See Also

gtrack_create : Create a new track from a track expression. gtrack_smooth : Create a smoothed copy of a track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_modify("dense_track", "dense_track * 2")
>>> pm.gtrack_modify("dense_track", "dense_track / 2")

pymisha.gtrack_smooth ¶

gtrack_smooth(track, description, expr, winsize, weight_thr=0, smooth_nans=False, alg='LINEAR_RAMP', iterator=None)

Create a new Dense track with smoothed values from a track expression.

Each output bin at coordinate C is computed by smoothing the non-NaN values of expr within a window of size winsize (in coordinate units) around C. The smoothing algorithm and handling of NaN / edge-of-chromosome gaps are controlled by the remaining parameters.

PARAMETER	DESCRIPTION
`track`	Name of the new track to create. TYPE: `str`
`description`	Human-readable description stored as a track attribute. TYPE: `str`
`expr`	Track expression whose values are smoothed. TYPE: `str`
`winsize`	Smoothing window size in coordinate units. Defines the total region considered on both sides of the central point. TYPE: `float`
`weight_thr`	Weight sum threshold below which the smoothed value is NaN instead of a partial-window estimate. TYPE: `float` DEFAULT: `0`
`smooth_nans`	If False, output NaN whenever the central window value is NaN, regardless of weight_thr. If True, NaN center values are filled from surrounding non-NaN values. TYPE: `bool` DEFAULT: `False`
`alg`	Smoothing algorithm. `"LINEAR_RAMP"` uses a weighted average with linearly decreasing weights. `"MEAN"` uses a simple arithmetic average. TYPE: `str` DEFAULT: ``"LINEAR_RAMP"``
`iterator`	Fixed-bin iterator bin size for the new track. If None, the bin size is inferred from the track expression. TYPE: `int or None` DEFAULT: `None`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track already exists, expr is None, winsize is not positive, or alg is not one of the supported algorithms.

See Also

gtrack_create : Create a track from a track expression. gtrack_modify : Modify an existing Dense track in-place. gtrack_create_sparse : Create a Sparse track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_smooth("smoothed", "Test", "dense_track", 500)
>>> pm.gtrack_rm("smoothed", force=True)

pymisha.gtrack_convert_to_indexed ¶

gtrack_convert_to_indexed(track, remove_old=False)

Convert a per-chromosome track to indexed format.

Reads the per-chromosome binary files and writes a unified track.idx / track.dat pair. Optionally removes the original per-chromosome files after conversion.

PARAMETER	DESCRIPTION
`track`	Name of the track to convert. TYPE: `str`
`remove_old`	If True, remove the original per-chromosome files after successful conversion. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`None`

See Also

gtrack_create_empty_indexed : Create empty indexed files. gdb_convert_to_indexed : Convert an entire database.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_convert_to_indexed("my_track", remove_old=True)

pymisha.gtrack_2d_convert_to_indexed ¶

gtrack_2d_convert_to_indexed(track, remove_old=True, force=False)

Convert a 2D track to indexed format (track.dat + track.idx).

Consolidates per-chromosome-pair files into a single indexed format, reducing file descriptor usage from O(N^2) to O(1).

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`
`remove_old`	If True, remove old per-pair files after conversion. TYPE: `bool` DEFAULT: `True`
`force`	If True, re-convert even if already in indexed format. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track does not exist, is not a 2D track, or conversion fails.

See Also

gtrack_convert_to_indexed : Convert a 1D track to indexed format. gdb_convert_to_indexed : Convert an entire database.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_2d_convert_to_indexed("my_2d_track")

pymisha.gtrack_2d_create ¶

gtrack_2d_create(track, description, intervals, values)

Create a 2D track from intervals and values.

PARAMETER	DESCRIPTION
`track`	Track name (dot-separated namespace). TYPE: `str`
`description`	Track description. TYPE: `str`
`intervals`	2D intervals with columns: chrom1, start1, end1, chrom2, start2, end2. TYPE: `DataFrame`
`values`	Numeric values, one per interval. TYPE: `array - like`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track already exists, values length does not match intervals, no valid intervals remain after normalization, or overlapping intervals are detected within the same chromosome pair.

See Also

gtrack_2d_import : Create a 2D track from a file. gtrack_2d_import_contacts : Import HiC contact data as a 2D track. gtrack_create_sparse : Create a 1D Sparse track. gtrack_rm : Delete a track.

Notes

Automatically detects POINTS vs RECTS format based on interval sizes. All unit-size intervals (end-start==1) produce a POINTS track. Overlapping intervals within the same chromosome pair raise an error.

Examples:

>>> import pymisha as pm
>>> import pandas as pd
>>> _ = pm.gdb_init_examples()
>>> ivs = pd.DataFrame({
...     "chrom1": ["1"], "start1": [0], "end1": [100],
...     "chrom2": ["1"], "start2": [200], "end2": [300],
... })
>>> pm.gtrack_2d_create("test_2d", "Test", ivs, [1.0])
>>> pm.gtrack_rm("test_2d", force=True)

pymisha.gtrack_2d_import ¶

gtrack_2d_import(track, description, file)

Import a 2D track from one or more tab-delimited files.

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`
`description`	Track description. TYPE: `str`
`file`	Path(s) to tab-delimited file(s) with header: chrom1, start1, end1, chrom2, start2, end2, . When multiple files are given, all are read and concatenated before building the quad-tree. TYPE: `str or list of str`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track already exists, any file is not found, the file list is empty, or a file has fewer than 7 columns.

See Also

gtrack_2d_create : Create a 2D track from a DataFrame. gtrack_2d_import_contacts : Import HiC contacts as a 2D track. gtrack_rm : Delete a track.

Notes

The value column is the 7th column (0-indexed: column 6). Automatically detects POINTS vs RECTS format.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_2d_import("test_2d", "Test", "contacts.tsv")
>>> # pm.gtrack_2d_import("test_2d", "Test", ["a.tsv", "b.tsv"])

pymisha.gtrack_2d_import_contacts ¶

gtrack_2d_import_contacts(track, description, contacts, fends=None, allow_duplicates=True)

Create a 2D Points track from inter-genomic contacts.

PARAMETER	DESCRIPTION
`track`	Track name (dot-separated namespace). TYPE: `str`
`description`	Track description. TYPE: `str`
`contacts`	Path(s) to contact files. If `fends` is None the files must be in "intervals-value" tab-separated format (columns: chrom1, start1, end1, chrom2, start2, end2, ). Otherwise they must be in "fends-value" format (columns: fend1, fend2, count). TYPE: `str or list of str`
`fends`	Path to a fragment-ends file with columns: fend, chr, coord. TYPE: `str or None` DEFAULT: `None`
`allow_duplicates`	If True, duplicate contacts (same midpoint pair) are summed. If False, duplicates raise `ValueError`. TYPE: `bool` DEFAULT: `True`

Notes

Intervals are converted to midpoints: X = (start1+end1)//2, Y = (start2+end2)//2.
Contacts are canonically ordered: if chrom2 < chrom1 (or same chrom and coord2 < coord1) the two sides are swapped.
Cis contacts (same chromosome) are mirrored: both (X,Y) and (Y,X) are stored unless X == Y.
Trans contacts (different chromosomes) are written in both directions: a chrA-chrB file and a chrB-chrA file (with swapped coordinates) are created so that queries work regardless of chromosome pair order.

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track already exists, no contact files are provided, or duplicates are found when allow_duplicates is False.

See Also

gtrack_2d_create : Create a 2D track from a DataFrame. gtrack_2d_import : Import a 2D track from a tab-delimited file. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_2d_import_contacts("hic", "HiC", ["contacts.tsv"])

pymisha.gtrack_create_pwm_energy ¶

gtrack_create_pwm_energy(track, description, pssmset, pssmid, prior, iterator)

Create a track from a PSSM energy function.

Creates a new Dense track with values of a PSSM energy function (log-sum-exp scoring). PSSM parameters are read from {pssmset}.key and {pssmset}.data files in GROOT/pssms/. Internally creates a temporary PWM virtual track, extracts values at the given iterator resolution, and writes them to a new Dense track.

PARAMETER	DESCRIPTION
`track`	Name for the new track. TYPE: `str`
`description`	Human-readable description stored as a track attribute. TYPE: `str`
`pssmset`	Name of PSSM set. Files `{pssmset}.key` and `{pssmset}.data` must exist in `GROOT/pssms/`. TYPE: `str`
`pssmid`	PSSM id within the set. TYPE: `int`
`prior`	Dirichlet prior for the PSSM. TYPE: `float`
`iterator`	Fixed-bin iterator bin size for the new track. Must be a positive integer. TYPE: `int`

RAISES	DESCRIPTION
`ValueError`	If the track already exists, any required argument is None, iterator is not positive, or the PSSM set/id is not found.
`FileNotFoundError`	If the PSSM key or data file does not exist.

RETURNS	DESCRIPTION
`None`

See Also

gtrack_create : Create a track from a general track expression. gtrack_create_dense : Create a Dense track from intervals/values. gtrack_smooth : Create a smoothed track. gtrack_rm : Delete a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_create_pwm_energy(
...     "pwm_track", "Test", "pssm", 3, 0.01, iterator=100
... )
>>> pm.gtrack_rm("pwm_track", force=True)

pymisha.gtrack_attr_get ¶

gtrack_attr_get(track, attr)

Get a single track attribute value.

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`
`attr`	Attribute name. TYPE: `str`

RETURNS	DESCRIPTION
`str`	Attribute value, or empty string if attribute doesn't exist.

RAISES	DESCRIPTION
`ValueError`	If track does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_attr_get("sparse_track", "created.by")
'...'

See Also

gtrack_attr_set : Set a track attribute. gtrack_attr_export : Export attributes for multiple tracks. gtrack_attr_import : Batch-import attributes from a table.

pymisha.gtrack_attr_set ¶

gtrack_attr_set(track, attr, value)

Set a track attribute value.

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`
`attr`	Attribute name. TYPE: `str`
`value`	Attribute value. Set to empty string "" to remove the attribute. TYPE: `str`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If track does not exist or the attribute is read-only.

See Also

gtrack_attr_get : Read a single track attribute. gtrack_attr_export : Export attributes for multiple tracks. gtrack_attr_import : Batch-import attributes from a table.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_attr_set("sparse_track", "test_attr", "test_value")
>>> pm.gtrack_attr_get("sparse_track", "test_attr")
'test_value'
>>> pm.gtrack_attr_set("sparse_track", "test_attr", "")

pymisha.gtrack_attr_export ¶

gtrack_attr_export(tracks=None, attrs=None)

Export track attributes as a DataFrame.

PARAMETER	DESCRIPTION
`tracks`	List of track names. If None, all tracks. TYPE: `list of str` DEFAULT: `None`
`attrs`	List of attribute names to include. If None, all attributes. TYPE: `list of str` DEFAULT: `None`

RETURNS	DESCRIPTION
`DataFrame`	DataFrame with tracks as rows and attributes as columns.

RAISES	DESCRIPTION
`ValueError`	If any specified track does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_attr_export()
>>> pm.gtrack_attr_export(tracks=["sparse_track", "dense_track"])
>>> pm.gtrack_attr_export(attrs=["created.by"])

See Also

gtrack_attr_import : Batch-import attributes from a DataFrame. gtrack_attr_get : Read a single attribute. gtrack_attr_set : Set a single attribute.

pymisha.gtrack_attr_import ¶

gtrack_attr_import(table, remove_others=False)

Bulk import track attributes from a DataFrame.

PARAMETER	DESCRIPTION
`table`	DataFrame with track names as index and attribute names as columns. Values are converted to strings. Empty string values are skipped (attribute not set for that track). TYPE: `DataFrame`
`remove_others`	If True, remove all non-readonly attributes not present in the table for tracks listed in the table. TYPE: `bool` DEFAULT: `False`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If table is empty, any track in the index does not exist, or any attribute is read-only.

See Also

gtrack_attr_export : Export attributes to a DataFrame. gtrack_attr_get : Read a single attribute. gtrack_attr_set : Set a single attribute.

Examples:

>>> import pymisha as pm
>>> import pandas as pd
>>> _ = pm.gdb_init_examples()
>>> tbl = pd.DataFrame({"description": ["test"]}, index=["dense_track"])
>>> pm.gtrack_attr_import(tbl)

pymisha.gtrack_var_ls ¶

gtrack_var_ls(track, pattern='')

List track variables.

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`
`pattern`	Regex pattern to filter variable names. Default `""` matches all. TYPE: `str` DEFAULT: `''`

RETURNS	DESCRIPTION
`list of str`	Sorted list of variable names matching the pattern.

RAISES	DESCRIPTION
`ValueError`	If track does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_var_ls("dense_track")
[]

See Also

gtrack_var_get : Read a variable's value. gtrack_var_set : Store a variable. gtrack_var_rm : Delete a variable.

pymisha.gtrack_var_get ¶

gtrack_var_get(track, var)

Get the value of a track variable.

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`
`var`	Variable name. TYPE: `str`

RETURNS	DESCRIPTION
`object`	The stored Python object.

RAISES	DESCRIPTION
`ValueError`	If the track or variable does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_var_get("dense_track", "my_var")

See Also

gtrack_var_set : Store a variable. gtrack_var_ls : List variables for a track. gtrack_var_rm : Delete a variable.

pymisha.gtrack_var_set ¶

gtrack_var_set(track, var, value)

Set the value of a track variable.

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`
`var`	Variable name. TYPE: `str`
`value`	Value to store. Can be any pickle-able Python object. TYPE: `object`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track does not exist.

See Also

gtrack_var_get : Read a variable's value. gtrack_var_ls : List variables for a track. gtrack_var_rm : Delete a variable.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_var_set("dense_track", "my_var", [1, 2, 3])

pymisha.gtrack_var_rm ¶

gtrack_var_rm(track, var)

Remove a track variable.

PARAMETER	DESCRIPTION
`track`	Track name. TYPE: `str`
`var`	Variable name to remove. TYPE: `str`

RETURNS	DESCRIPTION
`None`

RAISES	DESCRIPTION
`ValueError`	If the track does not exist.

See Also

gtrack_var_set : Store a variable. gtrack_var_get : Read a variable's value. gtrack_var_ls : List variables for a track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> # pm.gtrack_var_rm("dense_track", "my_var")

Track Export¶

Functions to export tracks to standard genomic file formats.

pymisha.gtrack_export_bedgraph ¶

gtrack_export_bedgraph(track: str, file: str, intervals: DataFrame | None = None, iterator: int | None = None, name: str | None = None) -> None

Export a track or track expression to bedGraph format.

Evaluates a track expression over the specified genomic intervals and writes the result in standard bedGraph format (4-column, tab-separated: chrom, start, end, value). NaN values are omitted from the output.

If the output file path ends in .gz, the output is gzip-compressed.

PARAMETER	DESCRIPTION
`track`	Track name or track expression (e.g. `"dense_track"` or `"dense_track * 2"`). TYPE: `str`
`file`	Output file path. If it ends in `.gz`, output is gzip-compressed. TYPE: `str`
`intervals`	Genomic intervals to export. If `None` (default), the entire genome is used. TYPE: `DataFrame or None` DEFAULT: `None`
`iterator`	Iterator bin size. If `None` (default), the iterator is determined automatically from the track expression. TYPE: `int or None` DEFAULT: `None`
`name`	Track name for the bedGraph header line. If `None` (default), uses the `track` parameter value. TYPE: `str or None` DEFAULT: `None`

RETURNS	DESCRIPTION
`None`	Called for its side effect of writing a file.

RAISES	DESCRIPTION
`ValueError`	If the track does not exist or is a 2D track.
`FileNotFoundError`	If the output directory does not exist.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_export_bedgraph("dense_track", "/tmp/dense.bedgraph")

pymisha.gtrack_export_bigwig ¶

gtrack_export_bigwig(track: str, file: str, intervals: DataFrame | None = None, iterator: int | None = None) -> None

Export a track or track expression to BigWig format.

Creates a temporary bedGraph file via :func:gtrack_export_bedgraph and then converts it to BigWig using the UCSC bedGraphToBigWig utility.

PARAMETER	DESCRIPTION
`track`	Track name or track expression. TYPE: `str`
`file`	Output file path (typically ending in `.bw` or `.bigwig`). TYPE: `str`
`intervals`	Genomic intervals to export. If `None` (default), the entire genome is used. TYPE: `DataFrame or None` DEFAULT: `None`
`iterator`	Iterator bin size. If `None` (default), the iterator is determined automatically from the track expression. TYPE: `int or None` DEFAULT: `None`

RETURNS	DESCRIPTION
`None`	Called for its side effect of writing a file.

RAISES	DESCRIPTION
`RuntimeError`	If `bedGraphToBigWig` is not found on PATH or conversion fails.
`ValueError`	If the track is a 2D track.

Examples:

>>> import pymisha as pm
>>> _ = pm.gdb_init_examples()
>>> pm.gtrack_export_bigwig("dense_track", "/tmp/dense.bw")