Changelog

Added gtrack.create_dense function - creates a dense track from an intervals and values.
Fixed compilation error in old versions of macOS (clock_gettime is missing).
Fixed a bug in gtrack.import_bigwig: intern argument was not passed to system calls.

Fixed memory alignment issues in clang-UBSAN.

Bug fix in “coverage” virtual track function: incorrect results on some cases when query intervals where from different chromosomes.

Added “kmer.count” and “kmer.frac” virtual track functions that calculate the number of k-mers and the fraction of k-mers in the current iterator interval.

Added “coverage” virtual track function that calculates the fraction that iterator intervals are covered by given source intervals.

Added requirement for C++14 standard.

Fixed compilation issue on Ubuntu devel.

Added PWM functions: “pwm”, “pwm.max” and “pwm.max.pos” are new virtual track functions.
Added grevcomp function (reverse complement of a DNA sequence).

Better error message in gvtrack.create.

Added gdb.create_genome function.
removed non-API calls to R: R_curErrorBuf, SET_TYPEOF

Fixed noRemap additional issue by adding Rf_ prefix in the c++ code.
Removed non-API calls to R in the C++ code.

Fixed a bug in gtrack.import from bigwig.

Fixed compilation issues in M1 Mac.

Fixed “mat string is not a string literal” warnings.

Fixed gcc-UBSAN memory misalignment issues.

ALLGENOME is now only soft deprecated in order to support old misha scripts.
Fixed compilation issues with gcc-UBSAN and LTO.

Added gtrack.create_dirs function.
Fixed a bug in gcluster.run.
Updated documentation of gintervals.neighbors.

First CRAN release
Breaking change: Moved global variables into a separate environment called .misha. Variables such as ALLGENOME can now be accessed as .misha$ALLGENOME. This change is not backwards compatible, please update your code accordingly.
Breaking change: Repair names of the resulting data frame of gintervals.neighbors (same as gintervals.neighbors1 from misha.ext). This means that instead of having two columns of ‘chrom’, ‘start’ and ‘end’, the resulting data frame would have ‘chrom1’, ‘start1’ and ‘end1’.
Use roxygen2 for documentation
Fixed compilation errors on MAC.
Fixed many compilation warnings.
gwget now uses curl in order to work on systems that do not have ftp installed.
User manual was converted to markdown format.
Added a new Genomes vignette that demonstrates how to create a new genome database.
fix: wrong bins assignment in BinFinder.h (this code was never accessible from the R API)

Bug fix (bug first appears in 4.0.8): virtual tracks based on “global.percentile”, “global.percentile.min” or “global.percentile.max” might occasionally return unexpected results and/or cause crashes due to faulty memory management.

Fixed compilation errors on OSX.

“unprotect_ptr: pointer not found” error in virtual tracks based on intervals.

Fixed a minor resource leak.
Redirect all messages and progress to stderr instead of stdout.

Fixed a resource leak that might result in “protection stack overflow” error.

Crash fix in gdb.create / gintervals.import.genes.

Increase the maximal number of tracks allowed in a track expression to 10.000.

Bug fix: “child process ended unexpectedly” errors, crashes and hang ups whilst multitasking when running out of memory

Fixed installation issues on some platforms

Switched from custom random seed control (options(grnd.seed=…)) to R standard (set.seed)

Bug fix in gsample: results differ on Linux vs. OSX even when the same random seed is used on both platforms

OSX support
Bug fix in all functions using 2D intervals set iterator: in multitasking mode some of the chromosome pairs might be skipped. In non-multitasking mode the scope might be considered as empty. This behavior is random and recurrent calls might suddenly return correct results.
Bug fix in multitasking: occasional hang ups when memory usage of the child processes exceeds the limit gmax.mem.usage
Bug fix in multitasking: all functions creating new files and reporting progress might create corrupted files (gtrack.create, ….)
Bug fix in all functions using intervals.set.out parameter: small intervals set instead of big one might be created and vice versa. Also an error message “result size exceeded the maximum allowed” might be mistakenly generated
Bug fix in gintervals.load applied to 1D big intervals: “Error in if (progress && percentage < 100 && progress.percentage != percentage)”
Bug fix in all functions returning 1D or 2D intervals: in rare random cases NULL or invalid intervals set is returned
Bug fix in gcompute_strands_autocorr: internal buffer overflow and possible memory corruption
gdb.reload: run-time improvements

Bug fix in gintervals.liftover and gtrack.liftover: some intervals might fail to be translated
Bug fix in gintervals.liftover: “object ‘f’ not found” error if chain intervals are used in “chain” parameter

Ubuntu support (multitasking mode still not thoroughly tested)
Bug fix: no progress report in multitasking mode
Bug fix in gdb.create: temporary directory is created under the current GROOT instead of the new one

Bug fix: occasional defunc processes AND/OR hanging in multitasking mode

Run-time optimizations

Avoid call to gdb.reload() (slow on large DB) in various functions: that create or remove tracks or intervals sets
Bug fix in gintervals.neighbors: with 2D intervals the number of the returned neighbors might be less than “maxneighbors” parameter
Bug fix in gintervals.neighbors: with 2D intervals NULL might be returned instead of NA if na.if.notfound=T

Improved control over total maximal memory consumption in multitasking mode via gmax.mem.usage option.

Run time optimizations when using several virtual tracks based on the same array track, differing only by slice

Allow usage of sparse / arrays tracks in place of intervals
Allow usage of big intervals sets in gintervals.diff, gintervals.intersect, gintervals.mapply, gintervals.union
Run time optimizations when 1D big intervals set is used for scope
Run time optimizations in various gintervals.* functions when big intervals sets are used
Bug fix: functions might get stuck or crash when array track / sparse track / 1D big intervals set iterator is used along with 1D big intervals scope
Bug fix: functions might get stuck or crash when 2D big intervals set iterator is used along with 2D big intervals scope
Bug fix in gintervals.neighbors and gintervals.intersect: result might be poorly sorted when big interval sets are used
Bug fix in gintervals.load: on an empty set or chrom returns NULL for small intervals sets and an empty data frame for a big intervals set
Bug fix: “100%…” or “100%” is sometimes printed as the only progress report
Bug fix: multiple progress report in some functions

Bug fix in various gintervals.* functions: invalid output (except for the first row) when 2D track is used for intervals
Bug fix in gextract: incorrect intervalID returned when 2D track is used for intervals

Allow usage of 2D track in place of intervals
Add progress report to gintervals.load
Bug fix when using 2D big intervals: in some cases some or all chromosomes of the big intervals set might be skipped when big intervals set is used as a scope
Big intervals set: before load verify that the size of a single chromosome (or chromosome pair) does not exceed gmax.data.size
Bug fix in gcluster.run: clean up of running processes might not be completed if Ctr+C is pressed multiple times
Bug fix in gintervals.load: returns all 2D intervals instead of a subset if one of chrom1/chrom2 is NULL and another one is not NULL
Bug fix in gintervals.load: invalid row names if chrom / chrom1 / chrom2 parameter is used for a small intervals set

Support intervals represented by tibbles

New function: gsample; returns N random samples from the specified track expression
Improved random seed when options(grnd.seed=0): so far two calls occurring within a second used identical random generators

Fixed compilation errors on some platforms
Run time improvement in gintervals.intersect when big intervals sets are used
Bug fix in gintervals.neighbors: returns NULL if 2D big intervals sets are used
Bug fix: a few point tracks in a track expression might be used without specifying an iterator

Dynamically limit memory use in multitasking mode
Bug fix: race condition and potential crash in multitasking mode when one of the child processes exits shortly after it is launched

Bug fix in gintervals.force_range: error when intervals are out of range

Run-time optimizations in track expression evaluation
Run-time optimizations in gintervals.neighbors

Run-time optimizations when working with large data frames of intervals in: gintervals.chrom_sizes, gintervals.force_range, gintervals.save
Run-time optimizations when working with large data frames of intervals and using intervals.set.out parameter in all the functions that accept this parameter

Run-time optimizations when working with big intervals sets in: gintervals.load, gintervals.diff, gintervals.force_range, gintervals.intersect, gintervals.mapply, gintervals.neighbors, gintervals.rbind, gintervals.update, gintervals.union
Bug fix in gintervals.diff, gintervals.intersect, gintervals.neighbors, gintervals.rbind, gintervals.union: “object ‘intervals’ not found” error in some cases when big intervals sets are used
Bug fix in gintervals.rbind: result does not preserve the original order if big intervals are used
New undocumented function: .grbind

gintervals.neighbors: run-time optimizations (the answer is entirely generated in C++)

gintervals.neighbors: sort the output by original ids of intervals1, then |distance| (Manhattan distance for 2D), then ids of intervals2

New version of gintervals.neighbors replaces both the old gintervals.neighbors and gintervals.annotate. By default gintervals.neighbors returns the closest neighbor
gintervals.neighbors: support 2D intervals
New parameters in gintervals.neighbors: maxneighbors, mindist1, maxdist1, mindist2, maxdist2, na.if.notfound
gintervals.neighbors: columns renamed in the output
Bug fix in gintervals.neighbors: in certain 1D cases some neighbors are not stated
Check interrupts (Ctr+C) in gintervals.neighbors
gintervals.annotate is removed

gintervals.annotate: change the output format (instead of annotation id fully attach annotation interval)
gintevals.annotate: support 2D intervals
Bug fix: some jobs might return NA when run with gcluster.run

gtrack.2d.import_contacts: support contacts files in interval-value format
gtrack.2d.import_contacts: reduce the number of simultaneously opened files
gtrack.2d.import: print out coordinates of duplicated object

New function: gintervals.2d.import
New option: gbig.intervals.size - controls the threshold when big intervals sets are created. Default value: 1000000
Bug fix in gintervals.2d.import_contacts: utterly huge tracks might have missing areas of contacts
Reduced the default value of gmax.processes option from 64 to 16
Bug fix: incorrect progress report in gtrack.2d.import_contacts

New function: gintervals.rbind. Runs rbind on intervals sets including big intervals sets on disk
New “intervals.set.out” parameter added to: gtrack.array.extract, gwilcox
Support big intervals sets in: gseq.extract, gtrack.array.extract, gtrack.modify
Bug fix: gtrack.2d.import_contacts does not recognize chromosomes that have “chr” prefix

Bug fix for all functions using 2D iterators: in some cases full chromosome pairs can be skipped. Bug first appeared in 3.3.0

Bug fix in gdb.create: “Error in .gintervals.check_new_set(intervals.set.out)…”

Added ‘opt.flags’ parameter to gcluster.run. Use this parameter to add restrictions to the machines that run submitted jobs: minimal RAM requirement, explicit hostnames list, etc. See man for qsub, “-l” flag.
Support big intervals sets in the following functions: gpartition
New “intervals.set.out” parameter added to: gpartition, gsegment
Bug fix: invalid error recovery in glookup - traces from intervals.set.out might be left

Interface change: gintervals.neighbors returns a data frame containing full intervals instead of their ids
gintervals.neighbors: colnames parameter removed
Support SAM files in gtrack.import_mappedseq
Support big intervals sets in the following functions: gintervals.neighbors, glookup
New “intervals.set.out” parameter added to: gintervals.neighbors, glookup
Removed gintervals.merge function
Runtime optimizations when big intervals sets are used in the following functions: gintervals.annotate, gintervals.diff, gintervals.force_range, gintervals.intersect, gintervals.mapply, gintervals.save, gintervals.union

Bug fix in gquantiles, multitasking version (which is used by default): invalid quantiles might be returned if the number of iterator intervals exceeds gmax.data.size / number_of_child_processes. number_of_child_processes equals at most to the number of different chromosomes (or chromosome pairs for 2D) used in in “intervals” parameter
Support big intervals sets in the following functions: gintervals.mapply
New “intervals.set.out” parameter added to: gintervals.mapply

Support big intervals sets in the following functions: gintervals.annotate, gintervals.diff, gintervals.intersect, gintervals.union, gsegment, gwilcox
New “intervals.set.out” parameter added to: gintervals.annotate, gintervals.diff, gintervals.intersect, gintervals.union
Added progress report to: gintervals.save, gintervals.force_range

Support big intervals sets in the following functions: gcis_decay (only intervals parameter), gintervals.2d.band_intersect
New “intervals.set.out” parameter added to: gintervals.2d.band_intersect
Bug fix in gintervals.force_range: “Error in if (size > max.data.size) { : argument is of length zero”

New concept: big intervals sets
Big intervals sets can be used for iterator parameter in all functions
Big intervals sets can be used in intervals parameter in the following functions: gdist, gextract, gquantiles, gscreen, gsummary, gbins.quantiles, gbins.summary, gintervals.quantiles, gintervals.summary, giterator.intervals
Interface change: gintervals.quantiles, gintervals.summary now return also the source intervals
New functions: gintervals.is.bigset, gintervals.chrom_sizes, gintervals.update
New “chrom”, “chrom1”, “chrom2”, parameters for gintervals.load
New “intervals.set.out” parameter added to: gextract, gscreen, gintervals.force_range, gintervals.quantiles, gintervals.summary, giterator.intervals
Restrict gintervals.quantiles and gintervals.mapply to work with only 1D and Fixed Rectangle iterators
Changed the position of “file” parameter in gextract
Bug fix: gscreen on vtrack with global.percentile.max returns different number of intervals in each run
Bug fix: in 2D iterators progress report can sometimes go backwards
Bug fix: empty intervals set is ignored if used as an iterator
Bug fix: crash if an empty intervals set is used for scope
Bug fix: memory leak in giterator.cartesian_grid

Bug fix in gintervals.save: “variable shadows the name of the intervals set” error can be generated even if auto-completion is turned off
Bug fix in all functions that create a new track: “variable shadows the name of the new track” error can be generated even if auto-completion is turned off
Bug fix in gdb.create, gtrack.var.set and while creating P-values table: insufficient permissions might be given to created directories

Bug fix: no proper clean up if error is generated while P-values table is loaded

New function: gcis_decay

New format for tracks created by gtrack.2d.import. This format uses on average 30% less space. Old format can still be used
Old computed tracks now require conversion
Bug fix: crash reading computed tracks

Changed format of 2D tracks (rectangles and computed). Use gtrack.convert to convert the old tracks
Added undocumented function: .gdb.convert_tracks
Support track files larger than 2 Gb on 32-bit platforms
gtrack.2d.import_contacts: support creation of huge 2D tracks (practically unlimited size) whilst constant memory usage
gtrack.2d.import_contacts: allow arbitrary order of contacts within contacts file or between several contacts files
gtrack.2d.import_contacts: improved progress report
Ensure binary consistency of 2D track files. Previously the files representing two identical 2D tracks could differ on binary level

gtrack.2d.import_contacts: sum up duplicated contacts if ‘allow.duplicates’ is TRUE
gtrack.2d.import_contacts: allow contacts to be passed in multiple files

Virtual tracks are not created anymore as variables in R environment (dummy variables are created in autocompletion mode). Virtual tracks are stored inside GVTRACKS variable.
Virtual tracks are not reset anymore on gsetroot or gdir.cd
New function: gvtrack.info
Removed functions: gvtrack.all.load, gvtrack.all.rm, gvtrack.all.save, gvtrack.import
Bug fix: GERROR_EXPR variable reported but not set by functions supporting multitasking

Bug fix in gcluster.run: on some systems a warning is generated “bash: module: line 1: syntax error: unexpected end of file: error importing function definition for `module’”

Track variables are referenced by two parameters: track, varname instead of “track.varname”
Renamed and adopted for new track variable convention:
gvar.load => gtrack.var.get
gvar.save => gtrack.var.set
gvar.ls => gtrack.var.ls
gvar.rm => gtrack.var.rm
Remove gvar.exists function
New concept: track attributes. Use .gdb.convert_attrs() to convert old trackdb to the new format.
New functions: gtrack.attr.get, gtrack.attr.set, gtrack.attr.export, gtrack.attr.import, gdb.get_readonly_attrs, gdb.set_readonly_attrs
created.by and created.date are no longer track variables but rather read-only track attributes
gtrack.ls: allow filtering by track attributes
New obligatory “description” parameter in gtrack.2d.create, gtrack.2d.import_contacts, gtrack.array.import, gtrack.convert, gtrack.create, gtrack.create_pwm_energy, gtrack.create_sparse, gtrack.import, gtrack.import_mappedseq, gtrack.import_set, gtrack.liftover, gtrack.lookup, gtrack.smooth
New function: gset_input_mode. This function replaces gparam.type option and controls auto-completion of track / intervals names
By default interactive mode is switched off (equivalent to gparam.type=“string” in older version). Auto-completion is switched off as well.
Check parameters correctness in giterator.cartesian_grid and not only when the iterator is actually used
Bug fix in gcluster.run: distributed command resets GROOT and various package options
Bug fix: in non-interactive input mode (“string” var mode) gvtrack.array.slice fails
Bug fix: gsetroot, gdb.reload, gdb.cd leave traces in the environment if they stop on error
Bug fix: gtrack.modify incorrectly sets created.by attribute
Bug fix: invalid usage printed in gvtrack.all.load

Changed the policy for multitasking job distribution

Multitasking for glookup, gtrack.smooth, gintervals.quantiles
Bug fix: memory corruption in gintervals.mapply when intervals==ALGENOME and multitasking is turned off

Multitasking for gintervals.mapply
Bug fix: invalid format of data frame returned by gintervals.mapply when intervals==ALGENOME
Bug fix: error while preparing a track for percentiles queries

Multitasking for gtrack.create

Multitasking for gtrack.create_pwm_energy

Multitasking for gquantiles

Bug fix: some genomic intervals might be missing in gintervals.load_chain
Bug fix: gintervals.liftover might incorrectly convert some genomic intervals to NULL
Bug fix: gtrack.liftover might incorrectly set NA for some converted genomic intervals

Bug fix: unreleased shared memory or/and named semaphore if R/misha crashes or is terminated with a signal
Bug fix: in some cases 3 seconds delay in multitasked functions
Bug fix: in some cases unresponsiveness on Ctrl+C in multitasked functions
Bug fix: with non-default options gquantile could return incorrect value for extreme percentiles (close to 0 or to 1)

Bug fix: deadlock in all multitasked functions (gsummary, gextract, gdist)

Multitasking for gscreen, gsummary
New R option for the package: gmax.processes, default: 64
Bug fix: “child process ended unexpectedly” error in multitasked functions when evaluation of track expression fails
Bug fix: Ctrl+C might stop working in R after evaluation of track expression fails

Multitasking for gextract
Bug fix: virtual tracks do not work in gdist
Bug fix: potential crash and process table bloating in gdist
Bug fix: potential freezing (deadlock) in gdist
Bug fix: error “2D iterator is used along with 1D intervals” in gdist with 2D iterator and intervals==ALLGENOME

Multitasking for gdist

Bug fix: in gintervals.2d “Error in is.null(strands) : ‘strands’ is missing”

Add strand parameter to gintervals

New function: gtrack.import
Allow tab-delimited files in gtrack.import_set
gintervals.import_genes / gdb.init: add kgID column to annotations
gintervals.import_genes / gdb.init: eliminate identical values in overlapping intervals’ annotation
Bug fix: in tab-delimited files if end coordinate equals the chrom size an error is reported

Bug fix: gintervals.import_genes / gdb.create switches between utr3 and utr5

New track type: array
New functions: gtrack.array.import, gtrack.array.get_colnames, gtrack.array.set_colnames, gtrack.array.extract, gvtrack.array.slice
Renamed:
gintervals.band.intersect => gintervals.2d.band_intersect
giterator.cartesian.grid => giterator.cartesian_grid
gsetroot.examples => gdb.init_examples
gtrack.create_2d => gtrack.2d.create
gtrack.import.2d_contacts => gtrack.2d.import_contacts
gtrack.import.mappedseq => gtrack.import_mappedseq
gtrack.import.wigs => gtrack.import_set
gsetroot has a new alias: gdb.init
gtrack.import_set: create a sparse track if binsize==0
gextract: allow saving result in a tab-delimited file
gintervals.force_range: eliminate intervals with non-existent chromosome
gquantile / gintervals.quantile, quantile / global.percentile / global.percentile.min / global.percentile.max functions of a virtual track: use weighted average of nearest samples instead of picking up the closest sample
Bug fix: sometimes using invalid value of quantile.edge.data.size option. Result: sub-optimal precision at the edges for quantile calculation OR memory bloating for quantile calculations on large sets of data.
Bug fix: sometimes using invalid value for gtrack.chunk.size option. Result: sub-optimal performance for newly created 2D tracks + memory bloating while reading 2D tracks.
Bug fix: sometimes using invalid value for gtrack.num.chunks option. Result: slow performance while reading 2D tracks OR memory bloating.
Bug fix: gsetroot / gdb.reload / gdir.cd corrupts the database state if one of the names shadows a variable in R environment.

giterator.cartesian.grid: replace ‘expansion’ parameter with ‘expansion1’ and ‘expansion2’ parameters for each axis
giterator.cartesian.grid: changed the order of the parameters
Bug fix: cartesian grid iterator incorrectly restricted the expansion between two neighbouring centers C1, C2 to be (C2-C1)/2

New function: gdb.create
gdbreload renamed to gdb.reload
Added support of ftp and zipped files in gintervals.import_genes
Documentation updates

New function: gintervals.import_genes
Documentation updates

Added User Manual in PDF, Reference Manual in PDF and HTML.
Updated functions documentation.

Added documentation for each function from R
Do not require libR.so for installation
Added “maxread” parameter to gcompute_strands_autocorr()
Do not unify overlapping intervals in gintervals
gintervals.apply is replaced with gintervals.mappy. The function interface changes.
Renamed: gvar.get() to gvar.load() and gvar.set() to gvar.save()
Bug fix: gcluster.run() did not load the package
Bug fix: gcluster.run() corrupted the return value
Bug fix: track expression iterator might miss an interval if intervals are not canonic and the iterator type is intervals/sparse

“misha” becomes an R package
gversion() removed

Bug fix: gdir.cd crashes

Support 2D tracks (Rectangles type) in gtrack.liftover
Support 2D intervals in gintervals.liftover
Bug fix: gtrack.liftover does not remove temporary files

Added gintervals.liftover function
Bug fix: error while trying to access a 2D track.

Added gtrack.liftover and gintervals.load_chain functions

Cleaned up stdout field in the result of gcluster.run

Bug fix: scope might be incorrectly applied while using cartesian iterator
Bug fix: gtrack.import.2d_contacts crashes when fend is out of range

New functions: gbins.quantiles and gbins.summary
giterator.cartesian.grid: do not implicitly add zero expansion
Add R parameter in gcluster.run
Add support of bedGraph extension in gtrack.import.wigs
Bug fix: overlapping 2D intervals are not reported correctly

New function: gcluster.run

Faster gsetroot using cached list of tracks and intervals.
New rescan parameter for gsetroot and gdbreload.

Added 2D intervals support in gintervals.apply.
Bug fix: gtrack.import.mappedseq crash.

Added band parameter to gdist, gextract, glookup, gpartition, gquantiles, gscreen, gsummary, gtrack.create, gtrack.lookup, gintervals.quantiles, gintervals.summary, giterator.intervals.
New function: gintervals.band.intersect
Removed min.band, max.band parameters from giterator.cartesian.grid

Run-time optimizations for 2D queries

Bug fix: crash if “preparing track for percentiles queries” is interrupted with CTRL-C

New function: gdbreload
gtrack.import.wigs: create tmp directory in GROOT/downloads rather than in /tmp
gwget: by default use path == GROOT/downloads
Bug fix in gtrack.import.wigs: do not proceed to import if one of the previous steps (ftp/unzip/convert to wig) failed or interrupted

Support BigWig / BedGraph formats in gtrack.import.wigs
Bug fixes in gtrack.import.wigs

New functions: gwget, gtrack.import.wigs
Allow to use unsorted intervals for gtrack.modify, gtrack.create_sparse
Allow to use unsorted and overlapping intervals for gintervals.intersect, gintervals.union and gintervals.diff

Run-time optimizations for 2D cartesian grid iterator
Bug fix: progress report goes reports 100% before the actual completion of command

Bug fix: 2D cartesian grid iterator skips some of the iterator intervals

Bug fix: gtrack.convert does not correctly convert old (version 1) computed tracks

Fix memory leaks when command exists on error or is interrupted by CTRL-C

Added “sum” virtual track function
Added “quantile” virtual track function
Renamed “percentile” virtual track functions to ”global.percentile”
Bug fix: when iterator interval does not intersect any intervals of sparse track “stddev” virtual track function returns last value instead of NaN

Added “stddev” virtual track function

Bug fix: memory corruption when the track expression contains more than one virtual track of “distance” or “distance.center” type.
Bug fix: incorrect statistics in gsummary / gintervals.summary when the summary is done on 1 sample.

Changed the format of computed 2D tracks
Bug fixes in gtrack.convert
Bug fixes in GenomeTrack::get_type
New function: .gtrack.create_test_computer2d
gintervals.2d.force_range removed (gintervals.force_range now works for both 1D and 2D intervals)

Changed the format of 2D tracks, now using StatQuadTreeCached class
Added gtrack.convert function
gwrite.table, gread.table were removed

Iterative algorithm in gtrack.smooth is reset once in a while to prevent loss of precision in floating point calculations
Bug fix: giterator.cartesian.grid does not work correctly if min/max.band are NULL.

Added gintervals.force_range and gintervals.2d.force_range

Change the default of min/max.band in cartesian iterator to NULL
Change the default of intervals2 in cartesian iterator to NULL
Add min/max.band.idx to cartesian iterator

Changed virtual track function “dist” to distance

Added gtraceback

Added 2D computed tracks

Changed the format of 2D tracks
Added “sum” and “area” functions to virtual tracks
Bug fix: Error message “Cannot implicitly determine iterator policy” when used with two or more virtual tracks pointing to the same physical track

Bug fix: gtrackimport_mappedseq, gcompute_strands_autocorr might skip the last portion of the input file
Bug fix: gtrackimport_mappedseq, gcompute_strands_autocorr might skip the first row of the input file

Added fixed rectangle iterator
Added support of 2D in gpartition
Added support of 2D in gtrack.lookup
Added support of 2D in gintervals.canonic
Added support of 2D in gintervals.intersect
Added gvtrack.ls
Check user interrupt in gseq.extract
Removed: gtrack.cache
Bug fix: incorrect error message in gwilcox when used with non fixed-bin iterator
Bug fix: iterator=trackname produces an error
Bug fix: gtrack.lookup produces an error
Bug fix: gtrack.modify produces an error
Bug fix: gvtrack.import incorrectly imports a track if called within a function
Bug fix: gtrack.import.2d_contacts memory leak

Add band control to cartesian grid iterator
Bug fix: cartesian grid iterator might produce incorrect results while using scope

Added new vtrack function: dist.center
Bug fix: precision loss in gextract due to float / double conversion

Bug fix: gintervals.* might fail on overlapping intervals

Run-time optimizations for gintervals and gintervals.2d
Run-time optimizations when using GITERATOR.INTERVALS

Added cartesian grid iterator

Remove canonic / original property for 2D tracks

Added virtual tracks

Hide the chrom1/chrom2 swap in 2D intervals from the user

Added support for 2D intervals in gtrack.create
Added gtrack.info
Removed gtrack.binsize
Added gintervals.all
Added gintervals.2d.all

Added gintervals.2d
Added gtrack.create_2d
Added support for 2D intervals in gscreen, gextract, glookup, gsummary, gsummary.intervals, gquantiles, gintervals.quantiles, gdist, giterator.intervals
Add unify_touching_intervals parameter to gintervals.canonic()

Automatically build PV-table
Remove gtrack.makepvals()

Allow maximal precision of gquantiles / gintervals.quantiles near extreme probs (0 and 1)

Allow gquantiles / gintervals.quantiles work on the whole genome (use random sampling)
Allow using non-canonic intervals in gseq.extract()

Added “.nearest” track function
Bug fix: invalid error report when fixed-bin track size does not match chrom size

Make .greloaddb faster (affect gsetroot, gdir.cd, …)
Do not autocomplete tracks variables
Add gvar.ls()
Remove gvar.load()
Added pattern matching for gtrack.ls(), gintervals.ls() and gvar.ls()
Added gtrack.exists() and gintervals.exists()
Print the track expression in various error messages

Make .greloaddb faster (affect gsetroot, gdir.cd, …)
Bug fix: in various functions: “Error, undefined gparam.type”

Require all track directories to have an extension .track
Forbid creation of tracks, intervals or directories inside track directories
Forbid deletion of directories inside track directories
Bug fix: gtrack.cache() sometimes creates tracks shorter by one sample than the chrom size

Treat +/-Inf value in a track as NA
Allow “dir” argument in gsetroot

Add gtrack.import.mappedseq
Add gcompute_strands_autocorr
Support integer breaks in gdist()
Suppress error report if force==TRUE and track/interv/dir do not exist in gtrack.rm / ginterv.rm / gdir.rm

Maintain virtual working directory in gsetroot / gdir.cd. Do not change the shell working directory of the user.
Add gdir.cwd

Add gdir.create, gdir.rm, gdir.cd
Remove gtrackset., gintervset.

Stop creating variables with tracksets/intervsets names for TAB-completion
Before addition / removal of track/intervals variables check whether they already exist
Bug: gtrack.cache overrides “created.by” and “created.date” variables

Rename intervals files by adding them “.interv” extension

Remove global/user attribute for tracksets/intervsets

Add “iterator” parameter to the relevant functions
Remove giterator.policy() function
Remove gapply() function
Bug in gtrack.create_pwm_energy(): Error in sprintf(“gtrack.create_pwm_energy(%s, %g, %s, %s, %g, track=%s)”, : invalid format ‘%g’; use format %s for character objects
Add “breaks” attribute to gdist result

Support sparse tracks and track expression iterators over intervals
Added gversion()
Check in giterator.intervals that the memory is not blown up
Do not unify touching intervals in giterator.policy()
error in gtrack.makepvals(): “Expression does not produce a numeric result”
Allow GAPPLY.INTERVALS and GITERATOR.INTERVALS variables

Stop supporting point intervals (start coordinate == end coordinate). Automatically convert old point intervals to (chrom, start, start+1).
Bug fix: gtrack.modify() fails with “.created.by.intervs” error.

Replace global variables with options: GMAX_DATA_SIZE => gmax.data.size, GBINSIZE => gstep, GBUFSIZE => gbuf.size
Add an option “gparam.type” controlling the type of the arguments (“var” or “string”)
gintervals.annotate produces incorrect results when annotation intervals contain overlapping intervals

Fix an error when the expression is too long or something like that (Rami for details) Fixed by Amos 5/7/10 patched .Rprofile
Fix gcreate crash
If I have track names with a minus sign in them - they are built correctly - but I can’t do conditions on them. e.g. : rv <- gscreen (GSE14097.ERG_AR_re_ChIP-7 > 20 && TE.WT_Input < 7). Resolve but blocking - signs in track names (and maybe other sensitive characters). Fixed: track names are not allowed to contain characters other than alphanumeric and . The fix applies for gcreatetrack, gcachemultires, gsmooth.
added support for writing multi resolution track binning (gcachemultires) - for use by tgbr.
Misha: grmtrack does not remove the dataset directory when the last track is deleted
Misha: add force delete command line option for grmtrack for track deletion without confirmation
Misha: add grmdataset function to delete the whole dataset recursively
Misha: eliminate use of the slow gsetroot() for track list refresh after track creation or removal The fix applies to grmtrack, gcreatetrack, gcachemultires, gsmooth.
Rami: On failure to create a track - remove it. The fix applies for gcreatetrack, gcachemultires, gsmooth.
Misha: allow a dot in the name of the track variable.
should store the command that created a track in a track variable (For future reference), we should also save the date (in the future we may want to do something with dependencies). Each newly created track is assigned two track variables: created.by and created.date.
Rami: Easier dump to text of info (current need to use sep=“ rownames = F, etc) Added gwrite.table() function.
Summary statistics on tracks (a simplification of the distribution feature). We want to compute the total, average, stdev, min and max of a track, number of bins, number of NAN bins. All that without going through the distribution computation (maybe using R function that seat over the gdist function) New function added: gsummary.
Change coordinate->bin function (should be int(coordinate/bin_size)
Iterator protect/unprotect (eval_next_int, eval_next_bool etc)
Misha: gsetroot() should undefine previous track variables
Misha: mix of float and double in gdist() causes incorrect bin assignment for the values at the border.
Misha: add smart progress report.
Misha: end coordinate of the last segment of the chromosome was mistakenly extended by binsize.
Misha: limit the range that intervals cover for gextract and gquantiles to prevent memory allocation failures.
Misha: enhance gintervals() function to accept chroms as strings without “chr” prefix or as integers.
eliminate gpath support.
Allow various services (gscan, gdist) to be limited to a subset of the genome (defined by a give interval set). This will probably require changing the classes that iterate over chromosome such that all will use a common GenomeTrack interface.
Misha: allow track expressions for gextract and gquantiles.
Rami: Find out why you need to do ‘as.vector’ on output of gquantiles to get regular numbers. Misha: as designed. gquantiles returns a table, not a vector. The table is of NxM size where N is the number of intervals, and M is the number of quantiles.
Return intervals as a dataframe from C code rather than a list that should be later converted to a dataframe. Misha: affects gscreen, gwilcox, gintervals.union, gsetroot. Improves run time for large sets of intervals.
Misha: intervals returned by gwilcox should cover the whole area covered by the small window rather than just the center of the window.
Why is the usage of: PeakIntervals_TE_AR_AR <- gwilcox (TE.TE_AR_DHT__AR, 100000, 1000, maxpval = 0.001) - giving me intervals with peak values of 0? Misha: added what2find option to gwilcox. This option controls whether peaks/lows or both should be searched by gwilcox.
Implement operations on intervals. Union: generate a new interval set, which include an interval for each closure of the two given set. Intersection: returns an interval set with all the non empty intersections of intervals in the two sets. Difference: returns the intervals of set A such that intersection with interval from B are removed. Misha: added gintervals.union(), gintervals.intersect(), gintervals.diff() function
Call R eval in bulks to boost run times. Misha: added “.bufsize=1” parameter to all functions that accept track expressions. Increase “bufsize” to boost the performance.
Misha: allow command interruption by “ctrl-C” in various function such as gwilcox. (Check R event loop?)
Misha: if “ctrl-C” is pressed whilst gcreate/gsmooth/gcachemult track directory is not removed.
Add gintervals.annotate function
Add auto-completion with TAB for annotations.
Add gls.annots() function
Add support for annotation function “annotation.dist” in the track expressions
Misha: determine .bufsize parameter automatically based on the dimensions of the first evaluation + the interval size Misha: for now interval size is not checked
Misha: fix error messages format Misha: fixed with an ugly hack
Add gapply
Allow TrackScanner to iterate over a few track expressions
Bug: message “Error: GBINSIZE variable is undefined or not numeric” is printed when invalid trackname is used in track expression.
Bug: invalid results whenever the same track is used in more than one track expression in gdist()
Bug: incorrect pvals when the limits are falling on the maximum
Bug: incorrect annot.dist when you also use intervals=something
Bug: gscreen returns one bin less for each interval
create all track dirs with group open permissions
Provide quick gsetroot (for scripts - e.g. not set up of completion variables? focus on subset of dirs? lazy init of gsetroot?) Misha: the performance of the original gsetroot was optimized
Take care of environment handling in various eval(parse()) constructs
Allow string parameters for track_expr - if the parameter is a string, just avoid parse
Reverse the convention of positive / negative distance between coord / interval and annotation
Change DB directory structure: tracks/tracksets/tracknames, annots/annotsets/annotnames Misha: renamed functions: gls => gls.tracks, gcreatetrack => gcreate.track, grmtrack => grm.track, grmdataset => grm.trackset. Newly added functions: gls.annots, gls.annotsets, gls.tracksets, gcreate.annotset, grm.annotset.
Add trackset / annotset manipulation functions + user / global flag support
Bug: if ~ symbol is used in gsetroot, track files cannot be accessed
Allow custom column names for gapply and gextract
Regression tests
make it easier to import intervals/annotations: add ginterval.import that will take non canonical intervals and will sort them and unify them to become disjoint. The function will return the canonical ginterval object plus a factor mapping ids in the original data frame to the new canonical gintervals. We can then use tapply to import meta-data.
Merge annotations and intervals concepts. Misha: renamed functions: gls.annots => gls.intervs, gls.annotset => gls.intervsets, grm.annots => grm.intervs, grm.annotsets => grm.intervsets. Newly added functions: gintervals.load, gintervals.save.
Effi: segmentation fault in gdist
Buf fix: calling gprepare_pvals() twice on the same track fails
need a simple way to merge data onto intervals, without reordering and invalidating the intervals (i.e. doing ginterval.import after merge is annoying. This may be solvable using standard r merge options, but we need to wrap it up nicely) Misha: Added gintervals.merge function
More protective use of intervals: coerce fields as factors if needed and generally be aware of potential user changes to intervals.
allow collection of pairs of interval within a maximal distance Misha: Added gintervals.neighbors function
Bug fix: gcachemultires does not work
Decide about the naming policy for the functions
Restrict gdist to accept minval=2
Add dist.XXX function
check how many times gapply calls func1 in gapply(func1(x1), track, gintervals(1, 0, 100))
gintervals.apply should reverse intervals of the minus strand
bug: gintervals.neighbors(gintervals(1, 0, 2000, 1), gintervals(1, 500, 1500, 1), 0, 0) produces an error
added colnames parameter to gintervals.neighbors
file descriptors are not closed properly in onexit() causing various .create functions to leave junk if interrupted
chromo sequence and pwms to the trackdb directory: allow import of pwms to the library (using several sets of pwm, each defined by a file?)
allow PWM energy computation (using the chromo sequence and a pwm)
Create tracks from 4c sites and prof files
BUG: The same interval.dist could not appear more than once in an expression - for example , gdist(global.tss.dist+global.tss.dist,-1000,1000,100) produces “object ‘global.tss.dist’ not found”
Provide the interval ID in the function provided to intervals.gapply Misha: GINTERVID variable is maintained while gapply
add progress report for non-track expression functions Misha: added progress report to gintervals.annotate
Allow to smooth NaN values in gtrack.smooth (add smooth_nans parameter)
it might be useful to have a direct way get sets of intervals corresponding to division of the genome according to the values of some track. it would be best if gdist could return another extra value with intervals corresponding to each of the track combinations Misha: gpartition function was added
Misha: avoid memory blow up when large vectors are returned by gquantile, etc.
gseq.extract() should return reverse-complementary sequence for strand -1
Bug fix: gpartition does not treat NaNs correctly
Change gdist and gpartition to accept breaks rather than minval/maxval/numbins
Bug: with equal bin size gdist crashes
Bug: gquantiles crashes with error: unprotect_ptr: pointer not found
Bug: gdist called from a function does not work Misha: deal with unevaluated (=promised) values
In case of track expression result mismatch save the result of the last evaluation in GERROR_EXPR variable
Bug: gtrack.make_pvals() does not work
Bug: gsummary() max value is incorrect for negative values
Added gvar.get() function
Added gvar.exists() function
gvar.rm() “warning: variable was not found” when variable exists but not loaded
Added gtrack.binsize()
Allow gintervals(chrom) which is equal to gintervals(chrom, 0, -1)
Effi: if x contains a line of NA’s gintervals.import(x) causes segmentation fault
Misha: optimize gextract to present chroms as factors rather than strings
Misha: truncate long column names
Added glookup() and gtrack.lookup() functions
Renamed “origin” attribute of gintervals.import to “mapping”
Allow regression tests to be invoked by the matching string
Removed gvar.loadall() function
Added gtrack.modify() function
Misha: delete gvar.loadall() - as it might blow up the memory in large databases
Misha: gvar.* functions do not work when track variable X is loaded and passed unquoted
Misha: read-only functions refuse to work with a track that does not have write permissions
Allow overlaps all functions except gintervals.intersect, gintervals.diff, gintervals.union, dist.XXX
Renamed gquantiles to gintervals.quantiles
Renamed gintervals.import to gintervals.canonic
Added gintervals.summary()
Added gquantiles()
Bug: gvar.load does not work when the variable is given unquoted
Warning “is.na() applied to non-(list or vector) of type ‘NULL’” when gscreen(is.na(track)) is called
Versioning and installation added
Calling gintervals.canonic(i) when i is a dataframe without any rows causes R to crash Misha: other functions (gintervals.union, …) were fixed too

misha 4.3.7

misha 4.3.62025-03-06

misha 4.3.5

misha 4.3.4

misha 4.3.3

misha 4.3.2

misha 4.3.1

misha 4.3.0

misha 4.2.14

misha 4.2.13

misha 4.2.12

misha 4.2.10

misha 4.2.92024-01-15

misha 4.2.82023-12-05

misha 4.2.7

misha 4.2.62023-09-14

misha 4.2.32023-09-05

misha 4.2.2

misha 4.1.0

misha 4.0.11

misha 4.0.10

misha 4.0.9

misha 4.0.8

misha 4.0.6

misha 4.0.5

misha 4.0.4

misha 4.0.3

misha 4.0.2

misha 4.0.1

misha 4.0.0

misha 3.7.1

misha 3.7.0

misha 3.6.0

misha 3.5.6

misha 3.5.5

misha 3.5.4

misha 3.5.3

misha 3.5.2

misha 3.5.1

misha 3.5.0

misha 3.4.3

misha 3.4.2

misha 3.4.1

misha 3.4.0

misha 3.3.18

misha 3.3.17

misha 3.3.16

misha 3.3.15

misha 3.3.14

misha 3.3.13

misha 3.3.12

misha 3.3.11

misha 3.3.10

misha 3.3.9

misha 3.3.8

misha 3.3.7

misha 3.3.6

misha 3.3.5

misha 3.3.4

misha 3.3.3

misha 3.3.2

misha 3.3.1

misha 3.3.0

misha 3.2.8

misha 3.2.7

misha 3.2.6

misha 3.2.5

misha 3.2.4

misha 3.2.3

misha 3.2.2

misha 3.2.1

misha 3.2.0

misha 3.1.11

misha 3.1.10

misha 3.1.9

misha 3.1.8

misha 3.1.7

misha 3.1.6

misha 3.1.5