Flame Time
TanayLabUtilities.FlameTime
—
Module
Generate a coarse-grained flameview. If the environment variable
JULIA_FLAME_MEASUREMENTS_FILE
is set, it should contain the path of a file to append measurements into. Every call to
flame_timed
will collect measurements, and on normal program exit, the total ("folded") measurements are written into this file. The file can't be directly viewed using
flameview.py
; it needs to be converted to a proper flameview file using
finalize_flameview
.
Measurements are appended into the file, so that running multiple programs automatically creates a unified file for a whole computational pipeline. Measurement stacks are prefixed by the program name (
basename(source_path())
or
Main
). If you want to separately measure each program execution, use a different measurements file for each. Also, there's no locking done, so running programs in parallel with the same measurements file will not end well.
This serves as a very lightweight mechanism for visualizing program performance. Every
@logged
function and every
parallel_loop
is automatically a measured step, so one gets a pretty decent level of detail "for free". It is possible to supplement this with occasional calls to
flame_timed
to increase the level of detail. Of course one can ignore all of this and use more detailed tools such as
Profile
when profiling for performance in anger.
TanayLabUtilities.FlameTime.flame_timed
—
Function
flame_timed(body::Function, name::AbstractString)::Any
Add a measurement to the flameview file for the execution of the
body
under the specified
name
. We distinguish between "serial" measurements (in the top-level main thread) and "parallel" measurements (inside tasks invoked from within a
parallel_loop
). For parallel code we can only meaningfully measure elapsed time. For serial code (including complete
parallel_loop
), we measure much more - GC overhead, CPU time, page faults, IO, and context switches.
If
name
starts with
.
, then we take the name of the surrounding
flame_timed
context (which must exist) and append the
name
to it. This allows easily marking sections inside logged functions (e.g.
flame_timed(".loop")
) in a way that still makes sense when generating a reversed flame graph.
If
JULIA_FLAME_MEASUREMENTS_FILE
was not specified, this just invokes the
body
without (almost) any overhead.
This is automatically invoked for every
@logged
function, which greatly reduces the need to manually annotate code sections with an explicit
flame_timed
call, though this is still useful when sections of a function need to be measured, or for wrapping calls to expensive standard library functions of interest.
TanayLabUtilities.FlameTime.finalize_flameview
—
Function
finalize_flameview(;
measurements_path::AbstractString,
flameview_path::AbstractString,
reversed::Bool = false,
)::Nothing
Convert the collected measurements in the
measurement_path
into a proper
flameview
file, suitable for use with This is needed for several reasons:
- The measurements are inclusive (that is, the elapsed time of a step includes the elapsed time of all nested steps).
- Measurements of serial steps (which should add to the total execution time) are mixed with the measurements of parallel steps (which would add up to much more).
- We collect much more than (elapsed) time, so we need to generate a nive HTML annotation for each line to show the extra data.
In addition, a reversed flamegraph (showing callers instead of callees) doesn't work well when just reversing the stacks of the flameview file, because the annotations won't be properly computed. Specify
reversed
to generate a proper reversed flameview file.
Each
parallel_loop
is therefore measured twice, once as a serial step and once as a parallel step. We scale the parallel time (and everything nested in it) so it adds up to the serial time. We then convert all the inclusive measurements to exclusive ones. The result is a flameview that shows the contribution of everything (serial and parallel steps) to the total execution time, under the assumption that the number of iterations in the parallel loops is large enough that it is the sum of parallel steps that matters.
This will not work if you call
flame_timed
inside an explicit
@threads
loops. You must use a
parallel_loop
instead.
Parallel names are suffixed with
_[1]
to allow using different colors for serial and parallel code. This only works with
flameview
; in general the generated file only works with
flameview.py
. If you want to feed it into a
flamegraph
tool, you will need to pass it through
sed 's/ #.*//;s/_\[1\]/_[j]/'
- that is, get rid of the (very useful) annotations attached to each node, and switch to a
_[j]
suffix (or any other suffix you choose that works with
flamegraph.pl
, or even completely get rid of the suffix.)