Parallel Storage

TanayLabUtilities.ParallelStorage Module

Allow reusing large data between parallel tasks. Using task local storage will re-allocate and re-initialize this data for each iteration, which is slow, and overwork the garbage collector. Using thread local storage is no longer safe because Julia has moved away from "sticky" threads (that is, a task may migrate between threads); if naively implemented, it also create an instance per thread regardless whether it is actually used. The ReusableStorage allows allocating the minimal number of instances, reusing them in multiple tasks, and automates resetting the data each time it is used by a new task (if needed).

TanayLabUtilities.ParallelStorage.ReusableStorage Type
ReusableStorage(create::Function)::ReusableStorage

Implement resuable storage for parallel computations. This allocates and reuses the minimal number of instances of some data for (re)use by parallel tasks. The create function should return a new instance of the data. If a previous task has used the data, we will call [ reset_reusable_storage! ]@(ref) to bring it back it to its initial state.

Note

The create and reset_reusable_storage! functions must be thread safe. We intentionally do not perform them while holding the global lock to maximize performance. This only matters if these functions access global variables or the like; typically they just allocate some memory and work on the specific reusable storage instance, which only they have access to, so this isn't a concern.

mutable struct ExampleStorage
    is_clean::Bool
end

function TanayLabUtilities.reset_reusable_storage!(storage::ExampleStorage)::Nothing
    @assert !storage.is_clean
    storage.is_clean = true
    return nothing
end

reusable_storage = ReusableStorage() do
    return ExampleStorage(true)
end

first = nothing
second = nothing

with_reusable(reusable_storage) do storage_1
    @assert storage_1.is_clean
    storage_1.is_clean = false
    global first
    first = storage_1

    with_reusable(reusable_storage) do storage_2
        @assert storage_2.is_clean
        storage_2.is_clean = false
        global second
        second = storage_2
        @assert second !== first
    end

    with_reusable(reusable_storage) do storage_3
        @assert storage_3.is_clean
        storage_3.is_clean = false
        @assert storage_3 === second
    end
end

@assert !first.is_clean
@assert !second.is_clean

# output


TanayLabUtilities.ParallelStorage.get_reusable! Function
get_reusable!(reusable_storage::ReusableStorage{T})::T where{T}

Get a private instance of the data from the reusable storage. Will prefer to return an existing instance of the data, after calling reset_reusable_storage! if used by a previous task. If all instances are currently being used by other tasks, will create a new instance instead.

Index