In addition to the in-memory (memory_daf) and
one-file-per-property (files_daf) storage backends, dafr
ships two Zarr-backed backends and an HTTP reader. All three are
bidirectionally compatible with DataAxesFormats.jl’s
ZarrDaf and HttpDaf, and the Zarr layout is
readable by zarr-python and any other Zarr v2 consumer.
ZarrDaf — directory layout
zarr_daf(path, mode) reads and writes a Zarr v2 group
tree on the local filesystem. The path conventionally ends in
.daf.zarr:
path <- tempfile(fileext = ".daf.zarr")
d <- zarr_daf(path, mode = "w")
add_axis(d, "cell", c("c1", "c2", "c3"))
set_scalar(d, "organism", "human")
set_vector(d, "cell", "score", c(1.5, 2.5, 3.5))
set_matrix(d, "cell", "cell", "kin",
matrix(c(1, 0, 0, 0, 1, 0, 0, 0, 1), 3, 3))
rm(d); gc()
#> used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 2508395 134.0 5035637 269.0 3274930 175.0
#> Vcells 4308408 32.9 10146329 77.5 7111962 54.3
# Reopen read-only and inspect.
d <- zarr_daf(path, mode = "r")
cat(description(d))
#> name: file278d383e99d9.daf.zarr
#> type: ZarrDaf
#> path: /tmp/RtmpxJ7LnK/file278d383e99d9.daf.zarr
#> mode: r
#> scalars:
#> organism: "human"
#> axes:
#> cell: 3 entries
#> vectors:
#> cell:
#> score
#> matrices:
#> cell,cell:
#> kinA directory ZarrDaf store is a normal Zarr group. Python consumers
can open it with zarr.open():
ZarrDaf — single-file zip layout
For shipping a daf as a single artifact,
zarr_daf(path, mode) accepts a .daf.zarr.zip
path. The store is backed by an mmap’d ZIP archive: reads of stored
(uncompressed) entries are zero-copy ALTREP views over the mapped
region, and writes append-commit through a crash-safe two-phase
protocol.
zip_path <- tempfile(fileext = ".daf.zarr.zip")
d <- zarr_daf(zip_path, mode = "w")
add_axis(d, "gene", c("g1", "g2"))
set_vector(d, "gene", "is_marker", c(TRUE, FALSE))
rm(d); gc()
#> used (Mb) gc trigger (Mb) max used (Mb)
#> Ncells 2733007 146.0 5035637 269.0 3274930 175.0
#> Vcells 4682686 35.8 10146329 77.5 7111962 54.3
# Reopen.
d <- zarr_daf(zip_path, mode = "r")
get_vector(d, "gene", "is_marker")
#> g1 g2
#> TRUE FALSENote: the single-file
.daf.zarr.zipbackend depends on POSIXmmapand is not available on Windows in this build. Use the directory layout above on Windows.
Foreign consumers open the same archive via Python’s
zarr.storage.ZipStore:
HttpDaf and HttpStore — reading over HTTP(S)
dafr provides two read-only HTTP backends:
http_daf(url)— read afiles_dafdirectory served over HTTP. The client downloadsmetadata.ziponce at open time, parses it in memory, and serves all JSON metadata from there. Non-JSON payloads (axis.txtfiles, vector / matrix.dataetc.) are fetched lazily on first access via one HTTPGETeach, cached by the standard cache layer.zarr_daf("http(s)://...")— read a.daf.zarrdirectory served over HTTP. Routes throughHttpStore, an HTTP-backed implementation of dafr’s internalZarrStoreinterface. Fetches.zmetadataonce to discover the tree, then pulls chunks on demand.
Both backends are read-only; writable modes hard-error. Server data
is assumed stable while a daf is open — reopen to pick up changes. Each
HTTP GET uses a 30-second timeout, overridable via
options(dafr.http_timeout = N) or env
DAFR_HTTP_TIMEOUT. There is no automatic retry;
flaky-network handling is the caller’s responsibility.
# Open a remote files_daf:
d <- http_daf("https://example.com/path/to/foo.daf/")
get_scalar(d, "organism")
# Open a remote ZarrDaf:
d <- zarr_daf("https://example.com/path/to/foo.daf.zarr/", mode = "r")
get_vector(d, "cell", "score")
# open_daf() routes by URL pattern:
d <- open_daf("https://example.com/path/to/foo.daf.zarr/", mode = "r")To publish a files_daf over HTTP, the directory needs a
metadata.zip bundle at its root. From dafr 0.2.0 onward
this is maintained automatically by every write; for pre-0.2.0 stores
call pack_files_daf_metadata(path) once before
publishing.
# One-time: bundle a pre-0.2.0 store's metadata so HttpDaf clients can
# open it. Idempotent; no-op once metadata.zip exists.
pack_files_daf_metadata("/srv/www/data/foo.daf")zarr_daf("https://...zip") is intentionally not
supported — open the zip locally instead, since byte-range reads against
a remote zip would need separate handling.
Cross-language smoke
Both Zarr layouts and the HTTP backend are continuously cross-checked
against zarr-python in the package test suite. Round-trip
parity is verified for axes, dense and sparse vectors, dense and sparse
matrices, and scalars of every dafr type. See
tests/testthat/test-zarr-python.R,
tests/testthat/test-mmap-zip-store-foreign.R, and
tests/testthat/test-http-live.R for the live test
scripts.