data ¶
Interface of
DafReader
and
DafWriter
. See the Julia
documentation
,
documentation
and
documentation
for details.
- class dafpy.data. DafReader ( jl_obj ) [source] ¶
-
Read-only access to
Dafdata. See the Julia documentation for details.- property name : str ¶
-
Return the (hopefully unique) name of the
Dafdata set.
- description ( * , cache : bool = False , deep : bool = False , tensors : bool = True ) str [source] ¶
-
Return a (multi-line) description of the contents of
Dafdata. See the Julia documentation for details.
- has_scalar ( name : str ) bool [source] ¶
-
Check whether a scalar property with some
nameexists in theDafdata set. See the Julia documentation for details.
- get_scalar ( name : str ) bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str [source] ¶
-
Get the value of a scalar property with some
namein theDafdata set. See the Julia documentation for details.Numeric scalars are always returned as
intorfloat, regardless of the specific data type they are stored in theDafdata set (e.g., aUInt8will be returned as anintinstead of anp.uint8).
- scalars_set ( ) AbstractSet [ str ] [source] ¶
-
The names of the scalar properties in the
Dafdata set. See the Julia documentation for details.
- has_axis ( axis : str ) bool [source] ¶
-
Check whether some
axisexists in theDafdata set. See the Julia documentation for details.
- axes_set ( ) AbstractSet [ str ] [source] ¶
-
The set of names of the axes of the
Dafdata set. See the Julia documentation for details.
- axis_length ( axis : str ) int [source] ¶
-
The number of entries along the
axisin theDafdata set. See the Julia documentation for details.
- axis_np_vector ( axis : str ) ndarray [source] ¶
-
A
numpyvector of unique names of the entries of someaxisof theDafdata set. See the Julia documentation for details.This creates an in-memory copy of the data, which is cached for repeated calls.
- axis_np_entries ( axis : str , indices : Sequence [ int ] | None = None , * , allow_empty : bool = False ) ndarray [source] ¶
-
Return a
numpyvector of the names of entries of theindicesin theaxis. See the Julia documentation for details.The
indicespassed here are 0-based to fit the Python conventions. This means that ifallow_empty, negativeindicesare converted to the empty string.
- axis_dict ( axis : str ) Mapping [ str , int ] [source] ¶
-
Return a dictionary converting
axisentry names to their (0-based) integer index.
- axis_np_indices ( axis : str , entries : Sequence [ str ] , * , allow_empty : bool = False ) ndarray [source] ¶
-
Return a
numpyvector of the indices of theentriesin theaxis. See the Julia documentation for details.The indices returned here are 0-based to fit the Python conventions. This means that if
allow_empty, the empty string is converted to the index -1.
- axis_pd_indices ( axis : str , entries : Sequence [ str ] , * , allow_empty : bool = False ) Series [source] ¶
-
Return a
pandasseries of the indices of theentriesin theaxis. See the Julia documentation for details.
- has_vector ( axis : str , name : str ) bool [source] ¶
-
Check whether a vector property with some
nameexists for theaxisin theDafdata set. See the Julia documentation for details.
- vectors_set ( axis : str ) AbstractSet [ str ] [source] ¶
-
The set of names of the vector properties for the
axisinDafdata set, not including the specialnameproperty. See the Julia documentation for details.
- get_np_vector ( axis : str , name : str , * , default : None ) ndarray | None [source] ¶
- get_np_vector ( axis : str , name : str , * , default : bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str | Sequence [ bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str ] | ndarray | UndefInitializer = Undef ) ndarray
-
Get the vector property with some
namefor someaxisin theDafdata set. See the Julia documentation for details.This always returns a
numpyvector (unlessdefaultisNoneand the vector does not exist). If the stored data is numeric and dense, this is a zero-copy view of the data stored in theDafdata set. Otherwise, a Python copy of the data as a densenumpyarray is returned (and cached for repeated calls). Since Python has no concept of sparse vectors (because “reasons”), you can’t zero-copy view a sparseDafvector using the Python API.
- get_pd_vector ( axis : str , name : str , * , default : None ) Series | None [source] ¶
- get_pd_vector ( axis : str , name : str , * , default : bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str | Sequence [ bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str ] | ndarray | UndefInitializer = Undef ) Series
-
Get the vector property with some
namefor someaxisin theDafdata set. See the Julia documentation for details.This is a wrapper around
get_np_vectorwhich returns apandasseries using the entry names of the axis as the index.
- has_matrix ( rows_axis : str , columns_axis : str , name : str , * , relayout : bool = True ) bool [source] ¶
-
Check whether a matrix property with some
nameexists for therows_axisand thecolumns_axisin theDafdata set. See the Julia documentation for details.
- matrices_set ( rows_axis : str , columns_axis : str , * , relayout : bool = True ) AbstractSet [ str ] [source] ¶
-
The names of the matrix properties for the
rows_axisandcolumns_axisin theDafdata set. See the Julia documentation for details.
- get_np_matrix ( rows_axis : str , columns_axis : str , name : str , * , default : None , relayout : bool = True ) ndarray | csc_matrix | None [source] ¶
- get_np_matrix ( rows_axis : str , columns_axis : str , name : str , * , default : bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str | Sequence [ bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str ] | ndarray | UndefInitializer = Undef , relayout : bool = True ) ndarray | csc_matrix
-
Get the column-major matrix property with some
namefor somerows_axisandcolumns_axisin theDafdata set. See the Julia documentation for details.This always returns a column-major
numpymatrix or ascipysparsecsc_matrix, (unlessdefaultisNoneand the matrix does not exist). If the stored data is numeric and dense, this is a zero-copy view of the data stored in theDafdata set.Note that by default
numpymatrices are in row-major (C) layout and not in column-major (Fortran) layout. To get a row-major matrix, simply flip the order of the axes, and call transpose on the result (which is an efficient zero-copy operation). This will also (zero-copy) convert thecsc_matrixinto acsr_matrix.Also note that although we call this
get_np_matrix, the result is not the deprecatednp.matrix(which is to be avoided at all costs).
- get_pd_matrix ( rows_axis : str , columns_axis : str , name : str , * , default : None , relayout : bool = True ) DataFrame | None [source] ¶
- get_pd_matrix ( rows_axis : str , columns_axis : str , name : str , * , default : bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str | Sequence [ bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str ] | ndarray | UndefInitializer = Undef , relayout : bool = True ) DataFrame
-
Get the column-major matrix property with some
namefor somerows_axisandcolumns_axisin theDafdata set. See the Julia documentation for details.This is a wrapper around
get_np_matrixwhich returns apandasdata frame using the entry names of the axes as the indices.Note that since
pandasdata frames can’t contain a sparse matrix, the data will always be in a densenumpymatrix, so take care not to invoke this for a too-large sparse data matrix.This is not to be confused with
get_framewhich returns a “real”pandasdata frame, with arbitrary (query) columns, possibly using a different data type for each.
- empty_cache ( * , clear : Literal [ 'MappedData' ] | Literal [ 'MemoryData' ] | Literal [ 'QueryData' ] | None = None , keep : Literal [ 'MappedData' ] | Literal [ 'MemoryData' ] | Literal [ 'QueryData' ] | None = None ) None [source] ¶
-
Clear some cached data. By default, completely empties the caches. See the Julia documentation for details.
- has_query ( query : str | Axis | Lookup | Names | QuerySequence ) bool [source] ¶
-
Return whether the
querycan be applied to theDafdata. See the Julia documentation for details.
- get_np_query ( query : str | Axis | Lookup | Names | QuerySequence , * , cache : bool = True ) bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str | ndarray | AbstractSet [ str ] [source] ¶
- get_np_query ( query : None = None , * , cache : bool = True ) PendingNumpyQuery
-
Apply the full
queryto theDafdata set and return the result. See the Julia documentation for details.If the result isn’t a scalar, and isn’t an array of names, then we return a
numpyarray or ascipycsc_matrix.If the
queryis not specified, this is intended to be used asquery | daf.get_np_query(). This is useful when constructing the query in parts (e.g.Axis("cell") |> Lookup("metacell") |> daf.get_np_query()).
- get_pd_query ( query : str | Axis | Lookup | Names | QuerySequence , * , cache : bool = True ) bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str | Series | DataFrame | AbstractSet [ str ] [source] ¶
- get_pd_query ( query : None = None , * , cache : bool = True ) PendingPandasQuery
-
Similar to
get_np_query, but return apandasseries or data frame for vector and matrix data.Note that since
pandasdata frames can’t contain a sparse matrix, the data will always be in a densenumpymatrix, so take care not to invoke this for a too-large sparse data matrix.If the
queryis not specified, this is intended to be used asquery | daf.get_np_query(). This is useful when constructing the query in parts (e.g.Axis("cell") |> Lookup("metacell") |> daf.get_np_query()).
- get_pd_frame ( axis : str | Axis | Lookup | Names | QuerySequence , columns : Sequence [ str | Tuple [ str , str ] ] | Mapping [ str , str | Axis | Lookup | Names | QuerySequence ] | None = None , * , cache : bool = False ) DataFrame [source] ¶
-
Return a
DataFramecontaining multiple vectors of the sameaxis. See the Julia documentation for details.Note this is different from
get_pd_matrixwhich returns some 2D data as apandasdata frame. Here, each column can be the result of an arbitrary query and may have a different data type.The order of the columns matters. Luckily, the default dictionary type is ordered in modern Python, so if you write
columns = {"color": ": type => color", "age": ": batch => age"}you can trust that thecolorcolumn will be first and theagecolumn will be second.
- read_only ( * , name : str | None = None ) DafReadOnly [source] ¶
-
Wrap the
Dafdata sett with aDafReadOnlyWrapperto protect it against accidental modification. See the Julia documentation for details.
- class dafpy.data. DafReadOnly ( jl_obj ) [source] ¶
-
A read-only
DafReader, which doesn’t allow any modification of the data. See the Julia documentation for details.- read_only ( * , name : str | None = None ) DafReadOnly [source] ¶
-
Wrap the
Dafdata sett with aDafReadOnlyWrapperto protect it against accidental modification. See the Julia documentation for details.
- class dafpy.data. DafWriter ( jl_obj ) [source] ¶
-
Read-write access to
Dafdata. See the Julia documentation for details.- set_scalar ( name : str , value : bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str , * , overwrite : bool = False ) Self [source] ¶
-
Set the
valueof a scalar property with somenamein aDafdata set. See the Julia documentation for details.Returns
selffor chaining.You can force the data type numeric scalars are stored in by using the appropriate
numpytype (e.g., anp.uint8will be stored as aUInt8).
- delete_scalar ( name : str , * , must_exist : bool = True ) Self [source] ¶
-
Delete a scalar property with some
namefrom theDafdata set. See the Julia documentation for details.Returns
selffor chaining.
- add_axis ( axis : str , entries : Sequence [ str ] | ndarray , * , overwrite : bool = False ) Self [source] ¶
-
Add a new
axisto theDafdata set. See the Julia documentation for details.Returns
selffor chaining.
- delete_axis ( axis : str , * , must_exist : bool = True ) Self [source] ¶
-
Delete an
axisfrom theDafdata set. See the Julia documentation for details.Returns
selffor chaining.
- set_vector ( axis : str , name : str , value : Sequence [ bool | int | float | int8 | int16 | int32 | int64 | uint8 | uint16 | uint32 | uint64 | float32 | float64 | str ] | ndarray | csc_matrix | csr_matrix , * , overwrite : bool = False ) Self [source] ¶
-
Set a vector property with some
namefor someaxisin theDafdata set. See the Julia documentation for details.If the provided
valueis numeric and dense, this passes a zero-copy view of the data to theDafdata set. Otherwise, a Python copy of the data is made (as a densenumpyarray), and passed toDaf.As a convenience, you can pass a 1xN or Nx1 matrix here and it will be mercifully interpreted as a vector. This allows creating sparse vectors in
Dafby passing a 1xN slice of a sparse (column-major) Python matrix.Returns
selffor chaining.
- empty_dense_vector ( axis : str , name : str , eltype : Type , * , overwrite : bool = False ) Iterator [ ndarray ] [source] ¶
-
Create an empty dense vector property with some
namefor someaxisin theDafdata set, and pass it to the block to be filled. See the Julia documentation for details.Note this is a Python
contextmanager, that is, is meant to be used with thewithstatement:with empty_dense_vector(dset, ...) as empty_vector: ....
- empty_sparse_vector ( axis : str , name : str , eltype : Type , nnz : int , indtype : Type , * , overwrite : bool = False ) Iterator [ Tuple [ ndarray , ndarray ] ] [source] ¶
-
Create an empty sparse vector property with some
namefor someaxisin theDafdata set, pass its parts (nzindandnzval) to the block to be filled. See the Julia documentation for details.Note this is a Python
contextmanager, that is, is meant to be used with thewithstatement:with empty_sparse_vector(dset, ...) as (empty_nzind, empty_nzval): .... The arrays are to be filled with Julia’sSparseVectordata, that is,empty_nzindneeds to be filled with 1 -based indices (as opposed to 0-based indices typically used byscipy.sparse). Due to this difference in the indexing, we can’t zero-copy share sparse data between Python and Julia. Sigh.
- delete_vector ( axis : str , name : str , * , must_exist : bool = True ) Self [source] ¶
-
Delete a vector property with some
namefor someaxisfrom theDafdata set. See the Julia documentation for details.Returns
selffor chaining.
- set_matrix ( rows_axis : str , columns_axis : str , name : str , value : ndarray | csc_matrix , * , overwrite : bool = False , relayout : bool = True ) Self [source] ¶
-
Set the matrix property with some
namefor somerows_axisandcolumns_axisin theDafdata set. See the Julia documentation for details.Since
Dafis implemented Julia, this should be a column-majormatrix, so if you have a standardnumpyorscipyrow-major matrix, flip the order of the axes and pass thetranspose(which is an efficient zero-copy operation).Returns
selffor chaining.
- empty_dense_matrix ( rows_axis : str , columns_axis : str , name : str , eltype : Type , * , overwrite : bool = False ) Iterator [ ndarray ] [source] ¶
-
Create an empty (column-major) dense matrix property with some
namefor somerows_axisandcolumns_axisin theDafdata set, and pass it to the block to be filled. See the Julia documentation for details.Note this is a Python
contextmanager, that is, is meant to be used with thewithstatement:with empty_dense_matrix(dset, ...) as empty_matrix: ....
- empty_sparse_matrix ( rows_axis : str , columns_axis : str , name : str , eltype : Type , nnz : int , indtype : Type , * , overwrite : bool = False ) Iterator [ Tuple [ ndarray , ndarray , ndarray ] ] [source] ¶
-
Create an empty (column-major) sparse matrix property with some
namefor somerows_axisandcolumns_axisin theDafdata set, and pass its parts (colptr,rowvalandnzval) to the block to be filles. See the Julia documentation for details.Note this is a Python
contextmanager, that is, is meant to be used with thewithstatement:with empty_sparse_vector(dset, ...) as (empty_colptr, empty_rowval, empty_nzval): .... The arrays are to be filled with Julia’sSparseVectordata, that is,empty_colptrandempty_rowvalneed to be filled with 1 -based indices (as opposed to 0-based indices used byscipy.sparse.cs[cr]_matrix). Due to this difference in the indexing, we can’t zero-copy share sparse data between Python and Julia. Sigh.
- relayout_matrix ( rows_axis : str , columns_axis : str , name : str , * , overwrite : bool = False ) Self [source] ¶
-
Given a matrix property with some
nameexists (in column-major layout) in theDafdata set for therows_axisand thecolumns_axis, then relayout it and store the row-major result as well (that is, with flipped axes). See the Julia documentation for details.Returns
selffor chaining.
- delete_matrix ( rows_axis : str , columns_axis : str , name : str , * , must_exist : bool = True ) Self [source] ¶
-
Delete a matrix property with some
namefor somerows_axisandcolumns_axisfrom theDafdata set. See the Julia documentation for details.Returns
selffor chaining.
- dafpy.data. CacheGroup ¶
-
Types of cached data inside
Daf. See the Julia documentation for details.alias of
Union[Literal[‘MappedData’],Literal[‘MemoryData’],Literal[‘QueryData’]]