Import a dataset to an MCView project from metacell R package
Source:R/import_metacell1.R
import_dataset_metacell1.Rd
Read objects from metacell
R package and import a
metacell dataset to MCView.
Usage
import_dataset_metacell1(
project,
dataset,
scdb,
matrix,
mc,
mc2d,
metacell_types_file,
cell_type_colors_file,
gene_modules_file = NULL,
gene_modules_k = NULL,
calc_gg_cor = TRUE,
network = NULL,
time_annotation_file = NULL,
time_bin_field = NULL,
metadata_fields = NULL,
categorical = c(),
...
)
Arguments
- project
path to the project
- dataset
name for the dataset, e.g. "PBMC". The name of the dataset can only contain alphanumeric characters, dots, dashes and underscores.
- scdb
path to R metacell single cell RNA database
- matrix
name of the umi matrix to use
- mc
name of the metacell object to use
- mc2d
name of the 2d projection object to use
- metacell_types_file
path to a tabular file (csv,tsv) with cell type assignement for each metacell. The file should have a column named "metacell" with the metacell ids and another column named "cell_type" or "cluster" with the cell type assignment. Metacell ids that do not exists in the data would be ignored. In addition, the file can have a column named "age" or "mc_age" with age metadata per metacell
- cell_type_colors_file
path to a tabular file (csv,tsv) with color assignement for each cell type. The file should have a column named "cell_type" or "cluster" with the cell types and another column named "color" with the color assignment. Cell types that do not exist in the metacell types would be ignored.
- gene_modules_file
path to a tabular file (csv,tsv) with assignment of genes to gene modules. Should have a field named "gene" with the gene name and a field named "module" with the name of the gene module.
- gene_modules_k
number of clusters for initial gene module calculation. If NULL - the number of clusters would be determined such that an gene module would contain 16 genes on average.
- calc_gg_cor
Calculate top 30 correlated and anti-correlated genes for each gene. This computation can be heavy for large datasets or weaker machines, so you can set
calc_gg_cor=FALSE
to skip it. Note that then this feature would be missing from the app.- network
name of the network object to use (optional)
- time_annotation_file
file with names for time bins (optional, only relevant with networks/flows). Should have a field named "time_bin" with the time bin id and another field named "time_desc" which contains the description of the time bin
- time_bin_field
name of a field in
cell_metadata
which contains time bin per cell (optional)- metadata_fields
names of fields
mat@cell_metadata
which contains metadata per cell to be summarized usingcell_metadata_to_metacell
.
The fields should can be either numeric or categorical.
You can usecell_metadata_to_metacell
to convert from categorical to a numeric score (e.g. by using fraction of the category).- categorical
metadata fields that should be treated as categorical (optional)
- ...
Arguments passed on to
create_project
title
The title of the app. This would be shown on the top left of the screen.
tabs
Controls which tabs to show in the left sidebar and their order. Options are: "QC", "Projection-QC", "Manifold", "Genes", "Query", "Atlas", "Markers", "Gene modules", "Projected-fold", "Diff. Expression", "Cell types", "Flow", "Annotate", "About". When NULL - default tabs would be set. For projects with atlas projections, please set
atlas
to TRUE.help
Controls wether to start the app with a help modal (from introjs). Help messages can be edited in help.yaml file (see 'Architecture' vignette).
selected_gene1,selected_gene2
The default genes that would be selected (in any screen with gene selection). If this parameter is missing, the 2 genes with highest max(expr)-min(expr) in the first dataset would be chosen.
selected_mc1,selected_mc2
The default metacells that would be selected in the Diff. Expression tab.
datasets
A named list with additional per-dataset parameters. Current parameters include default visualization properties of projection and scatter plots.
other_params
Named list of additional parameters such as projection_point_size, projection_point_stroke, scatters_point_size and scatters_stroke_size
edit_config
open file editor for config file editing
atlas
use default configuration for atlas projections (relevant only when
tabs
is NULL)
Details
The result would be a directory under project/cache/dataset
which
would contain objects used by MCView shiny app (such as the metacell matrix).
In addition, you can supply file with type assignment for each metacell
(metacell_types_file
) and a file with color assignment for each metacell type
(cell_type_colors_file
).
Make sure that you have the R metacell
package installed in order to use
this function.
network
, time_annotation_file
and time_bin_field
are only relevant
if you computed flows/networks for your dataset and therefore are optional.
In order to add time annotation to your dataset you will have to:
1. Add a column named "mc_age" or "age" to
metacell_types_file
with time per metacell2. Create a
time_annotation_file
with id for each time bin and description
Examples
if (FALSE) { # \dontrun{
import_dataset_metacell1(
"embflow",
"153embs",
scdb = "raw/scrna_db",
matrix = "embs",
mc = "embs",
mc2d = "embs",
metacell_types_file = "raw/metacell-types.csv",
cell_type_colors_file = "raw/cell-type-colors.csv",
network = "embs",
time_annotation_file = "raw/time-annot.tsv",
time_bin_field = "age_group"
)
} # }