Setup
We would start by downloading a PBMC dataset that was pre-processed
using the metacells
python package.
dir.create("raw")
download.file("http://www.wisdom.weizmann.ac.il/~atanay/metac_data/PBMC_processed.tar.gz", "raw/PBMC_processed.tar.gz")
untar("raw/PBMC_processed.tar.gz", exdir = "raw")
The above commands end result are two files named “raw/pbmc_metacells.h5ad” and “raw/cluster-colors.csv” which we would import to MCView in the next steps.
Create a project
The first step in running MCView is generating a project directory structure:
create_project("PBMC", title = "PBMC")
A text editor would be opened with
PBMC/config/config.yaml
file:
title: PBMC
tabs: ["QC", "Manifold", "Genes", "Markers", "Gene modules", "Diff. Expression", "Cell types","Annotate", "About"] # which tabs to show
help: false # set to true to show introjs help on start
# selected_gene1: Foxc1 # Default selected gene1
# selected_gene2: Twist1 # Default selected gene2
selected_mc1: 1 # Default selected metacell1
selected_mc2: 2 # Default selected metacell2
The configuration file was generated using
create_project
parameters, but we can edit them if we want,
or add parameters per dataset, see the architecture
vignette for a full description of the config parameters.
Import
Next, we would import the PBMC dataset to the project we created. A project can contain multiple dataset, and switching between them can be done from the right sidebar.
This would pre-process the metacell dataset in order to view it in the shiny app:
import_dataset(
project = "PBMC",
dataset = "PBMC",
anndata_file = "raw/metacells.h5ad",
cell_type_field = "type"
)
The most important field is the anndata
field which
points to the h5ad file we downloaded.
You can see that we also specified a field in the h5ad
object that has pre-computed cluster assignments per metacell at the
‘type’ field.
Note that the import part might take a few minutes, depending
(mostly) on the number of metacells. If you see that it takes too long -
set calc_gg_cor
to FALSE
in order to skip
calculating correlation between all genes. This would save significant
amount of import time but would make this feature unavailable in the
app.
In addition, some features would only be available if you ran
compute_for_mcview
in the metacells
python
package, so try to remember running it before importing to
MCView.
MCView supports also importing datasets from the old
R metacell
package, see
import_dataset_metacell1
for details.
Run the app
run_app(project = "PBMC", launch.browser = TRUE)
A browser window would be opened with the app.
You can also specify the port or host, or do not launch the browser window, e.g.:
run_app(project = "PBMC", port = 5555, host = "127.0.0.1", launch.browser = FALSE)
Update annotations
After working a bit on the initial metacell model, we would usually want to update the default dataset annotations with the ones we created using MCView. This can be done by:
- Pressing the “export” button in the upper left of “Annotate” screen and saving the file.
- Running:
update_metacell_types("PBMC", "PBMC163k", "/path/to/metacell_types_file")
Where “/path/to/metacell_types_file” is the path of the exported file.
You can now rerun the app and the types/colors would be updated.
If you only want to update the cell type colors you can run:
update_cell_type_colors("PBMC", "PBMC163k", "/path/to/cell_type_colors_file")
Metadata
You can load metadata fields for each metacell using the
metadata
parameter in import_dataset
command:
import_dataset(
project = "PBMC",
dataset = "PBMC",
anndata_file = "raw/metacells.h5ad",
metadata = "raw/metadata.csv"
)
The format is a data frame (or a delimited filename) with a column
named metacell
and the annotation fields. The metadata
fields can be either numeric or categorical.
You can use the metadata_colors
parameter to set the
breaks and colors for each numerical metadata field, and color for each
category for categorical fields.
If you want to change the metadata fields or colors after the import,
you can use the update_metadata
and
update_metadata_colors
functions:
update_metadata(
project = "PBMC",
dataset = "PBMC163k",
metadata = "new_metadata.csv"
)
update_metadata_colors(
project = "PBMC",
dataset = "PBMC163k",
metadata_colors = new_metadata_colors
)
You can generate metadata per metacell from cell metadata using the
cell_metadata_to_metacell
and
cell_metadata_to_metacell_from_h5ad
functions.
Deploy
Create a deployment ready bundle by running:
create_bundle(project = "PBMC", path = getwd(), name = "PBMC_bundle")
You can then upload the bundle to shinyapps.io by running:
Or to any shiny-server hosting service by uploading the “PBMC_bundle” directory to the service.
Note that you might need to set your hosting service to allow higher memory than the default - MCView keeps the metacell matrix in-memory and therefore needs around 1GB of RAM for small datasets such as PBMC, but up to 2-4GB for large datasets such as MOCA.
Docker
See the docker vignette for instructions of using the docker image.