Skip to contents

Setup

We would start by downloading a PBMC dataset that was pre-processed using the metacells python package.

dir.create("raw")
download.file("http://www.wisdom.weizmann.ac.il/~atanay/metac_data/PBMC_processed.tar.gz", "raw/PBMC_processed.tar.gz")
untar("raw/PBMC_processed.tar.gz", exdir = "raw")

The above commands end result are two files named “raw/pbmc_metacells.h5ad” and “raw/cluster-colors.csv” which we would import to MCView in the next steps.

Create a project

The first step in running MCView is generating a project directory structure:

create_project("PBMC", title = "PBMC")

A text editor would be opened with PBMC/config/config.yaml file:

title: PBMC
tabs: ["QC", "Manifold", "Genes", "Markers", "Gene modules", "Diff. Expression", "Cell types","Annotate", "About"] # which tabs to show
help: false # set to true to show introjs help on start
# selected_gene1: Foxc1 # Default selected gene1
# selected_gene2: Twist1 # Default selected gene2
selected_mc1: 1 # Default selected metacell1 
selected_mc2: 2 # Default selected metacell2

The configuration file was generated using create_project parameters, but we can edit them if we want, or add parameters per dataset, see the architecture vignette for a full description of the config parameters.

Import

Next, we would import the PBMC dataset to the project we created. A project can contain multiple dataset, and switching between them can be done from the right sidebar.

This would pre-process the metacell dataset in order to view it in the shiny app:

import_dataset(
    project = "PBMC",
    dataset = "PBMC",
    anndata_file = "raw/metacells.h5ad",
    cell_type_field = "type"
)

The most important field is the anndata field which points to the h5ad file we downloaded.

You can see that we also specified a field in the h5ad object that has pre-computed cluster assignments per metacell at the ‘type’ field.

Note that the import part might take a few minutes, depending (mostly) on the number of metacells. If you see that it takes too long - set calc_gg_cor to FALSE in order to skip calculating correlation between all genes. This would save significant amount of import time but would make this feature unavailable in the app.

In addition, some features would only be available if you ran compute_for_mcview in the metacells python package, so try to remember running it before importing to MCView.

MCView supports also importing datasets from the old R metacell package, see import_dataset_metacell1 for details.

Run the app

run_app(project = "PBMC", launch.browser = TRUE)

A browser window would be opened with the app.

You can also specify the port or host, or do not launch the browser window, e.g.:

run_app(project = "PBMC", port = 5555, host = "127.0.0.1", launch.browser = FALSE)

Update annotations

After working a bit on the initial metacell model, we would usually want to update the default dataset annotations with the ones we created using MCView. This can be done by:

  1. Pressing the “export” button in the upper left of “Annotate” screen and saving the file.
  2. Running:
update_metacell_types("PBMC", "PBMC163k", "/path/to/metacell_types_file")

Where “/path/to/metacell_types_file” is the path of the exported file.

You can now rerun the app and the types/colors would be updated.

If you only want to update the cell type colors you can run:

update_cell_type_colors("PBMC", "PBMC163k", "/path/to/cell_type_colors_file")

Metadata

You can load metadata fields for each metacell using the metadata parameter in import_dataset command:

import_dataset(
    project = "PBMC",
    dataset = "PBMC",
    anndata_file = "raw/metacells.h5ad",
    metadata = "raw/metadata.csv"
)

The format is a data frame (or a delimited filename) with a column named metacell and the annotation fields. The metadata fields can be either numeric or categorical.

You can use the metadata_colors parameter to set the breaks and colors for each numerical metadata field, and color for each category for categorical fields.

If you want to change the metadata fields or colors after the import, you can use the update_metadata and update_metadata_colors functions:

update_metadata(
    project = "PBMC",
    dataset = "PBMC163k",
    metadata = "new_metadata.csv"
)
update_metadata_colors(
    project = "PBMC",
    dataset = "PBMC163k",
    metadata_colors = new_metadata_colors
)

You can generate metadata per metacell from cell metadata using the cell_metadata_to_metacell and cell_metadata_to_metacell_from_h5ad functions.

Deploy

Create a deployment ready bundle by running:

create_bundle(project = "PBMC", path = getwd(), name = "PBMC_bundle")

You can then upload the bundle to shinyapps.io by running:

rsconnect::deployApp(appDir = file.path(getwd(), "PBMC_bundle"))

Or to any shiny-server hosting service by uploading the “PBMC_bundle” directory to the service.

Note that you might need to set your hosting service to allow higher memory than the default - MCView keeps the metacell matrix in-memory and therefore needs around 1GB of RAM for small datasets such as PBMC, but up to 2-4GB for large datasets such as MOCA.

Docker

See the docker vignette for instructions of using the docker image.