Skip to content

semontante/flowMagic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

204 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

flowMagic

Welcome to the flowMagic github repository!

flowMagic is an automated gating tool designed to automate the gating of bivariate flow cytometry (FCM) data. The flowMagic algorithm is the first algorithm trained on both expert-generated data and a large dataset of gated bivariate FCM data annotated by the players of EVE Online, the online role-playing game developed by the CCP Games company.

Project Logo License Issues PDF PDF Code

Overview

Flow cytometry (FCM) is a technology widely used in immunology and cell biology to enable the precise identification and quantification of heterogeneous cell populations across diverse applications. The prevailing analysis approach, termed gating, relies on expert manual analysis of hierarchical bivariate plots. This is a subjective and time-consuming approach that generates inconsistent results. In order to generate a large high quality training dataset in a relatively short amount of time, we recruited the players of EVE Online to gate millions of bivariate FCM plots within the mini-game Project Discovery. The resulting flowMagic algorithm was trained on this large and diverse dataset. By combining citizen science and machine learning, flowMagic aims to support more robust, generalizable, and automated FCM analysis tools.

flowMagic framework includes several key functionalities:

  • Rich visualization tools for intuitive exploration of bivariate FCM data. The flowMagic framework works seamlessly with the flowCore/flowWorkspace ecosystem.

  • Interactive manual gating for user-driven analysis, offering a free alternative to commercial platforms like FlowJo and FCS Express, with click-and-draw manual gate functionality.

  • Automated gating with machine learning, offering two models:

    1. Template model: trained on expert-generated data provided by the user based on the panel to analyze. The interactive manual gating functionality can be used to generate these templates.
    2. Generalized model: trained on data generated by video game players, requiring no template data from the user.
  • Improved navigation utilities for working with flowCore and flowWorkspace objects.

All of these functionalities are actively maintained and continuously updated to improve performance, usability and compatibility with the broader FCM analysis ecosystem.

Video: automated gating using the template ML model

Demo

Download Full Introduction Video

Automated gating using the generalized ML model

The generalized model approach is composed of two models: Model A predicts the number of gates (optional) and Model B predicts the gates boundaries based either the number of gates indicated by the user or the number of gates predicted by Model A. In order to use this approach, the user needs to download the R objects of these two models from the Federated Research Data Repository (FRDR) at this link: https://www.frdr-dfdr.ca/repo/dataset/abd523c1-7530-40b7-8807-2e4a4a3e7a5e

The user needs to download the ALL_GP_models.tar.gz file and extract it. The archive contains two folders, one for the model trained on curated data and full data(consensus named folder). Go to the curated data folder (the one without the consensus suffix).

  1. Model A: training_rf_index_3000train10val_2ntree_500points_100folds_31000_consensus_plots_pred_n_gates.RData in models_trained_to_predict_n_gates_final folder.

  2. Model B: list_models_all_n_gates.RData in models_trained_to_predict_classes_final folder.

Once they are downloaded, follow the flowMagic manual pdf guide to discover how to use them.

Note that the generalized approach works only with abundant single density peaks populations, like shown below:

Example Pop

For more complex populations (e.g., populations with multiple peaks or rare populations), please use the Template model.

Documentation

The full technical documentation for the flowMagic R package can be found inside the pk_manual directory above. Click on the red pdf badge (flowMagic vignette) to download and visualize the flowMagic introduction (vignette). Click on the red pdf badge named flowMagic manual to download a guide to all flowMagic functions and arguments.

Installation

The recommended approach is to use the Docker container, which already includes the packages and system dependencies needed to run flowMagic.

Requirements

Before using the launcher, please install Docker.

  • Windows/macOS/Linux: Docker Desktop is recommended for most users.
  • Linux advanced users: Docker Engine can also be used instead of Docker Desktop.
  • Minimum 8 GB of free memory; more is recommended for large datasets.
  • A modern web browser for accessing RStudio Server.

Download Docker Desktop here:

Download Docker Desktop

Linux users who prefer Docker Engine can use the official installation instructions:

Install Docker Engine

Using the launcher and Docker image recommended

The easiest way to use flowMagic is through the launcher scripts provided in the GitHub Releases section of this repository:

flowMagic releases

The launchers use the following Docker image:

ghcr.io/semontante/flowmagic_docker:latest

When launched, RStudio Server will be available locally in your browser at:

http://localhost:8787

Login details:

Username: rstudio
Password: 1234

Windows

Download the Windows launcher from the release page:

launch_flowmagic_windows.bat

Then double-click the file.

The launcher will:

  1. Ask you to select a folder containing your data.
  2. Download the latest flowMagic Docker image if needed.
  3. Start the Docker container.
  4. Open RStudio Server in your default browser.

Your selected folder will be available inside RStudio at:

/home/rstudio/main

Keep the launcher window open while using RStudio.

When you are finished, press any key in the launcher window to stop and remove the running container.

macOS and Linux

Download the macOS/Linux launcher from the release page:

launch_flowmagic_mac_linux.sh

Open a terminal in the folder where you downloaded the file and run:

chmod +x launch_flowmagic_mac_linux.sh
./launch_flowmagic_mac_linux.sh

The launcher will:

  1. Ask you to select a folder containing your data.
  2. Download the latest flowMagic Docker image if needed.
  3. Start the Docker container.
  4. Open RStudio Server in your default browser.

Your selected folder will be available inside RStudio at:

/home/rstudio/main

Keep the terminal window open while using RStudio.

When you are finished, press Enter in the terminal to stop and remove the running container.

First run

The first time you run the launcher, Docker will need to download the flowMagic image.

The image may take several minutes to download depending on your internet connection.

Subsequent runs should be faster because Docker stores the image locally.

Manual Docker command

If you prefer not to use the launcher, you can run the Docker image manually.

Replace /path/to/your/data with the folder containing your data.

docker run -d \
  --pull always \
  --name flowmagic_rstudio \
  -p 8787:8787 \
  -e PASSWORD=1234 \
  -v /path/to/your/data:/home/rstudio/main \
  ghcr.io/semontante/flowmagic_docker:latest

Then open:

http://localhost:8787

Login with:

Username: rstudio
Password: 1234

To stop and remove the container:

docker stop flowmagic_rstudio
docker rm flowmagic_rstudio

Troubleshooting

Docker is not running

Make sure Docker Desktop or Docker Engine is installed and running before using the launcher.

Port 8787 is already in use

RStudio Server uses port 8787.

If another program is already using this port, the launcher may fail. Close the other program or edit the launcher script and change the host port.

The browser does not open automatically

Open your browser manually and go to:

http://localhost:8787

Login does not work

Use:

Username: rstudio
Password: 1234

The image download is slow

The Docker image may take several minutes to download the first time.

Subsequent runs should be faster because Docker stores the image locally.

Using direct installation

The Docker container is the recommended option because it already includes the required R packages and system dependencies.

Direct installation is mainly recommended for users who are comfortable managing R packages, Bioconductor packages, and system libraries themselves.

Install from GitHub

First, install devtools if it is not already installed:

install.packages("devtools")

Then install flowMagic from GitHub:

library(devtools)
install_github("semontante/flowMagic")

Alternatively, you can use remotes:

install.packages("remotes")
remotes::install_github("semontante/flowMagic")

Test script

The script below can be used to test the correct installation of the flowMagic package.

# load libraries
library(sp) 
library(stringr)
library(ggplot2)
library(parallel) 
library(doParallel)
library(randomForest) 
library(caret)
library(concaveman)
library(sm)
library(pracma)
library(sf)
library(stats)
library(grDevices)
library(flowMagic)
#------------ using template model with 1 template
# get path to directory with files to analyze
path_dir<-system.file("extdata/csv_files",package = "flowMagic")
# import data with labels that we use as template data.
list_data_ref<-import_reference_csv(path_results = path_dir,n_cores = 1)
# import data without labels
list_test_data<-import_test_set_csv(path_data = path_dir,n_cores = 1)
# Note that it is possible to provide also directly the paths to each file. 
# See functions manual for additional details.
# data preprocessing to generate the template model using first file as template
# we select first element of the imported list of dataframes
ref_train<-get_train_data(paths_file = list_data_ref[1],n_cores = 1) 
# generate template model using out-of-the-bag validation
ref_model_info<-magicTrain(df_train = ref_train,n_cores = 1,train_model = "rf")
# perform automated gating (gates boundaries prediction step)
list_dfs_pred<-magicPred_all(list_test_data = list_test_data,magic_model = NULL,ref_data_train = ref_train,
ref_model_info = ref_model_info,n_cores = 8)
# Note that providing the training set is optional (ref_data_train = ref_train is optional).
# Providing the training set allows the user to calculate the target-template distance for each plot to analyze.
# list_dfs_pred contains a list of dataframes for each plot analyzed. In other words, it is a nested list (e.g.,
# downsampled dataset and original dataset with predicted labels for each plot). 
# See the functions manual for the full list of dataframes returned.
# visualize gated data
df_temp<-list_dfs_pred[[1]]$df_test_original # dataframe of first gated plot
magicPlot(df = df_temp,type = "ML",size_points = 1)
magicPlot(df = df_temp,type = "dens",size_points = 1)

License

This R package is licensed under the Apache License 2.0. See the LICENSE and NOTICE files for more information.

About

flowMagic is an R package for automated bivariate gating of flow cytometry (FCM) data. It uses a machine-learning model trained on both experts data and a large, high-quality FCM dataset generated by EVE Online players. The package also provides an extensive visualization toolkit and integrates seamlessly with the flowCore/flowWorkspace frameworks

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors