Create Own_Project

10 Create Your Own Projects

How to create, save, and open new MOG projects. This section describes the required format for MOG input data, and details the MOG GUI to create projects and interact with the project data.

10.1 Input format

Figure23: MOG input data format.

After starting MOG, the user can read their own data and metadata using MOG. The data and metadata should be present in two separate files which are in text-delimited format. The supported delimiter characters are tab, comma, semicolon and space. Both data and metadata files should be in a specific delimited format (see Figure 23). The data file is a matrix containing the measured expression values of the features (rows) over the samples (columns). For example: expression of genes (each gene is a feature) over a number of microarray studies; transcript counts (each transcript is a feature) over multiple RNA-Seq runs. The file can have first I columns as feature metadata columns (Figure 23) which can have information about the features. For genes these might be: name, phylostratum, description, protein encoded, tertiary structure of encoded protein, gene type etc. The first I columns are feature metadata columns; the first column is a unique ID for each row (here, gene symbol). The latter (S) columns contain expression values of F features over S samples. A metadata file should be a delimited text file with rows as samples and columns as metadata attributes. It is a matrix of S rows by M columns (Figure 23). Each row in the sample metadata file corresponds to a sample in the expression data file. The M columns are the metadata attributes of each experimental analysis (e.g., run for RNA-Seq data, chip for microarray data). A column, in the metadata file, contains the unique sample IDs that link to the columns in the data matrix. This column is referred to as the sample id column in MOG. If samples are missing from metadata file, MOG handles the missing information by producing an empty metadata row.

10.2 Start a new project

10.2.1 The "Create New Project" dialog

To start a new MOG project from the welcome dialog, follow the following steps:

Figure 24: Create New Project dialog.

CLICK on the ”From a delimited text file” option located in the upper-left quadrant of the ”Welcome Dialog”. CLICKing this option will create another dialog titled ”Create New Project” (see Figure 24).
In the ”Create New Project” dialog CLICK the ”Browse” button next to ”Data File” to locate the data file.
(Optional) In the ”Create New Project” dialog CLICK the ”Browse” button next to ”Metadata” to locate the metadata file.
Under the ”Feature Metadata Columns” section, enter the number of feature metadata columns present in the data file (see Section 10.1).
In the delimiter section select the correct delimiter for the data and the metadata files.
If the data file contain missing values choose an option to either skip the rows with missing values or treat the missing values as 0 or some other number.
CLICK OK.

10.2.2 The "Import Metadata" window

If metadata file was provided in step 3, a new window, ”Import Metadata”, will be displayed (Figure 25). This window displays a table which gives a preview of the metadata file selected. This window also shows basic summary of the metadata file by displaying:
Total Rows: This is the total number of rows in the metadata file (excluding the first row, which is treated as the header).
Total Columns: The total number of columns in the metadata file.
Extra Samples: The number of sample IDs in the ”Sample ID column” of the metadata file that don’t match the sample IDs in the data file. These rows are ignored by MOG.
Missing Samples: The number of sample IDs in the data that are missing from the ”Sample ID column” in the metadata file. MOG generates empty metadata for such samples.

Figure 25: MOG read metadata window.

If the metadata file looks incorrect from the preview then a new metadata file could be loaded by performing the following steps:

In the ”Import Metadata” window, CLICK the browse button next to the ”Metadata file” label to select the metadata file.
In the ”Import Metadata” window, select the delimiter for the selected metadata file.
In the ”Import Metadata” window, CLICK preview to preview the metadata file.

If the data in the preview looks OK then proceed with the metadata import step. Perform the following steps in order to import the metadata into MOG:

In the ”Import Metadata” window, select the ”Sample Id Column” located in the topmost panel. The ”Sample Id Column” is the column in the metadata file which contains the sample id. Although, MOG will automatically detect this column based on the sample ids from the data file the user should check if the correct column is selected.
CLICK ”Next” located in the bottom most panel.

10.2.3 The "Metadata Table to Tree" window

After CLICKing ”Next” in step 2 ”Import Metadata” window will close and a new window ”Metadata Table to Tree” will appear. This window provides and interface to interactively map the tabular metadata into a hierarchical tree structure.
To enable MOG to read the sample metadata, the user maps the sample metadata columns to a hierarchical structure which organizes the tabular data into a tree-like structure. A hierarchical view of the sample metadata can efficiently display the metadata at different levels of the hierarchy which makes the metadata more understandable for analysis. This hierarchical structure should be based on the organization of the metadata elements. Public repositories such as SRA, GEO and TCGA follow a hierarchical metadata schema with nested metadata elements are nested.

Figure 26: MOG metadata table to tree window.

For example, consider a workflow which is structured as follows:

An experiment consists of multiple studies.
Each study independently probes transcriptomic profiles of different samples.
Each sample is independent biological material obtained from different sources.
Each run is actual sequencing experiment performed on a given biological sample.

The above hypothetical structure would look like Figure 27.

Figure 27: An example hierarchy of metadata columns.

The ”Metadata Table to Tree” window is divided into three panes:left, center and right. Each panes displays different types of information about the metadata.
Left pane Shows all headers (column names) in the metadata file as a list.
Center pane Shows a user-built tree structure. Initially this tree is set to only one element, the ”Root”. The tree is built interactively by the user by dragging metadata headers from the left pane to the center pane.
Right pane Displays the tree structure in a hierarchical structure. This tree is generated once the ”preview” button is CLICKed, based on the tree structure built by the user in the center pane.

To map the tabular metadata into a hierarchical tree structure, perform the following steps in the ”Metadata Table to Tree” window:

Drag the column headers from the left pane to the tree in the center pane.
Check the option to remove the unused column headers. The columns once removed are could not be included in the project later.
If required, CLICK ”preview tree” to see preview of the mapped metadata.
CLICK ”Next” if the tree structure has been created.

***If the data and metadata are properly formatted, MOG will then import the data and its metadata and a new project will be created. The ”Main MetaOmGraph” window will be visible (Figure 3)