Data files

Load data as a "single flatfile data base" containg gene expression data as well as sample and gene annotations.

SUMO supports several kinds of data files / import:


Expression matrices

To load a data file goto select Main menu | File | Open Data



Open data

With this option you can load an expression matrix as tab-/comma-delimited text file.
In the File-open dialog box select the corresponding file type from the File-type drop-down list.

SUMO expects:

Such files are easily generated from micro array databases or exported from spreadsheet programs (e.g. MS EXCEL | File save as | Tab delimited text).

Alternatively, you may drag and drop tab-delimited text files into SUMO.

A file preview window (showing first few hundreds of lines from the selected file) opens up:

Double click the most left / upper data cell containing expression data.

The size of the expression matrix is mainly limited by the computer's free RAM.


Analysis tree

File name and dimensions of the expression matrix are shown in the analysis tree:


Click the Data table node to preview the data table:

The data file is shown in a spreadsheet. For more details see information about data tables.



SUMO analyses files

Complete analyses generated with SUMO may be saved, including expression data, backu-up data sets and the multiple statistical tests which have been performed (no SAM analyses).


    Main menu | Save analysis

to save an analysis, correspondingly

    Main menu | Load analysis

to load a previously saved analysis.

Amplification data files

SUMO may be used to analyze RT-PCR data.
Data generated with ABI's RQ-Manager software (exported as amplification data files) may be imported into SUMO.

Main menu | File | Import | ABI rtPCR amplification data

Select one or multiple files.

A file preview window shows up.
Ensure the correct data column (containing the CT values) is selected and load the data files.
SUMO extracts RN-values (which are used as "comments", useful to identify genes with low signal levels generating arbitrary CT-values) and CT-values.

Sometimes, very weak signal are named "undetermined" as CT-value by RQ-Manager software.

SUMO recognizes such missing values.
It is recommended to replace those values with some meanignful value (e.g. "40", the highest cycle number).
Most simple use Main menu | Adjust data | Data imputation | Row wise | Constant.
Select all samples and define "40" as replacement value.

SUMO tries to detect multiplex samples.
If found, SUMO requests a name for multiplex enodgenous controls.
I case such controls were used, give the unique name of the controls (or a unique part of the name).
IF no - cancel the dialogue.

SUMO now performs: Replicates, i.e. same Gene-ID and Sample-ID are automatically averaged - even across multiple amplification data files.

Additionally, SUMO computes averages and standard deviation from both deltaRN as well as from ct values and places them into the gene annotations. Such values might be used to filter genes with overall low abundance (i.e. high ct-values, e.g. >35) or low signal (i.e. low delta RN , e.g. <<1).

A new file containg averaging information is automatically created
(original filename extended with "_MenaSDevN", e.g. "MyExperiment.sdm-Amplification Data_MeanSDevN.txt").
For each sample it contains three data colums:
- Mean/Median CT-value from all technical replicates
- SDev/MAD
- number of replicates

Now you may use SUMO's functionality to analyse the PCR data.

But keep in mind:
CT-values represent ~log2 values !!