SUMO - Adjusdt data - Annotation
For interpretation of data and statistical tests it is helpful to use annotations both for genes (row) as
well as for conditions (columns) included into the loaded data matrix.
Here you find functions to extend / modify the annotatons.
Add condition / hybridisation annotations in colums
If you have data tables (tab delimited text) with additional annotations you can merge these data to
the already existing annotations by common Identifiers.
Supply a tab delimited text file with additional hybridisation annotations, one hybridisation per line.
External data are mapped to the loaded data table using a "unique" key which must exist in both the loaded
data as well as in the new data table.
Data preview windows open up:
- Define the column containing the unique link-key in the new data table.
- Define the row ot the loaded sample annotations which contains the respectve
unique key.
Data rows in the new extended data table, with a non matching key are ignored.
Add feature / gene annotations in rows
Similar as above, supply a teab delimited text file with extendedn feature annotation.
Define data columns containng the unique link key.
Means to annotation
It may be helpful to add annotaton columns containing descriptive statistic values to your matrix, before certain normalisation/data adjustment procedures are performed.
E.g. control gene normalized log2 ratios do not contain any inference on signal intensity.
BUT a 2 fold regulated gene at high signal levels (20000 counts) may be much more interesting compared to a gene 2-fold regulated at background/noise levels (~50).
SUMO allows to compute several descriptive statistics parameters from the all or selected conditions:
- Arithmetic mean
- Median (robust, less outlier dependant)
- Geometric mean (best for ratio data)
- Variance
- Standrad deviation (with arithmetic mean)
- MAD (Median Absolute Deviation), the SDev equivalent for Median)
- SDev-G, the SDdev equivalent for geometric mean
- Coefficent of variation (CV) for
- Arithmetic mean / SDev
- Median/MAD
- Geometric mean / SDev-G
- Minmum
- Maximum
- Sum
- Count (within thresholds):
Count the number of condition within the selection where the data cell has a value between user define low/high threshold.
A input dialog opens up.
Define numerical values for low/high threshold.
To omit a threshold, leave the respective data field empty.
Enter a name for the new data column.
The newly computed values are appended as new data columns to the already existing feature/gene annotations.
Such data values may be used lateron to filter / check lists of significant genes from statitstical tests.
Edit Sample/Gene annotations
Basic table editors to edit Condition/Sample or Feature/Gene annotatios