Data views

To get an impression about your data and to analye and diagnose then, it may be helpful to use different graphic representations of the data.
These graphs may help to explore whether your data contain systematic biasis / distortions and thus need additional data adjustment / normalisations or could be used for statistical analysis.

SUMO can generate
Heat maps
visualising intensities / ratios
of the expression matrix.
The famous red-green heat maps
Box plots
summarizing global intensity/ratio distribution
in all hybridisations
Scatter plots
Analyze pair wise signal distributions

See more details about the scatterplot viewer.
of hybridisations
Scatter plots mosaic
Get an overview of signal distributions
in all your sample data
Histograms
Show signal distribution in any
or all loaded samples
Line graphs / Profiles
Illustrating e.g. gene/sample
data profiles
Correlation maps
Illustrating inter-sample
similarity
Population maps
Dot charts
Deviation plot
RLE plot








Heat maps

Heat maps are a direct graphical representation of your expression matrix.
The numerical value of each data cell is translated into a colour in a 2-image.
Traditionally , numerical values
   
SUMO offers various interactive tools to zoom and scale the heat map view, change colours, genes or condition annotations, search in gene and condition annotations or sort the data.

For more details about the heat map viewer, go here.






Box plots

Box-Whisker plots summarize the signal distributions in al hybridisations.
   

For each hybridisation a single box plot shows:

Use cursor left/right keys to squeeze / expand width of the plot, or click the toolbar buttons to scale, size the data or switch between linear / logarithmic scaling.






Scatter-plots

The scatter plots are used to pair-wise display two hybridisations' intensity of ratio signals.

The hybridisation selector opens-up:
   
Select the two conditions (=hybridisations) and the Gene annotation which shall be shown in the scatter plot.

Click one of the four scatter plot buttons:
X-Y Signals from selected hybridisations are displayed
  
 
R / I Ratio from selected hybs on Y-axes, Product of hybridisations on X-axes
  
 
Q / Q Ranked (=intesity sorted) signals from selected hybridisations are displayed
  

See more details about the Scatterplot viewer


3D Signals from up to six selected hybridisations may be displayed in a 3D scatterplot
  






X-Y Scatter plots


See more details about the Scatterplot viewer






R-I Scatter plots


See more details about the Scatterplot viewer






Quantile scatter plots

See more details about the Scatterplot viewer






3-D scatter plots


See more details about the 3D-Scatterplot viewer






Scatterplot Mosaic Scatterplot mosaics may be used to get an overview of signal distributions in all your samples.
Therefore, all possible pair wise dotplots are performed. Small views of the single dotplots are stitched to a mosiac image, showing a quadratic matrix of dotplots. Dependng of the numer of selected samples, generation of the graph may require some time. The resulting graph may become very big, too.

Scatterplot mosaics may be generated as:
- X-Y scatterplots:
     

- RI = Ratio intensity plots. Here, log-ratios are shown on y-axis, average intensity on x-axis.
     

- Quantile plots,. Here data are ranked by intensity.
     

See a full resolution dot plot mosaic image.






Histograms

Histograms may be used to show signal distribution in any or all you loaded samples.
Data may be shown in linear- or log scaling. Scaling and perspective view may be freely adjusted.
Histogram data can be extracted as tab delimited data and used in other applications.








Correlation maps


Correlatons maps are a simple tool to visualize and explore global similarity between the individual hybridisations.
SUMO computes all pairwise correlations between all hybridisations and shows them in the heatmap viewer:


Sorting rows and colums (cluster with Ecuclidean distance/average linkage) shows coarse structure in the data:
- two major sample groups (V and IBC)
- four IBC samples (8,4,19,13) look like the v-group


Correlation maps may be computed using different similarity metrics:






Deviation plot

The deviation plot visualizes Median as well as 5%/95% as well as 25%/75% percentiles/quantiles for each individual features (e.g. gene) in the dataset.
Features are ranked by median signal, from lowest (left) to highest value (right).

Display of percentiles allows to visualize asymmetric data distributions in the features.
To display deviation plots for conditions (samples), just transpose the matric (Main menu | Adjust data | Transpose matrix).

Obviously the low variability of low intensity features (~100 cts) can not be visualized comparable to the much hijher variabilty of high intensity feautes (>10000 cts).

A Deviaion plot relative to median may solve this issue.
Here the deviations are normalized to the respective median feature-wise.
For medians close to zero, deviations may be misleading.

The example shows the intensity deviation plot from Illumina gene expression arrays (~48000 features):


The graph indicates, that ~30000 features measure signals at background level.

Zoom the intensity axes to see the low intensity features in detail:



Deviation versus Median normalized:
Deviation plotMedia normalized deviation plot

Two subsets from the above dataset (Main menu | Utilities | Demo data | Intensities)
"V"-Samples"IBC"-Samples






RLE plot