Description of the genome query for

Similarity Histogram

Queries such as Organism Phylogenies make use of the mean normalized BLASTP score in computing distances. The Similarity Histogram query allows the user to see the distribution of normalized BLASTP scores (note: similarities, not distances) of all ORFs shared between two genomes at better than the chosen BLASTP cutoff e-value.

The histogram comes as a list of pairs of numbers, the first representing the centre value of normalized BLASTP scores for the bin, and the second representing a count of the number of shared ORFs with that degree of similarity. The user is able to select the binning resolution for the query. Rather than drawing a histogram for you, we provide you the numbers so that you may easily draw a histogram in your preferred format, or use the data in some other analysis.

Examples:

Selecting Bacillus subtilis and Bacillus halodurans as the two genomes, and a BLASTP cutoff e-value of 1.0e-10, the 2687 resulting shared ORFs have a broad distribution of normalized BLASTP similarities, centred around 0.57.

In contrast, selecting Chlamydophila pneumoniae AR39 and Chlamydophila pneumoniae CWL029, same BLASTP cutoff e-value, results in 1031 matches whose distribution is bunched up near the top similarity values.

Copyright ©1998-2005 NeuroGadgets Inc. ©2006 University of Queensland

Back to our Home Page