Description of the genome query for

Retrieve ORF sequences

The user is prompted to enter a consensus gene name, or a protein PID number (gi number). Type it in, or copy and paste it from the results of a previous query.

Six self-explanatory options are available:

Retrieve genes with the specified consensus gene name: finds each gene in each genome with that consensus gene name.

Best match in other genomes returns self, and the best BLASTP hit better than the chosen cutoff e-value from each of the database's other genomes.

Recursive best match in other genomes returns self, and the best BLASTP hit better than the chosen cutoff e-value from each of the database's other genomes as above. Here, however, the process is repeated where each hit finds its best match in each genome.

Reciprocal best match in other genomes is like the previous option, but the specified protein must be the target's best match as well. We currently do not deal with ties, so the set of protein sequences returned may not be as complete as it should be...

Gene family within a genome returns self, and any and all matches better than the chosen cutoff e-value within the same genome.

Finally, one may retrieve the single sequence corresponding to the specified gi number.

Caution!

This query operates using BLASTP hits, and may return proteins unrelated to one another if the search sequence is a fusion of protein domains that are separate in some genomes. Perform a multiple sequence alignment in order to control for this problem.

Example:

If one specifies 7190047 as the gi number, 1.0e-10 as the BLASTP cutoff e-value, and asks for reciprocal best matches, one gets (Dec. 5, 2002) a listing of 40 RecB presumed orthologs, all from Bacteria. These multiple FASTAP-format sequences may then be used in a multiple alignment program such as CLUSTALW to produce a phylogenetic tree.

Copyright ©1998-2005 NeuroGadgets Inc. ©2006 University of Queensland

Back to our Home Page