Description of the genome query for
Retrieve ORF sequences
The user is prompted to enter a consensus gene name, or a protein PID number (gi number).
Type it in, or copy and paste it from the results of a previous query.
Six self-explanatory options are available:
Retrieve genes with the specified consensus gene name: finds each gene in each
genome with that consensus gene name.
Best match in other genomes returns self, and the best BLASTP hit better than the
chosen cutoff e-value from each of the database's other genomes.
Recursive best match in other genomes returns self, and the best BLASTP hit better
than the chosen cutoff e-value from each of the database's other genomes as above.
Here, however, the process is repeated where each hit finds its best match in
each genome.
Reciprocal best match in other genomes is like the previous option, but
the specified protein must be the target's best match as well. We currently do not
deal with ties, so the set of protein sequences returned may not be as complete as
it should be...
Gene family within a genome returns self, and any and all matches better than
the chosen cutoff e-value within the same genome.
Finally, one may retrieve the single sequence corresponding to the specified
gi number.
Caution!
This query operates using BLASTP hits, and may return proteins unrelated to one
another if the search sequence is a fusion of protein domains that are separate in
some genomes. Perform a multiple sequence alignment in order to control for this
problem.
Example:
If one specifies 7190047 as the gi number, 1.0e-10 as the BLASTP cutoff e-value,
and asks for reciprocal best matches, one gets (Dec. 5, 2002) a listing of 40 RecB
presumed orthologs, all from Bacteria. These multiple FASTAP-format sequences may
then be used in a multiple alignment program such as CLUSTALW to produce a phylogenetic
tree.
Copyright ©1998-2005 NeuroGadgets Inc. ©2006 University of Queensland