Edwards
Lab


Lab Links
Homepage
EdwardsLab Blog
Research
People
Publications
Organisms
Genomes
Software
Lab Software
SLiMSuite Blog
Lab GitHub
Webservers
REST Servers
Bioware@UCD
Other Stuff
MapTime
UPGMA Walkthrough
Molecular Evolution Glossary

DCMF proteins - NCBI annotation

DISCLAIMER: The data on these webpages has not yet been published and is the intellectual property of UNSW. It is provided in good faith and should not be used prior to publication without consent from Mike Manefield, Richard Edwards or Matt Lee at UNSW.

Protein sequences were predicted using the NCBI annotation pipeline, available as entry CP017634. Proteins were further annotated via high-throughput homology searching, multiple sequence alignment and molecular phylogenetics using HAQESAC and MulitHAQ. The full list of NCBI proteins can be found in the table, below. Where possible, NCBI annotations are also mapped onto the original JGI Genome Portal annotation, used for TaxaMap analysis, which identifies the taxonomic grouping of the clade to which that protein was found to belong. The full list of JGI proteins is also available for browsing (click here).

Each protein was subject to a BLAST+ (blastp) search against all NCBI and JGI proteins annotated for DCMF, all bacterial proteins in the UniProt Knowledgebase (download 2017-02-06), and the published proteomes for a set of closely related bacteria as identified from a 16S phylogeny (see paper for details). HAQESAC was used to iteratively generate and clean up Clustal Omega multiple sequence alignments to produce a high quality alignment against a set of close homologues. The neighbor-joining tree implementation of Clustal W2 was used to make a phylogenetic tree (below). (NOTE: These alignments and trees are designed to give an automated first look at a protein. Where individual protein alignment and/or phylogenetic inference details are important, more careful analysis is recommended.)

Individual proteins can be looked at in further detail by clicking the protein ID. The JGI annotation also has predicted paralogues and in-paraloges predicted by HAQESAC. HAQESAC only returns the closest homologues and these paralogue lists may be incomplete as a result.

ncbi, NCBI protein ID (click ^ to open entry); protein, JGI locus tag; description, JGI description; len, length of protein; pos, position in DCMF genome (and strand).


© 2019 RJ Edwards. Contact: richard.edwards@unsw.edu.au.