Edwards
Lab

Lab Links
Homepage
EdwardsLab Blog
Research
People
Publications
Organisms
Genomes
Software
Lab Software
SLiMSuite Blog
Lab GitHub
Webservers
REST Servers
Bioware@UCD
Other Stuff
MapTime
UPGMA Walkthrough
Molecular Evolution Glossary

DCMRF Protein Ga0180325_114556 -

DISCLAIMER: The data on these webpages has not yet been published and is the intellectual property of UNSW. It is provided in good faith and should not be used prior to publication without consent from Mike Manefield, Richard Edwards or Matt Lee at UNSW.

This is the supplementary data page for predicted protein Ga0180325_114556 (predicted ) from an unusual, dichloromethane-fermenting Peptococcaceae strain, DCMF. DCMF was isolated by the Manefield group at UNSW before being sequenced, assembled and annotated in collaboration with the Edwards lab.

Protein sequences were predicted using prokka and the JGI Genome Portal annotation pipeline. Proteins were further annotated via high-throughput homology searching, multiple sequence alignment and molecular phylogenetics using HAQESAC and MulitHAQ to search each protein against all bacterial proteins in the UniProt Knowledgebase (download 2017-02-06) and the published proteomes for a set of closely related bacteria as identified from a 16S phylogeny (see paper for details). Putative taxonomic assignments for each protein were then made using TaxaMap, which identifies the taxonomic grouping of the clade to which that protein was found to belong:

protein	ncbi	prokka	jgi	description	len	pos	inpara	paralogues	genus	family	order	class	phylum	boot	spcode

protein, JGI locus tag; ncbi, NCBI protein ID (click ^ to open entry); prokka, prokka protein ID; jgi, JGI ID; description, JGI description; inpara, DCMF-specific "in-paralogues" identified by HAQESAC; paralogues, paralogues identified by HAQESAC; genus/family/order/class/phylum, TaxaMap taxonomy predictions based on well-supported HAQESAC clades; boot, bootstrap support (0-1) for TaxaMap clade; spcode, full list of Uniprot taxonomy species codes for HAQESAC clade.

NOTE: HAQESAC only returns the closest homologues and paralogue lists may be incomplete as a result.

More Proteins: Click here for a full list of JGI-annotated proteins and their TaxaMap classifications.

HAQESAC protein alignment and phylogeny

Each prokka protein was subject to a BLAST search against Uniprot bacteria and the other JGI proteins. HAQESAC was used to iteratively generate and clean up Clustal Omega multiple sequence alignments to produce a high quality alignment against a set of close homologues. The neighbor-joining tree implementation of Clustal W2 was used to make a phylogenetic tree (below). (NOTE: These alignments and trees are designed to give an automated first look at a protein. Where individual protein alignment and/or phylogenetic inference details are important, more careful analysis is recommended.) NCBI proteins have the species code DCMF, whereas JGI species codes are WON710A1.

The full alignment can be accessed via the download link below. Individual Uniprot homologues can be retrieved by visiting http://www.uniprot.org/, e.g. http://www.uniprot.org/uniprot/D9RYD4. NCBI proteins can be looked at in further detail by editing the following URL with the appropriate ATWXXXXX.X ID: http://www.slimsuite.unsw.edu.au/research/dcmf/dcmf-ncbi.php?protein=ATWXXXXX.X. JGI proteins can be looked at in further detail by editing the following URL with the appropriate GaXXXXX_XXXX ID: http://www.slimsuite.unsw.edu.au/research/dcmf/dcmf-jgi.php?protein=GaXXXXX_XXXX. HAQESAC only returns the closest homologues and these paralogue lists may be incomplete as a result. (The text tree link below can be useful for cutting and pasting protein IDs.)

Download: Raw protein (fasta) | Sequence alignment (fasta) | Phylogenetic tree (newick | text | png)

NOTE: If download links do not work and/or no alignment/tree appears, either the protein ID is incorrect, the predicted gene was non-coding, or insufficient homologues were found by HAQESAC.

HAQESAC phylogeny

See below for HAQESAC sequence alignment.

HAQESAC multiple sequence alignment

Loading Multiple Alignment...

EdwardsLab

DCMRF Protein Ga0180325_114556 -

HAQESAC protein alignment and phylogeny

HAQESAC phylogeny

HAQESAC multiple sequence alignment

Edwards
Lab