Protein Homology
What is homology?
Figure 1: An example of homology among the bones of different species.
|
Homology is the idea that shared ancestry can be inferred based on the presence of similar structures between species [1]. The most commonly known image associated with homology is the one shown in Figure 1, where seemingly very different species share homologous bone structures, indicating that they likely share a common ancestor.
In addition to macro structures like bones, homology can be found in gene and protein sequences. This is done by comparing either genomic DNA sequences or protein amino acid sequences and determining how identical and similar they are. In the case of amino acid sequences, there may be instances where species do not share an identical amino acid but the corresponding amino acid may be similar enough (eg. they are both hydrophobic) and therefore have a similar function. These amino acids would be considered similar. Therefore, a sequence may not be all that identical to the sequence it is being compared to, but it may be very similar.
|
Methods
Possible homologs of the HFE protein were determined using Homologene, a database sponsored by the NCBI. Then the BLAST tool was used to search for sequences in other species that were homologous, or similar, to the human sequence. This tool is very effective for doing this because it aligns sequences from a library of genes and determines how many identical and similar base pairs there are between the two sequences. This gives an estimate of how similar the human gene is to the homologous gene in another species.
Data
Figure 2: Bar graph indicating the percent of identical and similar matches between the HFE homologue protein sequences of various species and the human HFE protein
Homologue Reference Pages and Numbers
Human
H. sapiens HFE Accession Number: NP_000401.1 FASTA Chimpanzee HFE P. troglodytes Accession Number: NP_001009101.1 E value: 0.0 FASTA Identity: 100% Similarity: 100% Rhesus macaque M. mulatta HFE Accession Number: NP_001247505.1 E value: 0.0 FASTA Identity: 94% Similarity: 96% Bovine B. taurus HFE Accession Number: NP_001012399.1 E value: 7e-168 FASTA Identity: 76% Similarity: 85% Brown rat R. norvegicus HFE Accession Number: NP_445753.1 E value: 1e-148 FASTA Identity: 70% Similarity: 80% House mouse M. musculus HFE Accession Number: NP_034554.2 E value: 1e-150 FASTA Identity: 69% Similarity: 80% |
Fruit Fly
D. melanogaster Malvolio Accession Number: NP_732584.1 E value: 0.63 FASTA Identity: 43% Similarity: 50% Arabidopsis IRT1 Accession Number: CAE30485.1 E value: 0.81 FASTA Identity: 41% Similarity: 47% C. elegans DRAG-1 Accession Number: ADI88501.1 E value: 8.6 FASTA Identity: 36% Similarity: 50% Zebrafish D. rerio UXA2 Accession Number: NP_956879.1 E Value: 2 e-47 FASTA Identity: 43% Similarity: 51% Yeast S. cerevisiae Aft1p Accession Number: NP_011444.1 E value: 3.4 FASTA Identity: 26% Similarity: 34% |
Results and Analysis
Almost all of the homologues of the HFE protein are found in mammals. This makes sense as HFE is involved in iron levels in the blood, so it will tend to only be found in animals that have some form of a cardiovascular system. When blasting for the primates, the chimpanzee had closest match to the human HFE gene with 100% identical matches. This was followed by the Rhesus macaque which had 94% identical matches and 96% similarity. It is not surprising that primates would have such strong homology as they are considered human’s closest relative. The low E values indicate that this similarity is highly statistically significant. In fact, all of the vertebrates (including D. rerio, also known as zebrafish) have very low E values, indicating that these homologues share very similar functions to the HFE protein. In addition, all of these vertebrate homologues share almost the exact same domains as H. sapiens: MHC_1 and IGc1 (see the Domains page).
The invertebrates, however, have a higher E values. This indicates that while these proteins are similar enough to the human sequence to be considered a homologue, they likely do not function exactly like the human homologue. In fact, when you get into the invertebrates the domains of these homologues are completely different from the human HFE protein.
One thing that I found particularly intriguing in the homology was how much more similar the Arabidopsis homologue, which is a plant, was to the human compared to other invertebrates like C. elegans and yeast. Perhaps the plant has developed an iron uptake pathway that is more similar to the human than these vertebrates or perhaps Arabidopsis relies on iron more than C. elegans or yeast.
The C. elegans homologue was found via a search on Worm Base, and the D. melanogaster homologue was found on FlyBase, rather than through BLAST and Homologene searches like the other homologues.
The invertebrates, however, have a higher E values. This indicates that while these proteins are similar enough to the human sequence to be considered a homologue, they likely do not function exactly like the human homologue. In fact, when you get into the invertebrates the domains of these homologues are completely different from the human HFE protein.
One thing that I found particularly intriguing in the homology was how much more similar the Arabidopsis homologue, which is a plant, was to the human compared to other invertebrates like C. elegans and yeast. Perhaps the plant has developed an iron uptake pathway that is more similar to the human than these vertebrates or perhaps Arabidopsis relies on iron more than C. elegans or yeast.
The C. elegans homologue was found via a search on Worm Base, and the D. melanogaster homologue was found on FlyBase, rather than through BLAST and Homologene searches like the other homologues.
References
1. Understanding evolution: http://evolution.berkeley.edu/evolibrary/article/lines_04
2. Homologene: http://www.ncbi.nlm.nih.gov/homologene
3. NCBI: http://www.ncbi.nlm.nih.gov/
4. BLAST: http://blast.ncbi.nlm.nih.gov/Blast.cgi
5. WormBase: http://www.wormbase.org/#01-23-6
6. FlyBase: http://flybase.org/
2. Homologene: http://www.ncbi.nlm.nih.gov/homologene
3. NCBI: http://www.ncbi.nlm.nih.gov/
4. BLAST: http://blast.ncbi.nlm.nih.gov/Blast.cgi
5. WormBase: http://www.wormbase.org/#01-23-6
6. FlyBase: http://flybase.org/