Selenium is an essential micronutrient for all living creatures. The role of selenium has been attributed largely to its presence in selenoproteins as the 21st aminoacid, selenocysteine (Sec). Sec is encoded by TGA in DNA. A unique mechanism is used to decode the UGA codon in mRNA to co-translationally incorporate Sec into the growing polypeptide.
The aim of this project was to identify and annotate the Varanus komodoensis (Komodo Dragon) selenoproteins, selenoprotein machinery factors and proteins related to selenium metabolism by performing gene prediction, phylogenetic reconstruction methods and analysis of gene structures and SECIS elements.
All in all, we determined 21 Selenoproteins, 3 Cys-containing homologous and 7 selenoprotein translation machinery proteins, giving an exhaustive depiction of the specie's selenoproteome.
Selenoproteins are selenium-containing proteins that are present in all three domains of life: bacteria, archaea and eukarya8. Although selenoproteins represent diverse molecular pathways and biological functions, all these proteins contain at least one selenocysteine (Sec), a selenium-containing amino acid, and most serve oxidoreductase functions. Selenoproteins have biological functions in oxidoreductions, redox signaling, antioxidant defense, thyroid hormone metabolism, and immune responses. They thus possess a strong correlation with human diseases such as cancer, Keshan disease, virus infections, male infertility, and abnormalities in immune responses and thyroid hormone function12.
Figure 1. Eukaryotic selenoproteins' functions.
SELENOCYSTEINE
Selenocysteine is recognized as the 21 amino acid in ribosome-mediated protein synthesis and its specific incorporation is directed by the UGA codon. Selenocysteine has a structure similar to that of cysteine, but with an atom of selenium taking the place of the usual sulfur, forming a selenol group2.
Therefore, proteins that contain one or more selenocysteine are called selenoproteins.
Figure 2. Cysteine and Selenocysteine residues.
SELENOPROTEINS SYNTHESIS
Selenoprotein synthesis is an evolutionary conserved process. Nevertheless, some major differences are found in the mechanisms of selenoprotein synthesis between prokaryotes, eukaryotes, and archaea. The common feature to all organisms is the UGA-Sec codon, the specific tRNA, the SECIS element, and several protein factors10.
Selenocysteine is encoded by a UGA codon in the selenoprotein mRNA. The decoding of UGA as selenocysteine requires the reprogramming of translation because UGA i normally read as a stop codon9. This process requires multiple features such as the selenocysteine insertion sequence (SECIS) element and several protein factors including specific elongation factor EFSec and SECIS binding protein 2 (SBP2).
The SECIS element is essential in order to sintetize selenoproteins. It is located in the 3′ untranslated region (3′ UTR) of the mRNA. Eukaryotic SECIS elements are not highly conserved at the nucleotide level but they all form a similar stem-loop structure, which is composed of two helices separated by an internal loop. Thus, the SECIS element is a structure that provides the platform for the protein factors required for Sec incorporation. One or more of these elements may therefore be directly interacting with SBP2 and thus anchoring it to the ribosome4.
SECIS binding protein 2 (SBP2) is a novel protein that contains an RNA-binding motif, which is found in eukaryotic release factor 1 (eRF1) and ribosomal proteins. Mutagenesis studies demonstrated that this domain is required for the SECIS-binding activity of SBP2. This protein possesses a unique N-terminal domain and a C-terminal domain that is sufficient for all of the known functions of SBP2: SECIS binding, ribosome binding, and in vitro Sec incorporation.
In the selenoprotein synthesis procedure, SECIS element will be recognized by the protein SBP2. When joining, SBP2 recruits a specific elongation factor called eEFsec. This elongation factor is responsible for recruiting the specific tRNA that is linked to Selenocysteine (tRNAsec). This way, the UGA codon will be coded by a selenocysteine and, therefore, the translation will continue until the next STOP codon.
Figure 3. Selenoprotein syntheis procedure.
However, selenoprotein synthesis also depends on other genes that encode for other elements or that have a role in the synthesis itself or participate in the incorporation of selenocysteine. These elements are: Selenophosphate Synthetase (SPS1 and SPS2), Selenocysteine synthase (SLA/LP), Phosphoseryl tRNA Kinase (PSTK) and Selenocysteine associated protein (SECPP43).
SELENOPROTEINS EVOLUTION
While the genome sequence and gene content are available for an increasing number of organisms, eukaryotic selenoproteins remain poorly characterized. The dual role of the UGA codon confounds the identification of novel selenoprotein genes5.
Because of the dual role of the UGA codon, identification of novel selenoproteins in eukaryotes is very difficult. The more direct approach is to search for occurrences of the SECIS structural pattern. Although this approach has been successfully applied in expressed sequence tag (EST) and other cDNA sequences, the low specificity of SECIS searches produces a large number of predictions when applied to eukaryotic genomes.
It should be mentioned that at larger evolutionary distances some selenoproteins are lost and a few evolve and duplicate. In addition, the selenoproteins are irregularly distributed among different species.The highest content of selenoproteins is observed in aquatic organisms, whether they are animals or plants7.
Furthermore, some species have lost Sec, replacing them by Cys-containing homologues, which do not always have the same function9.
The change from aquatic to terrestrial habitats posed a challenge to plants and animals, as the availability of some trace elements greatly diminished, and these organisms were now exposed to higher oxygen levels. As a result, many terrestrial organisms might have lost selenoproteins or replaced them with cysteine-containing homologs.
Figure 4. Evolution of the vertebrate selenoproteome.
Nevertheless, considerable progress has been made in characterizing the mechanisms and regulation of selenoprotein synthesis in eukaryotes. Significant discoveries were made recently that helped us better understand Se metabolic pathways, mechanism of Sec biosynthesis in eukaryotes, identities of selenoproteins, and functions of some of these proteins. However, the biological functions of most Sec-containing proteins remain unknown7.
DESCRIPTION
Although captive species often weigh more, in the wild, an average adult male will weigh around 79 to 91kg and measure 2,59m, while an average female will weigh 68 to 73kg and measure 2,29m. Its unusually large size has been attributed to island gigantism, since no other carnivorous animals fill the niche on the islands where it lives. As a result of their size, these lizards dominate the ecosystems where they live. Komodo dragons hunt prey including invertebrates, birds and mammals10. The Komodo dragon has a tail as long as its body and about 60 serrated teeth. It also has a long, yellow forked tongue. Its skin is reinforced by armoured scales, which contains osteoderms (small bones).
HABITAT
The Komodo dragon is a species of lizard found in hot and dry places such as the Indonesian islands of Komodo, Rinca, Flores and Gili Motang.
BEHAVIOUR AND DIET
As an ectotherm, the Komodo dragon is more active during the day, although it exhibits some nocturnal activity. They are mostly solitary except for breeding and eating, when they come together. They can also dive up, climb trees and run in brief sprints12.
Komodo dragons are carnivores. They hunt prey including invertebrates, birds and mammals, although they have been considered as eating mostly carrion.
This project aims to predict and annotate Varanus komodoensis’ selenoproteins, selenoprotein machinery factors and proteins related to selenium metabolism. To do so, a comparison with two different already annotated genomes was carried out: one from a reptile, the Anolis carolinensis, and the Human genome. Through local alignment, the potential proteins were predicted and annotated. Finally, the previously predicted proteins were characterized via global alignment and SECIS/Seblastian software.
Obtention of the analysed organsim genome.
Obtention of the sequences of the reference species' selenoproteome.
Gene prediction based on the reference species' selenoproteome.
Use of SEBLASTIAN and SECISearch3 software to analyse potential SECIS elements.
Supplementation of protein families predictions with phylogenetic reconstruction methods.
The results of the analysis of Varanus komodoensis' selenoproteome are shown in the following table, with the corresponding data for each protein and reference species.
The icons and abreviations used in the table are defined in the legend:
· -Homo sapiens' protein
· -Anole carolinensis' protein
· -Gallus gallus' protein
·None-No output.
·Sec-Selenocysteine-containing protein.
·Cys-Cys-homologous.
·Mac-Machinery. Non-Sec and non-Cys containing protein.
·ID-Incomplete data. The presence or position of Cys or Sec was unnpredictable.
The absolute values of the exons and SECIS elements from above are available in Excel or pdf format.
As it has been said before, the aim of this project was to determine every selenoprotein and machinery genes present in Varanus komodoensis’ genome. To do so, the Komodo genome was compared with human and lizard selenoproteins. We have chosen these two species because on the one hand, the human genome is very well annotated, whereas the lizard genome is a closely relative specie. Plus, those cases requiring supplemental data, chicken selenoproteome was also considered. Moreover, a SECIS elements prediction was done via Seblastian and SECISearch3. After that, and taking into account all the results obtained, a discussion was done protein by protein.
It has to be said that there are some criteria that we have considered in order to determine whether the predicted protein was a selenoprotein, a cys-containing homolog or none of them. The predicted protein will be considered as a selenoprotein when a UGA codon is aligned with a selenocysteine (or with a cysteine) in the query and the following alignment has to be perfect. In addition, it is necessary to find a SECIS element in the 3’ UTR region. On the other hand, the predicted protein will be considered a cys-containing homolog when a cysteine in Varanus komodoensis aligned with a selenocysteine in the query. As in the previous case, the alignment that follows the cysteine has to be perfect and in some cases, SECIS elements can also be found as they were initially selenoproteins. Finally, when the protein has lost its selenocysteine but replaced it with any another amino acid (except cysteine) will be classified as other proteins.
The aim of the project was to identify and annotate Varanus komodoensis selenoproteins, selenoprotein machinery factors and proteins related to selenium metabolism.
Although the importance of selenoproteins, selenoproteomes are not properly annotated in all organisms. This is due to the fact that the codon coding for Selenocysteine is UGA, which is also a codon STOP. Therefore, our project is important, as selenoproteome of Varanus komodoensis had not been characterised before.
Here, a summary of the project is presented. We have obtained the following characterization:
- Selenoproteins: DI1, DI2, DI3, GPx1, GPx2, GPx4, SelS, SelT, SelU1, TR1, TR3, Sel15, SelH, SelI, SelK, SelM, SelN, SelO, SelP, SelR1, SPS2.
- Cys-containing homolog: SelW2, SelU2, SBP2.
- Selenoprotein machinery: eEFSec, PTSK, SBP2, SPS1, SPS2, SecS, SECp43.
- Non-Selenoproteins: GPx3, GPx5, GPx6, GPx7, GPx8, MsrA, SelR2, SelR3, SelW1, TR2.
- Proteins not present in Varanus komodoensis: SelV, SelU3.
1. Annual Reviews. (2019). Selenium biochemistry. [online] Available at: https://www.annualreviews.org/doi/pdf/10.1146/annurev.bi.59.070190.000551 [Accessed 14 Nov. 2019].
2. Annual Reviews. (2019). Selenocysteine. [online] Available at: https://www.annualreviews.org/doi/abs/10.1146/annurev.bi.65.070196.000503 [Accessed 14 Nov. 2019].
3. Burden, W. Douglas (1927). Dragon Lizards of Komodo: An Expedition to the Lost World of the Dutch East Indies. Kessinger Publishing.
4. Berry MJ, Banu L, Harney JW, Larsen PR. (1993). Functional characterization of the eukaryotic SECIS elements which direct selenocysteine insertion at UGA codons.
5. Bertz M, Kühn K, Koeberle S, Müller M, Hoelzer D, Thies K et al. Selenoprotein H controls cell cycle progression and proliferation of human colorectal cancer cells. Free Rad Bio Med. 2018;127: 98-107.
6. Castellano, S., Novoselov, S., Kryukov, G., Lescure, A., Blanco, E., Krol, A., Gladyshev, V. and Guigó, R. (2019). Reconsidering the evolution of eukaryotic selenoproteins: a novel nonmammalian family with scattered phylogenetic distribution.
7. Horibata Y, Elpeleg O, Eran A, Hirabayashi Y, Savitzki D, Tal G et al. EPT1 (selenoprotein I) is critical for the neural development and maintenance of plasmalogen in humans. J Lipid Res. 2018;59(6):1015-1026.
8. Labunskyy, V., Hatfield, D. and Gladyshev, V. (2019). Selenoproteins: Molecular Pathways and Physiological Roles.
9. Lobanov, A., Hatfield, D. and Gladyshev, V. (2019). Eukaryotic selenoproteins and selenoproteomes.
10. Lutz, Richard L; Lutz, Judy Marie (1997). Komodo: The Living Dragon.
11.Marciel M, Hoffmann P. Molecular Mechanisms by Which Selenoprotein K Regulates Immunity and Cancer. Biol Trace Elem Res. 2019;192(1): 60-68.
12. Mariotti M., Ridge PG., Zhang Y., Lobanov AV., Pringle TH., Guigo R. (2012). Composition and Evolution of the Vertebrate and Mammalian Selenoproteomes.
13. Papp LV, e. (2019). From selenium to selenoproteins: synthesis, identity, and their role in human health. - PubMed - NCBI. [online] Ncbi.nlm.nih.gov. Available at: https://www.ncbi.nlm.nih.gov/pubmed/17508906 [Accessed 14 Nov. 2019].
14. PR, D. (2019). Mechanism and regulation of selenoprotein synthesis. - PubMed - NCBI. [online] Ncbi.nlm.nih.gov. Available at: https://www.ncbi.nlm.nih.gov/pubmed/12524431 [Accessed 14 Nov. 2019].
15. Sciencedirect.com. (2019). Selenoproteins - an overview | ScienceDirect Topics. [online] Available at: https://www.sciencedirect.com/topics/neuroscience/selenoproteins [Accessed 14 Nov. 2019].
16. Toppo S, Vanin S, Bosello V, Tosatto SC (2008) Evolutionary and structural insights into the multifaceted glutathione peroxidase (gpx) superfamily. Antioxid Redox Signal 10: 1501–1514.
17. Vitt, L. and Auffenberg, W. (1982). The Behavioral Ecology of the Komodo Monitor.
We wanted to express our gratitude to the subject coordinator, Roderic Guigó, who enhanced us into the world of bioinformatics. Thanks also to our supervisor, Hrant Hovhannisyan, for helping us in developing our project and guiding in every step. Finally, thanks to Universitat Pompeu Fabra for the opportunity of carrying out a practical project, for the material and the several softwares’ used during our work.
We are four 4th-grade Human Biology students from University Pompeu Fabra (Barcelona). This project is part of the Bioinformatics course carried out from September until December 2019. Please feel free to contact us for any doubt or question about our work.
MEET THE TEAM
Ask anything, contact us.