Abstract

Selenium is an essential micronutrient for all living creatures. The role of selenium has been attributed largely to its presence in selenoproteins as the 21st aminoacid, selenocysteine (Sec). Sec is encoded by TGA in DNA. A unique mechanism is used to decode the UGA codon in mRNA to co-translationally incorporate Sec into the growing polypeptide.



The aim of this project was to identify and annotate the Varanus komodoensis (Komodo Dragon) selenoproteins, selenoprotein machinery factors and proteins related to selenium metabolism by performing gene prediction, phylogenetic reconstruction methods and analysis of gene structures and SECIS elements.



All in all, we determined 21 Selenoproteins, 3 Cys-containing homologous and 7 selenoprotein translation machinery proteins, giving an exhaustive depiction of the specie's selenoproteome.





Introduction

Selenoproteins

Selenoproteins are selenium-containing proteins that are present in all three domains of life: bacteria, archaea and eukarya8. Although selenoproteins represent diverse molecular pathways and biological functions, all these proteins contain at least one selenocysteine (Sec), a selenium-containing amino acid, and most serve oxidoreductase functions. Selenoproteins have biological functions in oxidoreductions, redox signaling, antioxidant defense, thyroid hormone metabolism, and immune responses. They thus possess a strong correlation with human diseases such as cancer, Keshan disease, virus infections, male infertility, and abnormalities in immune responses and thyroid hormone function12.

Figure 1. Eukaryotic selenoproteins' functions.



SELENOCYSTEINE

Selenocysteine is recognized as the 21 amino acid in ribosome-mediated protein synthesis and its specific incorporation is directed by the UGA codon. Selenocysteine has a structure similar to that of cysteine, but with an atom of selenium taking the place of the usual sulfur, forming a selenol group2.

Therefore, proteins that contain one or more selenocysteine are called selenoproteins.

Figure 2. Cysteine and Selenocysteine residues.





SELENOPROTEINS EVOLUTION

While the genome sequence and gene content are available for an increasing number of organisms, eukaryotic selenoproteins remain poorly characterized. The dual role of the UGA codon confounds the identification of novel selenoprotein genes5.

Because of the dual role of the UGA codon, identification of novel selenoproteins in eukaryotes is very difficult. The more direct approach is to search for occurrences of the SECIS structural pattern. Although this approach has been successfully applied in expressed sequence tag (EST) and other cDNA sequences, the low specificity of SECIS searches produces a large number of predictions when applied to eukaryotic genomes.

It should be mentioned that at larger evolutionary distances some selenoproteins are lost and a few evolve and duplicate. In addition, the selenoproteins are irregularly distributed among different species.The highest content of selenoproteins is observed in aquatic organisms, whether they are animals or plants7.

Furthermore, some species have lost Sec, replacing them by Cys-containing homologues, which do not always have the same function9.

The change from aquatic to terrestrial habitats posed a challenge to plants and animals, as the availability of some trace elements greatly diminished, and these organisms were now exposed to higher oxygen levels. As a result, many terrestrial organisms might have lost selenoproteins or replaced them with cysteine-containing homologs.



Figure 4. Evolution of the vertebrate selenoproteome.



Nevertheless, considerable progress has been made in characterizing the mechanisms and regulation of selenoprotein synthesis in eukaryotes. Significant discoveries were made recently that helped us better understand Se metabolic pathways, mechanism of Sec biosynthesis in eukaryotes, identities of selenoproteins, and functions of some of these proteins. However, the biological functions of most Sec-containing proteins remain unknown7.





Varanus komodoensis









Methods

This project aims to predict and annotate Varanus komodoensis’ selenoproteins, selenoprotein machinery factors and proteins related to selenium metabolism. To do so, a comparison with two different already annotated genomes was carried out: one from a reptile, the Anolis carolinensis, and the Human genome. Through local alignment, the potential proteins were predicted and annotated. Finally, the previously predicted proteins were characterized via global alignment and SECIS/Seblastian software.



Komodo's genome

Obtention of the analysed organsim genome.

Queries obtention

Obtention of the sequences of the reference species' selenoproteome.

Prediction process

Gene prediction based on the reference species' selenoproteome.

SECIS characterisation

Use of SEBLASTIAN and SECISearch3 software to analyse potential SECIS elements.

Phylogenetic tree

Supplementation of protein families predictions with phylogenetic reconstruction methods.





Results

The results of the analysis of Varanus komodoensis' selenoproteome are shown in the following table, with the corresponding data for each protein and reference species.



The icons and abreviations used in the table are defined in the legend:

· -Homo sapiens' protein

· -Anole carolinensis' protein

· -Gallus gallus' protein

·None-No output.

·Sec-Selenocysteine-containing protein.

·Cys-Cys-homologous.

·Mac-Machinery. Non-Sec and non-Cys containing protein.

·ID-Incomplete data. The presence or position of Cys or Sec was unnpredictable.



Selenoproteins - Selenoproteins Machinery



SELENOPROTEINS

Protein Specie Query BLAST Scaffold Strand Position
(start-end)
Exonerate T coffee Residue SECIS Protein prediction
Iodothyronine deiodinases (Dio)
DI1 SJPD01000026.1 + 4652779-4652928 Sec
DI1 SJPD01000026.1 + 4652788-4653850 Sec
DI2 SJPD01000003.1 - 39328645-39341228 Sec
DI2 SJPD01000026.1 + 4652791 -4652928 Sec
DI3 SJPD01000026.1 - 4652761-4652928 Sec
DI3 SJPD01000003.1 + 52459709-52460527 Sec
Glutathione peroxidases (GPx)
GPx1 SJPD01000040.1 + 929848-930189 Sec
GPx1 SJPD01000040.1 + 92663-930189 Sec
GPx2 SJPD01000040.1 + 926621-930195 Sec
GPx2 SJPD01000040.1 + 926621-930195 Sec
GPx3 SJPD01000009.1 + 17714506-17721381 None None
GPx3 SJPD01000009.1 + 17714506-17721381 None None
GPx4 SJPD01000092.1 - 2048178-2048609 Sec
GPx4 SJPD01000092.1 - 2048178-2048339 None
GPx5 SJPD01000009.1 + 17718089-17721378 None None
GPx6 SJPD01000009.1 + 17714506-17721378 None None
GPx7 SJPD01000026.1 - 3547261-3558029 None None
GPx7 SJPD01000026.1 - 3547252-3547518 None None
GPx8 SJPD01000005.1 + 4298996-4300345 None None
GPx8 SJPD01000005.1 + 4300082-4300345 None None
GPx8 SJPD01000005.1 + 4300082-4300342 None None
Methionine sulfoxide reductase(MsrA)
MsrA SJPD01000076.1 - 149906-162828 None None
MsrA SJPD01000076.1 - 149907-164533 ID None
Selenoprotein 15 (Sel15)
Sel15 SJPD01000031.1 - 5148833-5149003 None None
Sel15 SJPD01000031.1 - 5115475-5149003 None None
Selenoprotein I (SelI)
SelI SJPD010000126.1 - 951142-964258 Sec
SelI SJPD01000126.1 - 951124-951318 Sec
Selenoproteins S and K (SelS and SelK)
SelS None
SelS SJD01000057.1 - 5580804-5582539 Sec
SelK SJPD01000007.1 + 14408200-14408295 None None
SelK SJPD01000007.1 + 14408200-14408295 Sec
Selenoprotein M (SelM)
SelM SJPD01000053.1 + 3410972-3411133 Sec
SelM SJPD01000053.1 + 3410086-3411103 Sec
Selenoprotein N (SelN)
SelN SJPD01000036.1 + 7844146-782210 Sec
SelN SJPD01000036.1 + 7844146-7862222 Sec
Selenoprotein O (SelO)
SelO SJPD01000016.1 - 1978997-1986356 Sec
SelO SJDP01000020.1 - 22000032-22009673 Cys None
Selenoprotein P (SelP)
SelP SJDP01000005.1 - 747511-749227 Sec
SelP SJPD100000005.1 - 747508-749227 Sec
Methionine-R-sulfoxide reductase (Sel R or MSRB)
MSRB1 SJPD000085.1 + 1870578-1872496 Cys
MSRB1 SJPD000085.1 + 1870593-1872496 Sec
MSRB2 SJPD000020.1 - 23300656-23299803 ID None
MSRB2 SJPD000020.1 - 23300659-23299803 ID None
MSRB3 SJPD1000017.1 - 758769-805563 Cys None
MSRB3 SJPD1000017.1 - 764575-819877 Cys None
Selenoproteins W, T, H and V (SelW, SelT, SelH and SelV)
SelW1 SJPD01000036.1 - 11391124-11391029 None
SelW1 SJPD01000036.1 - 11391103-11391029 ??
SelW2 SJPD01000061.1 - 3248079-3250889 Sec
SelT SJPD01000032.1 + 13468766-13471916 Sec
SelT SJPD01000032.1 + 13468766-13471916 Sec
SelH SJPD01000122.1 + 569175-569258 Sec
SelH SJPD01000122.1 + 569172-572138 Sec
SelV None
Selenoprotein U (SelU)
SelU1 SJPD01000014.1 + 6965627-6970639 Cys None
SelU1 SJPD01000014.1 + 6965164-6972460 Sec
SelU2 SJPD01000005.1 + 41696948-41801490 Cys None
SelU2 SJPD01000005.1 + 41746945-41751490 Cys None
SelU3 None
Thioredoxin Reductases (TRs)
TR1 SJPD01000048.1 - 5896400-5909749 Sec
TR1 SJPD01000048.1 - 5896224-5916641 Sec
TR2 SJPD01000007.1 - 1591769-1624455 None None
TR3 SJPD01000007.1 + 20925190-20939399 Sec
TR3 SJPD01000007.1 + 20905682-20939417 Sec None




Selenoproteins - Selenoproteins Machinery



SELENOPROTEINS MACHINERY

Protein Specie Query BLAST Scaffold Strand Position
(start-end)
Exonerate T coffee Residue SECIS Protein prediction
Sec-specific eukaryotic elongation factor (eEFsec)
eEFsec SJPD01000007.1 - 18589990-18701233 Mac None
eEFsec SJPD01000007.1 - 18630044-18649127 Mac none
phosphoseryl-tRNA kinase (PSTK)
PSTK SJPD01000060.1 + 674799-683674 Mac none
SECIS binding protein 2 (SBP2)
SBP2 SJPD1000041.1 - 7176579-7175318 Cys
SBP2 SJPD1000041.1 - 7191828- 7171816 Cys None
Sec synthase (SecS)
SecS SJPD01000012.1 - 13832197-13801608 Mac None
Selenophosphate synthase family (SPS or SPHS)
SPS1 SJPD01000016.1 - 15068382-15092907 Mac None
SPS1 SJPD01000016.1 - 15068382-15092907 Mac None
SPS2 SJPD01000051.1 - 3708299-3713123 None None
SPS2 SJPD01000051.1 - 3708260-3713129 Sec
SECp43 (SECp43)
SECp43 SJPD01000036.1 + 4926809-4935614 Mac None




The absolute values of the exons and SECIS elements from above are available in Excel or pdf format.





Discussion

As it has been said before, the aim of this project was to determine every selenoprotein and machinery genes present in Varanus komodoensis’ genome. To do so, the Komodo genome was compared with human and lizard selenoproteins. We have chosen these two species because on the one hand, the human genome is very well annotated, whereas the lizard genome is a closely relative specie. Plus, those cases requiring supplemental data, chicken selenoproteome was also considered. Moreover, a SECIS elements prediction was done via Seblastian and SECISearch3. After that, and taking into account all the results obtained, a discussion was done protein by protein.



It has to be said that there are some criteria that we have considered in order to determine whether the predicted protein was a selenoprotein, a cys-containing homolog or none of them. The predicted protein will be considered as a selenoprotein when a UGA codon is aligned with a selenocysteine (or with a cysteine) in the query and the following alignment has to be perfect. In addition, it is necessary to find a SECIS element in the 3’ UTR region. On the other hand, the predicted protein will be considered a cys-containing homolog when a cysteine in Varanus komodoensis aligned with a selenocysteine in the query. As in the previous case, the alignment that follows the cysteine has to be perfect and in some cases, SECIS elements can also be found as they were initially selenoproteins. Finally, when the protein has lost its selenocysteine but replaced it with any another amino acid (except cysteine) will be classified as other proteins.

post-img
44 proteins
post-img
9 proteins

Conclusion

The aim of the project was to identify and annotate Varanus komodoensis selenoproteins, selenoprotein machinery factors and proteins related to selenium metabolism.



Although the importance of selenoproteins, selenoproteomes are not properly annotated in all organisms. This is due to the fact that the codon coding for Selenocysteine is UGA, which is also a codon STOP. Therefore, our project is important, as selenoproteome of Varanus komodoensis had not been characterised before.



Here, a summary of the project is presented. We have obtained the following characterization:

- Selenoproteins: DI1, DI2, DI3, GPx1, GPx2, GPx4, SelS, SelT, SelU1, TR1, TR3, Sel15, SelH, SelI, SelK, SelM, SelN, SelO, SelP, SelR1, SPS2.

- Cys-containing homolog: SelW2, SelU2, SBP2.

- Selenoprotein machinery: eEFSec, PTSK, SBP2, SPS1, SPS2, SecS, SECp43.

- Non-Selenoproteins: GPx3, GPx5, GPx6, GPx7, GPx8, MsrA, SelR2, SelR3, SelW1, TR2.

- Proteins not present in Varanus komodoensis: SelV, SelU3.



References

1. Annual Reviews. (2019). Selenium biochemistry. [online] Available at: https://www.annualreviews.org/doi/pdf/10.1146/annurev.bi.59.070190.000551 [Accessed 14 Nov. 2019].

2. Annual Reviews. (2019). Selenocysteine. [online] Available at: https://www.annualreviews.org/doi/abs/10.1146/annurev.bi.65.070196.000503 [Accessed 14 Nov. 2019].

3. Burden, W. Douglas (1927). Dragon Lizards of Komodo: An Expedition to the Lost World of the Dutch East Indies. Kessinger Publishing.

4. Berry MJ, Banu L, Harney JW, Larsen PR. (1993). Functional characterization of the eukaryotic SECIS elements which direct selenocysteine insertion at UGA codons.

5. Bertz M, Kühn K, Koeberle S, Müller M, Hoelzer D, Thies K et al. Selenoprotein H controls cell cycle progression and proliferation of human colorectal cancer cells. Free Rad Bio Med. 2018;127: 98-107.

6. Castellano, S., Novoselov, S., Kryukov, G., Lescure, A., Blanco, E., Krol, A., Gladyshev, V. and Guigó, R. (2019). Reconsidering the evolution of eukaryotic selenoproteins: a novel nonmammalian family with scattered phylogenetic distribution.

7. Horibata Y, Elpeleg O, Eran A, Hirabayashi Y, Savitzki D, Tal G et al. EPT1 (selenoprotein I) is critical for the neural development and maintenance of plasmalogen in humans. J Lipid Res. 2018;59(6):1015-1026.

8. Labunskyy, V., Hatfield, D. and Gladyshev, V. (2019). Selenoproteins: Molecular Pathways and Physiological Roles.

9. Lobanov, A., Hatfield, D. and Gladyshev, V. (2019). Eukaryotic selenoproteins and selenoproteomes.

10. Lutz, Richard L; Lutz, Judy Marie (1997). Komodo: The Living Dragon.

11.Marciel M, Hoffmann P. Molecular Mechanisms by Which Selenoprotein K Regulates Immunity and Cancer. Biol Trace Elem Res. 2019;192(1): 60-68.

12. Mariotti M., Ridge PG., Zhang Y., Lobanov AV., Pringle TH., Guigo R. (2012). Composition and Evolution of the Vertebrate and Mammalian Selenoproteomes.

13. Papp LV, e. (2019). From selenium to selenoproteins: synthesis, identity, and their role in human health. - PubMed - NCBI. [online] Ncbi.nlm.nih.gov. Available at: https://www.ncbi.nlm.nih.gov/pubmed/17508906 [Accessed 14 Nov. 2019].

14. PR, D. (2019). Mechanism and regulation of selenoprotein synthesis. - PubMed - NCBI. [online] Ncbi.nlm.nih.gov. Available at: https://www.ncbi.nlm.nih.gov/pubmed/12524431 [Accessed 14 Nov. 2019].

15. Sciencedirect.com. (2019). Selenoproteins - an overview | ScienceDirect Topics. [online] Available at: https://www.sciencedirect.com/topics/neuroscience/selenoproteins [Accessed 14 Nov. 2019].

16. Toppo S, Vanin S, Bosello V, Tosatto SC (2008) Evolutionary and structural insights into the multifaceted glutathione peroxidase (gpx) superfamily. Antioxid Redox Signal 10: 1501–1514.

17. Vitt, L. and Auffenberg, W. (1982). The Behavioral Ecology of the Komodo Monitor.



Acknowledgements

We wanted to express our gratitude to the subject coordinator, Roderic Guigó, who enhanced us into the world of bioinformatics. Thanks also to our supervisor, Hrant Hovhannisyan, for helping us in developing our project and guiding in every step. Finally, thanks to Universitat Pompeu Fabra for the opportunity of carrying out a practical project, for the material and the several softwares’ used during our work.

>

About us

We are four 4th-grade Human Biology students from University Pompeu Fabra (Barcelona). This project is part of the Bioinformatics course carried out from September until December 2019. Please feel free to contact us for any doubt or question about our work.

MEET THE TEAM

Ask anything, contact us.