Discussion



Discussion

As it has been said before, the aim of this project was to determine every selenoprotein and machinery genes present in Varanus komodoensis’ genome. To do so, the Komodo genome was compared with human and lizard selenoproteins. We have chosen these two species because on the one hand, the human genome is very well annotated, whereas the lizard genome is a closely relative specie. Plus, those cases requiring supplemental data, chicken selenoproteome was also considered. Moreover, a SECIS elements prediction was done via Seblastian and SECISearch3. After that, and taking into account all the results obtained, a discussion was done protein by protein.



It has to be said that there are some criteria that we have considered in order to determine whether the predicted protein was a selenoprotein, a cys-containing homolog or none of them. The predicted protein will be considered as a selenoprotein when a UGA codon is aligned with a selenocysteine (or with a cysteine) in the query and the following alignment has to be perfect. In addition, it is necessary to find a SECIS element in the 3’ UTR region. On the other hand, the predicted protein will be considered a cys-containing homolog when a cysteine in Varanus komodoensisis aligned with a selenocysteine in the query. As in the previous case, the alignment that follows the cysteine has to be perfect and in some cases, SECIS elements can also be found as they were initially selenoproteins. Finally, when the protein has lost its selenocysteine but replaced it with any another amino acid (except cysteine) will be classified as other proteins.





Selenoproteins

Iodothyronine deiodinase family (DIO)

The iodothyronine deodinase (DI) family of selenoproteins is constituted by three paralogous proteins in mammals (DI1, DI2 and DI3), which are involved in regulation of the thyroid hormone activity by reductive deodination8. The main thyroid hormone which is produced by thyroid is secreted in the inactive form, thyroxine (T4). These proteins’ family metabolize T4 to inactivate or deactivate it. Surprisingly, homologs of mammalian deodinases occur not only in other vertebrates, they are also found in simple eukaryotes and bacteria. The function of deodinase homologs in these organisms is unknown. These proteins have different subcellular localizations and tissue expression. All of them are transmembrane proteins which have a thioredoxin fold and form homodimers. The circulating levels of thyroid hormone are basically regulated by DI1 activity. However, DI2 and DI3 have been implicated in fine-tuning local intracellular T3 concentrations in a tissue-specific manner, without changing overall serum levels of T3. All of the iodothyronine deiodinases contain Sec. The active-site Sec residue is located in the N-terminal part of the protein. However, in DI2, an additional Sec, whose function is unknown, is present in the C-terminal region. Other authors state that it appears that the primary function of the second UGA is to serve as stop codon. This second Sec does not participate in the catalytic mechanism and is dispensable for the D2 functional activity. The 3 DIs have a low sequence identity (50%), but they have similar overall topology and structural organization12. The three Iodothyronine deiodinases (DI1, DI2 and DI3) are found in all the vertebrates.

Figure. Chemical processes mediated by Sec-containing enzymes (DIO).





Glutathione peroxidases (GPx)

Glutathione peroxidases are the largest selenoprotein family in vertebrates, and in addition, they are found in single-cell eukaryotes and prokaryotes. GPxs play physiological functions in organisms involved in hydrogen peroxide (H2O2) signaling, detoxification of hydroperoxides and maintaining cellular redox homeostasis. Selenoproteins of the glutathione peroxidase (GPx) family are widespread in all three domains of life. In mammals, there are eight GPx paralogs, from which five (GPx1, GPx2, GPx3, GPx4, and GPx6) contain a Sec residue in their active site. In the other three GPx homologs (GPx5, GPx7, and GPx8), the active-site Sec is replaced by Cys. Moreover, GPx6 homologs in some mammals are not selenoproteins and have a Cys in the active site8. They all are highly conserved, and this is even more evident in Sec-containing GPx (in the case of mammalian GPx, they share approximately 80% of identity)12. GPx1, GPx2, GPx3 and possibly GPx6 work as tetramers, whereas GPx4 is a monomer. This protein family includes the two types of selenoproteins mentioned in the introduction of this website. GPx1 is a stress-related selenoprotein which is highly regulated by Selenium availability (its expression decreases drastically when there is lack of Selenium), whereas GPx4 is a housekeeping selenoprotein (it is less affected by dietary Selenium status and often serve functions critical to cell survival)8.

Figure. Chemical processes mediated by Sec-containing enzymes (GPx).

We present here an unambiguous phylogeny of the GPx tree wherein three evolutionary groups were observed: GPx1/GPx2, GPx3/GPx5/GPx6, and GPx4/GPx7/GPx8. It appeared that Cys-containing GPx7 and GPx8 evolved from a GPx4-like selenoprotein ancestor, but this happened prior to separation of mammals and fishes. GPx5 and GPx6 are the most recently evolved GPxs, which appeared to be the result of a tandem duplication of GPx3 at the root of placental mammals. Interestingly, no Sec-containing GPx5 form could be identified. As phylogeny indicates that this protein evolved from a duplication of selenoprotein GPx3, the Sec to Cys displacement must have happened very early in the evolution of GPx5.







Methionine sulfoxide reductase (MSRA)

MsrA is a sulfoxide reductase, present in humans and some eukaryotic species. As the Methionine amino acid is highly susceptible to oxidation, it ends up being a mixt of Methionine-S-sulfoxide and Methionine-R-sulfoxide. Its function, complementary to MsrB1, is to reduce the Methionine-S-sulfoxide levels8. However, MsrA and MsrB have different structures and belong to different Selenoprotein families.

Figure. Chemical processes mediated by Sec-containing enzymes (MSR).

MsrA is a protein present in humans and in lizard. They present similar results and hits when performing the blast and comparing it with the Komodo’s genome. When performing the Tblast, we obtained hits in 2 different scaffolds (the same ones for both species). To choose the most interesting hits, we look into the phylogenetic tree. In the human’s comparison, the most relative hit to the original human protein was the ones located in the SJPD01000076.1 scaffold, between positions 149906 - 162828, and in the lizard’s case, it was the ones situated on the same scaffold, between positions 149907 - 164533.

So, as the lizard has got the protein we are analysing, we are going to take into account the results obtained comparing with this specie, because is more relative to the Komodo into phylogenetic terms. We observe that it corresponds to a gene with 5 exons situated on the reverse chain. This alignment doesn’t start with a Methionine. This could be because exonerate has not correctly predicted the structure of the gene and, therefore, these residues are lacking (including the Met at the beginning of every protein). Moreover, there is not a Selenocysteine aligned in the T-coffee results, so our next step was to do a Seblastian/SECISearch analysis, to detect if there are SECIS elements related to that gene, in order to predict if it could be a Cys-containing homolog, but we did not obtained valid positive SECIS elements related to our scaffolds.

So, in conclusion and taking into account the previous results, we cannot confirm that the MsrA protein is included in the Komodo, because of the T-coffee and the Seblastian results.



Selenoprotein 15 (Sel15)

This protein has a thioredoxin-like domain and contains an NTD signal peptide, consistent with its ER localization. Its function is the reduction or rearrangement of disulfide bonds in the ER-localized or secretory proteins8.

This protein is found in the lizard and human. When running the blast, hits are found in the scaffold SJPD01000031.1 between positions 5149003-5148833 and 5149003-5115475 in human and lizard respectively. In the case of the human, the protein predicted has 2 exons in the reverse strand, whereas in the case of the lizard the protein predicted has 4 exons also in the reverse strand. Both T-coffees present very high scores, but in the case of the human, there is a gap at the beginning and at the end, meaning that part of the protein is missing. However, in the case of the lizard, there are just a few amino acid changes and no gaps. In addition, the selenocysteine found in the predicted protein matches with the Selenocysteine in both queries. Regarding the results of the Seblastian, significant SECIS elements are found in 3’ UTR region (between positions 5108382-5108309 in human and 5115024-5114951 in lizard).

Therefore, we can conclude that Sel15 is a selenoprotein in Varanus komodoensis.





Selenoprotein I (SelI)

Selenoprotein I catalyzes phosphatidylethanolamine biosynthesis and plays an important role in the formation and maintenance of the membranes of the vesicles7.

In its analysis we found the same significant scaffold in human and lizard, SJPD010000126. Human’s positions were 951142 – 964258, in the reverse strand and containing 9 exons. Lizard’s ones were 960906-962118, also in the reverse strand and containing, this time, 10 exons. Both t coffee scores were high, 999 and 997, but only in the human predicted protein we had aligned X. In the Lizard results, the first part of the predicted protein was missing, probably because it was in another scaffold.

For this last reason, and thanks to the fact that selenoprotein had multiple isoforms in lizard, we performed another analysis. We chose, from this other isoform, the scaffold SJPD01000126, and the positions from 951124-951318. The resultant region had 3 exons, in the reverse strand, and a t coffee score of 995. X were aligned in the results this time so we could validate this selenoprotein’s presence by both species analysis.

After running the Seblastian, Selenoprotein I was predicted both in the human comparation and in the chosen isoform in the Lizard. With all of that, we could conclude that Varanus komodoensis has Selenoprotein I.



Selenoprotein K (SelK)

Selenoprotein K is involved in calcium flux in immune cells, T cell proliferation and neutrophil migration. It is also related with the endoplasmatic reticulum-associated degradation of glycosylated proteins and plays a key role in the protection of cells from the apoptosis induced by the ER stress11.

This protein is also found in both species and although many hits were found in many scaffolds when performing the blast, only one hit of each specie was left after the selection criteria. The scaffold which contained the hits was the same for the two of them, the SJPD01000007. Human’s hit was in positions 14408200 - 14408295 and lizard’s one in 14408200 – 14408295, both in the forward strand, containing 4 exons and starting with methionine. T coffee score was not especially high in human, being a 989, but it was pretty high in Lizard, with a 994. Analysing the results of the T coffee we see that in human there’s an X but it is not aligned in the predicted protein: three residues are lost in the same positions where in human we have a Sec. But then, when analysing the Lizard’s T coffee, we see that there’s an X aligned with another X. This makes us think that the Selenoprotein K exists in the Varanus komodoensis.

As we only found a relevant X alignment with the Lizard, we performed its Seblastian analysis and found one SECIS. We checked the position and saw it corresponded with what we predicted. Again, with that, we confirm the selenoprotein nature of SelK in Varanus komodoensis, although in the human comparison we didn’t find the Sec for the predicted protein probably due to its further phylogenetic distance with our specie.

Selenoprotein S (SelS)

As SelK, the protein Sel S is involved in the transfer of misfolded proteins from endoplasmatic reticulum to the cytosol. That means it is involved in recognition, ubiquitination, and retrotranslocation of protein substrates from the ER to the cytosol, and their subsequent degradation by the ubiquitin/proteasome.

This protein is found in both species. However, when we run the blast for human, no output is shown, meaning that any hit presents the conditions established (identity >50% and e-value <0,00001). However, when running the blast for the lizard genome, two significant hits were found between the positions 5580804-5582539 in the SJD01000057.1 scaffold. The T-coffee shows a perfect alignment (score = 1000), covering almost the entire protein sequence. In the predicted protein, 5 exons were found in the reverse strand. Thus, we can confirm that Selenoprotein S exists in Varanus Komodoensis genome.

In the T-coffee results, it can also be seen that Selenocysteine is conserved and the Seblastian results indicate the presence of a SECIS element in the reverse strand in 3’UTR (positions 5575992-5575917). Therefore, we can conclude that SelS is a selenoprotein in Varanus Komodoensis.



Selenoprotein M (SelM)

Selenoprotein M is a thiol-disulfide oxidoreductase that resides in the ER. It is highly expressed in the brain and has important neuroprotective properties. It has active site that consist of a selenocysteine-containing thioredoxin-like domain, that mediates thiol-disulfide exchange8.

As it has happened many times, we found many hits in different scaffolds in both species, but again, only one scaffold had hits that overcame the filtering criteria. The scaffold’s name is SJPD01000053 and it contained one hit in human, from 3410972 – 3411133 and two hits in Lizard, which extended from 3410086 – 3411103, both in the forward strands. Predicted proteins had 3 exons and their T coffee scores were very high, 998 and 999, respectively. Analysing these results, we saw that, for both comparisons, there were X in our queries, in the first positions, but they were not aligned because the first and the last amino acids of our predicted proteins were missing. This may be due to a bad prediction of the exonerate program.

However, as we had Sec in our analysis queries, we performed the Seblastian. There we could predict two SECIS (one in each strand) in both species. We analyzed the ones in the forward chains. After checking the positions and confirming they were coherent with our exons, we can confirm the presence of SelM in Varanus komodoensis.



Selenoprotein N (SelN)

Selenoprotein N is a protein that plays a key role in cell protection against oxidative stress and in the regulation of the calcium homeostasis related to redox mechanisms. It regulates the calcium levels in the endoplasmic reticulum taking care of the good functionality of the calcium pump12.

A lot of hits were found when performing the blast, but almost all of them (in fact, all of them except one from Human) were in the same scaffold: SJPD01000036. Human’s chosen positions extended from 7844146 – 782210, including 10 exons. Lizard’s ones were from 7844146 – 7862222, containing 11 exons. Both were in the reverse strands. Performing the T coffee, we obtained very high scores: 992 and 1000, but in the human analysis a lot of residues were missing in the beginning of our predicted protein, probably due to exonerate. We found X aligned analyzing the results of both t coffee.

We corroborated the results by using Seblastian and they showed us three SECIS elements, for both species. We discarded the ones in the positive strands, and we chose the ones which positions were close to our predicted protein’s exons.

For all of that, we can accept our prediction and confirm Selenoprotein N is in Varanus komodoensis.



Selenoprotein O (SelO)

Selenoprotein O is a protein that’s in charge of catalyzing the transfer of AMP to Ser, Thr and Tyr residues of target proteins (AMPylation). It is not well characterized but is thought to be a redox-active mitochondrial selenoprotein which interacts with redox target protein12.

Many hits were found in both comparations, but after filtering, we had only one scaffold with many hits in human and two significant scaffolds, with different hits, in lizard. The region predicted in human was in the reverse strand, scaffold SJPD01000016, extending from the positions 1978997 – 1986356 and had 10 exons. Its T coffee score was of 992, and it had 1 aligned X. In the comparation with the Lizard, we analyzed both filtered scaffolds, SJDP01000016 from the 1984204 to 1986122 and SJDP01000020, from 22000032 to 22009673. First region had 2 exons and the second one had 8, and they both had high T coffee scores. Unfortunately, no X were found in the alignments and no cysteines were aligned either and some residues of the predicted proteins were missing. However, this Selenoprotein O has an isoform in Lizard, so we analyzed this other query. In this one we also found many hits but only one scaffold, with two hits, overcame the filtering. It was SJDP01000020 and the region of interest 22025489 – 22025904. It had 2 exons and a 1000 score in T coffee. Again, no X were in the alignment but this time we found aligned cysteines. Even with that, we couldn’t conclude it was a cysteine homologous because in the initial query there were no X so maybe the aligned cysteines are just normal cysteines.

Performing the Seblastian for the human comparation we saw it predicted selenoprotein O. It is not how we expected it to be as lizard should be more similar to our specie, but maybe we found the alignment only between our query and the human genome because it’s better annotated than the lizard one. Again, with this, we can conclude that Selenoprotein O is found in Varanus komodoensis.



Selenoprotein P (SelP)

Selenoprotein P is a protein that contains more than one selenocysteine residue. It is a secreted protein, formed by two domains, that contains the majority of the selenium in plasma. Its function is not well known but it might have an extracellular oxidant defense role because its presence is correlated with selenium protection8.

Many hits were found in different scaffolds in both comparations. In human, after filtering, there was only a scaffold left with three different hits. The scaffold was SJDP01000005 and the studied sequence 747511 – 749227. It had three exons, located in the reverse strand. T coffee score was really high, 996, the protein predicted started with a methionine and there was an aligned X. With the Seblastian, we found SECIS and a selenoprotein.

Lizard had two isoforms for Selenoprotein P, and many scaffolds that overcame the filtering criteria. We chose one scaffold from each isoform based on its high t coffee scores and the presence of aligned X. In both cases, the scaffold chosen is the SJPD100000005. First isoform’s region has 3 exons, from 747511-747633. The second one has also 3 exons, and it’s from 747508-749227. Both of them had, as it has just been said, aligned X, and when performing the Seblastian, it found the selenoprotein. With that, we can confirm by using the data conferred by both species comparations, that Varanus komodoensis has selenoprotein P.



Selenoprotein R or Methionine-sulfoxide reductase (SelR or MSRB)

Selenoprotein R is known as Methionine-sulfoxide reductase and its function is to reduce methionine-sulfoxide back to methionine. It has three important isoforms, that we analyzed separately.

Figure. Chemical processes mediated by Sec-containing enzymes (MSR).



Selenoproteins W, T, H and V

Selenoproteins W (SelW), T (SelT), H (SelH) and V (SelV) belong to the Rdx family of selenoproteins. The members of this protein family possess a thioredoxin-like fold and are characterized by the presence of a conserved Cys-x-x-Sec motif. The Rdx family proteins are thiol-based oxidoreductases, but the exact function of any of these proteins remains unknown8.

Selenoprotein W (SelW)

SelW is one of the first identified Sec-containing proteins and is one of the most abundant selenoproteins in mammals. SelW belongs to the stress-related group of selenoproteins as its expression is highly regulated by the availability of Se in the diet. While SelW1 can be found in human with the conserved selenocysteine residue, SelW2 was lost in all tetrapods (but frog), and its homolog can be found under the name Rdx12 with a cysteine residue instead. The hypothesis is that before the split of amphibians SelW2 duplicated and was immediately converted to a Cys form generating Rdx12, and then SelW2 was lost prior to the split of reptiles12.

Selenoprotein T (SelT)

SelT belongs to the Rdx family of selenoproteins. The members of this protein family are characterized by the presence of a conserved Cysx-x-Sec motif12. This protein is predominantly localized to the ER and Golgi and is ubiquitously expressed both during embryonic development and in adult tissues8.It was proposed that the Rdx family proteins are thiol-based oxidoreductases, but the exact function of any of these proteins remains unknown12.

This protein is found in both species, human and lizard. The hits were all found in the same scaffold (SJPD01000032.1) between positions 13468766-13471916 in both cases. The predicted protein from the human presents 5 exons in the forward strand, whereas the protein predicted from the lizard presents 4 exons. Regarding the T-coffee results, we can see that both scores are very good. However, the one from the lizard alignment presents a perfect match (score = 1000). This is probably due to the fact that Varanus Komodoensis is more similar to the lizard than to the human.

Anyhow, the Selenocysteine is conserved in both cases and when running Seblastian, two SECIS elements are predicted also in both species. One of the SECIS is located in the reverse strand, so it is not valid. Therefore, we take into account only the SECIS element found in 3’ UTR positions (between positions 13472734-13472817 and 13472734-13472817 for human and lizard respectively).

Considering all this evidence, we can conclude that SelT is a selenoprotein found in Varanus Komodoensis.

Selenoprotein H (SelH)

Selenoprotein H is an oxidoreductase that plays an important role in neuron protection against UVB-induced damage, by inhibiting the pathways that lead to cell apoptosis, promoting the mitochondrial synthesis and function and suppressing cellular senescence through genome maintenance and regulation of redox reactions5.

This protein is found in both species, human and lizard, where many hits were found in many different scaffolds when performing the blasts. Only one hit in human and two hits in lizard overcame the cut-off criteria, all of them found in the same scaffold, SJPD01000122. The one from human was located in the positions 569175 – 569258 and the sequence studied from lizard went through the positions 569172 – 572138. T coffee scores were 980 in human and a perfect 1000 in Lizard, and both predicted proteins were found in the reverse strands. The one from human had three exons and started with a methionine, and the one from Lizard only had 2 exons. When analysing both T coffee results, aligned X were seen in both species, making us confirm the existence of Selenoprotein H in Varanus komodoensis.

Checking the Seblastian results, we found two SECIS in both species. We discarded the ones that were in the positive strand, and we saw that we had SECIS elements in the reverse strand in both species. We checked the position of these SECIS, and we corroborated that they were in an acceptable nucleotide distance. With all, we can definitely confirm the selenoprotein nature of SelH.

Selenoprotein V (SelV)

SelV recently evolved, most likely by duplication from SelW [Mariotti et al., 2012] and then it was modified by addition of N-terminal sequences, whose function is unclear. This protein is found only in placental mammals. However, it was specifically lost in some organisms including gorillas12.

This protein is only found in humans. However, when we run the blast, any prediction of a scaffold that could contain SelV in Varanus komodoensis is found. The reason is the same as in the previous case, probably the hits are not significant and the identities are lower than 50%. However, the results obtained make sense if we take into account the literature, because SelV is not supposed to be present in our organism.



Selenoprotein U (SelU)

In high mammalian species, such as humans and mice, all SelU proteins exist in Cys form, due to the Sec to Cys event that occurred in the early period of mammalian history for the SelU lineage. Three subfamilies of SelU family, SelU1, 2 and 3 are found in humans.The Prx-like2 structure domain presented in these proteins implies that they belong to the thioredoxin-like superfamily8.



Thioredoxin Reductases (TR)

Thioredoxin reductases (TRs) are oxidoreductases that, together with thioredoxin (Trx), comprise the major disulfide reduction system of the cell. In mammalian cells, there are three TR isozymes, all of which are Sec-containing proteins. These proteins contain a Sec residue in the COOH-terminal penultimate position12.

Figure. Chemical processes mediated by Sec-containing enzymes (TR).

We tried to perform a phylogenetic tree in order to confirm that the scaffold chosen in each case was the correct, but the tool used did not find any relevant alignment site, which usually means the alignment is not reliable. However, we know that, as these proteins belong to the same family, they should present homology. This is why we tried again the phylogenetic workflow without GBlocks, which eliminates poorly aligned positions. Therefore, although we have to be aware that this phylogenetic tree may not be 100% reliable, we can have an idea of the scaffolds that are more similar to the human query protein.

Regarding the tree, we can see that the scaffolds chosen are correct except in TR2. We chose the scaffold SJPD01000007 but maybe the scaffold SJPD01000047 presents more homology with the query protein.







Selenoprotein machinery

Sec-specific eukaryotic elongation factor(eEFsec)

This protein works as a Sec-specific eukaryotic elongation factor, recruiting tRNA[Ser]Sec and inserting the Selenocysteine as the UGA codon residue into de novo translated protein. This specificity is thought to be directed by the presence of SECIS elements in the 3’UTR region of the protein8.

All the genes of eEFSec from both reference species, one from human and three from lizard, were found in the SJPD01000007.1 scaffold.

When blasting, one of the lizard protein aligned in the reverse strand (18630044-18649127) similarly to the human one (18589990-18701233). The latest was predicted 6 exons in the reverse strand, whereas in the protein from lizard 3 exons were found. For each of the other two lizard proteins, 18580056-1859700 and 18648900-18649127, only one exon was annotated. Regarding the T-coffee results, a better global alignment is obtained from the lizard prediction, explained by the evolutionary proximity. While from the human prediction, less correspondence is observed, as an initial gap of 74 bp and a final one of 63bp are spotted.

To clarify the misleading data, the chicken’s eEFsec was considered, resulting with an alignment comprising all the exons annotated previously. With all that, and considering the proximity, almost overlapping, of the alignment and the alternation of exons from the different reference proteins and species, we can say that probably the lizard protein’s annotation from Selenodb is incomplete. However, it is confirmed that the eukaryotic eEFsec protein is conserved in Varanus komodoensis’ proteome.

In this case, no characteristic features from selenoproteins could be predicted, which makes sense being eEFSec a protein of selenoproteins machinery, so it is not a selenoprotein, confirming our assumptions.





phosphoseryl-tRNA kinase (PSTK)

PTSK (phosphoseryl-tRNA kinase) phosphorylates Ser-tRNA[Ser]Sec to produce the phosphorylated intermediate PSer-tRNA[Ser]Sec, serving as a substrate for SecS.

This protein is not found in the human genome, only found in lizard’s. When running the blast, 4 hits are found in the scaffold SJPD01000060.1 between positions 674799-683674. Also, from the global alignment obtained through T-coffee a good alignment can be seen. The predicted protein has 6 exons in the forward strand.

Regarding the selenoproteins characteristics, it does not contain a the Selenocysteine as we were expecting because we know that this protein is part of the machinery and not a selenoprotein itself. Although the Seblastian predicts one SECIS element it is located more than 10.000 base pairs away from our gene, so we consider that it is not valid.

Regarding the results, we can conclude that PSTK is present in Varanus komodoensis but it is not a selenoprotein.





SECIS binding protein 2 (SBP2)

SECIS binding protein 2 (SBP2) is one of the two trans-acting factors are required for efficient recoding of UGA as Sec in eukaryotes. SBP2 is stably associated with ribosomes and contains a binding domain that is known to bind SECIS elements with high affinity and specificity 8.

This protein is present in both human and lizard. When running the blast, different hits are found in two different scaffolds in both species. We finally decided to choose SJPD1000041.1 as it was the one that presented better alignment in the T-coffee.

Regarding the results obtained in the scaffold chosen, the hits were obtained between positions 7176579-7175318 and 7191828- 7171816 for human and lizard respectively. The protein predicted from the human has 6 exons, whereas the protein predicted from the lizard has 16 exons, both in the reverse strand. Although both scores of the T-coffee are the same (997), the protein predicted from the human presents a big gap at the beginning, which means that part of the protein is missing. Though, the protein predicted from the lizard does not present any gap, which makes sense since they are closely related species.

Analyzing the selenoprotein characteristics, we do not find any alignment between two selenocysteines and no SECIS element is found in the 3’ UTR region. However, we can see the alignment between cysteines. In the case of the lizard, SBP2 is a cysteine-containing homolog or a selenium machinery protein (both things are considered in SelenoDB 2). Nevertheless, we can not determine if one of those cysteines was a selenocysteine in the past or not. Moreover, any SECIS was found in the 3’ UTR region and Seblastian did not predict any known selenoprotein.

Therefore, taking into account all the results, we can say that SBP2 is present in Varanus komodoensis but it is not a selenoprotein.





Sec synthase (SecS)

The conversion of the serine moiety on tRNA[Ser]Sec to selenocysteyl-tRNA[Ser]Sec is catalyzed by Sec synthase (SecS), which incorporates selenophosphate, the active form of Se, into the amino acid backbone and forms Sec-tRNA 8.

This protein is only present in the lizard genome. When running the blast, 10 hits were obtained in the scaffold SJPD01000012.1 between positions 13832197-13801608. The predicted protein has 11 exons in the reverse strand. Regarding the T-coffee results, a perfect alignment can be seen and it starts with a methionine, so it is probably a real protein. The protein predicted does not contain a Selenocysteine as expected, because it is part of the machinery and it is not a selenoprotein itself. Moreover, not any valid SECIS element was found using Seblastian and also any selenoprotein was predicted in the sequence.

Therefore, we can confirm that Varanus komodoensis has SecS and it is not a selenoprotein.





SELENOPHOSPHATE SYNTHASE (SPS or SPHS)

This family of Selenoproteins consists of two proteins involved in selenophosphate synthesis with high homology to SelD from E. coli. Recently, new functions have been discovered, distinguishing SPS2 as a de novo selenophosphate synthase, while SPS1 may have a possible role in Sec recycling through a selenium rescue system, interacting with Sec Lyase8.



A phylogenetic tree of the protein family was constructed in order to distinguish the corresponding scaffold of each Selenophosphate synthase, as they presented a strong sequence similarity from its familiar origin. From its proximity, SJPD01000016 scaffold is considered the region comprising SPS1 protein in our specie’s genome and SJPD01000051 for SPS2 protein.





(SECp43)

SECp43 interacts with the tRNA[Ser]Sec forming a complex. It has a nuclear localization and it may work as a chaperone for SLA and Sec-tRNA[Ser]Sec, being linked to the regulation of the synthesis of selenoproteins through methylation of tRNA[Ser]Sec and the intracellular distribution of SLA 8.

This protein is only present in the case of the lizard. When we run the blast, only one significant scaffold is found (SJPD01000036.1) and the hits obtained are located between positions 4926809-4935614. The predicted protein has 4 exons in the forward strand. T-coffee shows a good identity, although it presents a gap at the beginning, maybe because exonerate has not correctly predicted the gene structure and, therefore, these firsts amino acids are missing (including the methionine). As expected, no alignment between two selenocysteines was found and the Seblastian did not predict any SECIS element and also no selenoproteins were predicted.

For all these reasons, we can confirm that Varanus komodoensis has SECp43 and it is not a selenoprotein.