Results


Protein Selenocystein BLAST Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
GPx1
GPx2
GPx3
GPx4
GPx5
GPx6
GPx7
GPx8
TR1
TR2
TR3
DI1
DI2
DI3
SPS1
SPS2
Sel15
SelH
SelI
SelK
SelM
SelN
SelO
SelP
SelR1
SelR2
SelR3
SelS
SelT
SelU1
SelU2
SelU3
SelV
SelW1
SelW2
MsrA



Protein Selenocystein BLAST Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
SecS
Pstk
SECp43
SBP2
eEFSec


tRNA scan

All the organisms that codify for selenoproteins must contain the selenocysteine tRNA, which is necessary for their translation. In order to find the selenocysteine tRNA in the genome we used the software tRNAscan-SE. This is a tool that finds tRNAs in a given DNA or RNA sequence in fasta format.

We downloaded the executable and we ran it in UNIX command line with the following arguments:

tRNAscan-SE genome.fa -G -o "tRNA_output.txt"

The output contains all the tRNAs found in the genome. Then we extracted the selenocysteïne (TGA) tRNA:

egrep TGA tRNA_output > SeCys_tRNA.txt

Results:

  • gi|511782576|gb|CM001958.1| 26 44926657 44926738 Ser TGA 0 0 79.60
  • gi|511782576|gb|CM001958.1| 74 45966708 45966627 Ser TGA 0 0 78.00
  • gi|511782576|gb|CM001958.1| 91 45055841 45055760 Ser TGA 0 0 77.69
  • gi|511782591|gb|CM001941.1| 14 64208421 64208353 Pseudo TGA 0 0 31.91
  • gi|511782572|gb|CM001962.1| 30 2396433 2396365 Pseudo TGA 0 0 28.27
  • gi|511782569|gb|CM001965.1| 2 54746528 54746600 Ser TGA 0 0 26.70
  • gi|511782567|gb|CM001967.1| 7 43691336 43691268 Ser TGA 0 0 36.21
  • gi|511782565|gb|CM001969.1| 5 6020579 6020511 Ser TGA 0 0 30.23
  • gi|511782589|gb|CM001944.1| 1 5123406 5123474 Ser TGA 0 0 27.84
  • gi|511782586|gb|CM001947.1| 3 29257177 29257109 Ser TGA 0 0 32.42
  • gi|511782584|gb|CM001949.1| 1 63620871 63620790 Ser TGA 0 0 78.58

As we can see, there are several copies predicted in different scaffolds. However, the ones with highest cove score are located in scaffold gi|511782576|gb|CM001958.1 (3 copies) and in scaffold gi|511782584|gb|CM001949.1| (1 copy)


Discussion

GPx1

The selenoprotein GPx1 is located in the scaffold gi|511782571|gb|CM001963.1|between positions 10756519 and 10755188, on the reverse strand. The gene contains one intron and two exons.

Despite the fact that there are some discrepancies regarding the isoform annotation between human and macaque, we have found the same hit using both human and macaque GPx1 as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the GPx1 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated GPx1. Click here to see multiple sequences alignment.

We have found a single SeCys residue which is highly conserved among mammals. A single SECIS element (Grade A) has been predicted in the 3’UTR region (400pb downstream), which exhibits high sequence similarity between orthologous SECIS elements in macaque and orang-utan, but not in human. Click here to see mutliple sequence alignment.

Interestingly, no unassigned hits nor duplications nor pseudogenes have been predicted, although they have been reported in other primates.

So, the additional hits for GPx1 have been found in other genomic regions too. This fact is not surprising due to the high similarity between the members of the GPx family. Click here to see human GPx family alignemnt.

GPx2

The selenoprotein GPx2 is located in the scaffold gi|511782569|gb|CM001965.1| between positions 42154775 to 42151543, on the reverse strand. The gene contains 1 intron and 2 exons.
We have found the same hit using both human and macaque GPx2 as well as using Selenoprofiles. This can be explained by the high similarity between the two queries. Indeed, the GPx2 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated GPx2. Click here to see multiple sequence alignment.

We have found a single SeCys residue which is highly conserved among mammals. A single SECIS element (Grade A) has been predicted in the 3’UTR region (219 pb downstream), which surprisingly exhibits poor sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

No unassigned hits nor duplications nor pseudogenes have been predicted. However, additional hits for GPx2 have been found in other genomic regions. This fact is not surprising due to the high similarity between the members of the GPx family. Click here to see see human GPx family alignment.

GPx3

The selenoprotein GPx3 is located in the scaffold gi|511782570|gb|CM001964.1 between positions 53581652 and 53584451, on the forward strand. The gene contains 4 introns and 5 exons.

Despite the fact that there are some discrepancies regarding the Gpx3 annotation between human and macaque (mainly in the first region), we have found the same hit using both human and macaque GPx3 as well as using Selenoprofiles and Genewise. However, in this case the similarity is much better for the protein predicted by Selenoprofiles and the “manual search” using the macaque query.

The GPx3 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated GPx3. Click here to see multiple sequence alignment.

However, the first region of the human isoform is found neither in Chlorocebus sabaeus nor in the rest of mammals. It may be an alternative predicted isoform in human or an insertion that has taken place in the human lineage. We have found a single SeCys residue which is highly observed among mammals. Consistent with this, a SECIS element is predicted in the 3’UTR (948 pb downstream), which exhibits high sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

Interestingly, Selenoprofiles predicts a Gpx6-like protein instead of a Gpx3. This can be explained due to the high identity between GPx3 and GPx6. This alingment can be found here. This observation is consistent with the origin of GPx6. As described in (Mariotti et al, 2012) both Gpx6 and Gpx5 arose from a tandem duplication of GPx3 at the root of placental mammals. To confirm that the protein predicted was GPx3 and not Gpx6, we have performed an alignment between the predicted protein and GPx3 or GPx6, respectively: tcoffeeGPx3 tcoffeeGPx6. The result of the alignments clearly suggests that the protein predicted is GPx3. No unassigned hits nor duplications nor pseudogenes have been predicted. However, additional hits for GPx3 have been found in other genomic regions. This fact is not surprising due to the high similarity between the members of the GPx family. Click here to see the alignemnt between them.

GPx4

The selenoprotein GPx4 is located in scaffold gi|511782587|gb|CM001946.1 between positions 874762 and 876010, on the forward strand. The gene contains 6 introns and 7 exons.
Despite the fact that there are some discrepancies regarding the GPx4 annotation between human and macaque, we have found the same hit using both human and macaque GPx4 (annotated as a GPx_None in SelenoDB), as well as using Selenoprofiles. However, in this case the similarity is much better for the protein predicted by Selenoprofiles or Genewise and the one predicted from macaque using the “manual search”.

The conservation among mammals is strong in the core of the protein. Nevertheless, there are important differences in both C-terminus and N-terminus, probably to the aforementioned annotation problem. Click here to see multiple sequence alignment.

We have found a single SeCys residue which is highly conserved among mammals. A single SECIS element (Grade A) has been predicted in the 3’UTR region (48 pb downstream), which surprisingly exhibits poor sequence similarity between orthologous SECIS elements among primates. Click here to see mutliple sequence alignment.

No unassigned hits nor duplications nor pseudogenes have been predicted.However, additional hits for GPx4 have been found in other genomic regions. This fact is not surprising due to the high similarity between the members of the GPx family. Click here to see the alignemnt between them.

GPx5

The protein GPx5 is located in scaffold gi|511782576|gb|CM001958.1 between positions 44075582 and 44080388, on the forward strand. The gene predicted by Selenoprofiles and by the macaque query contains 3 introns and 4 exons. However, the gene predicted using the human query contains 4 introns and 5 exons.
Despite the fact that there are some discrepancies regarding the isoform annotation between human and macaque, we have found the same hit using both human and macaque GPx5, as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the GPx5 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated GPx5. Click here to see multiple sequence alignment.

We have not observed any SeCys residue in Chlorocebus sabaeus as well as in any mammalian organism. Consequently, no SECIS element is predicted. As Gpx5 have evolved from Gpx3 in at the root of placental mammals and it contains a SeCys, Gpx5 is considered a Cys homolog. Since all placental mammals have a Cys homolog, the Sec to Cys conversion must have happened very early in the evolution of Gpx5.
Interestingly, exonerate has also predicted GPx6 to be in the same subsequence of GPx5, on the reverse strand. What is most surprisingly is that they share part of their genomic position but in opposed orientations, so they are partly overlapping (See Ensemble).

No unassigned hits nor duplications nor pseudogenes have been predicted. However, additional hits for GPx5 have been found in other genomic regions. This fact is not surprising due to the high similarity between the members of the GPx family. (see human GPx family alignment). Click here to see the alignemnt between them.

GPx6

The selenoprotein GPx6 is located in scaffold gi|511782576|gb|CM001958.1 between positions 44061553 and 44051798, on the reverse strand. The gene contains 4 introns and 5 exons.
Despite the fact that there are some discrepancies regarding the GPx6 annotation between human and macaque, we have found the same hit using both human and macaque GPx6, as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the GPx6 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated GPx6. Click here to see multiple sequence alignment.

We have found a single SeCys residue which is highly conserved among mammals. A single SECIS element (Grade A) has been predicted in the 3’UTR region (683 pb downstream), which surprisingly exhibits poor sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

As explained above, GPx6 and GPx5 share part of their genomic position but they are oriented in opposite directions (See GPx5 discussion).

No unassigned hits nor duplications nor pseudogenes have been predicted. However, additional hits for GPx6 have been found in other genomic regions. This fact is not surprising due to the high similarity between the members of the GPx family. Click here to see the alignemnt between them.

GPx7

The protein GPx7 is located between the positions 80342051 and 80335747, on scaffold gi|511782573|gb|CM001961.1|, on the reverse strand. The gene predicted by Selenoprofiles and by macaque query contains one intron. However, the gene predicted using the human query contains two introns.

Despite the fact that there are some discrepancies regarding the Gpx7 annotation between human and macaque (mainly in the first region), we have found the same hit using both human and macaque GPx7. However, using Selenoprofiles and Genewise we have found the same hit as macaque, but with lower similarity.The GPx7 found in Chlorocebus sabaeus also shows a good identity with the majority of mammalian annotated GPx7. Click here to see multiple sequence alignment

We have not observed any SeCys residue in Chlorocebus sabaeus as well as in any mammalian organism. Consequently, no SECIS element is predicted.
No unassigned hits nor duplications nor pseudogenes have been predicted. However, additional hits for GPx7 have been found in other genomic regions. This fact is not surprising due to the high similarity between the members of the GPx family. Click here to see the alignemnt between them.

GPx8

The protein GPx8 is located in the scaffold gi|511782589|gb|CM001944.1| between positions 51393495 and 51397530, on the forward strand. The gene predicted by Selenoprofiles and by macaque query contains one intron. However, the gene predicted using the human query contains two introns.

Despite the fact that there are some discrepancies regarding the Gpx8 annotation between human and macaque (mainly in the first region), we have found the same hit using both human and macaque GPx8. However, using Selenoprofiles and Genewise we have found the same hit as macaque, but with lower similarity.

The conservation among mammals is strong in the core of the protein. Nevertheless, there are important differences in both C-terminus and N-terminus, probably to the aforementioned annotation problem. Click here to see multiple sequence alignment.

We have not observed any SeCys residue in Chlorocebus sabaeus as well as in any mammalian organism. Consequently, no SECIS element is predicted.

No unassigned hits nor duplications nor pseudogenes have been predicted. However, additional hits for GPx8 have been found in other genomic regions. This fact is not surprising due to the high similarity between the members of the GPx family. Click here to see the alignemnt between them.

TR1

The selenoprotein TR1 is located in the scaffold gi|511782582|gb|CM001952.1| between positions 99422513 and 99558063, on the forward strand. It contains 16 introns.

We have found the same hit using both human and macaque TR1 as well as using Selenoprofiles and Genewise. However, there are some discrepancies between ncbi and SelenoDB regarding the annotation of the TR1 isoforms, mainly in the firsts exons. The majority of the TR1 sequence found in Chlorocebus sabaeus shows high identity with the most of mammalian annotated TR1. Click here to see multiple sequence alignment.

We found a single SeCys residue, which is strongly conserved among mammals. A SECIS element (Grade A) was identified in the 3'-UTR region, 217 bp downstream the last exon. It is extremely conserved among primates. Click here to see multiple sequence alignment.

Other hits have also been found that overlap with TR2 and TR3, due to their high sequence similarity.
No duplications nor pseudogenes have been predicted.

TR2

The selenoprotein TR2 is located in the scaffold gi|511782574|gb|CM001960. between positions 5803213 and 5866677, on the forward strand. It contains 16 introns and 17 exons.

We have found the same hit using both human and macaque TR2 as well as using Selenoprofiles and Genewise. However, there are some discrepancies between ncbi and SelenoDB regarding the annotation of the TR2 isoforms, mainly in the first exons. The sequence found in Chlorocebus sabaeus also shows high identity with the most of mammalian annotated TR1. Click here to see multiple sequence alignment.

We found a single SeCys residue, which is strongly conserved among mammals.

A SECIS element (Grade A) was identified in the 3'-UTR region, 1441 bp downstream the last exon. Although it is extremely conserved among primates, the predicted SECIS for Chlorocebus sabaeus shows notably poor conservation. Click here to see multiple sequence alignment. This could be due to a SECIS degeneration. However, given the high conservation among the other primates, this option is unlikely. Thus we hypothesize that this might be due to a wrong prediction.

Another hits have also been found that overlaps with TR2 and TR3, due to their high sequence similarity.

No pseudogenes nor duplications have been found.

TR3

The selenoprotein TR3 is located in the scaffold gi|511782571|gb|CM001963.1 between positions 54360735 and 54416711, on the forward strand.

We have found the same hit using both the human and macaque as well as using Selenoprofiles. The majority of the TR3 sequence found in Chlorocebus sabaeus shows high identity with the most of mammalian annotated TR3. Click here to see multiple sequence alignment.

We found a single SeCys residue, which is strongly conserved among mammals. A SECIS element (Grade A) was identified in the 3'-UTR region, 213 bp downstream the last exon. It is extremely conserved among primates.Click here to see multiple sequence alignment.

Another hits have also been found that overlaps with TR2 and TR3, due to their high sequence similarity.

No pseudogenes nor duplications have been found.

DI1

DI1 is located in the scaffold gi|511782573|gb|CM001961.1| between positions 79040808 and 79025340, on the reverse strand. The gene contains 3 introns and 4 exons.

We have found the same hit using both human and macaque DI1, as well as using Selenoprofiles. This can be explained by the high identity between both queries. In fact, DI1 also shows high identity among the majority of mammals. Click here to see multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals.

A single SECIS element (Grade A) was identified in the 3'-UTR region (945 bp downstream). It also exhibits high sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

Other hits have also been found that overlap with DI2 and DI3, due to their high sequence similarity. Click here to see family members alignment.

No unassigned hits nor duplications nor pseudogenes have been predicted.

DI2

DI2 is located in scaffold gi|511782569|gb|CM001965.1| between positions 57530508 and 57522516, on the reverse strand. The gene contains 2 introns and 3 exons.

We have found the same hit using both Human and Macaque DI2 as well as using Selenoprofiles. This can be explained by the high similarity between both queries. However, some differences are worth mentioning. Human and Chlorocebus sabaeus show an insertion which is not present neither in macaque nor in other mammals. On the other hand, the hit found by selenoprofiles misses one exon.

A protein with homology with this two queries (more identity with human's query) is predicted in C.sabaeus, which also shows homology with the majority of mammalian annotated DI2. Click here to see multiple sequence alignment.

We have found three SeCys residues. One of them is strongly conserved among mammals. Consequently, we have identified a single SECIS element (Grade A) located in 3'-UTR (4833 bp downstream), which shares homology between mammals. Click here to see multiple sequence alignment.

Other hits have also been found that overlap with DI1 and DI3, due to their high sequence similarity.

No unassigned hits nor duplications nor pseudogenes have been predicted.

DI3

DI3 is located in scaffold gi|511782569|gb|CM001965.1| between positions 79538425 and 79539261, on the positive strand. It has 1 exons, no introns are predicted.

We have found the same hit using both Human and Macaque DI3. This can be explained because of the high identity between both queries. In fact, the predicted DI3 in Chlorocebus sabaeus also shows homology with the majority of mammals. However, we can observe some differences in queries' length which could come from wrong annotation. Click here to see multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals. Consequently, a single SECIS element (Grade A) was identified in the 3'-UTR (584 bp downstream). This SECIS element shares homology between mammals. Click here to see multiple sequence alignment.

Other hits have also been found that overlap with DI1 and DI2, due to their high sequence similarity.

Selenoprofiles did not predict this protein. No unassigned hits nor duplications nor pseudogenes have been predicted.

SPS1

The protein SPS1 is located in the scaffold gi|511782584|gb|CM001949.1| between positions 13375536 and 13349671, on the reverse strand.

SelenoDB has annotated several members for the human SPS1 protein, including duplicates and pseudogenes. On the other hand, Maccaca mulatta has a single SPS1 annotated. Therefore, in order to identify the coding selenoprotein, assuming that there is only one copy per genome, we used the criteria explained in Materials and Methods.

The same hits have been found using both human and macaque queries as well as using Selenoprofiles and Genewise. However, the gene predicted by Selenoprofiles is slightly different to the ones predicted by the “manual search” and it contains one more exon. The predicted SPS1 in Chlorocebus sabaeus by Selenoprofiles conserves the exonic/intronic structure and exhibits high similarity with the rest of mammalian annotated SPS1. Click here to see multiple sequence alignment. A SeCys residue is found in the query used by Selenoprofiles. However, it has been replaced by a Thr in Chlorocebus sabeus. Moreover, no SeCys residue is found in the rest of SPS1 proteins from the rest of mammals. Therefore, as expected, no SECIS element was identified.

Several additional hits were found. The majority of them coincided with the ones from SPS2 and were identified as pseudogenes due to the presence of frameshifts and in-frame stop codons (different from UGA). Interrupted exon duplications were also found. Moreover, a hit was found that corresponds to SPS2 coding selenoprotein, which is not strange because of the high similarity between the two members of the family. Click here to see SPS1-SPS2 alignment. Depending on the level of similarity between the predicted pseudogene and SPS1 or SPS2, respectively, the pseudogene was assigned to one or the other gene.

Pseudogens and partial exon duplications:

  • Partial Duplication_01: gi|511782570|gb|CM001964.1|: partial duplication with in-frame stop codon.
  • Partial Duplication_02: gi|511782572|gb|CM001962.1|: in-frame stop codons.
  • Partial Duplication_03: gi|511782562|gb|CM001951.1|: only found by Selenoprofiles. Abundant frameshifts.
  • Pseudogene_01: gi|511782562|gb|CM001951.1|: Lose of intron/exon structure but remarkably conserved.
  • Pseudogene_02: gi|511782589|gb|CM001944.1|: Lose of intron/exon structure and stop codon but remarkably conserved.
  • Partial Duplication_04: gi|511782589|gb|CM001944.1|: in-frame stop codons and frame shift.
SPS1 Selenocystein Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
Partial duplication_01
Partial duplication_02
Partial duplication_03
Pseudogene_01
Pseudogene_02
Partial Duplication_04

SPS2

The selenoprotein SPS2 is located in the scaffold gi|511782588|gb|CM001945.1| between positions 27132043 and 27130748, on the reverse strand. The gene predicted contains no introns and one long exon.

Although there are slight differences between the predicted proteins, the same hit was found using both Human and Macaque queries as well as using Selenoprofiles and Genewise. Nevertheless, the SPS2 found in Chlorocebus sabaeus shows high identity with the majority of mammalian SPS2 annotated. Click here to see multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals. Consequently, a single SECIS element (Grade A) was identified in the 3'-UTR region (560 pb downstream), which exhibits partial sequence similarity between some orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

Interestingly, the predicted SPS2 has no introns, what is strange in a protein-coding gene. This observation can be explained by a retrotransposition and gene replacement.

As described in Mariotti et al, 2012, the SPS2 gene appeared initially in mammals with a multiple exon structure (SPS2a), but it was replaced by a copy with a single exon (SPS2b). While in monotremes and non-mammalian vertebrates only SPS2a is present, in placental mammals only SPS2b is present. Interestingly, in non-placental mammals such as marsupials both copies are present.

Several additional hits were found. The majority of them coincided with the ones found with SPS1. As aforementioned, depending on the level of similarity between the predicted pseudogene and SPS1 or SPS2, the pseudogene was assigned to one or the other gene.

Pseudogens and partial exon duplications:

    Partial duplication_01: gi|511782564|gb|CM001970.1|: The SeCys is changed for a Tyr. Not found by selenoprofiles. Partial duplication_02: gi|511782583|gb|CM001950.1|: Very long and conserved hit, but the SeCys is changed for a Tyr. Also found by Selenoprofiles.

SPS2 Selenocystein Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
Partial duplication_01
Partial duplication_02

Sel15

The selenoprotein Sel15 is located in the scaffold gi|511782573|gb|CM001961.1| between positions 46546924 and 46595413, on the forward strand. The gene contains 4 introns and 5 exons.

We have found the same single hit using both human and macaque Sel15 as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the Sel15 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated Sel15. Click here to see multiple sequence alignment. On the other hand, Seblastian skips the first two exons.

We have found a single SeCys residue, which is strongly conserved among mammals. Click here to see multiple sequence alignment.
Consequently, a SECIS element (Grade A) has been identified in the 3'-UTR region. Interestingly, the SECIS predicted in Chlorocebus sabaeus is identic to the one predicted in Orangutan but different to the ones predicted in human or in macaque (see MSA above). This observation is not consistent with the described primate phylogeny.
No unassigned hits nor duplications nor pseudogenes have been predicted.

SelH

The selenoprotein SelH is located in the scaffold gi|511782591|gb|CM001941.1| between positions 15488727 and 15488348, on the reverse strand. The predicted coding gene contains 2 introns and 3 exons.

We have found the same single hit using both human and macaque SelH as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the SelH found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelH. Click here to see multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals. Consequently, a single SECIS element (Grade A) was identified in the 3'-UTR region (665 bp downstream). The SECIS element also exhibits sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

No unassigned hits nor duplications nor pseudogenes have been predicted.

SelI

The selenoprotein SelI is located in the scaffold gi|511782579|gb|CM001955.1|between positions 81303419 and 81258556, on the reverse strand. This gene contains 10 exons and 9 introns.

The same hit was found using both human and macaque queries as well as in Selenoprofiles and Genewise because of the high similarity between the two queries. Indeed, the SelI found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelI. Click here to see multiple sequence alignment.

A single SECIS element (Grade A) was identified in the 3'-UTR region, between the last exon of the gene and the SECIS element there are 1300 nucletotides. This SECIS element also exhibits sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

This prediction coincides with the one using selenprofiles.

Moreover, two additional hits were found:

  • Partial duplication_01 (gi|511782582|gb|CM001952.1|)
  • Partial duplication_02 (gi|511782573|gb|CM001961.1|)
SelI Selenocystein Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
Partial duplication_02
Partial duplication_02

SelK

The selenoprotein SelK is located in the scaffold gi|511782571|gb|CM001963.1| between positions 15274714 and 15268757, on the reverse strand. The gene contains 3 introns and 4 exons.

SelenoDB has annotated several members for the macaque SelK family, including duplicates and pseudogenes. In contrast, Homo sapiens has a single SelK annotated. Therefore, in order to identify the coding selenoprotein in macaque we have followed the aforementioned criteria (Materials and Methods/Queries).

In general we have found the same hits using both human and macaque SelK as well as using Selenoprofiles. This can be explained by the high similarity between both queries. However there is a contradiction concerning the number of introns. Genewise predicts 2 introns while the "manual search" and Selenoprofiles 3. Nevertheless, the SelK predicted in Chlorocebus sabaeus shows high identity with the majority of mammalian annotated SelK. Click here to see multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals. A single SECIS element (Grade A) was identified in the 3'-UTR region (473 bp downstream), which also exhibits sequence similarity between orthologous SECIS elements among primates. Interestingly, the SECIS predicted in Chlorocebus sabaeus is similar to the one from macaque and orangutan, while gorillas, chimpanzees and humans are slightly different. Click here to see multiple sequence alignment.

Several additional hits were found. The majority of them are cataloged as pseudogenes due to the presence of frameshifts and in-frame stop codons (different from UGA). Interrupted exon duplications were also found:

Interrupted exon duplications:

  • gi|511782572|gb|CM001962.1.
  • gi|511782587|gb|CM001946.1| (Hit 2): frameshift. Only found in Selenoprofiles

Pseudogenes:

  • Pseudogene_01 (gi|511782563|gb|CM001943.1|): loss of intronic structure. It contains TGA. Sequence very similar to the predicted coding gene and to the pseudogene predicted in contig gi|511782579|gb|CM001955.1|. However, it does not contain any SECIS element in the 3'UTR.
  • Pseudogene_02 (gi|511782579|gb|CM001955.1|): loss of intronic structure. It contains TGA. A single A secis is predicted in 3'-UTR which is identic to the one predicted in the protein coding gene. The fact that the SECIS from this pseudogene is conserved, as well as the sequence, suggests that this retrotransposition is remarkably recent.
  • Pseudogene_03 (gi|511782580|gb|CM001954.1|): loss of intronic structure. It contains TGA. Sequence very similar to the predicted coding gene and to the pseudogene predicted in contig gi|511782579|gb|CM001955.1|. However, it does not contain any SECIS element in the 3'UTR.
  • Pseudogene_04 (gi|511782580|gb|CM001954.1|(Hit 2)): pseudogene with a visible frameshift only predicted by Selenoprofiles.
  • Pseudogene_05 (gi|511782587|gb|CM001946.1|): loss of intronic structure. It does not contain SeCys, instead it contains a Arg. A single A SECIS is predicted in 3'-UTR
  • Pseudogene_06 (gi|511782590|gb|CM001942.1|): loss of intronic structure. In-frame STOP codons. It contains TGA.
SelK Selenocystein Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
Pseudogene_01
Pseudogene_02
Pseudogene_03
Pseudogene_04
Pseudogene_05
Pseudogene_06

As mentioned, the SECIS found in contigs gi|511782571|gb|CM001963.1| (protein-coding gene), gi|511782579|gb|CM001955.1| (pseudogene_02) and gi|511782587|gb|CM001946.1| (pseudogene_05) are identical. Click here to see this alignment.

This observation can be explained by the following hypothesis:

  1. The retrotranspositions gi|511782579|gb|CM001955.1| and gi|511782587|gb|CM001946.1| are very recent: we tried to prove this hypothesis by analyzing the sequence conservation of the pseudogenes in comparison to the predicted protein-coding gene. The alignment reveals that the pseudogene gi|511782579|gb|CM001955.1| has accumulated barely 3 changes compared to the predicted coding protein. The other pseudogene, gi|511782587|gb|CM001946.1|, has accumulated more than 15 changes. These results suggest that one retrotranscription took place many generations before the other. Therefore, this hypothesis can not explain the SECIS' high conservation.
  2. The SECIS element is under negative selection: it is a possibility that should be further analyzed by comparing the molecular clock of the SECIS element to a neutral region. It might suggest that the pseudogenes might still be translated.
  3. Gene conversion has homogenized the pseudogene (gi|511782579|gb|CM001955.1|): it is a very strange phenomenon, but it has been described in some pseudogenes (Martínez-Arias R et al, 2001).

SelM

The selenoprotein SelM is located in the scaffold gi|511782574|gb|CM001960.1| between positions 13998866 and 13996656, on the reverse strand. The gene contains 3 introns and 4 exons.

We have found the same single hit using both Human and Macaque SelM as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the SelM found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelM. Click here to see multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals. A SECIS element (Grade A) has been identified in the 3'-UTR region (42bp downstream), which exhibits partial sequence similarity between orthologous SECIS elements among primates. Interestingly, the conservation of the Chlorocebus' SECIS is lower in comparison with the conservation between all the other primates.Click here to see multiple sequence alignment.

No unassigned hits nor duplications nor pseudogenes have been predicted.

SelN

The selenoprotein SelN is located in the scaffold gi|511782573|gb|CM001961.1 between positions 106970753 and 106954956, on the reverse strand. It contains 11 exons and 10 introns.

We have found the same single hit using both Human and Macaque SelN as well as using Selenoprofiles and Genewise. The SelN found in Chlorocebus sabaeus shows high identity with the majority of mammalian annotated SelN. However, the first region of the isoform used in the human query is not found neither in Chlorocebus sabaeus nor in the rest of mammals. It may be an alternative predicted isoform or an insertion that has taken place in the human lineage. This might be the reason why the "manual search" predicts one extra intron. Click here to see multiple sequence alignment.

We have found a single SeCys residue which is strongly conserved among mammals. A SECIS element (grade A) has been predicted in the 3'UTR region of the gene, which exhibits high sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

No unassigned hits nor duplications nor pseudogenes have been predicted.

SelO

The selenoprotein SelO is located in the scaffold gi|511782574|gb|CM001960.1| between positions 32693020 and 32708443, on the forward strand. The predicted gene contains 8 introns and 9 exons.

We have found the same hit using both human and macaque SelO as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the SelO found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelO. Click here to see multiple sequence alignment.

We have found a single SeCys residue in the last exon, which position is highly conserved among mammals. Consequently, a SECIS element (grade A) is found in the 3'UTR (117bp downstream) and it exhibits high sequence conservation among primates. Click here to see multiple sequence alignment.

No pseudogenes nor duplication nor unassigned hits have been found.

SelP

The selenoprotein SelP is located in the scaffold gi|511782589|gb|CM001944.1| between positions 41552376 and 41545553, on the forward strand. The predicted coding gene contains 3 introns and 4 exons.

We have found the same single hit using both human and macaque SelP as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the SelP found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelP. Click here to see multiple sequence alignment.

We have found multiple SeCys residue. While some of them are strongly conserved among mammals, others exhibit variability. Our hypothesis is that not all the SeCys residues are important for the protein's function. Therefore, some of them have degenerate not only to Cys residues but also Lys, Glu and other residues.

Interestingly, two SECIS elements (Grade A) have been identified in the 3'-UTR region. One is separated from the last exon 689 bp whereas the other 260 bp. Surprisingly, the SECIS elements exhibits poor sequence similarity between orthologous SECIS elements among primates.Click here and here to see multiple sequence alignment..

No unassigned hits nor duplications nor pseudogenes have been predicted.

SelR1

The selenoprotein SelR1 is located in the scaffold gi|511782588|gb|CM001945.1| between positions 1840696 and 1836947, on the reverse strand.

We have found the same single hit using both Human and Macaque SelR1 as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the SelR1 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelR. There is discrepancy between Selenodb and NCBI concerning the number and sequence of isoforms. However, all the queries give the same hits. Click here to see multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals. A SECIS element (Grade A) was identified in the 3'-UTR region. Surprisingly, it exhibits poor sequence conservation between orthologous SECIS elements, despite the high conservation of this SECIS among primates. Click here to see multiple sequence alignment.

In scaffold gi|511782587|gb|CM001946.1| we have found a pseudogene with an in-frame STOP codon, a loss of intronic structure, but conservation of the TGA. No SECIS is predicted for this pseudogene.

SelR2

The protein SelR2 is located in the scaffold gi|511782584|gb|CM001949.1 between positions 22871762 and 22889641 on the forward strand.

We have found the same hits using both human and macaque SelR2 as well as using Selenoprofiles and Genewise, despite the fact that there are discrepancies regarding the isoform annotation between SelenoDB and ncbi. This lack of consensus might be attributed to the predicted existence of several isoforms for SelR2 and the absence of experimental data. As expected due to the aforementioned issue, the predicted protein were similar to the queries except for the presence of some indels in the firsts exons. However, the rest of the protein shows high identity with the majority of mammalian annotated SelR2. Click here to see multiple sequence alignment.

No SeCys residue was found. Consequently, no SECIS element was predicted in the 3'UTR region.

An additional hit was found in contig gi|511782582|gb|CM001952.1|, but this hit turned out to be SelR3, which shares partial sequence similarity to SelR2.

SelR3

The selenoprotein SelR3 is located in the scaffold gi|511782582|gb|CM001952.1 between positions 60970307 and 61154356, on the forward strand. It has 6 exons and 5 introns.

We have found the same hits using both Human and Macaque SelR3 as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries.

The predicted protein always showed high similarity to the query used, either for Human or Macaque. Ignoring the aforementioned isoforms, the SelR3 found in Chlorocebus sabaeus shows high identity with the majority of mammalian annotated SelR3. Click here to see multiple sequence alignment.

No SeCys residue was identified in the coding sequence. Consequently, no SECIS element was predicted in the 3'UTR region.

An additional hit has been found in contig gi|511782584|gb|CM001949.1, but this hit is, in fact, SelR2, which shares partial sequence similarity to SelR3.

SelS

The selenoprotein SelS is located in the scaffold gi|511782564|gb|CM001970.1| between positions 19732248 and 19729592, on the reverse strand. It has 6 exons and 5 introns.

We have found the same single hit using both human and macaque SelS as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the SelS found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelS. Click here to see multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals.

A SECIS element (Grade A) has been identified in the 3'-UTR region (430bp downstream), which exhibits sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

No unassigned hits nor duplications nor pseudogenes have been predicted.

SelT

The selenoprotein SelT is located in the scaffold gi|511782578|gb|CM001956.1| between positions 40094462 and 40064119, on the reverse strand. It has 5 exons and 4 introns.

We have found the same single hit using both Human and Macaque SelT as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the SelT found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelT.Click here to see multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals.

A SECIS element (Grade A) has been identified in the 3'-UTR region (1037bp downstream), which exhibits high sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

An additional hit, gi|511782570|gb|CM001964.1|, has been found. It corresponds to a pseudogene with an in-frame stop codon and without introns.

SelT Selenocystein Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
Pseudogene

SelU1

The selenoprotein SelU1 is located in the scaffold gi|511782584|gb|CM001949.1| between positions 51416183 and 51404921, on the reverse strand. The gene contains 4 introns and 5 exons.

We have found the same single hit using both human and macaque SelU1 as well as using Selenoprofiles. This can be explained by the high similarity between the two queries. Indeed, the SelU1 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelU1. Click here to see multiple sequence alignment. However, Genewise does not predict any region.

No SeCys residue is found in the coding sequence. Consequently, no SECIS element is predicted.

Pseudogenes has been predicted on the following scaffolds:

  • Pseudogene_01 (gi|511782585|gb|CM001948.1|): stop codon
  • Pseudogene_02 (gi|511782582|gb|CM001952.1|): stop codon
SelU1 Selenocystein Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
Pseudogene_01
Pseudogene_02

SelU2

The protein SelU2 is located in the scaffold gi|511782581|gb|CM001953.1| between positions 106486811 and 106499615, on the forward strand. The predicted gene contains 5 introns and 6 exons.

We have found the same hit using both human and macaque SelU2 as well as using Selenoprofiles. This can be explained by the high similarity between the two queries. Indeed, the SelU2 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelU2. Click here to see multiple sequence alignment.

No SeCys residue is found in the exons. Consequently, no SECIS element is predicted.

An additional hit was found. No in-frame stop codons can be observed, and sequence is relatively conserved. Nonetheless, due to the loss of the intron/exon/structure we have cataloged it as a pseudogene. We can found it in the scaffold gi|511782562|gb|CM001951.1|:

SelU2 Selenocystein Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
Pseudogene_01

SelU3

The selenoprotein SelU3 is located in the scaffold gi|511782573|gb|CM001961.1| between positions 128967462 and 128965112, on the forward strand. The predicted gene contains 6 introns and 7 exons.

We have found the same single hit using both human and macaque SelU3 as well as using Selenoprofiles. However, Genewise predicts 5 introns. This can be explained by the high similarity between the two queries. Indeed, the SelU3 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelU3. Click here to see multiple sequence alignment. Note the differences in the isoform annotation between the different species. The gene predicted in Chlorocebus sabaeus is similar to the largest isoform of the human. However, gene expression data is required to detect the main isoform expressed in Chlorocebus sabaeus.

There is a significant difference in the last exon predicted by Selenoprofiles and the last exon predicted by the “manual search”. The intron predicted by Selenoprofiles is 100.000nt long whereas the one predicted by the “manual search” is 359nt long. Taking into consideration that the rest of introns do not exceed a length of 600nt, it is feasible to think that the correct protein is the one predicted by the “manual search”.

No SeCys residue is found in the coding sequence. Consequently, no SECIS element is predicted. No unassigned hits nor duplications nor pseudogenes have been predicted.

SelV

The selenoprotein SelV is located in the scaffold gi|511782587|gb|CM001946.1| between positions 34122715 and 34125934, on the forward strand.

Macaque mulatta does not have any SelV annotated in SelenoDB. Previous studies have reported that SelV was lost by deletion specifically in gorilla, but this deletion is not expected in Macaca. Therefore we have done a blast in Macaca mulatta's genome using the human SelV as query, and a SelV was predicted. Therefore, we used this one as the query of our analysis.

We have found the same hit using both Human and Macaque SelV as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. However, there are slight differences between the gene predicted by Selenoprofiles and the gene predicted by the “manual search”. While Selenoprofiles predicts 4 introns and 5 exons, the “manual” search does skip the 2nd exon and predicts only 3 introns. The “manual search” predicts a few changes which are not conserved across primate phylogeny. Moreover, the annotated human SelV contains 5 coding exons. Therefore, it seems feasible to think that the structure predicted by Selenoprofile is likely to be the correct one.

The SelV predicted in Chlorocebus sabaeus shows high identity with the majority of primate annotated SelV (enllaç queries). The SelV is not well conserved beyond primates, as can be seen in the following multiple sequence alignment.

We have found a single SeCys residue, which is strongly conserved among mammals. A single SECIS element (Grade A) has been identified in the 3'-UTR region (1053 pb downstream), which exhibits high sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment. Surprisingly, in Macaque's genome, the SECIS element for SelV is not predicted neither in SelenoDB nor using blast with the human SECIS element as query.

Interestingly, we have found SelV in Selenoprofiles when searching for SelW. This fact can be explained due to the high similarity between SelV and SelW, particularly in the exons 2-5. The reason underlying that identity has been unraveled by previous studies which reported that SelV arose from a duplication of SelW in the placental stem (Mariotti et al, 2012). The first exon is the only one remarkably different. While the one from SelV contains ~260 aminoacids, the one from SelW contains only 9 aminoacids. The rest of exons exhibit partial similarity in this alignment.

SelW1

The selenoprotein SelW is located in the scaffold gi|511782587|gb|CM001946.1| between positions 41083343 and 41085943, on the forward strand.

We have found the same hit using both human and macaque SelW as well as using Selenoprofiles. This can be explained by the high similarity between the two queries. However, there are slight differences between the gene predicted by Selenoprofiles and the gene predicted by the “manual search”. While Selenoprofiles predicts 4 introns and 5 exons, the “manual search” does skip the 2nd exon and predicts only 3 introns. The hit found by Selenoprofiles was selected as the good one for two reasons: first, the annotated human SelW in SelenoDB also contains this 2nd intron, which is conserved among primates. Click here to see multiple sequence alignment with Selenoprofiles' hit. Second, the hit found by the “manual search” predicts 4 changes which are not conserver among primates. Click here to see multiple sequence alignment with "manual search" hit.

The SelW found in Chlorocebus sabaeus shows high identity with the majority of mammalian annotated SelW

We have found a single SeCys residue, which is strongly conserved among mammals. Consequently, a single SECIS element (Grade A) was identified in the 3'-UTR region (2646 pb downstream), which exhibits high sequence similarity between orthologous SECIS elements among primates. Click here to see multiple sequence alignment.

Another hit was also found in contig gi|511782573|gb|CM001961.1| between positions 101740763 and 101741354. It shares high sequence similarity but it has lost the exonic/intonic structure as well as the SECIS element. Thus, it is likely to be a pseudogene. Actually, the presence of several SelW pseudogenes in the mammalian selenoproteome has been described in recent studies (Mariotti et al, 2012).

SelW Selenocystein Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
Pseudogene

Interestingly, a hit for the SelV is found in Selenoprofiles when searching for SelW. This observation can be explained by the origin of the selenoprotein SelV (See SelV discussion).

SelW2

The selenoprotein SelW2 is located in the scaffold gi|511782577|gb|CM001957.1 between positions 66433657and 66434618 on the forward strand. The gene contains 3 introns and 4 exons.

SelW2 is an homolog of SelW1 found in non-mammalian vertebrates such as bony fashes or frogs. In mammals, only a remote homolog of SelW2 is present, called Rdx12.

We have found the same single hit using both Human and Macaque SelW2, but not using Selenoprofiles. This can be explained by the high similarity between the two queries. Indeed, the SelW2 found in Chlorocebus sabaeus also shows high identity with the majority of primate annotated SeW2. Click here to see multiple sequence alignment.
Rdx12 is a Cys homolog and thus no SECIS is predicted in the 3'UTR of the gene.

No unassigned hits nor duplications nor pseudogenes have been predicted.

MsrA

The protein MsrA is located in the scaffold gi|511782585|gb|CM001948.1 between positions 8881277 and 8664777, on the reverse strand. The gene predicted by the “manual search” contains 5 introns and 6 exons, while Selenoprofiles does not predict the first exon.

Despite the fact that there are some discrepancies regarding the isoform annotation between human and macaque, we have found the same single hit using both human and macaque MsrA as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the MsrA found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated MsrA. Click here to see multiple sequence alignment.

Interestingly, the gorilla and the macaque lack the last three aminoacids (IKK) whereas both human and Chlorocebus sabaeus have them. The most feasible argument to explain this is a wrong annotation. Otherwise this indel should have occurred twice, as the separation between apes and monkeys occurred before that of the macaque and the Chlorocebus sabaeus or the separation of great apes. We have not found a SeCys residue. Consequently, no SECIS element has been predicted in the 3’UTR region.
No unassigned hits nor duplications nor pseudogenes have been predicted.

SecS

The protein SecS is located in the scaffold gi|511782566|gb|CM001968.1| between positions 25202683 - 25238838, on the forward strand.The gene contains 10 introns and 11 exons.

We have found the same single hit using both human and macaque SecS as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the SecS found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SecS. Click here to see multiple sequence alignment.

This protein is part of the machinery and does not contain any SeCys. Therefore, no SECIS element has been predicted. No unassigned hits nor pseudogenes nor duplications have been found.

Pstk

The protein Pstk is located in the scaffold gi|511782584|gb|CM001949.1 between positions 115655213 and 115661977, on the forward strand. The gene contains 5 introns and 6 exons.

Despite the fact that there are some differences between the human and macaque query, we have found the same single hit using both human and macaque Pstk as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the Pstk found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated Pstk. Click here to see multiple sequence alignment. The only differences to be mentioned are the 14-aminoacid long fragment which is not found in macaque nor in Chlorocebus sabaeus, and the region in the N-terminus, which is not conserved.

Although using Selenoprofiles we have found the same hit, the 5th intron predicted it is much larger than the one predicted using the human and macaque query (3214bp vs 366bp). Surprisingly, Selenoprofiles predicts a completely different 6th exon.

We have not found a SeCys residue. Consequently, no SECIS element has been predicted in the 3’UTR region.
No unassigned hits nor duplications nor pseudogenes have been predicted.

SECp43

SecP43 does not have a SeCys residue, instead it has a cystein residue. In Chlorocebus, SecP43 is located in the scaffold gi|511782573|gb|CM001961.1| between positions 104239063 and 104215074, on the reverse strand. It has 8 introns and 9 exons.

We have found the same hit using both Human and Macaque SelH as well as Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. However the macaque query has lost the first three exons of the protein.

Indeed, the SelH found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated SelH. Selenodb does not predict correctly this protein in mammals, thus we have used the queries from the NCBI database. Click here to see multiple sequence alignment

As expected, no SECIS element has been predicted.

No unassigned hits nor duplications nor pseudogenes have been predicted.

SBP2

The protein SPB2 is located in the scaffold gi|511782581|gb|CM001953.1| between positions 99814066 - 99854849, on the forward strand. The gene contains 16 introns and 17 exons.

We have found the same single hit using both human and macaque Sbp2 as well as using Selenoprofiles and Genewise. This can be explained by the high similarity between the two queries. Indeed, the Sbp2 found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated Sbp2.Click here to see multiple sequence alignment.
This protein is part of the machinery and does not contain any SeCys. Therefore, no SECIS element has been predicted.

We have found one more hit which is likely to be a pseudogene or interrupted duplication in the human query as well as macaque query. It share poor sequence conservation and in-frame stop codons. This pseudogene or interrupted duplication is found in the scaffold gi|511782575|gb|CM001959.1|.


SBP2 Selenocystein Exonerate Genewise T-Coffee SECIS Seblastian Selenoprofiles
Pseudogene

What is more, using the second macaque query and the human query as well as using Selenoprofiles, we have found an additional hit in the scaffold gi|511782567|gb|CM001967.1|. Interestingly, the two queries from macaque are completely different, and the protein predicted in Chlorocebus sabaeus is highly conserved. Click here to see the multiple sequence alignment of the second hit. This predicted protein is not annotated in humans in SelenoDB, but we confirmed its existence by doing blast.

eEFSec

The protein eEFSec is located in the scaffold gi|511782571|gb|CM001963.1| between positions 52887459 - 52634507, on the reverse strand. The gene contains 6 introns and 7 exons.

We have found the same hit using both human and macaque eEFSec as well as using Selenoprofiles. This can be explained by the high similarity between the two queries. Indeed, the eEFSec found in Chlorocebus sabaeus also shows high identity with the majority of mammalian annotated eEFSec. However, Genewise does not predicts any region. Click here to see multiple sequence alignment.

This protein is part of the machinery and does not contain any SeCys. Therefore, no SECIS element has been predicted.

An additional hit have been found in contig gi|511782576|gb|CM001958.1| which shows poor sequence conservation and is likely to be a partial duplication or maybe a hit found by chance.

Non-mammalian selenoproteins

We also searched for the following non-mammalian selenoproteins using Selenoprofiles: Fep15, SelL, SelJ, SelQ, Rhor, Sel1, Sel2, Sel3, Sel4, SelD, SelG, Dsba. No significant hits were found