Discussion

Selenoprotein analysis

Sel15

Sel15 is a selenoprotein, the function of which is almost unknown. However, some studies suggest that this protein may have redox function and may be involved in the quality control of protein folding.

Its gene was found in scaffold LZPO010955251.1 between positions 61771 and 82955 on the forward strand. This was the only hit in tblast found comparing Sel15 protein of Mus musculus with Neotoma lepida genome.

Five exons and four introns were predicted, including query range from amino acid 0 to 162.

It is a selenoprotein as it has a selenocystein located in exon three and we could find a SECIS element in scaffold postions 83514 and 83588 in the same strand.

Glutathione Peroxidase Family

Glutathione peroxidase (GPx) is a peroxidase enzyme family the role of which is to protect the organism from oxidative stress, by reducing organic hydroperoxides to their corresponding alcohols and by reducing free hydrogen peroxide to water, using glutathione as reductant.

GPx1

GPx1 protein is the most abundant isozyme in the Glutathione Peroxidase family and it is ubiquitously expressed and localized in the cytoplasm, and which prefered substrate is hydrogen peroxide.

It is found in scaffold LZPO01058615.1 between positions 969259 and 970069 on the forward strand. This hit was found by comparing GPx1 protein of Mus musculus with Neotoma lepida genome.

Two exons and one intron where predicted including query range from amino acid 0 to 197.

It is a selenoprotein as it has a selenocystein in its sequence, which was found in exon one, and we could find a SECIS element in scaffold postions from 970086 to 970199 in the same strand.

GPx2

GPx2 protein is an isozyme in the Glutathione Peroxidase family and it is mainly expressed in the gastrointestinal tract.

It is found in scaffold LZPO01097134.1 between positions 309872 and 313253 on the reverse strand. This hit was found by comparing GPx2 protein of Mus musculus with Neotoma lepida genome.

Two exons and one intron where predicted including query range from amino acid 0 to 190.

It is a selenoprotein as it has a selenocystein in its sequence, which was found in exon one, and we could find a SECIS element in scaffold postions from 309620 to 309691 in the same strand.

GPx3

GPx3 protein is an isozyme in the Glutathione Peroxidase family.

It is found in scaffold LZPO01027373.1 between positions 820525 and 820647 on the reverse strand. This hit was found by comparing GPx3 protein of Homo sapiens with Neotoma lepida genome because comparison with Mus musculus GPx3 couldn't align the selenocystein.

Three exons and two introns where predicted including query range from amino acid 0 to 100.

It is a selenoprotein as it has a selenocystein in the second exon of its sequence, which was found in exon two, and we could find a SECIS element in scaffold positions from 818891 to 818964.

Iodothyronine deiodinase family

DIO1

DIO1 belongs to Iodothyronine deiodinase family which is involved in the activation and deactivation of thyroid hormones. Concretly DIO1 transforms T4 into T3. As far as T4 has two rings, both of them with two Iode, DIO1 is able to deiodinate both rings.

It is found in scaffold LZPO01034880.1 between position 300281 and 302279 on the forward strand comparing Homo sapiens protein with Neotoma lepida genome. We used Homo sapiens protein because comparisons with Mus musculus didn't show a good alignment. Moreover, Tcoffee in LZPO01034880.1 was much better than the one of the scaffold with highest e-value.

Five exons and four introns were predicted, including a query range from amino acid 34 to 185.

It is a selenoprotein as it has a selenocystein located in exon four and we could find a SECIS element in scaffold postions 305933 and 306001 on the same strand.

DIO2

DIO2 is an other member of Iodothyronine deiodinase family and it is able to deiodinate only the outer ring of the T4 hormone acting as the major activator of the T3 hormone.

It is found in scaffold LZPO01056726.1 between positions 3219 and 3782 on the forward strand. This hit was found by comparing DIO2 protein of Mus musculus with Neotoma lepida genome.

one exon introns where predicted, including query range from amino acid 74 to 262.

It is a selenoprotein as it has a selenocysteine in its sequence and we could find a SECIS element between scaffold positions 8356 and 8434 in the same strand.

DIO3

DIO3 is the last member of Iodothyronine deiodinase family and it is the major T4 inactivator as it is able to deiodinate only the inner ring of the T4.

It is found in scaffold LZPO01089558.1 between positions 2661737 and 2662297 on the reverse strand. This was the best hit in tblast found comparing DIO3 protein of Mus musculus with Neotoma lepida genome.

Only one exon was predicted, including query range from amino acid 0 to 275.

It is a selenoprotein as it has a selenocystein in its sequence and we could find a SECIS element in scaffold postions 2661083 to 2661164 in the same strand.

SELENOH

SelenoH is a selenoprotein involved in a redox-related process.

It is located in scaffold LZPO01027497.1 between positions 70059 and 70608 on the reverse strand. This was the best hit in tblast found comparing SelenoH protein of Mus musculus with Neotoma lepida genome.

Three exons and two introns were predicted, including query range from amino acid 0 to 116.

It is a selenoprotein as it has a selenocystein in the second exon of its sequence and we could find a SECIS element in scaffold positions 69367 to 69440 in the same strand.

SELENOI

SelenoI is a selenoprotein with and intrinsic phosphotransferase activity involved in the phospholipid biosynthesis process.

Its gene was found in scaffold LZPO01066265.1 between positions 643146 and 661778 on the reverse strand. This was the best hit in tblast found comparing SelenoI protein of Mus musculus with Neotoma lepida genome.

Ten exons and nine introns were predicted, including query range from amino acid 0 to 398.

It is a selenoprotein as it has a selenocystein, located in exon ten, and we could find a SECIS element in scaffold postions 641879 to 641962 in the same strand.

SELENOK

Seleno K is a selenoprotein highly expressed in the heart, where is supposed to act as an antioxidant.

It is found in scaffold LZPO01116999.1 between positions 10761 and 10829 on the reverse strand. This hit was obtained in tblast comparing Seleno K protein of Mus musculus with Neotoma lepida genome.

Four exons and three introns were predicted, including query range from amino acid 6 to 92.

It is a selenoprotein as it has has a selenocystein in the first exon of its sequence and we could find a SECIS element in scaffold positions 8585 to 8669 on the same strand.

However, the best hit obtained for Seleno K protein in tblast in terms of e-value was found in scaffold LZPO01000002.1 between positions 2076907 and 2077032 on the reverse strand, but no SECIS could be predicted in this section and its identity percentatge was lower than LZPO01116999.1. Therefore, scaffold LZPO01000002.1 could not be confirmed as a possible localization for a selenoprotein. A posible explanation could be that the genome is badly annotated and SECIS location is not sequenced in the same contig.

SELENOM

SelenoM is an ER-resident thiol-disulfide oxidoreductase that is highly expressed in the brain and has neuroprotective properties.

Seleno M protein is found in scaffold LZPO01034772.1 between positions 184388 and 184704 on the reverse strand. This was the best hit in tblast comparing Seleno M protein of Mus musculus with Neotoma lepida genome.

Six exons and five introns were predicted, including a query range from amino acid 16 to 145.

It is a selenoprotein as it has has a selenocystein in its sequence, which was found in the third exon, and we could find a SECIS element in scaffold positions 184275 to 184346 in the same strand.

SELENON

SelenoN is a selenoprotein the mutation of which cause a phenotype of multiminicore disease and congenital muscular dystrophy with spinal rigidity and restrictive respiratory syndrome.

SelenoN protein is found in scaffold LZPO01099511.1 between positions 2192666 and 2201214 on the forward strand. This was the only hit in tblast comparing SelenoN protein of Mus musculus with Neotoma lepida genome.

Thirteen exons and twelve introns were predicted, including a query range from amino acid 2 to 498.

It is a selenoprotein as it has a selenocystein in its sequence, which was found in the eleventh exon, and we could find a SECIS element in scaffold postions 2202318 to 2202385 in the same strand.

SELENOO

SelenoO is associated with some diseases, such as commensal bacterial infectious disease.

SelenoO is protein is found in scaffold LZPO01054713.1 between positions 723059 and 734720 on the forward strand. This was the only hit in tblast comparing SelenoO protein of Mus musculus with Neotoma lepida genome.

Nine exons and eight introns were predicted, including a query range from amino acid 0 to 667.

It is a selenoprotein as it has a selenocystein in its sequence, which is found in the ninth exon, and we could find a SECIS element in scaffold positions 734809 to 734889 in the same strand.

MSRB1

MSRB1 is an enzyme which catalyzes the reduction of methionine sulfoxide to methionine.

MSRB1 protein is found in scaffold LZPO01017331.1 between positions 41557 and 45551 on the reverse strand. This was the best hit in tblast comparing MSRB1 protein of Mus musculus with Neotoma lepida genome.

Four exons and three introns were predicted, including a query range from amino acid 0 to 116.

It is a selenoprotein as it has a selenocystein in its sequence, which is found in the third exon, and we could find a SECIS element in scaffold positions 39358 to 39426 in the same strand.

SELENOT

SelenoT has been recently identified as a member of the redoxin protein family, based on the occurrence in its primary structure of a “thioredoxinlike fold” containing a selenocystein.

It is found in scaffold LZPO01064442.1 between positions 3 and 679 on the reverse strand. This was the best hit in tblast found comparing SelenoT protein of Mus musculus, Rattus norvegicus and Homo sapiens with Neotoma lepida genome.

Three exons and two introns were predicted in each case, including query range from amino acid 34 to 120.

It is not clear if it is a selenoprotein as it seems to have a selenocystein in the second exon of its sequence but we could not find a SECIS element in any of the three comparisons. When examining the scaffold we could see that it was considerably short, because of this we hypothesized that there is probably a SECIS element but we could not find it because the scaffold and the sequence for the SECIS element was not included in it.

SELENOW

Seleno W is a selenoprotein which plays a role as a glutathione-dependent antioxidant and may be involved in redox-related process. Its importance could depend on its role in the myopathies of selenium deficiency.

It is located in scaffold LZPO01034912.1 between positons 366904 and 367131 on the reverse strand. This hit was found in tblast comparing selenoW protein of Mus musculus with Neotoma lepida genome.

Five exons and four introns were predicted, including query range from amino acid 162 to 321.

It is a selenoprotein as it has a selenocystein in its sequence, which is located in the fifth exon. This was not the hit with higher percentage of identity but it was the best one having a SECIS element upstream. This SECIS element was located between positions 366769 and 366850 in the same strand.

TXNRD1

TXNRD1 protein is a member of the piridine nucleotide oxidoreductases family, the function of which is the maintenance of thioredoxines in a reduced state and protecting against oxidative stress.

It is located in scaffold LZPO01034884.1 between positions 517901 and 537436 in the reverse strand. This hit was the third best hit found in tblast comparing TXNRD1 protein of Mus musculus with Neotoma lepida genome but in terms of identity and t-coffee scaffold LZPO01034884.1 showed a much better alignment.

Eighteen exons and seventeen exons were predicted, including a query range from amino acid 5 to 616.

It is a selenoprotein as it has a selenocystein in its sequence, which is located in the eighteenth exon, and we could find a SECIS element in scaffold position 511088 to 511162 in the same strand than the protein.

TXNRD2

This protein is a member of the pyridine nucleotide-disulfide oxidoreductase family. It is a selenocysteine-containing flavoenzyme that maintains thioredoxins in a reduced state, thereby playing a key role in regulating the cellular redox environment.

It is located in scaffold LZPO01099512.1 between positions 634082 and 653596 in the reverse strand. This hit was the best hit found in tblast comparing TXNRD2 protein of Homo sapiens with Neotoma lepida genome. In this case, even though the same hit can be obtained comparing Neotoma lepida genome with Mus musculus protein, we compare our genome with the Human protein because, it shows a better alignment.

Seventeen exons and sixteen introns were predicted, including a query range from amino acid 0 to 494.

It is a selenoprotein as it has a selenocystein in its sequence, which is located in the seventeenth exon, and we could find a SECIS element in scaffold position 630186 to 630256 in the same strand.

TXNRD3

TXNRD3 protein is a member of the pyridine nucleotide-disulfide oxidoreductase family. It catalyzes the reduction of thioredoxin, and is implicated in the defense against oxidative stress.

It is located in scaffold LZPO01017447.1 between positions 119853 and 153041 in the forward strand. This hit was the best hit found in tblast comparing TXNRD3 protein of Mus musculus with Neotoma lepida genome.

Seventeen exons and sixteen introns were predicted, including a query range from amino acid 0 to 678.

It is a selenoprotein as it has a selenocystein in its sequence, which is located in the seventeenth exon, and we could find a SECIS element in scaffold positions 153857 to 153935 in the same strand than the protein.


Cystein-containing homologs analysis

Glutathione Peroxidase Family

GPx6

GPx6 protein is an isozyme in the Glutathione Peroxidase family and its expression is restricted to embryos and adult olfactory epithelium.

It is found in scaffold LZPO01108088.1 between positions 66890 and 71567 on the reverse strand. This hit was found by comparing GPx6 protein of Mus musculus with Neotoma lepida genome.

Six exons and five introns where predicted including query range from amino acid 0 to 233.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

GPx7

GPx7 protein is an isozyme in the Glutathione Peroxidase family.

It is found in scaffold LZPO01007972.1 between positions 548263 and 546082 on the reverse strand. This hit was found by comparing GPx7 protein of Mus musculus with Neotoma lepida genome.

Two exons and one intron where predicted including query range from amino acid 0 to 131.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

GPx

GPx protein is an enzyme in the Glutathione Peroxidase family.

It is found in scaffold LZPO01008220.1 between positions 26077 and 26340 on the forward strand. This hit was found by comparing GPx protein of Mus musculus with Neotoma lepida genome.

One exon was predicted including query range from amino acid 0 to 88.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

MsrA

MrsA is a protein highly conserved through species, the function of which is the enzymatic reduction of methionine sulfoxide to methionine.

It is found in scaffold LZPO01087211.1 between positions 313217 and 313408 on the reverse strand. This was the best hit in terms of e-value in tblast comparing MsrA protein of Mus musculus with Neotoma lepida genome.

Five exons and four introns were predicted, including a query range from amino acid 121 and 233.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

MSRB2

MSRB2 is an enzyme which catalyzes the reduction of methionine sulfoxide to methionine.

MSRB2 protein is found in scaffold LZPO01097099.1 between positions 77845 and 95298 on the reverse strand. This was the best hit in tblast comparing SelenoN protein of Mus musculus with Neotma lepida genome.

Four exons and three introns were predicted, including a query range from amino acid 0 to 145.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

MSRB3

MSRB3 is an enzyme which catalyzes the reduction of methionine sulfoxide to methionine.

MSRB3 protein is found in scaffold LZPO01075874.1 between positions 60417 and 143594 on the forward strand. This was the best hit in tblast comparing MSRB3 protein of Mus musculus with Neotma lepida genome.

Seven exons and six introns were predicted, including a query range from amino acid 3 to 188.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

SELENOU1

SelenoU1 is expressed in bone, brain, liver, and kidney but its function remains unknown although it is involved in redox processes.

The best hit in tblast between Mus musculus SelenoU1 protein, which is a cysteine-containing homolog, and Neotoma lepida’s genome is located in the scaffold LZPO01055208.1 between positions 96128 and 103405 in the reverse strand.

Five exons and four introns were predicted in the gene, including query range from amino acid 53 to 227.

We hypothezise that a this protein in Neotoma lepida is a cysteine-containing homolog aswell. This would explain why we could not find any SECIS and we found several cysteins aligned in the sequence.

SELENOU2

SelenoU2’s only hit in tblast comparing the Mus musculus protein and Neotoma lepida’s genome is located in the scaffold LZPO01107979.1 between position 213304 and 229685 in the forward strand.

Six exons and five introns were predicted, including query range from amino acid 0 to 224.

Mus musculus’s SelenoU2 is a cysteine-containing homolog so our hypothesis is that a this protein in Neotoma lepida is a cysteine-containing homolog aswell. This would explain why we could not find any SECIS and we found several cysteins aligned in the sequence.

SELENOU3

Mus musculus SelenoU3 was compared with Neotoma lepida’s genome. In the tblast we found only one hit located in scaffold LZPO01055099.1 between positions 244602 and 246266 in the forward strand.

Seven exons and six introns were predicted, including query range from amino acid 0 to 201.

SelenoU2 is a cysteine-containing homolog in Mus musculus so our hypothesis is that a this protein in Neotoma lepida is a cysteine-containing homolog aswell. This would explain why we could not find any SECIS and we found several cysteins aligned in the sequence.


Machinery analysis

SecS

SecS, or selenocysteine synthase, catalyses the synthesis of selenocysteyl-tRNA(Sec) from seryl-tRNA(Sec) in a pyridoxal phosphate-dependent reaction mechanism. The enzyme specifically recognizes the tRNA(Sec) molecule; a cooperative interaction between the tRNA binding site and the catalytically active pyridoxal phosphate site is suggested.

SecS is found in scaffold LZPO01055257.1 between positions 400 and 23954 on the reverse strand. This was the best hit in terms od percentage of identity in tblast comparing SecS protein of Homo sapiens with Neotma lepida genome. We used Homo sapiens SecS because tblast didn't find any alignment with Mus musculus and LZPO01055257.1 instead of the scaffold with higher e-value because the alignment was much better.

Twelve exons and eleven introns were predicted, including a query range from amino acid 22 to 411.

It is a homologous protein because probably, the antecedents of Homo sapiens and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

eEFsec

Murine eEFSec elongation factor binds to selenocysteyl-tRNA[Ser]Sec. It forms the Sec decoding apparatus by the association with SECIS binding protein 2 (SBP2) in the presence of selenocysteyl-tRNA.

It is found in scaffold LZPO01055714.1 between positions 32412 and 54176 on the forward strand. This was not the best hit in tblast comparing eEFsec protein of Mus musculus with Neotoma lepida genome in terms of e-value, but the alginment was better than the hit with the highest e-value.

Six exons and five introns were predicted, including a query range from amino acid 11 to 224.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

PSTK

PSTK, or Phosphoseryl-tRNA[Ser]Sec kinase, is characterized as a protein that phosphorylates the seryl motif on seryl-tRNA[Ser]Sec in the presence of ATP and Mg2+. Moreover, the function and homology of this protein is conserved across archaea and eukaryotes that sinthetise selenoproteins, fact that suggests that it plays an important role in selenoprotein biosynthesis and/or regulation.

PSTK is found in scaffold LZPO01108068.1 between positions 174117 and 176938 on the forward strand. This was the best hit in tblast comparing PSTK protein of Mus musculus with Neotma lepida genome.

Three exons and two introns were predicted, including a query range from amino acid 0 to 225.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

SBP2

SBP 2, or SECIS Binding Protein 2 (SBP2), is a protein which binds to a Sec insertion sequence (SECIS). So, according to that, SBP2 is a machinery protein.

SBP2 is found in scaffold LZPO01044437.1 between positions 97563 and 97817 on the forward strand. This was not the best hit in tblast in terms of e-value and identity percentage comparing SBP2 protein of Mus musculus with Neotma lepida genome. However, t-coffee showed a much better alignment between LZPO01044437.1 scaffold and Mus musculus protein.

Nineteen exons and eighteen introns were predicted, having a large sequence, including a query range from amino acid 279 to 845.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.

SEPHS2

SEPHS2 protein or selenophosphate synthetase 2 is an enzyme that synthesizes selenophosphate (the selenium donor used to synthesize selenocysteine) from selenide and ATP. So, acording to that, SEPHS2 is a machinery protein.

It is found in scaffold LZPO01099884.1 between positions 1827908 and 1829143 on the forward strand. This was the best hit in tblast found comparing SEPHS2 protein of Mus musculus with our Neotoma lepida genome.

One exon was predicted, including query range from amino acid 4 to 451.

It is a selenoprotein as it has a selenocystein in its sequence and we could find a SECIS element in scaffold positions 1829727 to 1829803 in the same strand.

Secp43

Selenocysteine tRNA 1 associated protein, also called Secp43, has a role as a co-factor in selenoprotein expression.

It is found in scaffold LZPO01110309.1 between positions 1810549 and 1823482 on the reverse strand. This was the best hit in tblast comparing Secp43 protein of Mus musculus with Neotoma lepida genome.

Six exons and five introns were predicted, including a query range from amino acid 0 to 245.

It is a homologous protein because probably, the antecedents of Mus musculus and Neotoma lepida had a selenocystein in their sequence. However, along the evolution the selenocystein from the ancestor changed to a cysteine in both species. This is why we see specific cysteins alignments between them.