Results

The following table resumes the results obtained in our analysis of the selenoproteins in Anas zonorhyncha proteome. We can also find the files obtained for each protein: its sequence, tBLASTn, exonerate, T-coffee, SECIS sequence, SECIS image and Seblastian. For the representation of the results we have used the legend below.


Protein Specie Residue Scaffold   Gene location   Tblastn Exonerate T-Coffee SECIS info SECIS photo Seblastian
Selenoproteins
15kDa selenoprotein family
Sel15 Sel SPP00001043_2.0 (-) 68456-50128
Glutathione peroxidase family
GPx3 Sel SPP00001033_2.0 (+) 50042-50921
GPx7 Cys SPP00001034_2.0 (+) 50000-51653
GPx8 Cys SPP00001035_2.0 (+) 49197-52251
Iodothyronine deiodinase family
DIO1 Sel SPP00001030_2.0 (+) 46828-50742
DIO2 Sel SPP00001031_2.0 (-) 61722-50569
DIO3 Sel SPP00000003_1.0 (+) 49958-50749
Methionine sulfoxide reductase family
MsrB1 Sel SPP00001056_2.1 (-) 50003-49898
MsrB3 Cys SPP00001057_2.0 (+) 1169-50164
Selenoprotein I family
SELENOI Sel SPP00001048_2.0 (+) 18216-25671
Selenoprotein K family
SELENOK Sel SPP00001044_2.0 (+) 49649-49667
Selenoprotein N family
SELENON Sel SPP00001049_2.0 (-) 27994-13434
Selenoprotein O family
SELENOO Sel SPP00001052_2.0 (+) 40505-54611
Selenoprotein P family
SELENOPa Sel SPP00000020_1.0 (-) 53987-47900
SELENOPb Sel SPP00001055_2.0 (-) 52064-46290
Selenoprotein S family
SELENOS Sel SPP00001058_2.0 (-) 52874-48043
Selenoprotein T family
SELENOT Sel SPP00001059_2.0 (-) 51204-47624
Selenoprotein U family
SELENOU1 Sel SPP00001060_2.0 (+) 49997-55771
Thioredoxin reductase family
TXNRD Sel SPP00001061_2.0 (-) 56382-31146
TXNRD2 Sel SPP00001062_2.0 (-) 63209-32444
TXNRD3 Sel SPP00001063_2.0 (-) 58148-39754
Selenoproteins machinery
Eukariotic elongation factor family
eEFsec Cys SPP00001064_2.0 (+) 13930-63811
Phosphoseryl-tRNA family
PSTK Cys SPP00001065_2.0 (-) 52044-50202
Selenophosphate synthetase family
SEPHS Sel SPP00001040_2.0 (-) 64042-45797
SECIS binding protein 2 family
SBP2 Cys SPP00001039_2.0 (+) 50000-50218
Sec synthase family
SECS Cys SPP00001042_2.0 (-) 53842-31904
SECp43 family
SECp43 Cys SPP00001066_2.0 (+) 44931-57399


Selenoproteins analysis

15kDa selenoprotein

The query corresponding to Sel15 is located in the scaffold SPP00001043_2.0 between the position 68456 and the position 50128 in the reverse strand. This gene contains 4 exons. The structure of the gene, analyzed with the Exonerate file, is described below:

Sel15 E1 E2 E3 E4
68456 68623 59270 59333 51882 51931 50000 50128

Seblastian was able to predict one known selenoprotein using the 15 kDa selenoprotein from Anas platyrhynchos. Regarding the SECIs, one grade A SECI was found between the locations 49394-49464. As it is located in 5' UTR, we can validate the SECI as the most suitable one.


Glutathione peroxidase family

  • Glutathione peroxidase 3

The query corresponding to GPx3 from chicken has been aligned in the scaffold SPP00001033_2.0 from Anas zonorhyncha's genome. Concretely, it is found from position 50042 to position 50921 in the forward strand. It contains four exons with one selenocysteine in the second one. The structure of the gene, analyzed with the Exonerate file, is described below:

GPx3 E1 E2 E3 E4
50042 50198 50317 50434 50521 50620 50718 50921

Seblastian has predicted one SECIS of grade A between the positions 51133-51212 in the forward strand, which means that is a suitable SECIS because it is in the 3' UTR. Seblastian could not predict any selenoprotein.


  • Glutathione peroxidase 7

The query corresponding to Gpx7 is located in the forward strand in the scaffold SPP00001034_2.0. It is situated between the positions 50000 and 51653. This gene is composed by two exons without any selenocysteine and it is distributed as shown below:

GPx7 E1 E2
50000 50264 51529 51653

Seblastian has not predicted any selenoprotein because it is a cysteine homologue. For the same reason, no SECIS elements have been identified.


  • Glutathione peroxidase 8

The query corresponding to Gpx8 is located in the forward strand in the scaffold SPP00001035_2.0 between the positions 49197 and 52251. This gene is composed by two exons without any selenocysteine and it is distributed as shown below:

GPx8 E1 E2 E3
49197 49403 50003 50264 52091 52251

Seblastian has not predicted any selenoprotein but has predicted one SECIS of B grade. As the SECIS was predicted in the reverse strand it could not be considerated as a valid SECIS.


Iodothyronine deiodinase family

  • Iodothyronine deiodinase 1

The query corresponding to DIO1 is located in the scaffold SPP00001030_2.0 between the positions 46828 and 50742 in the forward strand. This gene is formed by 4 exons with a selenocysteine in the second exon. The structure of the gene, analyzed with the Exonerate file, is described below:

DIO1 E1 E2 E3 E4
46828 47158 48641 48784 50001 50200 50680 50742

The result protein that has been predicted with Seblastian correlates exactly with our result; it has also four exons and the selenocysteine is in the second one. Seblastian has found one SECIS of grade A between the positions 51233-51300 in the forward strand. Its location shows that it is located in 3' UTR so we select this SECI as a suitable one.


  • Iodothyronine deiodinase 2

The query corresponding to DIO2 is located between the position 61722 and 50569 in the reverse strand, in scaffold SPP00001031_2.0. This sequence has only two exons and the selenocysteine is in the second one. The structure of the gene, analyzed with the Exonerate file, is described below:

DIO2 E1 E2
61722 61943 50000 50569

Seblastian has predicted one SECIS of grade A between the positions 45485-45559 in the reverse strand which means that the SECIS is located in 3' UTR. The selenoprotein could not be predicted by Seblastian.


  • Iodothyronine deiodinase 3

The query corresponding to DIO3 has been found in scaffold SPP00000003_1.0 between the positions 49958 and 50749 in the forward strand. This gene contains only one exon with a selenocysteine. The structure of the gene, analyzed with the Exonerate file, is described below:

DIO3 E1
49958 50749

A grade A SECIS has been predicted by Seblastian. It is located between the position 51183 and the position 51263 in the 3'UTR of the sequence.

Seblastian also predicted a selenoprotein, the known selenoprotein type III iodothyronine deiodinase from Anas platyrhynchos.


Methionine sulfoxide reductase family

  • Methionine R-sulfoxide reductase 1

The query corresponding to Msrb1 has been found in scaffold SPP00001056_2.1 between the positions 50003 and 49898 in the reverse strand. This gene has two exons with the selenocystein located in the second one and it is distributed as shown below:

MsrB1 E1 E2
50003 50149 49776 49898

The selenoprotein predicted with Seblastian is correlated with our prediction but one additional exon in the 3' UTR region is found in the Seblastian results. A SECIS of A grade was also predicted by Seblastian in the reverse strand and in the 3' UTR because it is located between 48579-48650 positions.


  • Methionine R-sulfoxide reductase 3

The query corresponding to MSRB3 was found in the scaffold SPP00001057_2.0 between the position 1169 and the position 50164 in the forward strand. This gene is formed by 5 exons but no selenocysteine was found. The structure of the gene, analyzed with the Exonerate file, is described below:

MsrB3 E1 E2 E3 E4 E5
1169 1281 3082 3159 13588 13616 45503 45600 50003 50164

No selenoproteins could be predicted by Seblastian, and no SECIS elements have been identified by SECISearch3. This agrees with the expected results, since it is a selenocysteine homologue.


Selenoprotein I

The query corresponding to Selenoprotein I has been found in the scaffold SPP00001048_2.0, in the forward strand. Its exons are distributed from position 18216 to position 25671. It is distributed in 7 exons, as shown below, with a selenocysteine in exon 6:

SELENOI E1 E2 E3 E4 E5 E6 E7
18216 18297 18790 19052 20302 20410 21894 21942 22388 22568 23342 23524 25558 25671

Seblastian was unable to predict the protein or any SECIS.


Selenoprotein K

The query corresponding to Selenoprotein K has been found in the scaffold SPP00001044_2.0 in the forward strand. This sequence has 4 exons distributed between the positions 49649 and 51609 as shown below, with selenocysteine found in exon 4:

SELENOK E1 E2 E3 E4
49649 49667 50019 50109 50850 50927 51516 51609

Seblastian was able to predict the protein. However, this sequence has one exon missing in comparison with the query from Ensembl. Seblastian also predicted a grade A SECIS on the positions 52621 - 52707 on the forward strand. Therefore, it is located in the 3' UTR, and it is a valid SECIS.


Selenoprotein N

The query corresponding to Selenoprotein N has been found in the scaffold SPP00001049_2.0 in the negative strand. This protein has 12 exons distributed between the positions 27994 and 13434 as shown below, with a selenocysteine in exon 6:

SELENON E1 E2 E3 E4 E5 E6
27994 28101 23248 23365 22314 22444 21629 21841 20617 20741 19672 19809
E7 E8 E9 E10 E11 E12
18870 18951 17728 17916 16347 16452 15404 15516 14628 14729 13270 13434

Seblastian was able to predict the protein, although it predicted 9 exons instead of 12. It also predicted a grade A SECIS on the positions 10322 - 10255 on the negative strand. Therefore, it is located in the 3' UTR, and it is a valid SECIS.


Selenoprotein O

The query corresponding to Selenoprotein O has been found in the scaffold SPP00001052_2.0 in the forward strand. This gene has 9 exons distributed between the positions 40505 and 54611 as shown below, with a selenocysteine in exon 9:

SELENOO E1 E2 E3 E4 E5
40505 40648 44962 45104 46056 46236 49540 49670 50017 50297
E6 E7 E8 E9
50646 50796 52350 52535 53630 53786 54420 54611

Seblastian was able to predict the protein, however it showed some changes. The first predicted exon by the program was divided into two by Seblastian, with the first exon in Seblastian starting in an earlier position and with a methionine (M) residue. It also predicted a grade A SECIS in the positions 65771 - 65845 in the forward strand. Therefore, it is located in the 3' UTR, and it is a valid SECIS.


Selenoprotein P family

  • Selenoprotein Pa

The query corresponding to Seleno Pa is located in the reverse strand in the scaffold SPP00001054_2.0. It is situated between the positions 53957 and 47900. This gene is composed by five exons and there are 13 selenocysteins amongst them, distributed like this: One in the first exon and the other twelves in the last one. The exons are distributed as shown below:

SELENOPa E1 E2 E3 E4 E5
53957 54147 51897 52109 50006 50123 47934 48183 47539 47900

SEBLASTIAN predicted two valid SECIS in the 3’ UTR of the reverse strand, one grade A SECIS and one grade B SECIS, between the positions 47151 and 47222 and the positions 46481 and 46551 respectively.

SEBLASTIAN predicted a known selenoprotein in the sequence, using as a query the selenoprotein P partial from Anas platyrhynchos. Nevertheless, it is important to highlight that the protein predicted is a low quality one.


  • Selenoprotein Pb

The protein Seleno P is located in the reverse strand in the scaffold SPP00001055_2.0. It is situated between the positions 52064 and 46290. This gene is composed by four exons and there is just one selenocysteine, which is located in the first exone. The exons are distributed as shown below:

SELENOPb E1 E2 E3 E4
52064 52254 50004 50216 48113 48230 46081 46290

Slebastian predicted two valid SECIS in the 3’ UTR of the reverse strand, one grade A SECIS and one grade B SECIS, between the positions 47151 and 47222 and the positions 46481 and 46551 respectively.

Seblastian also predicted a known selenoprotein, but it matches with the protein SelenoPa rather than with SelenoPb, so we did not considered it a plausible selenoproteine prediction.


Selenoprotein S

The query corresponding to Seleno S has been found in the scaffold SPP00001058_2.0, from position 52874 to position 48043 in the reverse strand. It contains six exons and a selenocystein has been found in the sixth one. The structure of the gene, analyzed with the Exonerate file, is described below:

SELENOS E1 E2 E3 E4 E5 E6
52874 52903 51962 52136 50748 50854 50184 50273 50002 50080 47961 48043

SEBLASTIAN has predicted a known selenoprotein, using the Selenoprotein S from Gallus Gallus as a query.

Regarding the SECIS, SEBLASTIAN predicted two of them, a Grade A one and a Grade B one, but the second one is located in the forward strand so it is not a plausible SECIS for our target sequence. The Grade A one is valid though, as it is located from the position 47312 to the position 47402 in the reverse strand, which is the 3' UTR.


Selenoprotein T

The query corresponding to Seleno T is located in the reverse strand in the scaffold SPP00001059_2.0. It is situated between the positions 51204 and 47624. This gene contains a selenocysteine and is composed by four exons with a selenocystein in the first one. The four exons are distributed as shown below:

SELENOT E1 E2 E3 E4
51204 51313 50003 50120 48461 48548 47503 47624

SEBLASTIAN has not predicted any known selenoprotein. However, SEBLASTIAN predicted a valid grade A SECIS for this sequence, located in the reverse strand from the position 45790 to the position 45871, in the 3' UTR.


Selenoprotein U

The query corresponding to Seleno U1 is located in the forward strand in the scaffold SPP00001060_2.0. The gene contains a selenocysteine and is distributed in 5 exons within the positions 49997 and 55771, as described in the following figure. There is a selenocystein in the second exon.

SELENOU1 E1 E2 E3 E4 E5
49997 50183 50685 50776 54251 54391 55368 55532 55676 55771

SEBLASTIAN has predicted a known selenoprotein from this sequence, being the query protein the redox-regulatory protein FAM213A from Anas platyrhynchos.

SEBLASTIAN predicted a SECIS element of grade A between the position 55977 and 56041 in the forward strand. As it is located in the 3' UTR of the sequence, we have considered it as a valid SECIS.


Thiodoredoxin reductase family

  • Thiodoredoxin reductase

The query corresponding to TXNRD protein is found in scaffold SPP00001061_2.0, in the reverse strand, within the positions 56382 and 31146. The gene contains a selenocystein residue and it is distributed in 13 exons as seen below, and it contains a selenocystein in the last one:

TXNRD E1 E2 E3 E4 E5 E6 E7
56382 56510 53643 53715 53407 53526 52392 52534 52150 52265 50003 50228 47566 47658
E8 E9 E10 E11 E12 E13
42026 42102 40193 40349 38026 38133 36532 36627 34307 34441 31081 31146

SEBLASTIAN predicted in the sequence a known selenoprotein, the thioredoxin reductase 1 partial from Meleagris gallopavo.

Two SECIS elements have been predicted by SEBLASTIAN, one of grade A and one of grade B. The B one is located in the forward strand, so it cannot be a SECIS element in the target sequence. The A one is a valid one, as it is located in the 3' UTR in the reverse strand, between the positions 30756 and 30847.


  • Thiodoredoxin reductase 2

The query corresponding to TXNRD2 is found in the scaffold SPP00001062_2.0 in the reverse strand, between the positions 63209 and 32444. There are 16 exons in the gene, with a selenocystein in the last one, which are distributed in the following way:

TXNRD2 E1 E2 E3 E4 E5
63209 63284 62045 62110 61279 61335 60519 60663 60131 60205
E6 E7 E8 E9 E10
59239 59317 58136 58198 56938 57008 52700 52811 50008 50182
E11 E12 E13 E14 E15 E16
43255 43391 37849 37944 37549 37641 36646 36717 34709 34806 32318 32444

SEBLASTIAN predicted a known selenoprotein in this sequence, the thioredoxine reductase 2 partial from Meleagris gallopavo.

SEBLASTIAN predicted two different SECIS, a grade A one and a grade B one. Both of them are well located in the reverse strand, but just the grade A one is located in the 3' UTR, between the positions 29637 and 29714.


  • Thiodoredoxin reductase 3

The gene TXNRD3 is located in the scaffold SPP00001063_2.0, between the position 58148 and the position 39754, in the reverse strand. This gene also has 16 exons with a selenocystein in the last one of them, which are distributed as shown below:

TXNRD3 E1 E2 E3 E4 E5
58148 58267 51634 51694 50567 50676 49881 49994 48458 48530
E6 E7 E8 E9 E10
47531 47650 47027 47169 45516 45631 45186 45411 44867 44959
E11 E12 E13 E14 E15 E16
44328 44404 43541 43366 42904 43011 41957 42052 40045 40179 39689 39754

SEBLASTIAN found a prediction of a known selenoprotein in the sequence, the thioredoxine reductase 3 from Gallus Gallus.

SEBLASTIAN also predicted a SECIS element of grade A, in the reverse strand of the sequence, in the 3' UTR between the positions 39377 and 39459.



Selenoproteins machinery analysis


Eukariotic elongation factor

The query corresponding to eEFsec is located in the forward strand in the scaffold SPP00001064_2.0. It is situated between the positions 13930 and 63811. This gene is composed by five exons without any selenocysteine. This agrees with the expected results as it is a machinery protein. The exons are distributed as shown below:

eEFsec E1 E2 E3 E4 E5
13930 14138 17974 18070 20773 20937 50000 50641 63734 63811

In this sequence no SECIS elements have been predicted either the protein by Seblastian.


Phosphoseryl-tRna

The query corresponding to PSTK was found in the scaffold SPP00001065_2.0 between the position 52044 and the position 50202 in the reverse strand. This gene is formed by 6 exons but no selenocysteine was found. The structure of the gene, analyzed with the Exonerate file, is described below:

PSTK E1 E2 E3 E4 E5 E6
52044 52223 51857 51947 51445 51673 50990 51065 50415 50508 50000 50202

No selenoprotein was predicted by Seblastian and no SECIS were found either, which agrees with the expected results since it is a machinery protein.


Selenophospate synthetase

The query corresponding to SEPHS has been found in the reverse strand in the scaffold SPP00001040_2.0. The gene contains a selenocysteine and is distributed in 8 exons, the first one starting at the position 64042 and the last one ending at the position 45797. No selenocystein has been found in the transcript.

SEPHS E1 E2 E3 E4
64042 64234 60928 61031 56613 56720 55670 55824
E5 E6 E7 E8
54807 54897 51449 51558 50002 50219 45586 45797

Even though SEBLASTIAN has not predicted any known selenoprotein in the sequence, the website predicted a SECIS element of grade B in the reverse strand of the sequence in the 3’UTR, between the position 43247 and the position 43325.


SECIS binding protein 2

This sequence is located in the scaffold SPP00001038_2.0 between the position 32018 and the position 53085 in the forward strand. This gene is formed by 16 exons but no selenocysteine was found. The structure of the gene, analyzed with the Exonerate file, is described below:

SBP2 E1 E2 E3 E4 E5
32018 32169 33270 33462 33949 34078 34515 34738 35428 35563
E6 E7 E8 E9 E10
37345 37493 38046 38204 41156 41260 43611 43776 46170 46339
E11 E12 E13 E14 E15 E16
47954 48089 48754 48895 49999 50219 50967 51121 52010 52208 53009 53085

No selenoprotein could be predicted in this sequence. Nevertheless, one SECIS grade B was found. Since the position of this SECI is 40543-40620, it is found in the middle of the protein, so we discard it. This agrees with the expected result for the reason that SBP2 is a machinery protein.


Sec synthase

This gene was found in the scaffold SPP00001042_2.0 between the position 53842 and the position 31904 in the reverse strand. This gene has 11 exons but no selenocysteines were found. The structure of the gene, analyzed with the Exonerate file, is described below:

SECS E1 E2 E3 E4 E5 E6
53842 53955 52584 52738 50245 50363 50002 50160 48962 49115 44844 44946
E7 E8 E9 E10 E11
43210 43339 42991 43082 34327 34420 33585 33675 31715 31904

No selenoproteins were predicted but one SECI was found. This SECI is grade C and it is found between the positions 5785-5870. So, as in SecS_1, we discard the SECI because is located in 5' UTR.


SECp43

This sequence is located in the scaffold SPP00001066_2.0 between the position 44931 and the position 57399 in the forward strand. This gene contains 7 exons but we can not find any selenocystein. The structure of the gene, analyzed with the Exonerate file, is described below:

SECp43 E1 E2 E3 E4 E5 E6 E7
44931 45028 45813 45912 48709 45761 50002 50133 53245 53364 54756 54933 57256 57399

No selenoproteins were found in the sequence but Seblastian found 3 different SECIs. Since two of the SECIs are located in the reverse strand, we discard them. The third SECI is found between the positions 70361-70437 and its grade is B. Its location shows that it is located in 3' UTR so we select this SECI as the most suitable one.