The following table resumes the results obtained in our analysis of the selenoproteins in Anas zonorhyncha proteome. We can also find the files obtained for each protein: its sequence, tBLASTn, exonerate, T-coffee, SECIS sequence, SECIS image and Seblastian. For the representation of the results we have used the legend below.
Protein | Specie | Residue | Scaffold |   Gene location   | Tblastn | Exonerate | T-Coffee | SECIS info | SECIS photo | Seblastian |
---|---|---|---|---|---|---|---|---|---|---|
Selenoproteins | 15kDa selenoprotein family | |||||||||
Sel15 | Sel | SPP00001043_2.0 | (-) 68456-50128 | Glutathione peroxidase family | ||||||
GPx3 | Sel | SPP00001033_2.0 | (+) 50042-50921 | |||||||
GPx7 | Cys | SPP00001034_2.0 | (+) 50000-51653 | |||||||
GPx8 | Cys | SPP00001035_2.0 | (+) 49197-52251 | Iodothyronine deiodinase family | ||||||
DIO1 | Sel | SPP00001030_2.0 | (+) 46828-50742 | |||||||
DIO2 | Sel | SPP00001031_2.0 | (-) 61722-50569 | |||||||
DIO3 | Sel | SPP00000003_1.0 | (+) 49958-50749 | Methionine sulfoxide reductase family | ||||||
MsrB1 | Sel | SPP00001056_2.1 | (-) 50003-49898 | |||||||
MsrB3 | Cys | SPP00001057_2.0 | (+) 1169-50164 | Selenoprotein I family | ||||||
SELENOI | Sel | SPP00001048_2.0 | (+) 18216-25671 | Selenoprotein K family | ||||||
SELENOK | Sel | SPP00001044_2.0 | (+) 49649-49667 | Selenoprotein N family | ||||||
SELENON | Sel | SPP00001049_2.0 | (-) 27994-13434 | Selenoprotein O family | ||||||
SELENOO | Sel | SPP00001052_2.0 | (+) 40505-54611 | Selenoprotein P family | ||||||
SELENOPa | Sel | SPP00000020_1.0 | (-) 53987-47900 | |||||||
SELENOPb | Sel | SPP00001055_2.0 | (-) 52064-46290 | Selenoprotein S family | ||||||
SELENOS | Sel | SPP00001058_2.0 | (-) 52874-48043 | Selenoprotein T family | ||||||
SELENOT | Sel | SPP00001059_2.0 | (-) 51204-47624 | Selenoprotein U family | ||||||
SELENOU1 | Sel | SPP00001060_2.0 | (+) 49997-55771 | Thioredoxin reductase family | ||||||
TXNRD | Sel | SPP00001061_2.0 | (-) 56382-31146 | |||||||
TXNRD2 | Sel | SPP00001062_2.0 | (-) 63209-32444 | |||||||
TXNRD3 | Sel | SPP00001063_2.0 | (-) 58148-39754 | |||||||
Selenoproteins machinery | Eukariotic elongation factor family | |||||||||
eEFsec | Cys | SPP00001064_2.0 | (+) 13930-63811 | Phosphoseryl-tRNA family | ||||||
PSTK | Cys | SPP00001065_2.0 | (-) 52044-50202 | Selenophosphate synthetase family | ||||||
SEPHS | Sel | SPP00001040_2.0 | (-) 64042-45797 | SECIS binding protein 2 family | ||||||
SBP2 | Cys | SPP00001039_2.0 | (+) 50000-50218 | Sec synthase family | ||||||
SECS | Cys | SPP00001042_2.0 | (-) 53842-31904 | SECp43 family | ||||||
SECp43 | Cys | SPP00001066_2.0 | (+) 44931-57399 |
The query corresponding to Sel15 is located in the scaffold SPP00001043_2.0 between the position 68456 and the position 50128 in the reverse strand. This gene contains 4 exons. The structure of the gene, analyzed with the Exonerate file, is described below:
Sel15 | E1 | E2 | E3 | E4 | ||||
---|---|---|---|---|---|---|---|---|
68456 | 68623 | 59270 | 59333 | 51882 | 51931 | 50000 | 50128 |
Seblastian was able to predict one known selenoprotein using the 15 kDa selenoprotein from Anas platyrhynchos. Regarding the SECIs, one grade A SECI was found between the locations 49394-49464. As it is located in 5' UTR, we can validate the SECI as the most suitable one.
The query corresponding to GPx3 from chicken has been aligned in the scaffold SPP00001033_2.0 from Anas zonorhyncha's genome. Concretely, it is found from position 50042 to position 50921 in the forward strand. It contains four exons with one selenocysteine in the second one. The structure of the gene, analyzed with the Exonerate file, is described below:
GPx3 | E1 | E2 | E3 | E4 | ||||
---|---|---|---|---|---|---|---|---|
50042 | 50198 | 50317 | 50434 | 50521 | 50620 | 50718 | 50921 |
Seblastian has predicted one SECIS of grade A between the positions 51133-51212 in the forward strand, which means that is a suitable SECIS because it is in the 3' UTR. Seblastian could not predict any selenoprotein.
The query corresponding to Gpx7 is located in the forward strand in the scaffold SPP00001034_2.0. It is situated between the positions 50000 and 51653. This gene is composed by two exons without any selenocysteine and it is distributed as shown below:
GPx7 | E1 | E2 | ||
---|---|---|---|---|
50000 | 50264 | 51529 | 51653 |
Seblastian has not predicted any selenoprotein because it is a cysteine homologue. For the same reason, no SECIS elements have been identified.
The query corresponding to Gpx8 is located in the forward strand in the scaffold SPP00001035_2.0 between the positions 49197 and 52251. This gene is composed by two exons without any selenocysteine and it is distributed as shown below:
GPx8 | E1 | E2 | E3 | |||
---|---|---|---|---|---|---|
49197 | 49403 | 50003 | 50264 | 52091 | 52251 |
Seblastian has not predicted any selenoprotein but has predicted one SECIS of B grade. As the SECIS was predicted in the reverse strand it could not be considerated as a valid SECIS.
The query corresponding to DIO1 is located in the scaffold SPP00001030_2.0 between the positions 46828 and 50742 in the forward strand. This gene is formed by 4 exons with a selenocysteine in the second exon. The structure of the gene, analyzed with the Exonerate file, is described below:
DIO1 | E1 | E2 | E3 | E4 | ||||
---|---|---|---|---|---|---|---|---|
46828 | 47158 | 48641 | 48784 | 50001 | 50200 | 50680 | 50742 |
The result protein that has been predicted with Seblastian correlates exactly with our result; it has also four exons and the selenocysteine is in the second one. Seblastian has found one SECIS of grade A between the positions 51233-51300 in the forward strand. Its location shows that it is located in 3' UTR so we select this SECI as a suitable one.
The query corresponding to DIO2 is located between the position 61722 and 50569 in the reverse strand, in scaffold SPP00001031_2.0. This sequence has only two exons and the selenocysteine is in the second one. The structure of the gene, analyzed with the Exonerate file, is described below:
DIO2 | E1 | E2 | ||
---|---|---|---|---|
61722 | 61943 | 50000 | 50569 |
Seblastian has predicted one SECIS of grade A between the positions 45485-45559 in the reverse strand which means that the SECIS is located in 3' UTR. The selenoprotein could not be predicted by Seblastian.
The query corresponding to DIO3 has been found in scaffold SPP00000003_1.0 between the positions 49958 and 50749 in the forward strand. This gene contains only one exon with a selenocysteine. The structure of the gene, analyzed with the Exonerate file, is described below:
DIO3 | E1 | |
---|---|---|
49958 | 50749 |
A grade A SECIS has been predicted by Seblastian. It is located between the position 51183 and the position 51263 in the 3'UTR of the sequence.
Seblastian also predicted a selenoprotein, the known selenoprotein type III iodothyronine deiodinase from Anas platyrhynchos.
The query corresponding to Msrb1 has been found in scaffold SPP00001056_2.1 between the positions 50003 and 49898 in the reverse strand. This gene has two exons with the selenocystein located in the second one and it is distributed as shown below:
MsrB1 | E1 | E2 | ||
---|---|---|---|---|
50003 | 50149 | 49776 | 49898 |
The selenoprotein predicted with Seblastian is correlated with our prediction but one additional exon in the 3' UTR region is found in the Seblastian results. A SECIS of A grade was also predicted by Seblastian in the reverse strand and in the 3' UTR because it is located between 48579-48650 positions.
The query corresponding to MSRB3 was found in the scaffold SPP00001057_2.0 between the position 1169 and the position 50164 in the forward strand. This gene is formed by 5 exons but no selenocysteine was found. The structure of the gene, analyzed with the Exonerate file, is described below:
MsrB3 | E1 | E2 | E3 | E4 | E5 | |||||
---|---|---|---|---|---|---|---|---|---|---|
1169 | 1281 | 3082 | 3159 | 13588 | 13616 | 45503 | 45600 | 50003 | 50164 |
No selenoproteins could be predicted by Seblastian, and no SECIS elements have been identified by SECISearch3. This agrees with the expected results, since it is a selenocysteine homologue.
The query corresponding to Selenoprotein I has been found in the scaffold SPP00001048_2.0, in the forward strand. Its exons are distributed from position 18216 to position 25671. It is distributed in 7 exons, as shown below, with a selenocysteine in exon 6:
SELENOI | E1 | E2 | E3 | E4 | E5 | E6 | E7 | 18216 | 18297 | 18790 | 19052 | 20302 | 20410 | 21894 | 21942 | 22388 | 22568 | 23342 | 23524 | 25558 | 25671 |
---|
Seblastian was unable to predict the protein or any SECIS.
The query corresponding to Selenoprotein K has been found in the scaffold SPP00001044_2.0 in the forward strand. This sequence has 4 exons distributed between the positions 49649 and 51609 as shown below, with selenocysteine found in exon 4:
SELENOK | E1 | E2 | E3 | E4 | ||||
---|---|---|---|---|---|---|---|---|
49649 | 49667 | 50019 | 50109 | 50850 | 50927 | 51516 | 51609 |
Seblastian was able to predict the protein. However, this sequence has one exon missing in comparison with the query from Ensembl. Seblastian also predicted a grade A SECIS on the positions 52621 - 52707 on the forward strand. Therefore, it is located in the 3' UTR, and it is a valid SECIS.
The query corresponding to Selenoprotein N has been found in the scaffold SPP00001049_2.0 in the negative strand. This protein has 12 exons distributed between the positions 27994 and 13434 as shown below, with a selenocysteine in exon 6:
SELENON | E1 | E2 | E3 | E4 | E5 | E6 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
27994 | 28101 | 23248 | 23365 | 22314 | 22444 | 21629 | 21841 | 20617 | 20741 | 19672 | 19809 | |
E7 | E8 | E9 | E10 | E11 | E12 | |||||||
18870 | 18951 | 17728 | 17916 | 16347 | 16452 | 15404 | 15516 | 14628 | 14729 | 13270 | 13434 |
Seblastian was able to predict the protein, although it predicted 9 exons instead of 12. It also predicted a grade A SECIS on the positions 10322 - 10255 on the negative strand. Therefore, it is located in the 3' UTR, and it is a valid SECIS.
The query corresponding to Selenoprotein O has been found in the scaffold SPP00001052_2.0 in the forward strand. This gene has 9 exons distributed between the positions 40505 and 54611 as shown below, with a selenocysteine in exon 9:
SELENOO | E1 | E2 | E3 | E4 | E5 | |||||
---|---|---|---|---|---|---|---|---|---|---|
40505 | 40648 | 44962 | 45104 | 46056 | 46236 | 49540 | 49670 | 50017 | 50297 | |
E6 | E7 | E8 | E9 | |||||||
50646 | 50796 | 52350 | 52535 | 53630 | 53786 | 54420 | 54611 |
Seblastian was able to predict the protein, however it showed some changes. The first predicted exon by the program was divided into two by Seblastian, with the first exon in Seblastian starting in an earlier position and with a methionine (M) residue. It also predicted a grade A SECIS in the positions 65771 - 65845 in the forward strand. Therefore, it is located in the 3' UTR, and it is a valid SECIS.
The query corresponding to Seleno Pa is located in the reverse strand in the scaffold SPP00001054_2.0. It is situated between the positions 53957 and 47900. This gene is composed by five exons and there are 13 selenocysteins amongst them, distributed like this: One in the first exon and the other twelves in the last one. The exons are distributed as shown below:
SELENOPa | E1 | E2 | E3 | E4 | E5 | |||||
---|---|---|---|---|---|---|---|---|---|---|
53957 | 54147 | 51897 | 52109 | 50006 | 50123 | 47934 | 48183 | 47539 | 47900 |
SEBLASTIAN predicted two valid SECIS in the 3’ UTR of the reverse strand, one grade A SECIS and one grade B SECIS, between the positions 47151 and 47222 and the positions 46481 and 46551 respectively.
SEBLASTIAN predicted a known selenoprotein in the sequence, using as a query the selenoprotein P partial from Anas platyrhynchos. Nevertheless, it is important to highlight that the protein predicted is a low quality one.
The protein Seleno P is located in the reverse strand in the scaffold SPP00001055_2.0. It is situated between the positions 52064 and 46290. This gene is composed by four exons and there is just one selenocysteine, which is located in the first exone. The exons are distributed as shown below:
SELENOPb | E1 | E2 | E3 | E4 | ||||
---|---|---|---|---|---|---|---|---|
52064 | 52254 | 50004 | 50216 | 48113 | 48230 | 46081 | 46290 |
Slebastian predicted two valid SECIS in the 3’ UTR of the reverse strand, one grade A SECIS and one grade B SECIS, between the positions 47151 and 47222 and the positions 46481 and 46551 respectively.
Seblastian also predicted a known selenoprotein, but it matches with the protein SelenoPa rather than with SelenoPb, so we did not considered it a plausible selenoproteine prediction.
The query corresponding to Seleno S has been found in the scaffold SPP00001058_2.0, from position 52874 to position 48043 in the reverse strand. It contains six exons and a selenocystein has been found in the sixth one. The structure of the gene, analyzed with the Exonerate file, is described below:
SELENOS | E1 | E2 | E3 | E4 | E5 | E6 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
52874 | 52903 | 51962 | 52136 | 50748 | 50854 | 50184 | 50273 | 50002 | 50080 | 47961 | 48043 |
SEBLASTIAN has predicted a known selenoprotein, using the Selenoprotein S from Gallus Gallus as a query.
Regarding the SECIS, SEBLASTIAN predicted two of them, a Grade A one and a Grade B one, but the second one is located in the forward strand so it is not a plausible SECIS for our target sequence. The Grade A one is valid though, as it is located from the position 47312 to the position 47402 in the reverse strand, which is the 3' UTR.The query corresponding to Seleno T is located in the reverse strand in the scaffold SPP00001059_2.0. It is situated between the positions 51204 and 47624. This gene contains a selenocysteine and is composed by four exons with a selenocystein in the first one. The four exons are distributed as shown below:
SELENOT | E1 | E2 | E3 | E4 | ||||
---|---|---|---|---|---|---|---|---|
51204 | 51313 | 50003 | 50120 | 48461 | 48548 | 47503 | 47624 |
SEBLASTIAN has not predicted any known selenoprotein. However, SEBLASTIAN predicted a valid grade A SECIS for this sequence, located in the reverse strand from the position 45790 to the position 45871, in the 3' UTR.
The query corresponding to Seleno U1 is located in the forward strand in the scaffold SPP00001060_2.0. The gene contains a selenocysteine and is distributed in 5 exons within the positions 49997 and 55771, as described in the following figure. There is a selenocystein in the second exon.
SELENOU1 | E1 | E2 | E3 | E4 | E5 | |||||
---|---|---|---|---|---|---|---|---|---|---|
49997 | 50183 | 50685 | 50776 | 54251 | 54391 | 55368 | 55532 | 55676 | 55771 |
SEBLASTIAN has predicted a known selenoprotein from this sequence, being the query protein the redox-regulatory protein FAM213A from Anas platyrhynchos.
SEBLASTIAN predicted a SECIS element of grade A between the position 55977 and 56041 in the forward strand. As it is located in the 3' UTR of the sequence, we have considered it as a valid SECIS.
The query corresponding to TXNRD protein is found in scaffold SPP00001061_2.0, in the reverse strand, within the positions 56382 and 31146. The gene contains a selenocystein residue and it is distributed in 13 exons as seen below, and it contains a selenocystein in the last one:
TXNRD | E1 | E2 | E3 | E4 | E5 | E6 | E7 | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
56382 | 56510 | 53643 | 53715 | 53407 | 53526 | 52392 | 52534 | 52150 | 52265 | 50003 | 50228 | 47566 | 47658 | |
E8 | E9 | E10 | E11 | E12 | E13 | |||||||||
42026 | 42102 | 40193 | 40349 | 38026 | 38133 | 36532 | 36627 | 34307 | 34441 | 31081 | 31146 |
SEBLASTIAN predicted in the sequence a known selenoprotein, the thioredoxin reductase 1 partial from Meleagris gallopavo.
Two SECIS elements have been predicted by SEBLASTIAN, one of grade A and one of grade B. The B one is located in the forward strand, so it cannot be a SECIS element in the target sequence. The A one is a valid one, as it is located in the 3' UTR in the reverse strand, between the positions 30756 and 30847.
The query corresponding to TXNRD2 is found in the scaffold SPP00001062_2.0 in the reverse strand, between the positions 63209 and 32444. There are 16 exons in the gene, with a selenocystein in the last one, which are distributed in the following way:
TXNRD2 | E1 | E2 | E3 | E4 | E5 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
63209 | 63284 | 62045 | 62110 | 61279 | 61335 | 60519 | 60663 | 60131 | 60205 | ||
E6 | E7 | E8 | E9 | E10 | |||||||
59239 | 59317 | 58136 | 58198 | 56938 | 57008 | 52700 | 52811 | 50008 | 50182 | ||
E11 | E12 | E13 | E14 | E15 | E16 | ||||||
43255 | 43391 | 37849 | 37944 | 37549 | 37641 | 36646 | 36717 | 34709 | 34806 | 32318 | 32444 |
SEBLASTIAN predicted a known selenoprotein in this sequence, the thioredoxine reductase 2 partial from Meleagris gallopavo.
SEBLASTIAN predicted two different SECIS, a grade A one and a grade B one. Both of them are well located in the reverse strand, but just the grade A one is located in the 3' UTR, between the positions 29637 and 29714.
The gene TXNRD3 is located in the scaffold SPP00001063_2.0, between the position 58148 and the position 39754, in the reverse strand. This gene also has 16 exons with a selenocystein in the last one of them, which are distributed as shown below:
TXNRD3 | E1 | E2 | E3 | E4 | E5 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
58148 | 58267 | 51634 | 51694 | 50567 | 50676 | 49881 | 49994 | 48458 | 48530 | ||
E6 | E7 | E8 | E9 | E10 | |||||||
47531 | 47650 | 47027 | 47169 | 45516 | 45631 | 45186 | 45411 | 44867 | 44959 | ||
E11 | E12 | E13 | E14 | E15 | E16 | ||||||
44328 | 44404 | 43541 | 43366 | 42904 | 43011 | 41957 | 42052 | 40045 | 40179 | 39689 | 39754 |
SEBLASTIAN found a prediction of a known selenoprotein in the sequence, the thioredoxine reductase 3 from Gallus Gallus.
SEBLASTIAN also predicted a SECIS element of grade A, in the reverse strand of the sequence, in the 3' UTR between the positions 39377 and 39459.
The query corresponding to eEFsec is located in the forward strand in the scaffold SPP00001064_2.0. It is situated between the positions 13930 and 63811. This gene is composed by five exons without any selenocysteine. This agrees with the expected results as it is a machinery protein. The exons are distributed as shown below:
eEFsec | E1 | E2 | E3 | E4 | E5 | |||||
---|---|---|---|---|---|---|---|---|---|---|
13930 | 14138 | 17974 | 18070 | 20773 | 20937 | 50000 | 50641 | 63734 | 63811 |
In this sequence no SECIS elements have been predicted either the protein by Seblastian.
The query corresponding to PSTK was found in the scaffold SPP00001065_2.0 between the position 52044 and the position 50202 in the reverse strand. This gene is formed by 6 exons but no selenocysteine was found. The structure of the gene, analyzed with the Exonerate file, is described below:
PSTK | E1 | E2 | E3 | E4 | E5 | E6 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
52044 | 52223 | 51857 | 51947 | 51445 | 51673 | 50990 | 51065 | 50415 | 50508 | 50000 | 50202 |
No selenoprotein was predicted by Seblastian and no SECIS were found either, which agrees with the expected results since it is a machinery protein.
The query corresponding to SEPHS has been found in the reverse strand in the scaffold SPP00001040_2.0. The gene contains a selenocysteine and is distributed in 8 exons, the first one starting at the position 64042 and the last one ending at the position 45797. No selenocystein has been found in the transcript.
SEPHS | E1 | E2 | E3 | E4 | ||||
---|---|---|---|---|---|---|---|---|
64042 | 64234 | 60928 | 61031 | 56613 | 56720 | 55670 | 55824 | |
E5 | E6 | E7 | E8 | |||||
54807 | 54897 | 51449 | 51558 | 50002 | 50219 | 45586 | 45797 |
Even though SEBLASTIAN has not predicted any known selenoprotein in the sequence, the website predicted a SECIS element of grade B in the reverse strand of the sequence in the 3’UTR, between the position 43247 and the position 43325.
This sequence is located in the scaffold SPP00001038_2.0 between the position 32018 and the position 53085 in the forward strand. This gene is formed by 16 exons but no selenocysteine was found. The structure of the gene, analyzed with the Exonerate file, is described below:
SBP2 | E1 | E2 | E3 | E4 | E5 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|
32018 | 32169 | 33270 | 33462 | 33949 | 34078 | 34515 | 34738 | 35428 | 35563 | ||
E6 | E7 | E8 | E9 | E10 | |||||||
37345 | 37493 | 38046 | 38204 | 41156 | 41260 | 43611 | 43776 | 46170 | 46339 | ||
E11 | E12 | E13 | E14 | E15 | E16 | ||||||
47954 | 48089 | 48754 | 48895 | 49999 | 50219 | 50967 | 51121 | 52010 | 52208 | 53009 | 53085 |
No selenoprotein could be predicted in this sequence. Nevertheless, one SECIS grade B was found. Since the position of this SECI is 40543-40620, it is found in the middle of the protein, so we discard it. This agrees with the expected result for the reason that SBP2 is a machinery protein.
This gene was found in the scaffold SPP00001042_2.0 between the position 53842 and the position 31904 in the reverse strand. This gene has 11 exons but no selenocysteines were found. The structure of the gene, analyzed with the Exonerate file, is described below:
SECS | E1 | E2 | E3 | E4 | E5 | E6 | ||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
53842 | 53955 | 52584 | 52738 | 50245 | 50363 | 50002 | 50160 | 48962 | 49115 | 44844 | 44946 | |
E7 | E8 | E9 | E10 | E11 | ||||||||
43210 | 43339 | 42991 | 43082 | 34327 | 34420 | 33585 | 33675 | 31715 | 31904 |
No selenoproteins were predicted but one SECI was found. This SECI is grade C and it is found between the positions 5785-5870. So, as in SecS_1, we discard the SECI because is located in 5' UTR.
This sequence is located in the scaffold SPP00001066_2.0 between the position 44931 and the position 57399 in the forward strand. This gene contains 7 exons but we can not find any selenocystein. The structure of the gene, analyzed with the Exonerate file, is described below:
SECp43 | E1 | E2 | E3 | E4 | E5 | E6 | E7 | |||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
44931 | 45028 | 45813 | 45912 | 48709 | 45761 | 50002 | 50133 | 53245 | 53364 | 54756 | 54933 | 57256 | 57399 |
No selenoproteins were found in the sequence but Seblastian found 3 different SECIs. Since two of the SECIs are located in the reverse strand, we discard them. The third SECI is found between the positions 70361-70437 and its grade is B. Its location shows that it is located in 3' UTR so we select this SECI as the most suitable one.