Selenoproteins in
Nanorana parkeri




Conclusions:

The main purpose of our study is to define and properly annotate all the selenoproteins and its corresponding machinery proteins in our studying genome, Nanorana parkeri (frog). Due to the high homology these types of proteins present among the really different species, it was possible to predict the majority of them in our organism genome. In order to optimize this analysis, we create an automate program which runs the following programs, necessary to achieve the protein prediction: tblastn, exonerate, genewise and t-coffee.

As it was mentioned along our project, it was also necessary to analyze preservation of characteristic elements present in the 3'UTR region of the gene, named SECIS elements, which are essential for the selenoprotein biosynthesis. Thus, it allows us to make a comparison of the conservation of selenoproteins and its characteristic Sec residue.

This comparative analysis to predict the selenoproteins and associated machinery in Nanorana parkeri genome was possibly performed by using other phylogenetically related organisms genomes, mainly Xenopus tropicalis and Homo sapiens. Even some selenoproteins, basically members of different protein families, were not well defined in databases, in general they were properly annotated. This fact reflects the wide and consistent research background among this frog genus, which allows us to have a confident source as a starting point. Furthermore, in those cases where the database had not the proper protein annotation or definition we needed to extend our source search, considering other databases, such as ENSEMBL, UniProt, HomoloGene and UniGene. However, in few cases no protein from Xenopus tropicalis was finally obtained from databases, which force us to use Homo sapiens proteins as a model, due to its conservation among vertebrates.

In previous studies, the ancestral selenoproteome was defined as the following 28 selenoproteins: GPx1-4, TR1, TR3, Dio1-3, SelH, SelI, SelJ, SelK, SelL, SelM, SelN, SelO, SelP, SelPb, MsrB1, SelS, SelT1, SelU1, SelW1, SelW2, Sep15, Fep15 and SPS2a. After that, the subsequent 21 selenoproteins were found in all vertebrates: GPx1-4, TR1, TR3, Dio1, Dio2, Dio3, SelH, SelI, SelK, SelM, SelN, SelO, SelP, MsrB1 (methionine-R-sulfoxide reductase 1), SelS, SelT1, SelW1 and Sep15.

Through our analysis procedures we were able to contrast this previous finding explained above, classifying our results in the following manner.

On the one hand, the Sec-containing selenoproteins we predicted were: SelI, SelW1, Sel15, SelM, MsrB1, SelH, SelN, SelS, SelT, SelO, SelK, SelP, GPx1, GPx3, GPx4, DI1, DI2, DI3, TR1, TR2 and TR3. However, we could not predict the Sec residue in GPx2. In general, the SECIS elements prediction confirmed the presence of a Sec residue in the selenoprotein.

On the other hand, the Cys-containing selenoproteins we predicted were: Fep15, SelPb, SelU1, SelW2, MsrA, SelU2, GPx7 and GPx8. Nevertheless, we could not predict the Cys residue in Rdx12 and SelU3. In the majority of cases, proteins that suffered a Sec-to-Cys switch did not present any SECIS element. Finally, predicted machinery associated-proteins were: eEFsec, PSTK, SBP2, SECp43, SecS, SPS (SPS1 and SPS2). After the SECIS element analysis, we can conclude its main role to find the selenocysteine residue in the protein structure.

Through the phylogenetic evolution of the different lineages, a progressive switch from Sec-to-Cys residues has been reported. Concretely, in Nanorana parkeri genome, it is important to highlight these switches in three punctual selenoproteins, which are SelU1, Fep15 and SelPb.

Nevertheless, is important to point out the limitations we came across with while developing our project.

As we can observe in the literature, especially in the vertebrate subfillum phylogenetic tree exposed in the introduction section, our organism model, Nanorana parkeri is not classified in the main defined vertebrate subgroups, such as bony fishes or mammals classes, where it is possible to establish a better comparison within the organism of the same subgroup. This fact suggests the lack of selenoproteins research in amphibians' lineage, which really difficult our analysis study. Due to the lack of information related with classification and research, we were only able to establish an intraclass comparison of our organism genome with Xenopus tropicalis, complicating our analysis when its selenoproteins were not well annotated. Thus, in some cases, it forces us to take Homo sapiens selenoproteins as the reference proteins. This fact is related to the impossibility, even predicting new selenoprotein in Nanorana parkeri, to compare and prove these selenoproteins with closely related organisms.

Moreover, another limitation in this project was the misannotated and wrong defined selenoproteins found sometimes in databases, mainly in SelenoDB. Thus, we needed to widen the source range to the other databases already mentioned.

Another restriction is related to the methodology itself: in order to filter the proper selenoproteins, we needed to establish an e-value threshold in our automate program. This fact represents a risk we had to assume, even being conscious of a possible automatic rejection of some correct hits, which could correspond to some selenoproteins. For this reason, we did not fix a high restrictive e-value threshold, to try to predict the maximum correct selenoproteins.

Last but not least, we would like to consider another limitation we had to manage with. Due to the magnitude of this project and the different steps needed to perform it, it would be necessary a better knowledge of selenoproteins, such as SECIS elements information, and also better instructions in order to properly analyze and understand the results obtained.

Taking into account the previous limitations exposed, in further studies it would be needed to find phylogenetically closer organism in relation to Nanorana parkeri, in order to establish wider and greater predictions of selenoproteins in our studying genome. Thus, it could allow us to confirm the selenoprotein function-lineage association.

In addition, thanks to all the information collected by the different performed projects of our colleagues, it would be possible to update the different databases, improving and facilitating following selenoproteins studies.