The selenoproteins are a set of proteins that are characterized by containing selenocistein (Sec) amino acids, encoded by the UGA codon, which under normal conditions works as a STOP codon stopping the translation. In order to continue the synthesis of the protein after this codon, insertion of SECIS elements (selenocysteine insertion sequences) is required. The double function of this UGA codon explains why it is so difficult to annotate and predict the selenoproteins and that is why they have been discarded or poorly cataloged for so long. This type of proteins have essential functions in many cellular processes acting as antioxidants, protecting against oxidative stress, participating in thyroid metabolism or in the folding of proteins, among others.
The main objective of this study is to determine and localize the selenoproteins of the genome of Callorhinus ursinus. The reference specie chosen to compare the genome was the Homo sapiens, since, although it is not very close phylogenetically, its genome is very well annotated.
To carry out this project, a computer program was created in Bash language that automates the process of analysis of selenoproteins by different sequence alignment programs to finally obtain the predicted sequence of Callorhinus ursinus.