HOMO SAPIENS 3 BAC RP11-758L3 ANALYSIS


DISCUSSION

The results presented about the second prediction of Geneid and Genescan show the same kind of conclusions. Blastp indicates the presence of homologous regions to the polymerase domain of various reverse transcriptase. Observing the RepeatMasker results we can see that this region is full of LINE 1 elements. It is known that L1 family sequences have some characteristic features such as an A-rich stretch at the 3' end, a truncated 5' end, the existence of significantly long open reading frames (ORFs), and the presence of L1 family transcripts in various types of cells. However, due to base substitutions or truncation, most elements appear incapable of producing mRNA that can be translated (1). Furthermore, L1 family sequence contains an ORF which has a significant homology with several RNA-dependent DNA polymerases (RT) with viral origin (2); which may allow their own dispersion. In conclusion, we think that these predictions only refer to the LINE1 elements, without translation possiblity.

According to the results obtained for the third gene, we affirm that the predictions made by the gene predicting programmes are not right. Although GeneId and Genscan predict a gene that matches to the end of the sequence of the FLJ22419 protein, they fail to predict its begining. Not only do not they consider the exon at the 4.000 nucleotide inside the gene, but they suppose a begining for the gene, around the 64,000 nucleotide, which is not correct. There are enough evidences to think that the gene is composed by eight exons, although only five of them are codified inside our BAC, RP11-758L3. Through further research, we have stablished the position of the other three exons and we have evidences which support that the two first ones are codded in BAC RP11-598P2. There is a great variability among the length of the introns inside the gene, while the last ones are quite short, the first introns are longer and can easily reach 60,000 nucleotides. Anyway, it is important to highlight that we have not found any BAC containing the sequence for the third exon, although we know its exact position in the chromosome. After all the investigation developed, we are bound to suggest that the FLJ22419 protein could exist as a product of this gene.

The fourth gene predicted by Geneid, and the fifth one predicted by Genescan, are not real genes. The results obtained with the different programmes confirm that they are repeatitive areas which codify non functional envelope proteins. These proteins belong to the HERV-H human endogenous retrovirus family, which are commonly found integrated in the genome (3). Usually the three large envelopes (HERV-H/env62, HERV-H/env60, and HERV-H/env59) in humans are prematurely stopped in the majority of primates, that is, they are not translated (4). In this case, the absence of ESTs supports this hypothesis. To sum up, our results are consistent with the absence of a strong selective pressure for the conservation of a functional envelope gene of possible benefit for the host.

After analysing the results we have obtained, we suggest that the gene number six, predicted by Genscan, could possibly be a pseudogen. The Blastp and Interpro results, indicate that the protein product of this gene has a Homeobox domain which appears to be the same of the Homeobox domain from VENTX2 protein. VENTX2 protein is an haemopoietic progenitor, homeobox protein, expressed in bone marrow and it is located in chromosome 10q26.3 (5). Doing a Blastx 2 sequence we saw that a part of VENTX2's mRNA aligns with a segment of 200 aa of the predicted protein (428aa). However, the rest of the protein has no similarity with other known proteins. These results support the hypotesis that a piece of VENTX2 is duplicated in this region. Furthermore, the analysis of the RepeatMasker shows a high concentration of repeatitive elements in this area. These reasons made us think this gene could have been stablished by a process of duplication or pseudogene formation (retroposition) and consequently inactivated by mutation and mobile elements as transposons.


LastIndexNext

© Porta,M
     Ros,S
     Sancho,M
     Trujillo,E / March 2002