GENSCAN 1.0 Date run: 25-Feb-102 Time: 06:26:03 Sequence 06:25:22 : 149409 bp : 36.91% C+G : Isochore 1 ( 0 - 43 C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ 1.01 Term + 3489 3656 168 2 0 92 47 144 0.922 7.60 1.02 PlyA + 4402 4407 6 1.05 2.04 PlyA - 5001 4996 6 1.05 2.03 Term - 8995 8957 39 1 0 94 47 31 0.253 -4.19 2.02 Intr - 9803 9714 90 2 0 31 100 110 0.402 5.77 2.01 Init - 18666 18376 291 0 0 71 67 247 0.789 18.10 2.00 Prom - 27863 27824 40 -5.85 3.02 PlyA - 28822 28817 6 1.05 3.01 Sngl - 30122 29823 300 2 0 94 42 193 0.926 9.65 3.00 Prom - 30979 30940 40 -2.85 4.02 PlyA - 31796 31791 6 1.05 4.01 Sngl - 59519 58551 969 2 0 54 47 268 0.972 15.76 4.00 Prom - 60902 60863 40 -6.25 5.00 Prom + 64229 64268 40 -6.85 5.01 Init + 64325 64369 45 1 0 89 94 32 0.804 4.69 5.02 Intr + 69518 69548 31 1 1 62 93 21 0.398 -3.21 5.03 Intr + 77280 77513 234 1 0 57 69 223 0.890 14.04 5.04 Intr + 88813 88991 179 2 2 86 97 133 0.682 12.72 5.05 Intr + 90419 90520 102 1 0 47 84 67 0.758 1.65 5.06 Term + 93036 93269 234 2 0 122 32 248 0.988 17.94 5.07 PlyA + 93908 93913 6 1.05 6.04 PlyA - 96025 96020 6 1.05 6.03 Term - 108460 107529 932 2 2 47 39 415 0.490 23.81 6.02 Intr - 108710 108503 208 2 1 51 52 99 0.415 0.33 6.01 Init - 110832 110686 147 0 0 94 77 98 0.853 9.44 6.00 Prom - 113424 113385 40 -5.55 7.00 Prom + 123739 123778 40 -6.15 7.01 Init + 126733 127014 282 0 0 78 45 232 0.601 13.35 7.02 Intr + 132019 132188 170 0 2 73 77 92 0.261 4.42 7.03 Term + 135302 135695 394 2 1 78 36 211 0.248 8.42 7.04 PlyA + 136519 136524 6 1.05Click here to view a PDF image of the predicted gene(s)
Click here for a PostScript image of the predicted gene(s)
Predicted peptide sequence(s):
>06:25:22|GENSCAN_predicted_peptide_1|55_aa
SQAAAHYKGTKHAKKLKALEAMKNKQKSVTAKDSAKTTFTSITTNTINTSSDKTG
>06:25:22|GENSCAN_predicted_peptide_2|139_aa
MIRNDKRDMTTDPTEIHKTLRDYYEYVYAHKLENLEEMDKFLETYNLPRLNKEEIESLKK
PITSSKIESVIKSLVTPKSPGPGKFTAEFYQMYKDEQNYSTRNNNGKTRCASSGKYCRKA
VEEKIGKQLNKHWTQDKQY
>06:25:22|GENSCAN_predicted_peptide_3|99_aa
MHCTSWASFEEAICTAGVWSVNIQDQEDRPLPPAMSLQCPLLTKVNIMPAYKEGYVKGLD
PLSQDGNGQECPALQFYQCPLEGGLPLLALVLSTNSRTL
>06:25:22|GENSCAN_predicted_peptide_4|322_aa
MSELPFTIASKRIKYPGIQLTRDVKDLFKENYKPLLNEIKEDTNKWKNIPRSWIGRINIV
KMAILPKVIYRFNAIPIKLQMTFFIELEKTIVKFIWNQKRTRIAKSILSQKNKAGGIMLP
DFKLYYKPTVTKTAWYWYQNRDIDQWNGTEPSEIIPHIYNHLIFDKPEKNKQWGKDSLFN
KWCWENWLAICRKLKLDPFLTPFTKINSRWSKDLHVRPKSIKTLEENLGNTIQDIGMGGK
DFLSKTPKAMATKAKIDKWDLIKLKSFCTAKETTIRVNRQPTEWEKIFAIYSSDKGLISR
IYNELKQIYKKKTTPSKSERRI
>06:25:22|GENSCAN_predicted_peptide_5|274_aa
MSTLLNKTLTALPKELNPQLPDPSQDGTAGTPAISTTTTVEIRKSSVMTTEITSKVEKSP
TTATGNSSCPSTETEEEKAKRLLYCSLCKVAVNSASQLEAHNSGTKHKTMLEARNGSGTI
KAFPRAGVKGKGPVNKGNTGLQNKTFHCEICDVHVNSETQLKQHISSRRHKDRAAGKPPK
PKYSPYNKLQKTAHPLGVKLVFSKEPSKPLAPRILPNPLAAAAAAAAVAVSSPFSLRTAP
AATLFQTSALPPALLRPAPGPIRTAHTPVLFAPY
>06:25:22|GENSCAN_predicted_peptide_6|428_aa
MEVLARVFRQEKEMKGIHIGKEEHKLFLFADNMTLCLEKTKDSVKKLVEQPSSFGSVDWL
CLSSCSGSTHTPRPADVSPGSLPGPGQTFGTREPPQAVSTKEASSSNLHAPERTVAGLTF
TTEQVRALEGVFRHHQYLGPLERNWLAREMQLSEVQIKTWFQNRRMKHKRQMQDSQLNGP
LSGSLHGPPAFHSPSSGLANGLQLLCPWAPLPGSPGCPLAPSGVSDKWIKRPWPLRGLPA
AGSLWHTTPHAQEVVRISWDQPCPRGPGACVLCQRQGMHLRKRLRLHASHTSGLCRSHLT
PTRRTQLFCLHRGGTSHPDPRKGSGDFWRIYLLKSPLHLKKKKKKKKGQKAIDTGEVWRK
GNSYTLLVGMYISTTIVENTVEIYQKTKNTATIFISNPTTEGLSKGIETSEYIKGIPALA
CLSQLYSR
>06:25:22|GENSCAN_predicted_peptide_7|281_aa
MLAALAALACSQRLFGLGAHSGRALQPATALWEPLSGLAKAEAGSLSLRGGVEGEAQTGT
GLHMALVGQHEFWVGVGLVDPTLRAAGQLHQPWAVAEAKVIVKVNAPLSLSDLSQNQLAF
RLFFIEYKNPAQFMVCLAATLTRFTALDPERSSTYMCLPANWTGTCTLVFLTPKIQIANR
TEELPVPLMTPTRQKRVIPLIPLLVGLGLSASTIALSTGIAGISTSVTTFRSLSNDFSAS
ITDISQTLSVLQAQVDSLAAVVLQNRRDLDLLLKKEDSVYS
Explanation
Gn.Ex : gene number, exon number (for reference)
Type : Init = Initial exon (ATG to 5' splice site)
Intr = Internal exon (3' splice site to 5' splice site)
Term = Terminal exon (3' splice site to stop codon)
Sngl = Single-exon gene (ATG to stop)
Prom = Promoter (TATA box / initation site)
PlyA = poly-A signal (consensus: AATAAA)
S : DNA strand (+ = input strand; - = opposite strand)
Begin : beginning of exon or signal (numbered on input strand)
End : end point of exon or signal (numbered on input strand)
Len : length of exon or signal (bp)
Fr : reading frame (a forward strand codon ending at x has frame x mod 3)
Ph : net phase of exon (exon length modulo 3)
I/Ac : initiation signal or 3' splice site score (tenth bit units)
Do/T : 5' splice site or termination signal score (tenth bit units)
CodRg : coding region score (tenth bit units)
P : probability of exon (sum over all parses containing exon)
Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores)
Comments
The SCORE of a predicted feature (e.g., exon or splice site) is a
log-odds measure of the quality of the feature based on local sequence
properties. For example, a predicted 5' splice site with
score > 100 is strong; 50-100 is moderate; 0-50 is weak; and
below 0 is poor (more than likely not a real donor site).
The PROBABILITY of a predicted exon is the estimated probability under
GENSCAN's model of genomic sequence structure that the exon is correct.
This probability depends in general on global as well as local sequence
properties, e.g., it depends on how well the exon fits with neighboring
exons. It has been shown that predicted exons with higher probabilities
are more likely to be correct than those with lower probabilities.