GENSCAN 1.0 Date run: 14-Mar-103 Time: 07:35:42 Sequence hg13_dna : 242888 bp : 38.87% C+G : Isochore 1 ( 0 - 43 C+G%) Parameter matrix: HumanIso.smat Predicted genes/exons: Gn.Ex Type S .Begin ...End .Len Fr Ph I/Ac Do/T CodRg P.... Tscr.. ----- ---- - ------ ------ ---- -- -- ---- ---- ----- ----- ------ 1.07 PlyA - 990 985 6 1.05 1.06 Term - 8654 8583 72 2 0 66 41 132 0.371 3.23 1.05 Intr - 16651 16628 24 1 0 81 105 42 0.666 2.50 1.04 Intr - 18652 18508 145 0 1 91 44 43 0.483 -0.44 1.03 Intr - 20407 20278 130 2 1 84 61 47 0.736 0.43 1.02 Intr - 22532 22488 45 0 0 73 106 73 0.315 5.06 1.01 Init - 29188 28792 397 1 1 65 64 252 0.884 16.14 1.00 Prom - 34098 34059 40 -1.95 2.07 PlyA - 34143 34138 6 1.05 2.06 Term - 39999 39839 161 1 2 109 35 76 0.789 1.52 2.05 Intr - 42468 42334 135 1 0 83 71 86 0.494 6.02 2.04 Intr - 52404 52290 115 0 1 46 52 88 0.166 0.10 2.03 Intr - 53055 53014 42 0 0 75 98 38 0.151 1.02 2.02 Intr - 63941 63847 95 0 2 65 27 114 0.317 1.76 2.01 Init - 65131 65083 49 1 1 67 108 9 0.647 1.96 2.00 Prom - 66240 66201 40 -4.85 3.00 Prom + 75157 75196 40 -3.45 3.01 Init + 78144 78202 59 2 2 92 84 23 0.297 3.15 3.02 Intr + 80695 80875 181 1 1 76 37 160 0.518 8.65 3.03 Term + 89629 89739 111 0 0 89 41 77 0.440 0.68 3.04 PlyA + 91017 91022 6 1.05 4.02 PlyA - 91318 91313 6 1.05 4.01 Sngl - 100662 98971 1692 0 0 83 38 600 0.968 49.32 4.00 Prom - 108203 108164 40 -3.65 5.03 PlyA - 108486 108481 6 -0.45 5.02 Term - 109619 108964 656 0 2 35 38 270 0.651 9.87 5.01 Init - 110473 110308 166 1 1 70 72 73 0.817 3.75 5.00 Prom - 114244 114205 40 -3.65 6.02 PlyA - 115008 115003 6 1.05 6.01 Sngl - 115531 115052 480 1 0 60 48 164 0.503 5.43 6.00 Prom - 116397 116358 40 -6.15 7.04 PlyA - 116986 116981 6 1.05 7.03 Term - 119698 119541 158 2 2 71 43 154 0.627 6.31 7.02 Intr - 123937 123856 82 1 1 83 78 47 0.401 1.49 7.01 Init - 137065 136823 243 1 0 57 87 114 0.442 5.98 7.00 Prom - 140378 140339 40 -5.75 8.00 Prom + 141037 141076 40 -5.95 8.01 Init + 149470 149475 6 0 0 58 92 0 0.261 -1.67 8.02 Intr + 153372 155858 2487 2 0 111 96 1876 0.868 177.08 8.03 Intr + 156153 156240 88 2 1 85 32 36 0.385 -3.68 8.04 Intr + 157438 157550 113 2 2 55 98 74 0.535 4.28 8.05 Intr + 158406 158609 204 2 0 69 101 212 0.877 19.07 8.06 Intr + 181200 181307 108 2 0 79 31 89 0.466 1.86 8.07 Intr + 186572 186647 76 1 1 106 111 67 0.769 8.97 8.08 Term + 193102 193259 158 2 2 73 42 94 0.266 0.41 8.09 PlyA + 193287 193292 6 1.05 9.00 Prom + 193341 193380 40 -6.95 9.01 Init + 195978 196244 267 2 0 40 12 328 0.869 17.43 9.02 Term + 196425 196490 66 2 0 111 39 119 0.934 6.26 9.03 PlyA + 196822 196827 6 -3.24 10.06 PlyA - 197195 197190 6 1.05 10.05 Term - 198195 197629 567 0 0 28 36 287 0.876 10.93 10.04 Intr - 206724 206550 175 2 1 27 31 144 0.017 1.62 10.03 Intr - 211384 211236 149 1 2 75 29 106 0.087 1.51 10.02 Intr - 211863 211742 122 1 2 58 103 8 0.051 -1.41 10.01 Init - 217635 217269 367 0 1 68 12 211 0.026 8.73 10.00 Prom - 218114 218075 40 -4.75 11.00 Prom + 220126 220165 40 -4.65 11.01 Init + 223582 223833 252 0 0 53 67 119 0.124 3.69 11.02 Term + 233431 233610 180 0 0 85 55 213 0.993 14.33 11.03 PlyA + 233621 233626 6 1.05 12.04 PlyA - 233974 233969 6 1.05 12.03 Term - 234554 234427 128 0 2 88 45 112 0.694 4.36 12.02 Intr - 242026 241859 168 2 0 15 86 102 0.191 1.70 12.01 Init - 242713 242458 256 1 1 49 40 132 0.181 1.94Click here to view a PDF image of the predicted gene(s)
Click here for a PostScript image of the predicted gene(s)
Predicted peptide sequence(s): >hg13_dna|GENSCAN_predicted_peptide_1|270_aa MLQWFLLAVGFLLPPPLPQGENRIQGRLRVGDERPSPTYLDDAAPAGQAARLALPGEGTD DRAGGRGERNAAGSPGRASPGFVPGHAEAAVSVGNPAAVIARPPVDEATSPDTVFPHSGF QGALGTEGRARGASVMAVAAMSEIPGEKTYGIHMREKRQINLVTQKVAGQVSGGAPSHPL LQKTLEGGLGKDKNGDLFPDLPPDLPARVDADQHLQTLTDNMNADWKESNWSSQIISPKA KALVEPEVFSKNPEESEDEAVTQRFTEKTG >hg13_dna|GENSCAN_predicted_peptide_2|198_aa MDVRFTNSHFLKVSGAKAVPESGIHDQVTQVPIVPFTCNSVEDASEVKMRKLQEDMQSAQ QTTANLCHSTKHAVLCIHGSRLMLFPVPEDYPPPPPSSGCGMTALRVAKLYLGTALSSPV ILVVSSKPSSTLSYFTVWQPVMMRTGHQLVQGFSRSHHVQVSTAAEMFHMLIHRQQKTVK GTALNSAGGSNQLVSEFD >hg13_dna|GENSCAN_predicted_peptide_3|116_aa MARLSFRSVSVGSLERRLARMFSNIRGFYPLADLYPLNASNTPISHDNKKVSPALPNVPS GADYPWLRTTGIANAEEAGLSYVIRFTTFLFSTITSDPFVSQQMPVTKKNKNSGGP >hg13_dna|GENSCAN_predicted_peptide_4|563_aa MDKIDRLLARLIKKKREKNQIDAIKNDKGDITTDPTEIQTTIREYYKHLYANKLENLEEM DKFLDTYTLPRPNQEEVESLNRPITGSEIAAIINSLPTKKSPGPDGFTAEFYQRYKEELV PFLLKLFQSIEKEGILPNSFYEVSIILIPKPGRDTTKKENFRPISLMNIDAKILNKILAN RIQQHIKKLIHHDQVGFIPGMQGWFNICKSINIIQHINRTKNKNHMIISIDAEKAFDKIQ QPFMLKTLNKLGIDGTYLKIIRAIYDKPTANIILNGQKLEAFPLKTGTRQACPLSPLLFN IVLEVLARAIRQEKEIKGIQLGKEEVKLSLFADDMIVYLENPIVSAQNLLKLISNFSKVS GHKINVQKSQAFLYTNNRQTESQIMSELPFTIASKRIKYLGIQLTRDGKDLLKENYKPLL NEIKEDTNKWKNIPCSWVGRTNMVKMAILPKVIYRFNAIPIKLPMTFFTELEKTTLKFIW NQKRARIAKSILSQKNKAGGITLPDFKLYYKATVTKTAWYWYQNRDIDQWNRTEPSEIIP HIYNHLIFDKPEKNKKWEERFPI >hg13_dna|GENSCAN_predicted_peptide_5|273_aa MDKFLDTYTLPRLNQEEVKSLNRPMTGSEIEAIINSILTKKKVQDQKDSRPNSTRAPNLL KLISNFSKVSGYKINVQKSQAFLYTNNRQTESQIMSELPFTIASKRIKYLGIQLTRDGKD LLKENYKPLLNEIKEDTNKWKNIPCSWIGRINIVKMAILPKVIYRFNAIPIKLPMTFFTE LEKTTLKFIWNQKRARIAKRILSQKNKAGGITLPDFKLYYKATVTKTAWYWYQNRHIDQW NRTEPSEIIPHIYNHLIFDKPDKNKKREKGFPI >hg13_dna|GENSCAN_predicted_peptide_6|159_aa MSELPFTIASKRIKYLGIQLTRDMKDLFKENYKPLLKEIKEDTNKWKNIPCSWVGRINIV KMAISPKVIYRFSAIPFKLPMTFFTELEKTTLKFIWNQKRARIAKSILSQKNKAGGITLP DFKLYYKATVTLVPKQRYRSMEQNRALRNNAAYLQISDL >hg13_dna|GENSCAN_predicted_peptide_7|160_aa MRSPAPANHRNGWNHQQLVNYITQGRMTSKVGQLKRVNVGINKHATSTGTALGKLKHMII LPESDLINNVLSSKSQALEMQVRGASARPPLLGSEEPLCPASHSVQEGVLSKTGIKALQA LCFSQPNTEYHQQPHHTRPEPKRATPPQILAEVSLLANGA >hg13_dna|GENSCAN_predicted_peptide_8|1079_aa MEDGTKQKRERKKTVSFSSMPTEKKISSASDCINSMVEGSELKKVRSNSRIYHRYFLLDA DMQSLRWEPSKKDSEKAKIDIKSIKEVRTGKNTDIFRSNGISDQISEDCAFSVIYGENYE SLDLVANSADVANIWVTGLRYLISYGKHTLDMLESSQDNMRTSWVSQMFSEIDVDNLGHI TLCNAVQCIRNLNPGLKTSKIELKFKELHKSKDKAGTEVTKEEFIEVFHELCTRPEIYFL LVQFSSNKEFLDTKDLMMFLEAEQGVAHINEEISLEIIHKYEPSKEGQEKGWLSIDGFTN YLMSPDCYIFDPEHKKVCQDMKQPLSHYFINSSHNTYLIEDQFRGPSDITGYIRALKMGC RSVELDVWDGPDNEPVIYTGHTMTSQIVFRSVIDIINKYAFFASEYPLILCLENHCSIKQ QKVMVQHMKKLLGDKLYTTSPNVEESYLPSPDVLKGKILIKAKKLSSNCSGVEGDVTDED EGAEMSQRMGKENMEQPNNVPVKRFQLCKELSELVSICKSVQFKEFQVSFQVQKYWEVCS FNEVLASKYANENPGDFVNYNKRFLARVFPSPMRIDSSNMNPQDFWKCGCQIVAMNFQTP GLMMDLNIGWFRQNGNCGYVLRPAIMREEVSFFSANTKDSVPGVSPQLLHIKIISGQNFP KPKGSGAKGDVVDPYVYVEIHGIPADCAEQRTKTVHQNGDAPIFDESFEFQINLPELAMV RFVVLDDDYIGDEFIGQYTIPFECLQTGYRHVPLQSLTGEVLAHASLFVHVAITNRRGGG KPHKRGLSVRKGKKSREYASLRTLWIKTVDEVFKNAQPPIRDATDLRENMQYTWESLGFL NIPYLEHKLMIYQNKKDIGQNQDPSVPEIAHLTLIMIIIGLLLLTVTQSMGLNISSFQNA VVSFKELCGLSSVANLMQCMLAVSPRFLGPDNTPLVVLNLSEQYPTMELQGIVPEVLKKI VTTYDMQLKDFSDFVTSLETATTEDAVATSVLSRTGKESSLEMIQSLKALIENADAVYEK IVHCQKADNMILCLEKPKDSTKNLLELINKCSKAVGYKISMQKSVAFLYANSEQFEKEI >hg13_dna|GENSCAN_predicted_peptide_9|110_aa MDFGENQELMDTTPEELTEDNLTEMSASKPVPDNEEYVEEAVPPNKLISDNLEEGFLLFK TAYDFFYDKDPPDDTGTETKAKSGRSTGTPTLLEDDEDEDLCDDPLPRDK >hg13_dna|GENSCAN_predicted_peptide_10|459_aa MVDLDGFMTTVGKVNADGMKITRELELEIEPEEVTKLLQSHHKTVMHEELLLMNEQRKWF LEMESPPGEDAVNIAVITAKELEYYIHLIDKGMAGFEKTDFQRSSTVGKMLLNSIARYRE ICVADHVQCRKCYTKRGKHAERGCKITSIRNNSIDTFKMVIFQRREAQQASEMGSCQGSR KKGWDQHLDLDSQKADLSSTEDTVETRDRKNAKVWLLAYVARALPAGPMESHQADTGENK QEGRFRTVSRLLLPPLLRDSDKNSHCENNLQKLILRSRALLEDVKAALELVMGKSWNSLE SSEKDRKMRESLELPRDLLNCCDQNTDSDMDNEVQAQEVSDGNEELIGNWSRGHFCYALA KSLAALYPCSRDLWNLELESDDLGYLAEEIPKQQSVQDVAWLLLTTYAHMHRKRNDLKLE FIFNRETEHKSWKNLQPGHVVEKKSPFSEEEFKQAAEIL >hg13_dna|GENSCAN_predicted_peptide_11|143_aa MQPTIPSSSRHCLEPQKHLFGSSKNQGQELCLEGSPFRSCTEPFIPLWQLLCTLWGGGQG GGKAVFLLWGDKLICSSCSMPIKRGQADLLKYAKNETLENLKQIHFAAVSCGLNKPGTEN ADVQKPRRSLEVIPEKANDETGE >hg13_dna|GENSCAN_predicted_peptide_12|183_aa MNIDAKILNKILANRIQQHIKKLIHHDQVGFIPGMQGWFNIRKSINVIQHINRAKDKNHM IISIDAEKAFDKIQQLFMLKALNKLENKIPRNLTYKGCEGPLQGELQTTAQGNKRGYKQT EEHSMLMGRKNQYRENGHTAQDSEVHITTDLIQKHQEGSCGYLGSLYSQLLLLKAVETLP KAK Explanation Gn.Ex : gene number, exon number (for reference) Type : Init = Initial exon (ATG to 5' splice site) Intr = Internal exon (3' splice site to 5' splice site) Term = Terminal exon (3' splice site to stop codon) Sngl = Single-exon gene (ATG to stop) Prom = Promoter (TATA box / initation site) PlyA = poly-A signal (consensus: AATAAA) S : DNA strand (+ = input strand; - = opposite strand) Begin : beginning of exon or signal (numbered on input strand) End : end point of exon or signal (numbered on input strand) Len : length of exon or signal (bp) Fr : reading frame (a forward strand codon ending at x has frame x mod 3) Ph : net phase of exon (exon length modulo 3) I/Ac : initiation signal or 3' splice site score (tenth bit units) Do/T : 5' splice site or termination signal score (tenth bit units) CodRg : coding region score (tenth bit units) P : probability of exon (sum over all parses containing exon) Tscr : exon score (depends on length, I/Ac, Do/T and CodRg scores) Comments The SCORE of a predicted feature (e.g., exon or splice site) is a log-odds measure of the quality of the feature based on local sequence properties. For example, a predicted 5' splice site with score > 100 is strong; 50-100 is moderate; 0-50 is weak; and below 0 is poor (more than likely not a real donor site). The PROBABILITY of a predicted exon is the estimated probability under GENSCAN's model of genomic sequence structure that the exon is correct. This probability depends in general on global as well as local sequence properties, e.g., it depends on how well the exon fits with neighboring exons. It has been shown that predicted exons with higher probabilities are more likely to be correct than those with lower probabilities.