PERL COMMANDS TO EXTRACT THE SEQUENCES USING THE AC NUMBER AND TURNING THEM INTO FASTA FORMAT

 

5' SEQUENCES (Hum_5utrnr.dat)

$ ~/bin/ExtractAllSeqs.sh < Hum_5utrnr.dat | sed ' s/ ; / / ' | gawk ' {if  ($1=="AC") {id=$2;  printf (">%s\n" , id) } else if (substr ($1,1,1) ~ / [actg] / ) {$NF=" "; printf ("%s%s%s%s%s%s\n",$1,$2,$3,$4,$5,$6)}}' | FastaToTbl | sed -n '1,200p' | TblToFasta >| Hum_5utrnr.2RunPattern

Do it for redundant and non-redundant and the other species included in the following databases:

        In_5utrnr.dat

        Om_5utrnr.dat

        Ro_5utrnr.dat

        Ov_5utrnr.dat

 

$cat *5*2RunPattern > 5UTR.2RunPattern

3' SEQUENCES (Hum_3utrnr.dat):

$~/bin/ExtractAllSeqs.sh < Hum_3utrnr.dat | sed ' s/ ; / / ' | gawk ' {if  ($1=="AC") {id=$2;  printf (">%s\n" , id) } else if (substr ($1,1,1) ~ / [actg] / ) {$NF=" "; printf ("%s%s%s%s%s%s\n",$1,$2,$3,$4,$5,$6)}}' | FastaToTbl | sed -n '1,200p' | TblToFasta >| Hum_3utrnr.2RunPattern

Do it for redundant and non-redundant and the other species included in the following databases:

        In_3utrnr.dat

        Om_3utrnr.dat

        Ro_3utrnr.dat

        Ov_3utrnr.dat

 

$cat *3*2RunPattern > 3UTR.2RunPattern