edited by Roderic Guigó
A Course on Sequence Analysis. Practical
Signal Searches in
by Roderic Guigó 6/6/97
In this practical we will run a genomic DNA sequence against
a database of position weight matrices representing promoter elements,
and will try to locate potential splice sites in the genomic sequence.
WWW DB Tools
We will use:
- SRS (at EBI or EMBL) to extract a query sequence. )
- Transfac (at
GBF, to search the sequence against a database of profiles for promoter elements )
- SPL (at the Sanger Center ,
to find potential splice sites.)
Step 1. Extract a genomic DNA sequence from the database
- Load SRS (at EBI or EMBL) and Start the session.
- Select the EMBL database and standard query form.
- Typecds in the FtDescription(Feature) box, and
tata in the FtDescription(Feature) box, then do query
- Clik on AGCHYM12 entry
- look at the entry
- clik on save
- Sequence format is fasta and view FastSeqs. Then click on
- Netscape (text) save on e.g. genomic.fa
Step 2.Weigth matrix based search
of potential promoter elements
Analyze the results and compare with the EMBL annotation for the sequence
- Search Transfac with MatInspector (http://transfac.gbf.de/cgi-bin/matSearch/matsearch.pl)
- Select Weight-Matrix-Based search
- Cut and Paste the genomic DNA sequence.
- Search Insect Matrix Group
- Matches on the sequence
- 1. Where the annotated promoters found? Why?
- 2. Where promoter elements other than the annotated found? Why?
Step 3.Multivariate Analysis based search for
potential splice sites
- Load Genomic Analysis tools at the Sanger Center (http://genomic.sanger.ac.uk)
- Select Nucleotide Sequence Analysis
- Load file the genomic DNA sequence.
- Select Drosophila and spl
- Perform Search
- 1. How many potential splice sites were detected?
- 2. Were the real splice sites detected?
- 3. Did the real splice sites score high?
- 4. How many false sites were identified?
- Select Human and spl, and repeat the analysis
- compare the results.