Findorf 2003 version 1.0 is a program created with the objective of searching for orfs in genomic sequences and translate them into protein. It has been programmed in March 2003 using Perl by Ramírez-Soriano A and Molina-Tomàs MC.
That's a simple thing to do. You only have to click the download link and save it to disk. The program is compressed using zip, so you will needzip to decompress and use it (to download free Winzip for Windows click here).
Findorf 2003 version 1.0 has been thought to be executed using Linux. Once you have downloaded it you have to give it execution permission. That can be done using chmod u+x command plus de file name (Findorf.pl). Afterwards, you only need to write ./Findorf.pl. You do not need any argument, as the program will ask you for all the parameters it needs.
Unfortunately yes. Orfinder 2003 version 1.0 only accepts sequences in fasta format. If your sequence is not in this format, you can use Transfasta 2003 version 1.0 to convert it.
A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. An example sequence in FASTA format is:
>gi|532319|pir|TVFV2E|TVFV2E envelope protein
ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVHCTNLMNTTVTTGLLLNGSYSENRT
QIWQKHRTSNDSALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQKYNLRLRQAWC
HFPSNWKGAWKEVKEEIVNLPKERYRGTNDPKRIFFQRQWGDPETANLWFNCHGEFFYCK
MDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPGPCVQRTYVACHIRSVIIWLETISKK
TYAPPREGHLECTSTVTGMTVELNYIPKNRTNVTLSPQIESIWAAELDRYKLVEITPIGF
APTEVRRYTGGHERQKRVPFVXXXXXXXXXXXXXXXXXXXXXXVQSQHLLAGILQQQKNL
LAAVEAQQQMLKLTIWGVK
Sequences are expected to be represented in the standard IUB/IUPAC amino acid and nucleic acid codes, with these exception: lower-case letters are accepted. The nucleic acid codes supported by Findorf 2003 version 1.0 are:
A --> Adenosine |
C --> cytidine |
G --> guanine |
T --> thymidine |
And the accepted amino acid codes are:
A alanine | P proline | |||
B aspartate or asparagine | Q glutamine | |||
C cystine | R arginine | |||
D aspartate | S serine | |||
E glutamate | T threonine | |||
F phenylalanine | U selenocysteine | |||
G glycine | V valine | |||
H histidine | W tryptophan | |||
I isoleucine | Y tyrosine | |||
K lysine | Z glutamate or glutamine | |||
L leucine | X any | |||
M methionine | * translation stop | |||
N asparagine |
You can chose the name of the output file and its format. The program will ask you for this information when it starts running. It is recommended to use a txt extension as output, since you will be able to display it using any text editor.
Findorf 2003 version 1.0 is a complete program for finding orfs that allows you to chose the characteristics of the orf found, displaying only these orfs that suit your needs. This options are:
Orfs are traduced and displayed in fasta format.
Of course! You can choose the lenght of the lines without restriction. But remember that the maximum lenght recommended is 80 characters.
The description line will include the description line of the input sequence plus the positions of starting and ending of each orf. If the orf has been found in the complementary sense, the word "complem" will precede the positions.
Findorf 2003 version 1.0 has been created using GNU Emacs 21.2.1 (Linux) as text editor. If you open it in some other editor or Emacs version some errors occur when it encounters accents or other letters not used in english (for example ç).
Catalan is our language, so the original program is written in Catalan. For the moment, the only part translated are the entry questions. Further versions will also include the script translation.