Transfasta 2003 version 1.0 is a program created with the objective of transforming sequences – nucleotidic or aminoacidic sequences- in any format to fasta format. It has been programmed in March 2003 using Perl by Ramírez-Soriano A and Molina-Tomàs MC.
A sequence in FASTA format begins with a single-line description, followed by lines of sequence data. The description line is distinguished from the sequence data by a greater-than (">") symbol in the first column. It is recommended that all lines of text be shorter than 80 characters in length. An example sequence in FASTA format is:
>gi|532319|pir|TVFV2E|TVFV2E envelope protein
ELRLRYCAPAGFALLKCNDADYDGFKTNCSNVSVVHCTNLMNTTVTTGLLLNGSYSENRT
QIWQKHRTSNDSALILLNKHYNLTVTCKRPGNKTVLPVTIMAGLVFHSQKYNLRLRQAWC
HFPSNWKGAWKEVKEEIVNLPKERYRGTNDPKRIFFQRQWGDPETANLWFNCHGEFFYCK
MDWFLNYLNNLTVDADHNECKNTSGTKSGNKRAPGPCVQRTYVACHIRSVIIWLETISKK
TYAPPREGHLECTSTVTGMTVELNYIPKNRTNVTLSPQIESIWAAELDRYKLVEITPIGF
APTEVRRYTGGHERQKRVPFVXXXXXXXXXXXXXXXXXXXXXXVQSQHLLAGILQQQKNL
LAAVEAQQQMLKLTIWGVK
Sequences are expected to be represented in the standard IUB/IUPAC amino acid and nucleic acid codes, with these exception: lower-case letters are accepted. The nucleic acid codes supported by Transfasta 2003 version 1.0 are:
A --> Adenosine | M --> A C (amino) | |||||
C --> cytidine | S --> G C (strong) | |||||
G --> guanine | W --> A T (weak) | |||||
T --> thymidine | B --> G T C | |||||
U --> uridine | D --> G A T | |||||
R --> G A (purine) | H --> A C T | |||||
Y --> T C (pyrimidine) | V --> G C A | |||||
K --> G T (keto) | N --> A G C T (any) |
And the accepted amino acid codes are:
A alanine | P proline | |||
B aspartate or asparagine | Q glutamine | |||
C cystine | R arginine | |||
D aspartate | S serine | |||
E glutamate | T threonine | |||
F phenylalanine | U selenocysteine | |||
G glycine | V valine | |||
H histidine | W tryptophan | |||
I isoleucine | Y tyrosine | |||
K lysine | Z glutamate or glutamine | |||
L leucine | X any | |||
M methionine | * translation stop | |||
N asparagine |
That’s a simple thing to do. You only have to click the download link and save it to disk. The program is compressed using zip, so you will needzip to decompress and use it (to download free Winzip for Windows click here).
Transfasta 2003 version 1.0 has been thought to be executed using Linux. Once you have downloaded it you have to give it execution permission. That can be done using chmod u+x command plus de file name (Transfasta.pl). afterwards, you only need to write ./Transfasta.pl. You do not need any argument, as the program will ask you for all the parameters it needs.
You can chose the name of the output file and its format. The program will ask you for this information when it starts running. It is recommended to use a txt extension as output, since you will be able to display it using any text editor.
Of course! You can choose the lenght of the lines without restriction. But remember that the maximum lenght recommended is 80 characters.
The description line starts with ">gi|" as usual. Information about the sequence is included by the user. Remember to put the "|" symbol whithin the diferent parts of the description, as the program will only include the one after "gi".
Transfasta 2003 version 1.0 has been created using GNU Emacs 21.2.1 (Linux) as text editor. If you open it in some other editor or Emacs version some errors occur when it encounters accents or other letters not used in english (for example ç).
Catalan is our language, so the original program is written in Catalan. For the moment, the only part translated are the entry questions. Further versions will also include the script translation.