Assignment1_2
Separate the list of FASTA file
1. Open Taverna, and type “split” in the search box. Under “Local Services”, “Split sting into string list by regular expression” is shown in red. The FASTA file from http://www.cs.manchester.ac.uk/~katy/taverna/fastaFile.txt contains some necleotide sequences, and these sequences will be separated individually by using the split service.
2. Right click on “Split sting into string list by regular expression” and choose “Add to model”. Each nucleotide sequence is called a ’string’ and the regular expression is the pattern that separates each string.
3. This service requires 2 workflow inputs: the string and regex (regular expression). Right click on “Workflow inputs” in the low left panel and choose “Create New Input…”
4. Add “FASTA sequence” in the “Name for the new workflow input” box. This input will be the nucleotide sequence file. After that, add another workflow input as “pattern” to be assigned later on as a format that separates each nucleotide sequence.
5. Then, right click on the “Workflow outputs” to add the output of the splited FASTA files.
6. On the right panel, graphical representation of workflow inputs and output are illustrated. These boxes need to be connected together. Right click on “FASTA sequence” and choose “Processors > Split_string… > string”. Then, this box will be linked to the processor. The FASTA file will be added to this input later on.
7. Connect “pattern” by right clicking on it and choose “Processors > Split_string… > regex”.
8. Connect the split processor to the output by right clicking on “split” under Processors category.
9. After that, the workflow is established completely for this process on the right panel.
10. The workflow can now be run by choosing File > Run workflow… on the left corner.
11. A popup window will be appeared and enable adding value for the inputs. Right click on “FASTA_sequence” and choose “New input value”.
12. The FASTA file containing nucleotide sequences is added on the right panel.
13. For pattern, only “>” is added since each nucleotide sequence is begun with it. The service can find this symbol and separate each sequence.
14. Then, click on “Run workflow” botton. The result will appear on the main window program under the “Result” tab.
Leave a Comment













