Shim Services or helper services These exercises highlight

Exploring Shims A shim is a service that doesn’t perform an experimental function, but

Exploring Shims Find the workflow ‘Biomartand. Emboss. Disease’ on my. Experiment (workflow ID: 2213)

Shims for Data Input One of the most common uses of shims is to

Shims for Data Input In the services panel, find ‘get Protein FASTA’ and import

Shims for Data Input The results now contain 2 protein sequences. This is ok

Shims for Data Input Right-click on the regex input to ‘split_string. . . ’

Shims for Data Input Clustalw requires a single file containing all the sequences. Currently,

Writing your own shim services Many shims are actually Beanshell scripts allow you to

Writing your Own Beanshell q q q Create a new workflow by selecting ‘file’

Writing your Own Beanshell Select the script tab and Paste the following script my.

Checkpoint Beanshell Load the workflow ‘Disease Genes and GO’ from my. Experiment (workflow ID

Managing Services that Produce or Consume XML Service that produce and consume XML naturally

Complex Types – XML Services Load the service ‘Get. Weather’ into the workbench. Right

Slides: 14

Download presentation

Shim Services (or helper services) These exercises highlight the services that do not perform biological functions, but are vital for running life science workflows

Exploring Shims A shim is a service that doesn’t perform an experimental function, but acts as a connector, or glue, when 2 experimental services have incompatible outputs and inputs A shim can be any type of service – WSDL, soaplab etc. Many are simple Beanshell scripts We have already used many shims in these exercises http: //en. wikipedia. org/wiki/Shim (for the origin of the word)

Exploring Shims Find the workflow ‘Biomartand. Emboss. Disease’ on my. Experiment (workflow ID: 2213) Run the workflow and work out what it does Work out which services are shims What do the shims do?

Shims for Data Input One of the most common uses of shims is to transform your input data into a format the analysis services will understand In this exercise, we will find a set of protein sequences from Uniprot and perform a multiple sequence alignment with them We will use shims to process a list of Uniprot identifiers and to combine retrieved services into a list for multiple alignment

Shims for Data Input In the services panel, find ‘get Protein FASTA’ and import it into a new workflow Connect an input and output and run the workflow with the input ‘Q 9 RZ 30’ (a Uniprot ID) Look at the results – you should see a single protein sequence Go back to the design window and right-click on the workflow input and select ‘edit workflow input port’ Change the port depth from single to ‘List of depth 1’ Rerun the workflow. This time, enter ‘Q 9 RZ 30’ and ‘Q 5 WJS 9’ as a list

Shims for Data Input The results now contain 2 protein sequences. This is ok for one or 2, but it is very inefficient for a long list of protein identifiers Go back to the design window and change the input port back to a single value (you can use Taverna’s undo button to do this if you like). Find the service ‘Split string into string list by regular expression’ and import it into the workflow Delete the link between the input and ‘get Protein Fasta’ and reconnect the workflow so that the input flows to ‘split_string (string)’ and then ‘get Protein Fasta’

Shims for Data Input Right-click on the regex input to ‘split_string. . . ’ and enter the constant value ‘n’ This is the regular expression that will split an input string every time there is a new line Run the workflow again, but this time, download and use the ‘Shims Protein Data File’ on my. Experiment as input: How many sequences do you get this time? Now we will align those sequences using the ‘EMBL-EBI Clustal. W 2_SOAP’ workflow from my. Experiment (ID: 1768) Upload it and see what input it requires

Shims for Data Input Clustalw requires a single file containing all the sequences. Currently, we have a list. Therefore, we need another shim Go back to your ‘Get Protein FASTA’ workflow and find and import the service ‘Merge string list to string’ Add this service after ‘Get Protein FASTA’ and run the workflow again This is another shim. It changes the format of the service output Now import the ‘EMBL-EBI Clustal. W 2_SOAP’ workflow as a nested workflow and run the whole thing. This time you will get a protein sequence alignment. (Note: if you can’t remember how to import workflows, refer to the intro tutorial from yesterday)

Writing your own shim services Many shims are actually Beanshell scripts allow you to add simple data transformation steps into your workflow in an easy way. The next few exercises will give you a brief introduction to writing Beanshells and give you some examples of when they are commonly used.

Writing your Own Beanshell q q q Create a new workflow by selecting ‘file’ and ‘New Workflow’ Add a new Beanshell from the “service template” section of the service panel. A configure window will pop-up Create 2 input ports named: my. Name and my. Surname after selecting the ‘Ports’ tab Cretate 1 output port named: my. Fullname

Writing your Own Beanshell Select the script tab and Paste the following script my. Fullname = my. Name +"t" + my. Surname q Create 2 workflow inputs and 1 workflow output and connect them to the configured beanshell service. q Run the workflow q You should get your full name printed in the output. This is a very simple example of using helper services to format results from your workflow

Checkpoint Beanshell Load the workflow ‘Disease Genes and GO’ from my. Experiment (workflow ID 944) This workflow produces a list of all genes associated with OMIM diseases on Chrom Y and finds their GO descriptions and IDs q Currently, this workflow produces different outputs for Gene Ids, gene ontology descriptions and identifiers. q Write a Beanshell to combine the gene IDs, gene ontology terms and descriptions into one output file HINT: when you run the workflow, you need to think about iteration and how your different IDs and terms combine

Managing Services that Produce or Consume XML Service that produce and consume XML naturally require XML documents as input. These are called Complex Type services Producing XML by hand can be difficult Taverna’s XML splitters help you construct or navigate the XML required or produced XML splitters can either assemble XML elements or separate elements out of the XML so that users only have to add strings as inputs or get strings as outputs, and Taverna populates the XML or extracts from the XML behind the scenes.

Complex Types – XML Services Load the service ‘Get. Weather’ into the workbench. Right -click and select add xml input splitter Expand the ports – now you will see the XML splitter box has divided the one input from Get. Weather into 2 inputs – City. Name and Country. Name Add two string constants – France and Paris Add an XML splitter for the output Save and run the workflow