High Performance Computing for genomic applications Snakemake example

  • Slides: 12
Download presentation
High Performance Computing for genomic applications Snakemake example with genomic data Scientific IT Services

High Performance Computing for genomic applications Snakemake example with genomic data Scientific IT Services Michal Okoniewski, Samuel Fux, Manuel Kohler ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 | 1

Snakemake § Workflow management system § Designed by Johannes Köster § Now PI at

Snakemake § Workflow management system § Designed by Johannes Köster § Now PI at Uni. Essen § § § Python 3 – based cmake philosophy conda installation conda support http: //snakemake. readthedocs. io/ ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 | 2

Snakefile – set of rules ID | SIS | |

Snakefile – set of rules ID | SIS | |

Freeform python, rule all, wildcards ID | SIS | |

Freeform python, rule all, wildcards ID | SIS | |

Running snakemake on the cluster ID | SIS Michal Okoniewski, Scientific IT ETH |

Running snakemake on the cluster ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 | 5

Cluster settings: cluster. json ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020

Cluster settings: cluster. json ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 | 6

What we actually do in snakemake? § Directed acycylic graph of jobs § Can

What we actually do in snakemake? § Directed acycylic graph of jobs § Can be seen with snakemake --dag > graph. dag dot -Tpdf graph. dag > aaa. pdf § Visualizes dependencies of rules ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 | 7

What we actually do in snakemake? ID | SIS Michal Okoniewski, Scientific IT ETH

What we actually do in snakemake? ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 | 8

Other examples of rules graph ID | SIS Michal Okoniewski, Scientific IT ETH |

Other examples of rules graph ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 | 9

Snakemake happily finished ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 |

Snakemake happily finished ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 | 10

Advantages and difficulties of snakemake § Reproducibility § Control over workflow § Re-running §

Advantages and difficulties of snakemake § Reproducibility § Control over workflow § Re-running § Encapsulation of typical tasks § “One-click” starting of a large process § You need to “speak python” § Learning curve steep at the beginning ID | SIS Michal Okoniewski, Scientific IT ETH | 9/9/2020 | 11

Snakemake

Snakemake