Introduction to Linux and R Justification for Linux

  • Slides: 11
Download presentation
Introduction to Linux and R

Introduction to Linux and R

Justification for Linux • Linux is one of several variants of Unix; Linux, Solaris,

Justification for Linux • Linux is one of several variants of Unix; Linux, Solaris, Mac. OS • Several “flavours” of Linux: eg Ubuntu, Debian, Fedora, Red. Hat • Many Bioinformatics tools are only available for Linux • All large computers that are analysing large data sets run Unix • There are many commands and programs for managing large files • You can run a linux machine from anywhere, • If you have a large dataset that you cannot analyse on your laptop you can run it on a linux machine on another continent from your laptop

Learning Linux • Graphical interfaces available but we use the command line • Can

Learning Linux • Graphical interfaces available but we use the command line • Can access remote Linux machines from a Windows machine with ”putty” available from http: //www. putty. org/ • Much to learn but a lot of help on the internet

Virtual Machines Windows Host Operating System Virtual Box Virtual machine Xubuntu Guest Operating System

Virtual Machines Windows Host Operating System Virtual Box Virtual machine Xubuntu Guest Operating System Docker Container Python Workflow for fast. Structure Other software eg: Beagle, Plink, vcftools etc Docker provides a way to run applications securely isolated in a container, packaged with all its dependencies and libraries. Docker is free from https: //docs. docker. com/ H 3 Africa project is developing workflows for GWAS for distribution on the Cloud in Docker Containers http: //h 3 abionet. org/17 -h 3 abionetcourses/h 3 abionet-coursesupcoming/266 -h 3 abionet-cloudcomputing-hackathon

Cut and Paste Between Windows and Linux • The most recent set linux installation

Cut and Paste Between Windows and Linux • The most recent set linux installation notes has instructions for enabling cut and paste between Windows and Linux • If you have already done the installation but not enabled cat andpaste between Windows and. Linux then Google “virtualbox cut and paste between host and guest” to get instructions

Getting a terminal To get the Application menu Right click on the desktop Or

Getting a terminal To get the Application menu Right click on the desktop Or Alt-F 1

R • R is a programming language popular with statisticians • Many Bioinformatics packages

R • R is a programming language popular with statisticians • Many Bioinformatics packages written in R • As a programming language R is complex to learn • A very little knowledge of R is sufficient to run many R packages • We will mainly use R because it can generate good graphics • You should have installed R on your computers • There is a lot of help online • Plink is one program with many options R is a programming language in which anyone can write a program

Lists, Arrays and Vectors Used for storing sets of similar information Zero based index

Lists, Arrays and Vectors Used for storing sets of similar information Zero based index Array of Integers One based index (R) Array of Strings 0 672 1 242 2 530 3 1 4 501 5 972 6 417 7 180 1 2 3 4 5 6 7 8 Apple orange mango melon banana pineapple guava If I enter Fruit[2] the program will return orange If I enter Fruit[2] = “melon” the program will change the contents of cell 2 to melon The R tutorial will teach you how to create vectors and extract data from them in R. Arrays can have many dimensions. In R multi-dimensional vectors are called matrices. Other languages use the same concept but different syntax lemon

Bash scripting • The shell is an application that allows users to communicate with

Bash scripting • The shell is an application that allows users to communicate with the computer. • The “bash” shell is the most widely used shell for Linux • The shell can be used to write simple programs or shell scripts • We will use a couple of scripts to run the same command many times each time with different parameters. • Scripts are very fussy about exact use of capitals and punctuation • Bad things can happen if you copy from word documents into linux commands. Punctuation may need to be reentered.

Example bash shell script eucalypt="eucalypt. clean 3" #Begininging of loop for first in bh

Example bash shell script eucalypt="eucalypt. clean 3" #Begininging of loop for first in bh br cr dr kh lj lr qv rt do grep $first ${eucalypt}. ped | awk '{print $1, $2, $1}' > within. txt plink --file $eucalypt --fst --within. txt --allow-no-sex --out temp done To learn how to write loops in the bash shell Google ‘bash loops’

Excercises • Do the online Linux tutorials 1 -6 at http: //www. ee. surrey.

Excercises • Do the online Linux tutorials 1 -6 at http: //www. ee. surrey. ac. uk/Teaching/Unix/ • Complete the Rpractical. doc