Bioinformatics Programming EE NCKU TienHao Chang Darby Chang
Bioinformatics Programming EE, NCKU Tien-Hao Chang (Darby Chang) 1
In the last slide n Terminology – Unix, Linux and UNIX – Linux vs. Window n Unix-like system – commands, permissions, shell – cd, ls, du, ln, sort, find, tar, wc, … n n n Ubuntu Shell script gcc 2
http: //brownblog. info/wp-content/uploads/2008/02/ignore. JPG Other features worth to mention 3
More about Shells 4
Shells Shell Startup n The file. profile (sh) or. login (csh) is used at login to: – – n set path define functions set terminal parameters (stty) set terminal type Other files such as. bash_profile and. bashrc. You can man shell ($ man bash), since shell is also a program, for more information. 5
Shells Sample. profile File n PATH=/usr/bin: /usr/ucb: /usr/local/bin: . n export PATH n PS 1="{ ‘hostname‘ ‘whoami‘ }" n ls() { /bin/ls -sb. F "$@"; } n ll() { ls -al "$@"; } n stty erase ˆH n eval ‘tset -Q -s -m ’: ? xterm’‘ n umask 077 6
Shells . login and. cshrc n . login runs only at login time – tell whether you have mail – tell who else is online – configure terminal settings n . cshrc runs whenever the shell starts – set environment and shell variables – set aliases n Other advanced shells inherit similar concepts. For example, the corresponding files in bash are. bash_profile and. bashrc, respectively. 7
Shells Job Control n Putting a job into the background by appending & to the command line n ˆZ to stop while job is running n bg continue stopped job in background n fg return the job to the foreground n jobs list background jobs n kill a background job 8
Shells History n C Shell, Korn shell and others retain information about former commands executed within the shell – Use history and savehist variables to set number of commands retained: – in. cshrc: • set history=100 savehist=50 – saved in ˜/. history between logins n Examples – – $ $ history nn # prints last nn commands !! # repeats the last command !nn # repeats the command numbered nn !string repeats latest command starting with string 9
Shells Changing your Shell n n n $ chsh /bin/sh The new shell must be the full path name Frequently standard shells: – Bourne /bin/sh – Korn /bin/ksh – C /bin/csh n n n Alternate shells should be listed in /etc/shells tcsh (/bin/tcsh) and bash (/bin/bash) are the most common alternatives To try some other shell, type it at the system prompt – useful when you want to check some compatibility – type exit to return to normal 10
Special Unix Features 11
Special Unix Features I/O Redirection and Piping n Output redirection to a file n Input redirection from a file n n Piping — output of one command becomes the input of a subsequent command Standard File Descriptors – stdin standard input to the program – stdout standard output from the program – stderr standard error output 12
n > redirect standard output to file – $ command > outfile – $ ls > foo n >> append standard output to file – $ command >> outfile – $ echo ‘foo’ >> foo n < input redirection from file – $ command < infile – $ sort < foo – less useful since most commands accept filenames as arguments n | pipe output to another command – $ command 1 | command 2 – $ ls | sort 13
n n n >& >>& |& redirect stdout and stderr to file append stdout and stderr to file pipe stdout and stderr to command n 2> >file 2>&1 >>file 2>&1|command n To redirect stdout and stderr to two separate files: n n n redirect stderr to file redirect both stdout and stderr to file append both stdout and stderr to file pipe stdout and stderr to command – $ (command > outfile) >& errfile – $ command > outfile 2> errfile n To discard stderr: – $ command 2 > /dev/null – /dev/null is a “black hole” for bits 14
Special Unix Features Other Symbols n ; & n && n || n () n command separator run the command in the background run the following command only if previous command completes successfully run the following command only if previous command did not complete successfully grouping, commands within parentheses are executed in a subshell 15
Special Unix Features Quoting n escape the following character (take it literally) – $ echo ”” n ‘’ don’t allow any special meaning to characters within single quotes (except ! in csh) – $ echo ‘shell is $SHELL’ n “” allow variable and command substitution inside double quotes (does not disable $ and within the string) – $ echo “shell is $SHELL” n `command‘ backquotes take the output of command substitute it into the command line (works inside doublequotes) – $ echo `ls` 16
Special Unix Features Wild Cards n n n n n ? * [abc] [a-z] match [!def] (sh) [ˆdef] (csh) {ab, bc, cd} ˜ ˜user any any single character string of zero or more characters one of the enclosed characters character in the range a through z match any characters not one of the enclosed characters match any set of characters separated by comma user’s own home directory of specified user 17
Remember These symbols may vary from shell to shell — see the man pages 18
Any Questions? 19
Screen 20
http: //www. mergersandinquisitions. com/wp-content/uploads/2008/05/alttab-key. jpg No Alt-Tab to switch between programs 21
Screen n n Screen is best described as a terminal multiplexer Prevent multiple terminal emulators More important, to live with sessions rather than terminals To start – $ screen 22
Screen Commands n In screen, all commands begin with ^a (Ctrl+a) – ? – c – – – – – w [0 -9] n, [space] p, [backspace] a, ^a A ', " x, ^x help create a new window, each created window is assigned with a number list current windows switch window by number switch to an empty window (boss coming) switch to the next window switch to the previous window switch to the last window (recall button) name the current window switch window by name lock the current window 23
Screen Sessions n Each screen (and all the associated window) is a session – When you type ‘screen’, you start a session. Then you use ‘^a c’ to create some windows. The status (screen, connections, …) of all these windows are logged by the session. n Commands – ^a d – ^a DD – ^a z, ^a ^z detach the current session and logout make the current session background, of course you can use ‘fg’ to restore it 24
n Options – $ screen -ls list sessions – $ screen -d [pid] detach a remote session (which is attached, but not to the current terminal) detach and logout a remote session reattach a session reattach the youngest session, create a new one if necessary attach to a not detached screen session (multi display mode) – $ screen -D [pid] – $ screen –r [pid] – $ screen –R – $ screen –x – $ screen -d -r – $ screen -D -R reattach a session and if necessary detach it first attach here and now • In detail this means: If a session is running, then reattach. If necessary detach and logout remotely first. If it was not running create it and notify the user. This is the author's favorite. – $ screen -wipe clean dead sessions 25
Any Questions? 26
Relation among Session, terminal and window in screen. One to many, many to one, or many to many? A drawing would be necessary. 27
Text Processing 28
Text Processing Editors n vi – Visual Editor – No alternative (except you choose emacs), it is the best way to force you guys to learn vi – Pronounce both letters: V-I, never “Vy” – Three modes • Command mode (“beep mode”) • Insert mode (“no beep mode”) • Command line mode (“colon mode”) – Commands are generally case sensitive 29
vi Cursor Movement n n n n h, j, k, l [n]h [n]j [n]k [n]l alternates for arrows left [n]character(s) down [n] character(s) up [n] line(s) down [n] line(s) ˆF ˆB ˆD ˆU forward one screen back one screen down half screen up half screen G $ ˆ 0 [n]w [n]b go to last line of file end of current line beginning of text on current line beginning of current line forward [n] word(s) back [n] word(s) 30
vi Inserting/Deleting Text n n n i a I A o O insert text before the cursor append text after the cursor insert text at beginning of line append text at end of line open new line after current line open new line before current line n dd [n]dw D x [n]x X n Confused? Remember range command unit philosophy n n n delete delete current line [n] line(s) [n] word(s) from cursor to end of line current character [n] characters previous character (like backspace) 31
vi Change Commands n n n cw [n]cw c$ ˜ J u change word change next [n] word(s) change from cursor to end of line change case of character joins current line and next line undo the last command just done . [n]yy [n]yw p P repeat last change yank [n] line(s) to buffer yank [n] word(s) to buffer puts yanked or deleted text after cursor puts yanked or deleted text before cursor 32
vi File Manipulation n : w write changes to file n : wq write changes and quit n : w! force overwrite of file n : q quit if no changes made n : q! quit without saving changes n : ! shell escape n : r! insert result of shell command at cursor position 33
Any Questions? 34
How to Change the next 10 lines? 35
What is d. G 36
Text Processing Commands 37
Text Processing Commands grep n n grep — search the argument for all occurrences of the search string grep [option]. . . regexp file – search for the number 15 • $ grep '15' file – count the number of lines matching the search criterion • $ grep -c '15' file – search for lines not matching the search criterion • $ grep -v '15' file – search for 11, 12 or 15 • $ grep '1[125]' file – search for all lines that begin with a space • $ grep '^ ' file – search for lines begin with the characters 1 through 9 • grep '^[1 -9]' file 38
Text Processing Commands Advanced grep Examples n $ wget -q -O- http: //url. to. web/ | grep 'a href' | head – list the first 10 links of a given web page – wget – power of pipe n n $ grep Mem. Total /proc/meminfo $ grep 'model name' /proc/cpuinfo – show RAM and CPU info – everything is a file n $ set | grep $USER – set – using environment variables 39
Text Processing Commands sed n n sed — stream editor for editing files from script or command line sed [option]. . . edit_command file – changes all incidents of a comma into a comma followed by a space • $ sed 's/, /, /g' file – filter for lines containing ‘Date: ’ and ‘From: ’ and replace these without the colon (perform multiple operations) • $ sed -e 's/Date: /Date /' -e 's/From: /From /‘ – print only those lines of the file from the one beginning with "Date: " up to, and including, the one beginning with "Name: " • $ sed -n '/^Date: /, /^Name: /p‘ 40
Text Processing Commands n n awk — scan for patterns in a file and process the results awk program file – $ cat /etc/passwd | tr a-z A-Z | awk -F: '{printf ("user %-16 s %c %5 dn", $1, $2, $3)}‘ – $ awk 'BEGIN { x=0 } /^$/ { x=x+1 } END { print "I found " x " blank lines. : )" }' file n n awk is good at column processing awk is almost a script language 41
Any Questions? 42
How Much Do you remember? 43
2 Things You Should Learn The way to interpret these magic commands Remember that there are such facilities 44
What to Do First When you see this $ sed -n 1, 10 p 45
http: //farm 1. static. flickr. com/34/108805307_c 43 af 20 f 59. jpg Google? Sure But you should try man (or --help) first 46
$ sed -n 1, 10 p n It is similar to head n Then, why not use head? n n Suppose that if you want the ninth and tenth lines $ sed -n 9, 10 p 47
$ sed -n 1, 10 p It is similar to head. Then, why not use head? Suppose that if you want the ninth and tenth lines. $ sed -n 9, 10 p 48
UNIX The Last Reminding n How to remember these? n Just use them! Force yourself to use them! n Briefly speaking n want to search some lines? try grep n want to edit some lines? try sed n want to handle columns? try awk n something awk script cannot do (I doubt), try perl one-liner 49
Any Questions? Our UNIX tutorial is finished 50
Terminology Programming 51
What is programming? What is programming language? What is language? 52
Programming Vs. Language n In general, both can be used to – describe how to accomplishes a particular task n However, to learn programming – to learn the method (step-by-step) n On the other hand, to learn a (programming) language – to learn how to express the method in commonly readable symbols 53
Programming Different Languages n n Express the same concept with different symbols Chinese and English – 想睡覺 – feel sleepy n English vs. C – increase i by 1 – { ++i; } n Programming languages are just languages the commonly readable for computers 54
Programming n n Algorithm + Data Structure Some people think learning programming is to learn the method such as sort. It is not wrong. – An algorithm is a finite set of instructions that accomplishes a particular task n n However, choosing different data structures may result in extremely different algorithms Consider a program of scoring system – it supports insert, delete, min, max, average – two data structures are considered • a) array • b) sorted array 55
Which algorithms of these operations would change between a) and b)? Which operations are faster in a), faster in b) or equal? Which algorithm + data structure combination is better? 56
Dealer In Out 13 cards Requirement - the cards are chosen by chance - but they are displayed in order - each card is represented by two characters - S 1 means spade one, HJ means heart J, . . . - no need to implement a sort algorithm - using C would be the best Bonus - output 52 cards randomly split into 4 sets 57
Deadline 2010/3/16 23: 59 Zip your code, a step-by-step README of how to execute the code and anything worthy extra credit. Email to darby@ee. ncku. edu. tw. 58
qsort n #include <stdio. h> #include <stdlib. h> int comp(const int * a, const int * b) { if (*a < *b) return -1; if (*a > *b) return 1; return 0; } int main(int argc, char* argv[]) { int numbers[10]={1892, 45, 200, -98, 4087, 5, -12345, 1087, 88, -100000}; int i; // sort the array qsort(numbers, 10, sizeof(int), comp); } for (i=0; i<9; i++) printf("Number = %dn", numbers[ i ]); return 0; 59
- Slides: 59