microarray Organism Experimental design Sample list Sample distribution
使用microarray数据须知 • • Organism Experimental design Sample list (Sample distribution, sample size) Platform • Important!!!!
Data levels and data types • https: //tcgadata. nci. nih. gov/tcga. Data. Type. jsp
Print • print(matrix(c(1, 2, 3, 4), 2, 2)) • print(list("a", "b", "c"))
Basis functions • ls() • rm() • c() #creating a vector, c() is a function • mode() # • class() # • • • mean(x) median(x) sd(x) var(x) cor(x, y) # cov(x, y)
Creating Sequences • • 1: 5 5: 1 seq(from=0, to=20, by=5) 1. 1: 10. 1 1. 1: 10. 3 a<-rep(0, 3) rep(c(1, 2, a), 2)
Basic calculations • • + * / %% ^ %*% #matrix multiply • log(x) • sin(x) • exp() • • e Pi Inf NA
Data mode: Physical Type mode(3. 1415) # Mode of a number [1] "numeric" > mode(c(2. 7182, 3. 1415)) # Mode of a vector of numbers [1] "numeric" > mode("Moe") # Mode of a character string [1] "character"
Data Class: Abstract type • scalar • array (vector) • matrix • From array to matrix • factor (looks like a vector, but has special properties, for Categorical variables or grouping) • data. frame
data. frame • Same data mode in each column • Unique Row/column names (rownames, colnames) • One row of a data. frame is a data. frame • as. data. frame(****) matrix • Same data mode in the whole matrix • Can have repeated Row/column names • One row of matrix is an array (vector) • as. matrix(****)
这门课处理的数据类型 • Clinical data-> data. frame • Experimental data-> data. frame or matrix – Microarray data – RNA seq data – Somatic mutation data – Protein array – DNA methylation data
Data combining • cbind – Combine data by column • rbind – Combine data by row • Eg. a<-matrix(0, 2, 2) b<-matrix(1, 2, 2) cbind(a, b) rbind(a, b)
length • a<-c(1: 5) • length(a)
apply • Apply Functions Over Array Margins • apply(DATA, MARGIN, FUNCTION, . . . ) – MARGIN= 1 for rows; 2 for columns • Eg. m <- matrix(c(1: 10, 11: 20), nrow = 10, ncol = 2) apply(m, 1, mean) apply(m, 2, mean)
Pattern寻找 • Which command • which(****), **** should be a logical operation • which(****), return the index of TRUE elements in the logical operation • Eg x<- floor(10*runif(10)) x which(x<5) x[which(x<5)]
For loop: http: //en. wikipedia. org/wiki/For_loop In computer science a for loop is a programming language statement which allows code to be repeatedly executed Question: Calculate the sum of all the values in the vector x<- floor(10*runif(10))
For loop Real computer program! Eg. for(i in 1: 100){ print("Hello world!") print(i*i) }
For loop for(*** in ***){} for(VARIABLE in TARGETSET){} for(i in 1: 100){} x <-floor(10*runif(10)) total_x<-0 for(i in 1: length(x)) { print(i) print(x[i]) total_x<-total_x+x[i] }
Working directory • getwd() • setwd(“****”) • list. files() • load(“****”) • save. image(“****”)
实例 • 摘出colon cancer的clinical information中所 有二期和三期的样本
- Slides: 47