Lab 5 handling missing georeferenced data 1 estimating
Lab #5: - handling missing georeferenced data: (1) estimating missing values (2) spatial interpolation Spatial statistics in practice Center for Tropical Ecology and Biodiversity, Tunghai University & Fushan Botanical Garden
Today’s lab • estimating missing values • spatial interpolation • some comments about geographically weighted regression
Conventional EM estimation FILENAME ATTRIBUT 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-DEM&QUAD-DATA. TXT'; OPTIONS LINESIZE=72; TITLE 'SPATIAL FILTER MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP 1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; Y=LOG(MELEV+17. 5); X=(SELEV-25)**0. 5; ********************************** * ARBITRATILY SUPPRESS A VALUE OF Y; * * FOR CROSSVALIDATION EACH VALUE OF Y WOULD BE SUPPRESSED, IN TURN * **********************************; YO=Y; IF IDDEM=1 THEN Y=0; IF IDDEM=1 THEN IM 1=-1; ELSE IM 1=0; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; ********************************** * THE MISSING VALUE IMPUTATON IS THE ESTIMATED COEFFICIENT OF IM 1 * **********************************; PROC REG; MODEL Y = X IM 1; RUN; DATA ORIGINAL; SET STEP 1; IF IM 1=0 THEN DELETE; RUN; PROC PRINT; VAR YO; RUN;
Spatial EM: eigenvector selection FILENAME ATTRIBUT 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-DEM&QUAD-DATA. TXT'; FILENAME MC 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-MC-EIG. TXT'; FILENAME CONN 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-CON. TXT'; FILENAME EVEC 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-EIGENVECTORS. TXT'; TITLE 'SPATIAL FILTER MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP 1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; * Y=LOG(MELEV+17. 5); Y=(SELEV-25)**0. 5; Y 0=Y; X 0=1; principal change IF IDDEM=1 THEN Y='. '; IF IDDEM=1 THEN X 0='. '; RUN; PROC SORT DATA=STEP 1 OUT=STEP 1(REPLACE=YES); BY NAME; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; PROC STANDARD MEAN=0 STD=1 OUT=STEP 1(REPLACE=YES); VAR Y 0; RUN; DATA STEP 1 (REPLACE=YES); SET STEP 1; INFILE EVEC LRECL=2048; INPUT IDE E 1 -E 73; RUN; ********************************* * THE SET OF CANDIDATE EIGENVECTORS IS DETERMINED BY MCADJ >= 0 * *********************************; PROC REG OUTEST=COEF; MODEL Y = E 1 -E 18/SELECTION=STEPWISE SLE=0. 10; OUTPUT OUT=TEMP P=YHAT R=YRESID; RUN; PROC UNIVARIATE NORMAL; VAR YRESID; RUN;
PROC TRANSPOSE DATA=COEF PREFIX=B OUT=COEF 2; VAR E 1 -E 18; RUN; DATA COEF(REPLACE=YES); INFILE MC; INPUT ID LAM_MCM MC MCADJ; IF MCADJ<0. 25 THEN DELETE; RUN; DATA COEF(REPLACE=YES); SET COEF 2; IF B 1='. ' THEN DELETE; IF B 1=0 THEN DELETE; BSQ=B 1**2; EMC=BSQ*MC; P=1; RUN; PROC PRINT; RUN; PROC MEANS SUM NOPRINT; VAR LAM_MCM P EMC BSQ; OUTPUT OUT=EMC SUM=SUML P SPANUM SPADEN; RUN; PROC STANDARD DATA=TEMP MEAN=0 STD=1 OUT=TEMP(REPLACE=YES); VAR YRESID; RUN; DATA STEP 2(REPLACE=YES); INFILE CONN LRECL=1024; INPUT ID C 1 -C 73; RUN; DATA STEP 2(REPLACE=YES); SET STEP 2; SET TEMP(KEEP=YRESID); SET STEP 1(KEEP=Y 0 X 0); ARRAY CONN{73} C 1 -C 73; ARRAY ZC{73} ZC 1 -ZC 73; ARRAY Y 0 C{73} Y 0 C 1 -Y 0 C 73; CSUM=0; DO I=1 TO 73; CSUM = CSUM + CONN{I}; ZC{I} = YRESID*CONN{I}; Y 0 C{I} = Y 0*CONN{I}; END; Z=YRESID; RUN;
PROC PROC DATA MEANS SUM NOPRINT; VAR CSUM X 0; OUTPUT OUT=CSUM SUM=CSUM N; RUN; PRINT; VAR CSUM; RUN; MEANS DATA=STEP 2 NOPRINT; VAR Y 0 C 1 -Y 0 C 73; OUTPUT OUT=Y 0 COUT 1 SUM=Y 0 C 1 -Y 0 C 73; RUN; TRANSPOSE DATA=Y 0 COUT 1 PREFIX=Y 0 C OUT=Y 0 COUT 2; VAR Y 0 C 1 -Y 0 C 73; RUN; MEANS DATA=STEP 2 NOPRINT; VAR ZC 1 -ZC 73; OUTPUT OUT=ZCOUT 1 SUM=ZC 1 -ZC 73; RUN; TRANSPOSE DATA=ZCOUT 1 PREFIX=ZC OUT=ZCOUT 2; VAR ZC 1 -ZC 73; RUN; STEP 2(REPLACE=YES); SET STEP 2(KEEP=X 0 Z CSUM Y 0); SET ZCOUT 2(KEEP=ZC 1); SET Y 0 COUT 2(KEEP=Y 0 C 1); Y 0 C=Y 0 C 1; ZC=ZC 1; DROP ZC 1; RUN; PROC REG DATA=STEP 2 OUTEST=DEN NOPRINT; MODEL CSUM=X 0/NOINT; RUN; PROC REG DATA=STEP 2 OUTEST=NUM 0 NOPRINT; MODEL Y 0 C=Y 0/NOINT; RUN; PROC REG DATA=STEP 2 OUTEST=NUM NOPRINT; MODEL ZC=Z/NOINT; RUN; DATA STEP 3; SET DEN; SET NUM 0; SET NUM; SET CSUM(KEEP=CSUM N); SET EMC(KEEP=SUML SPANUM SPADEN P); MC 0=Y 0/X 0; MC=Z/X 0; EMC = -(1+(N/CSUM)*SUML)/(N-P-1); ZMC = (MC-EMC)/SQRT(2/CSUM); MCSPA=SPANUM/SPADEN; RUN; PROC PRINT; VAR MC 0 MC EMC ZMC MCSPA; RUN; PROC STANDARD DATA=TEMP OUT=TEMP(REPLACE=YES) MEAN=0; VAR YHAT; RUN;
Spatial EM: imputation FILENAME ATTRIBUT 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-DEM&QUAD-DATA. TXT'; FILENAME EVEC 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-EIGENVECTORS. TXT'; TITLE 'SPATIAL FILTER MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP 1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; * Y=LOG(MELEV+17. 5); Y=(SELEV-25)**0. 5; YO=Y; IF IDDEM=1 THEN Y=0; IF IDDEM=1 THEN IM 1=-1; ELSE IM 1=0; for calculating the imputation RUN; PROC SORT DATA=STEP 1 OUT=STEP 1(REPLACE=YES); BY NAME; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; DATA STEP 1 (REPLACE=YES); SET STEP 1; INFILE EVEC LRECL=2048; INPUT IDE E 1 -E 73; RUN; ************** *SPATIAL FILTER IMPUTATION * **************; PROC REG OUTEST=COEF; MODEL Y = E 3 E 4 E 5 E 7 E 12 E 13 E 14 E 16 E 17 E 18 IM 1; OUTPUT OUT=TEMP P=YHAT R=YRESID; RUN; PROC UNIVARIATE NORMAL; VAR YRESID; RUN; DATA TEMP(REPLACE=YES); SET TEMP; IF _N_=1 THEN SET COEF; IF IDDEM>1 THEN DELETE; RUN; PROC PRINT; VAR IDDEM YO IM 1 YRESID; RUN;
Spatial filter GWR: eigenvector selection FILENAME ATTRIBUT 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-DEM&QUAD-DATA. TXT'; FILENAME EVEC 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-EIGENVECTORS. TXT'; TITLE 'SPATIAL FILTER GWR MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP 1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; Y=LOG(MELEV+17. 5); X=(SELEV-25)**0. 5; RUN; PROC SORT DATA=STEP 1 OUT=STEP 1(REPLACE=YES); BY NAME; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; DATA STEP 1 (REPLACE=YES); SET STEP 1; INFILE EVEC LRECL=2048; INPUT IDE E 1 -E 73; ARRAY EVECS{73} E 1 -E 73; ARRAY XEVAR{73} XE 1 -XE 73; DO I=1 TO 73; XEVAR{I} = X*EVECS{I}; END; computes the interaction terms keeps X in the equation RUN; *********** * SPATIAL FILTER GWR * ***********; PROC REG OUTEST=COEF; MODEL Y = X E 1 -E 18 XE 1 -XE 18/SELECTION=STEPWISE SLE=0. 10 INCLUDE=1; OUTPUT OUT=TEMP P=YHAT R=YRESID; RUN; PROC UNIVARIATE NORMAL; VAR YRESID; RUN;
Spatial filter GWR: variable coefficient calculation FILENAME ATTRIBUT 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-DEM&QUAD-DATA. TXT'; FILENAME EVEC 'D: JYU-SUMMERSCHOOL 2006LAB#1PR-EIGENVECTORS. TXT'; FILENAME OUTFILE 'D: JYU-SUMMERSCHOOL 2006LAB#5SF-GWR. TXT'; OPTIONS LINESIZE=72; TITLE 'SPATIAL FILTER GWR MODEL FOR PUERTO RICO: GAUSSIAN RV'; DATA STEP 1; INFILE ATTRIBUT; INPUT IDDEM MELEV SELEV U V QUAD NAME$; Y=LOG(MELEV+17. 5); X=(SELEV-25)**0. 5; RUN; PROC SORT DATA=STEP 1 OUT=STEP 1(REPLACE=YES); BY NAME; RUN; PROC UNIVARIATE NORMAL; VAR Y; RUN; DATA STEP 1 (REPLACE=YES); SET STEP 1; INFILE EVEC LRECL=2048; INPUT IDE E 1 -E 73; ARRAY EVECS{73} E 1 -E 73; ARRAY XEVAR{73} XE 1 -XE 73; DO I=1 TO 73; XEVAR{I} = X*EVECS{I}; END; RUN;
*********** * SPATIAL FILTER GWR * ***********; PROC REG OUTEST=COEF; MODEL Y = X E 1 -E 18 XE 1 -XE 18/SELECTION=STEPWISE SLE=0. 10 INCLUDE=1; OUTPUT OUT=TEMP P=YHAT R=YRESID; RUN; PROC UNIVARIATE NORMAL; VAR YRESID; RUN; DATA COEF(REPLACE=YES); SET COEF (KEEP=INTERCEPT E 2 E 5 E 6 E 7 E 9 E 11 E 12 E 13 E 15 E 18 X XE 4 XE 6 XE 9 XE 10); A=INTERCEPT; BE 2=E 2; BE 5=E 5; BE 6=E 6; BE 7=E 7; BE 9=E 9; BE 11=E 11; BE 12=E 12; BE 13=E 13; BE 15=E 15; BE 18=E 18; BX=X; BXE 4=XE 4; BXE 6=XE 6; BXE 9=XE 9; BXE 10=XE 10; DROP INTERCEPT E 2 E 5 E 6 E 7 E 9 E 11 E 12 E 13 E 15 E 18 X XE 4 XE 6 XE 9 XE 10; RUN; DATA FINAL; SET STEP 1; IF _N_=1 THEN SET COEF; GWRA=A*1 + BE 2*E 2 + BE 9*E 9 + BE 15*E 15 GWRB=BX + BXE 4*E 4 + RUN; BE 5*E 5 + BE 6*E 6 + BE 7*E 7 + BE 11*E 11 + BE 12*E 12 + BE 13*E 13 + + BE 18*E 18; BXE 6*E 6 + BXE 9*E 9 + BXE 10*E 10; DATA _NULL_; SET FINAL; FILE OUTFILE; PUT IDDEM GWRA GWRB; RUN; GWR coefficient calculations
Kriging in Arc. GIS • Step #1: load a *. shp file • Step #2: access the Geostatistical Wizard • Step #3: fit a semivariogram model: – Select a variable – Select the kriging option • Step #4: select “Prediction Map” under the “Ordinary Kriging” heading • Step #5: select a semivariogram model
• Step #6: click FINISH and then OK • Step #7: right-click on the “Ordinary Kriging” Layer that appears & change the extent to “the current display extent” • Step #8: right-click on the “Ordinary Kriging” Layer that appears & select “Create Prediction Standard Error Map”
• Step #9: right-click on “Layers” select “Activate” right-click on “Layers” select “Properties” check “Enable” click “Specify Shape” select the map • Step #10: in the Display window, move the base map to the top, and remove its coloring
What you should achieve after this lab: 1. Impute missing spatial data 2. Create krigged maps (that are clipped) 3. Perform a spatial statistical cross-validation 4. Perform geographically weighted regression with spatial filters
- Slides: 14