Intel Compilers 9 x on the Intel Core

  • Slides: 41
Download presentation
Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version Intel Software

Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version Intel Software College

Objectives At the successful completion of this module, you will be able to: •

Objectives At the successful completion of this module, you will be able to: • Use key compiler optimization switches • Optimize software for the Architecture • Enhance performance with vectorization and other techniques Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 2 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Agenda Introduction Compiler Switches Dual Core Vectorization Intel Compilers 9. x on the Intel®

Agenda Introduction Compiler Switches Dual Core Vectorization Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 3 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Key to optimizing: Intel® Core™ Duo Exploiting Architectural Power requires Sophisticated Compilers Optimal use

Key to optimizing: Intel® Core™ Duo Exploiting Architectural Power requires Sophisticated Compilers Optimal use of • Registers & functional units • Dual-Core/Multi-processor • SSE instructions • Cache architecture Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 4 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

C++ Compatibility with Microsoft Source & binary compatible with VC 2003 with /Qvc 71,

C++ Compatibility with Microsoft Source & binary compatible with VC 2003 with /Qvc 71, Source & binary compatible with w/ VC 2005 under /Qvc 8. Microsoft* & Intel Open. MP binaries are not compatible. • Use the one compiler for all modules compiled with Open. MP For more information, refer to the User’s Guide Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 5 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Use Intel Compiler in Microsoft IDE C++ Intel Compilers 9. x on the Intel®

Use Intel Compiler in Microsoft IDE C++ Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 6 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Agenda Introduction Compiler Switches • Intel® C++ compiler Dual Core Vectorization Intel Compilers 9.

Agenda Introduction Compiler Switches • Intel® C++ compiler Dual Core Vectorization Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 7 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

General Optimizations Windows* Linux* Mac* /Od -O 0 Disables optimizations /Zi -g -g Creates

General Optimizations Windows* Linux* Mac* /Od -O 0 Disables optimizations /Zi -g -g Creates symbols /O 1 -O 1 Optimize for Binary Size: Server Code /O 2 -O 2 Optimizes for speed (default) /O 3 -O 3 Optimize for Data Cache: Loopy Floating Point Code Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 8 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Multi-pass Optimization Interprocedural Optimizations (IPO) ip: Enables interprocedural optimizations for single file compilation ipo:

Multi-pass Optimization Interprocedural Optimizations (IPO) ip: Enables interprocedural optimizations for single file compilation ipo: Enables interprocedural optimizations across files Windows* Linux* Mac* /Qip -ip /Qipo -ipo Can inline functions in separate files Enhances optimization when used in combination with other compiler features Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 9 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Multi-pass Optimization - IPO Usage: Two-Step Process Compiling Pass 1 Windows* icl -c /Qipo

Multi-pass Optimization - IPO Usage: Two-Step Process Compiling Pass 1 Windows* icl -c /Qipo main. c func 1. c func 2. c Linux* icc -c -ipo main. c func 1. c func 2. c Mac* icc -c -ipo main. c func 1. c func 2. c virtual. o Pass 2 executable Linking Windows* icl /Qipo main. o func 1. o func 2. o Linux* icc -ipo main. o func 1. o func 2. o Mac* icc -ipo main. o func 1. o func 2. o Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 10 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Profile Guided Optimizations (PGO) Use execution-time feedback to guide many other compiler optimizations Helps

Profile Guided Optimizations (PGO) Use execution-time feedback to guide many other compiler optimizations Helps I-cache, paging, branch-prediction Enabled optimizations: • Basic block ordering • Better register allocation • Better decision of functions to inline • Function ordering • Switch-statement optimization • Better vectorization decisions Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 11 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Multi-pass Optimization PGO: Three-Step Process Step 1 Instrumented Compilation Instrumented executable (Mac*/Linux*) icc -prof_gen[x]

Multi-pass Optimization PGO: Three-Step Process Step 1 Instrumented Compilation Instrumented executable (Mac*/Linux*) icc -prof_gen[x] prog. c (Windows*) icl -Qprof_gen[x] prog. c Step 2 Instrumented Execution Run program on a typical dataset Step 3 Feedback Compilation (Mac/Linux) icc -prof_use prog. c (Windows) icl -Qprof_use prog. c DYN file containing dynamic info: . dyn Merged DYN summary file: . dpi Delete old dyn files if you do not want the info included Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 12 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Agenda Introduction Compiler Switches Dual Core • • • Auto Parallelization Open. MP Threading

Agenda Introduction Compiler Switches Dual Core • • • Auto Parallelization Open. MP Threading Diagnostics Vectorization Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 13 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Auto-parallelization: Automatic threading of loops without having to manually insert Open. MP* directives. Windows*

Auto-parallelization: Automatic threading of loops without having to manually insert Open. MP* directives. Windows* Linux* Mac* /Qparallel -parallel /Qpar_report[n] -par_report[n] • Compiler can identify “easy” candidates for parallelization, but large applications are difficult to analyze. Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 14 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Open. MP* Threading Technology Pragma based approach to parallelism Usage: Open. MP switches: -openmp

Open. MP* Threading Technology Pragma based approach to parallelism Usage: Open. MP switches: -openmp : /Qopenmp Open. MP reports: -openmp-report : /Qopenmp-report #pragma omp parallel for (i=0; i<MAX; i++) A[i]= c*A[i] + B[i]; Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 15 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Open. MP: Workqueueing Extension Example Intel Compiler’s Workqueuing extension • Create Queue of tasks…Works

Open. MP: Workqueueing Extension Example Intel Compiler’s Workqueuing extension • Create Queue of tasks…Works on… • • Recursive functions Linked lists, etc. #pragma intel omp parallel taskq shared(p) { while (p != NULL) { #pragma intel omp task captureprivate(p) do_work 1(p); p = p->next; } } Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 16 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Parallel Diagnostics Source Instrumentation for Intel Thread Checker • Allows thread checker to diagnose

Parallel Diagnostics Source Instrumentation for Intel Thread Checker • Allows thread checker to diagnose threading correctness bugs • To use tcheck/Qtcheck you must have Intel Thread Checker installed • • See thread checker documentation http: //www. intel. com/support/perfor mancetools/sb/CS-009681. htm Windows* Linux* Mac* /Qtcheck No support -tcheck Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 17 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Agenda Introduction Compiler Switches Dual Core Vectorization • • • SSE & Vectorization Reports

Agenda Introduction Compiler Switches Dual Core Vectorization • • • SSE & Vectorization Reports Explanations of a few specific vectorization inhibitors Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 18 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

SIMD – SSE, SSE 2, SSE 3 Support 2 x doubles 4 x floats

SIMD – SSE, SSE 2, SSE 3 Support 2 x doubles 4 x floats 1 x dqword SSE 2 SSE 3 16 x bytes SSE 8 x words MMX* 4 x dwords 2 x qwords * MMX actually used the x 87 Floating Point Registers - SSE, SSE 2, and SSE 3 use the new SSE registers Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 19 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

SSE 3 Instructions FISTTP FP to integer conversions ADDSUBPD, ADDSUBPS, Complex arithmetic MOVDDUP, MOVSHDUP,

SSE 3 Instructions FISTTP FP to integer conversions ADDSUBPD, ADDSUBPS, Complex arithmetic MOVDDUP, MOVSHDUP, MOVSLDUP Video encoding SIMD FP using AOS format* LDDQU HADDPD, HSUBPD Thread Synchronization HADDPS, HSUBPS MONITOR, MWAIT * Also benefits Complex and Vectorization Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 20 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Using SSE 3 - Your Task: Convert This… for (i=0; i<=MAX; i++) c[i]=a[i]+b[i]; A[1]

Using SSE 3 - Your Task: Convert This… for (i=0; i<=MAX; i++) c[i]=a[i]+b[i]; A[1] A[0] + not used + B[0] B[1] not used C[0] C[1] not used + 128 -bit Registers not used Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 21 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

… Into This … for (i=0; i<=MAX; i++) c[i]=a[i]+b[i]; A[3] + A[2] + A[1]

… Into This … for (i=0; i<=MAX; i++) c[i]=a[i]+b[i]; A[3] + A[2] + A[1] B[3] B[2] B[1] C[3] C[2] C[1] A[0] + + 128 -bit Registers B[0] C[0] Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 22 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Compiler Based Vectorization Processor Specific Description Use Windows* Linux* Mac* Generate instructions and optimize

Compiler Based Vectorization Processor Specific Description Use Windows* Linux* Mac* Generate instructions and optimize for Intel® Pentium® 4 compatible processors including MMX, SSE and SSE 2. W /Qx. W -x. W Does not apply Generate instructions and optimize for Intel® processors with SSE 3 capability including Core Duo. These processors support SSE 3 as well as MMX, SSE and SSE 2. P /Qx. P /Qax. P -x. P, -ax. P Vectorization occurs by default Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 23 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Compiler Based Vectorization Automatic Processor Dispatch – ax[? ] Single executable • Optimized for

Compiler Based Vectorization Automatic Processor Dispatch – ax[? ] Single executable • Optimized for Intel® Core Duo processors and generic code that runs on all IA 32 processors. For each target processor it uses: • Processor-specific instructions • Vectorization Low overhead • Some increase in code size Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 24 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Why Loops Don’t Vectorize Independence • Loop Iterations generally must be independent Some relevant

Why Loops Don’t Vectorize Independence • Loop Iterations generally must be independent Some relevant qualifiers: • Some dependent loops can be vectorized. • Most function calls cannot be vectorized. • Some conditional branches prevent vectorization. • Loops must be countable. • Outer loop of nest cannot be vectorized. • Mixed data types cannot be vectorized. Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 25 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Why Didn’t My Loop Vectorize? Windows* Linux* Macintosh* -Qvec_reportn -vec_reportn Set diagnostic level dumped

Why Didn’t My Loop Vectorize? Windows* Linux* Macintosh* -Qvec_reportn -vec_reportn Set diagnostic level dumped to stdout n=0: No diagnostic information n=1: (Default) Loops successfully vectorized n=2: Loops not vectorized – and the reason why not n=3: Adds dependency Information n=4: Reports only non-vectorized loops n=5: Reports only non-vectorized loops and adds dependency info Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 26 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Why Loops Don’t Vectorize • “Existence of vector dependence” • “Nonunit stride used” •

Why Loops Don’t Vectorize • “Existence of vector dependence” • “Nonunit stride used” • “Mixed Data Types” • “Unsupported Loop Structure” • “Contains unvectorizable statement at line XX” • There are more reasons loops don’t vectorize but we will disucss the reasons above Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 27 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

“Existence of Vector Dependency” Usually, indicates a real dependency between iterations of the loop,

“Existence of Vector Dependency” Usually, indicates a real dependency between iterations of the loop, as shown here: for (i = 0; i < 100; i++) x[i] = A * x[i + 1]; Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 28 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Defining Loop Independence Iteration Y of a loop is independent of when (or whether)

Defining Loop Independence Iteration Y of a loop is independent of when (or whether) iteration X occurs. int a[MAX], b[MAX]; for (j=0; j<MAX; j++) { a[j] = b[j]; } Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 29 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

“Nonunit stride used” Memory for (I=0; I<=MAX; I++) for (J=0; J<=MAX; J++) { c[I][J]+=1;

“Nonunit stride used” Memory for (I=0; I<=MAX; I++) for (J=0; J<=MAX; J++) { c[I][J]+=1; // Unit Stride c[J][I]+=1; // Non-Unit A[J*J]+=1; // Non-unit A[B[J]]+=1; // Non-Unit if (A[MAX-J])=1 last 1=J; }// Non-Unit End Result: Loading Vector may take more cycles than executing operation sequentially. Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 30 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

“Mixed Data Types” An example: int howmany_close(double *x, double *y) { int withinborder=0; double

“Mixed Data Types” An example: int howmany_close(double *x, double *y) { int withinborder=0; double dist; for(int i=0; i<MAX; i++) { dist=sqrtf(x[i]*x[i] + y[i]*y[i]); if (dist<5) withinborder++; } } Mixed data types are possible – but complicate things • i. e. : 2 doubles vs 4 ints per SIMD register Some operations with specific data types won’t work Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 31 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

“Unsupported Loop Structure” Example: struct _xx { int data; int bound; } ; doit

“Unsupported Loop Structure” Example: struct _xx { int data; int bound; } ; doit 1(int *a, struct _xx *x) { for (int i=0; i<x->bound; i++) a[i] = 0; An unsupported loop structure means the loop is not countable, or the compiler for whatever reason can’t construct a run-time expression for the trip count. Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 32 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

“Contains unvectorizable statement” for (i=1; i<nx; i++) { B[i] = func(A[i]); } A[3] func

“Contains unvectorizable statement” for (i=1; i<nx; i++) { B[i] = func(A[i]); } A[3] func A[2] func A[1] func B[3] B[2] B[1] A[0] func 128 -bit Registers B[0] Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 33 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Reference Web-based and classroom training • www. intel. com/software/college White papers and technical notes

Reference Web-based and classroom training • www. intel. com/software/college White papers and technical notes • www. intel. com/ids • www. intel. com/software/products Product support resources • www. intel. com/software/products/support Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 34 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 35 Copyright

Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 35 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Activity 1 - raytrace 2: Initial Compilation Set up environment and compile with both

Activity 1 - raytrace 2: Initial Compilation Set up environment and compile with both Microsoft* Visual C++. NET (MSVC*) and Intel® C++ Compiler (icl) Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 36 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Activity 2 - raytrace 2: O 3 Compilation Use Intel compiler’s High Level Optimizer

Activity 2 - raytrace 2: O 3 Compilation Use Intel compiler’s High Level Optimizer (-O 3) for loop centric codes Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 37 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Activity 3 - raytrace 2: IPO Compilation Use Intel compiler’s Inter-procedural Optimization (-Qipo) Intel

Activity 3 - raytrace 2: IPO Compilation Use Intel compiler’s Inter-procedural Optimization (-Qipo) Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 38 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Activity 4 - raytrace 2: PGO Compilation Use Intel compiler’s Profile-guided Optimization Intel Compilers

Activity 4 - raytrace 2: PGO Compilation Use Intel compiler’s Profile-guided Optimization Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 39 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Activity 5 – raytrace 2: Vectorization Use Intel compiler’s Vectorization optimization (-Qx. P) Intel

Activity 5 – raytrace 2: Vectorization Use Intel compiler’s Vectorization optimization (-Qx. P) Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 40 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.

Activity 6 - raytrace 2: Putting it all together Use all previous optimizations in

Activity 6 - raytrace 2: Putting it all together Use all previous optimizations in tandem (-O 3, -Qx. P, IPO and PGO) Intel Compilers 9. x on the Intel® Core Duo™ Processor Windows version 41 Copyright © 2006, Intel Corporation. All rights reserved. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States or other countries. *Other brands and names are the property of their respective owners.