Processor Parallelism CPUs Multiple cores driving performance increases
- Slides: 22
Processor Parallelism CPUs Multiple cores driving performance increases GPUs Emerging Intersection Heterogeneous Multiprocessor Computing programmin g – e. g. Open. MP Increasingly general purpose data-parallel computing Graphics APIs and Shading Language s Open. CL is a programming framework for heterogeneous compute resources
The BIG Idea behind Open. CL • Open. CL execution model … – Define N-dimensional computation domain – Execute a kernel at each point in computation domain • C Derivative to write kernels – based on ISO C 99 – APIs to discover devices in a system and distribute work to them • Targeting many types of device – GPUs, CPUs, DSPs, embedded systems, mobile phones. . Even FPGAs
The BIG Idea behind Open. CL Traditional loops void trad_mul(int n, const float *a, const float *b, float *c) { int i; for (i=0; i<n; i++) c[i] = a[i] * b[i]; Data Parallel Open. CL kernel void dp_mul(global const float *a, global const float *b, global float *c) { int id = get_global_id(0); c[id] = a[id] * b[id]; } } // execute over “n” work-items
Open. CL Platform Model • One Host + one or more Compute Devices – Each Compute Device is composed of one or more Compute Units • Each Compute Unit is further divided into one or more Processing Elements
Open. CL Platform Model
Open. CL–Heterogeneous Computing • Framework for programming diverse parallel computing resources in a system • Platform Layer API – Query, select and initialize compute devices • Kernel Language Specification – Subset of ISO C 99 with language extensions • Runtime API – Execute compute kernels – gather results • Open. CL has Embedded profile – No need for a separate “ES” spec Copyright Khronos 2009
Announcing Open. CL 1. 2! • Open. CL 1. 2 Specification publicly available today! – Significant updates - Khronos being responsive to developer requests – Updated Open. CL 1. 2 conformance tests available – Multiple implementations underway • Backward compatible upgrade to Open. CL 1. 1 – Open. CL 1. 2 will run any Open. CL 1. 0 and Open. CL 1. 1 programs – Open. CL 1. 2 platform can contain 1. 0, 1. 1 and 1. 2 devices – Maintains embedded profile for mobile and embedded devices
Open. CL Working Group Members • Diverse industry participation – many industry experts – Processor vendors, system OEMs, middleware vendors, application developers – Academia and research labs, FPGA vendors • NVIDIA is chair, Apple is specification editor Apple
Open. CL Milestones • Six months from proposal to released Open. CL 1. 0 specification – Due to a strong initial proposal and a shared commercial incentive • Multiple conformant implementations shipping – For CPUs and GPUs on multiple OS • 18 month cadence between Open. CL 1. 0, Open. CL 1. 1 and now Open. CL 1. 2 – Backwards compatibility protect software investment Open. CL working group formed Open. CL 1. 0 conformance tests released Dec 08 Jun 10 May 09 Open. CL 1. 0 released Open. CL 1. 2 Specification and conformance tests released! Nov 11 Open. CL 1. 1 Specification and conformance tests released
Looking Forward Open. CL-HLM (High Level Model) Exploring high-level programming model, unifying host and device execution environments through language syntax for increased usability and broader optimization opportunities Long-term Core Roadmap Exploring enhanced memory and execution model flexibility to catalyze and expose emerging hardware capabilities Web. CL Bring parallel computation to the Web through a Java. Script binding to Open. CL-SPIR (Standard Parallel Intermediate Representation) Exploring low-level Intermediate Representation for code obfuscation/security and to provide target back-end for alternative highlevel languages
Major New Features in Open. CL 1. 2 • Partitioning Devices – Applications can partition a device into sub-devices – Enables computation to be assigned to specific compute units – Reserve a part of the device for use for high priority/latencysensitive tasks or effectively use shared hardware resources such as a cache • Separate compilation and linking of objects – Provides the capabilities and flexibility of traditional compilers – Create a library of Open. CL programs that other programs can link to • Enhanced Image Support – Added support for 1 D images, 1 D & 2 D image arrays – Open. GL sharing extension now enables an Open. CL image to be created from an Open. GL 1 D texture, 1 D and 2 D texture arrays
More Major New Features in Open. CL 1. 2 • Custom devices and built-in kernels – Drive specialized custom devices from Open. CL – even if not programmable – Can enqueue built-in kernels to custom devices alongside Open. CL kernels • DX 9 Media Surface Sharing – Efficient sharing between Open. CL and Direct. X 9 or DXVA media surfaces • DX 11 surface sharing – Efficient sharing between Open. CL and Direct. X 11 surfaces • Installable Client Drivers (optional) – Portably handling multiple installed implementations from multiple vendors • And many other updates and additions. .
Partitioning Devices • • • Devices can be partitioned into sub-devices – More control over how computation is assigned to compute units Sub-devices may be used just like a normal device – Create contexts, building programs, further partitioning and creating command-queues Three ways to partition a device – Split into equal-size groups – Provide list of group sizes – Group devices sharing a part of a cache hierarchy Host Compute Device Compu te Unit Compu te Unit Sub-device #1 Real-time processing tasks Sub-device #2 Mainline processing tasks
Custom Devices and Built-in Kernels • • • Embedded platforms often contain specialized hardware and firmware – That cannot support Open. CL C Built-in kernels can represent these hardware and firmware capabilities – Such as video encode/decode Hardware can be integrated and controlled from the Open. CL framework – Can enqueue built-in kernels to custom devices alongside Open. CL kernels FPGAs are one example of device that can expose built-in kernels – Latest FPGAs can support full Open. CL C as well Open. CL becomes a powerful coordinating framework for diverse resources – Programmable and non-programmable devices controlled by one run-time Built-in kernels enable control of specialized processors and hardware from Open. CL run-time
Installable Client Driver • Analogous to Open. GL ICDs in use for many years – Used to handle multiple Open. GL implementations installed on a system • Optional extension – Platform vendor will choose whether to use ICD mechanisms • Khronos Open. CL installable client driver loader – Exposes multiple separate vendor installable client drivers (Vendor ICDs) • Application can access all vendor implementations – The ICD Loader acts as a de-multiplexor
Installable Client Driver Vendor #1 Open. CL Application ICD Loader enables application to use any of the installed implementations Vendor #2 Open. CL Vendor #3 Open. CL ICD Loader ensures multiple implementations are installed cleanly
Open. CL Desktop Implementations • http: //developer. amd. com/zones/Open. CL Zone/ • http: //software. intel. com/enus/articles/opencl-sdk/ • http: //developer. nvidia. com/opencl
Open. CL Books – Available Now! • • Open. CL Programming Guide - The “Red Book” of Open. CL – http: //www. amazon. com/Open. CL-Programming-Guide-Aaftab-Munshi/dp/0321749642 Open. CL in Action – http: //www. amazon. com/Open. CL-Action-Accelerate-Graphics-Computations/dp/1617290173/ Heterogeneous Computing with Open. CL – http: //www. amazon. com/Heterogeneous-Computing-with-Open. CL-ebook/dp/B 005 JRHYUS The Open. CL Programming Book – http: //www. fixstars. com/en/opencl/book/
Open. CL Books – Available Now!
Spec Translations • Japanese Open. CL 1. 1 spec translation available today – http: //www. cutt. co. jp/book/978 -4 -87783 -256 -8. html – Valued partnership between Khronos and CUTT in Japan • Working on Open. CL 1. 2 specification translations – Japanese, Korean and Chinese
Khronos Open. CL Resources • Open. CL is 100% free for developers – Download drivers from your silicon vendor • Open. CL Registry – www. khronos. org/registry/cl/ • Open. CL 1. 2 Reference Card – PDF version – http: //www. khronos. org/files/opencl-1 -2 -quick-reference-card. pdf • Online Man pages – http: //www. khronos. org/registry/cl/sdk/1. 2/docs/man/xhtml/ • Open. CL Developer Forums – Give us your feedback! – www. khronos. org/message_boards/
- Which function is incorporated into some intel cpus
- Instruction level parallelism vs thread level parallelism
- Multiple processor scheduling in os
- Uma multiprocessors using crossbar switches
- Principles of high-performance processor design
- Multiple probe vs multiple baseline
- Example of mimd
- Teoria das cores design
- Indicador ácido-base
- Simbolo embaixadores do rei
- Mapa de risco segurança do trabalho
- Monocromia e policromia atividades 7 ano
- A força e a exuberância das cores douradas do amanhecer
- Em gado da raça dexter existe uma anomalia
- Cores do tabernáculo
- Um fabricante de sorvetes possui a disposição 7
- Nr 26 cores
- Hino da irlanda
- Fenolftaleina cores
- As cores dos amigos
- As cores do arco-íris a brisa a murmurar
- What is a ice core
- Composite cores