High Performance Computing On Laptops With Multicores GPUs

  • Slides: 18
Download presentation
High Performance Computing On Laptops With Multicores & GPUs Sushil K. Prasad Computer Science

High Performance Computing On Laptops With Multicores & GPUs Sushil K. Prasad Computer Science sprasad@gsu. edu

About me Research Area: Parallel and Distributed Algorithms and Systems - over multicores, GPUs,

About me Research Area: Parallel and Distributed Algorithms and Systems - over multicores, GPUs, clusters, sensors, handhelds, web services, … Lab: n n Distributed and Mobile Systems (Di. Mo. S) at Ga. Tech campus, 5 Ph. D students, 2 M. S. students IEEE TCPP Chair (elected) 2 NSF grants – currently looking for Ph. D/MS/undergraduate students n n Distributed Algorithms High Performance Cloud Computing

Multicore & GPU Chips Inside a Laptop - 100 s of processors

Multicore & GPU Chips Inside a Laptop - 100 s of processors

GPUs Vs Multicores • Combined power exceeds 180 GFLOPs

GPUs Vs Multicores • Combined power exceeds 180 GFLOPs

Intel Core-2 Duo Multicore n n Difficult to parallelize Memory hierarchy is a barrier:

Intel Core-2 Duo Multicore n n Difficult to parallelize Memory hierarchy is a barrier: n n 1 cycle core 3 cycles L 1 cache 14 cycles L 2 250 cycles RAM

GPU: Graphics Processing Unit Nvidia 280 GTX • 240 cores • Extreme memory hierarchy

GPU: Graphics Processing Unit Nvidia 280 GTX • 240 cores • Extreme memory hierarchy • • • Registers Local memory Shared memory/8 cores Off chip Global Memory bottleneck bus to CPU

 • Nvidia 8800 GTX • Smith Waterman Seq Alignment, Fasta, and Blast •

• Nvidia 8800 GTX • Smith Waterman Seq Alignment, Fasta, and Blast • Database: Swiss. Prot • Manavski and Valle 2008

Parallel Data Structures Priority Queues • Large Scale Event Simulation • Immune System Simulation

Parallel Data Structures Priority Queues • Large Scale Event Simulation • Immune System Simulation 5 3 1 2 • VLSI Logic simulation • Branch and Bound • Task Scheduling • Challenge: Fine Grained Systems • Students: Dinesh Agarwal, Nick Mancuso 10 12 6 15 8 7 9 6 16 13 5 14 65 8 23 34 25 38 9 19 7 21 12 14

Parallel Priority Queues on Multicore

Parallel Priority Queues on Multicore

Legacy-Code to GPUs (Student: Chad Christopher)

Legacy-Code to GPUs (Student: Chad Christopher)

Distributed Algorithms for Lifetime of Wireless Sensor Networks (Student: Akshaye Dhawan)

Distributed Algorithms for Lifetime of Wireless Sensor Networks (Student: Akshaye Dhawan)

NP-Hard Distributed Problems in Networks NSF Grant n n n Minimum Vertex/Target Cover Minimum

NP-Hard Distributed Problems in Networks NSF Grant n n n Minimum Vertex/Target Cover Minimum Triangle Packing Optimum mobile sensor network target tracking Minimum channel assignment in mobile adhoc networks Students: John Daigle, Thamer Sulaiman

Middleware for Mobile Ad–hoc Applications Mobile Support Station Applications 3. p 2 p communication

Middleware for Mobile Ad–hoc Applications Mobile Support Station Applications 3. p 2 p communication Process Requests Deviceware Listener Applications 2. Lookup 1. Register Listener Process Requests Deviceware Listener Groupware 10 September 2020 UM-Morris Directory 13

Bond. Flow: Distributed Workflow over Web Services (Student: Janaka Balasooriya) Web Services Registry (UDDI)

Bond. Flow: Distributed Workflow over Web Services (Student: Janaka Balasooriya) Web Services Registry (UDDI) S O A P Lookup for Web services WSDL Web Service Interface Module WS Locator WSDL Parser § Web service interface module § Proxy object generator module Parsed WSDL Workflow Execution Module Proxy Object Generator Module Web Bond Runtime SOAP/ Sy. D JVM Workflow Configuration Module § Workflow configuration module § Execution module. § Mobile Web Services

P 2 P Search based on Bayesian Decision and Value of Information (VOI) –

P 2 P Search based on Bayesian Decision and Value of Information (VOI) – (Student: Rasanjalee) Peer Selection: n n n A Priori Uncertainty : U 1 Sending/forwarding query at each node along query path = series of decision making steps based on incomplete data A decision step: query the node that will reduce the uncertainty of current belief most. A Posterior Uncertainty : U 2 U 1 –U 2 = Information The reduction in uncertainty at each decision step Decision step 1. . Decision step n Experimental Results: Success Ratio Success ratio % n The meaning of Uncertainty based Information 100 90 80 70 60 50 40 30 20 10 0 SR-APS SR-UCBPS 1 2 3 # walkers 4 5 Current Belief. .

Middleware on Distributed Smart Cameras n Middleware on DSC networks n n provide a

Middleware on Distributed Smart Cameras n Middleware on DSC networks n n provide a high-level programming interface for applications. simplify the development of distributed applications on DSC networks. provide networking functionality as part of the middleware Student: Jayampathi Sampat cmucam 3

About me Research Area: Parallel and Distributed Algorithms and Systems - over multicores, GPUs,

About me Research Area: Parallel and Distributed Algorithms and Systems - over multicores, GPUs, clusters, sensors, handhelds, web services, … Lab: n n Distributed and Mobile Systems (Di. Mo. S) at Ga. Tech campus, 5 Ph. D students, 2 M. S. students IEEE TCPP Chair (elected) 2 NSF grants – currently looking for Ph. D/MS/undergraduate students n n Distributed Algorithms High Performance Cloud Computing

High Performance Computing On Laptops With Multicores & GPUs Sushil K. Prasad Computer Science

High Performance Computing On Laptops With Multicores & GPUs Sushil K. Prasad Computer Science sprasad@gsu. edu