A Multithreading C Data Synchronization Program and Its





























- Slides: 29

A Multithreading C# Data Synchronization Program and Its Realization Course: ECE 1747 H Parallel Programming Professor: Christiana Amza Student / Presenter: Bin Li Dec. 12, 2006 @ University of Toronto

Agenda Ø Background Problem & Solution Parallel Implementation Performance Measuring Other Approaches Future Work Q&A Parallel Programming Professor: Christiana Amza 2 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Background (Company & Project) “Retail Value Canada Inc. ” Markham-based Specialty Retailer 384 stores in Canada & USA, 30 K types of items Head Office-side information maintained in Windows Store-side information maintained in Unix Data synchronization is needed Data type: product code, status, cost, price, promo, deal, subsidy, vendor, warehouse, etc. (by item, by store) Current application: i. Sync (developed in 2000 in Visual C# 1. 0) Parallel Programming Professor: Christiana Amza 3 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Background (System Architecture) Parallel Programming Professor: Christiana Amza 4 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Agenda ü Ø Background Problem & Solution Parallel Implementation Performance Measuring Other Approaches Future Work Q&A Parallel Programming Professor: Christiana Amza 5 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Problem & Solution Scheduled Data Synchronization (i. Sync process) starts at 10 pm, and ends at 12 am i. Sync extracts and transforms data from Windows into ASCII file (. dat), and sends it to Unix Mass data modification takes i. Sync quite a long time (4 -5 hours) to run, which is over 2 hour schedule limit The latest change (i. e. prices) in head office cannot reach stores before the opening hour of the next business day Parallel Programming Professor: Christiana Amza 6 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Problem & Solution (cont’d) The store-side information delay causes inaccurate sales information in retail stores Bottleneck: i. Sync (only 10% CPU usage on a 4 -CPU database server) Parallel Programming Professor: Christiana Amza 7 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Problem & Solution (cont’d) Sequential program i. Sync generates. dat file by each store, which is slow Parallel solution Implementing q. Sync to replace i. Sync (using Microsoft C# multithreading) Parallelly generating. dat file by store groups Parallel Programming Professor: Christiana Amza 8 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Agenda ü ü Ø Background Problem & Solution Parallel Implementation Performance Measuring Other Approaches Future Work Q&A Parallel Programming Professor: Christiana Amza 9 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Parallel Implementation Development Environment Design/UML Tool: Microsoft Visio 2003 Development Tool: Microsoft Visual Studio. NET 2003 Programming Language: Visual C# 2. 0 (multithreading similar to Linux PThreads) Parallelization Steps Store Data Segmentation Parallel Data Processing Result Data Consolidation Parallel Programming Professor: Christiana Amza 10 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Parallel Implementation (cont’d) Parallel Programming Professor: Christiana Amza 11 Student: Bin Li Dec. 12, 2006 @ University of Toronto

User Interface (Screen 1) Parallel Programming Professor: Christiana Amza 12 Student: Bin Li Dec. 12, 2006 @ University of Toronto

User Interface (Screen 2) Parallel Programming Professor: Christiana Amza 13 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Sample Code using System. Threading; private int get. Nbr. Of. Instance() { //. . . string sql. Stmt = "select cast(RBSValue as int) from Rules. Based. System " + "where RBSTxt = 'HISSPNbr. Of. Inst' and RBSScope. Key = 'Retail Value'"; //. . . } HISSPCLPSync. Component clp. Component = null; clp. Component = new HISSPCLPSync. Component(); Thread. Start thread. Delegate=null; Thread thread. Obj=null; Parallel Programming Professor: Christiana Amza 14 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Sample Code (cont’d) for (int i=0; i<dtb. Rows. Count; i++) { //. . . thread. Delegate = new Thread. Start(clp. Component. Ext. CLPPrice); thread. Obj = new Thread(thread. Delegate); thread. Obj. Name = Convert. To. String(i); thread. List. Add(thread. Obj); //Start the thread. Obj. Start(); } // Join the threads for (int i = 0; i<dtb. Rows. Count; i++) { thread. Obj = (Thread) thread. List[i]; thread. Obj. Join(); } while(j>0) //Approach #3 { lock(this) { consolidate. CLPPrice(base. Store. Id[j], item. Base. Id[j], market. Zone. Id[j], item. Pack. Id[j]); } j--; } Parallel Programming Professor: Christiana Amza 15 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Agenda ü ü ü Ø Background Problem & Solution Parallel Implementation Performance Measuring Other Approaches Future Work Q&A Parallel Programming Professor: Christiana Amza 16 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Performance Measuring Testing Environment Database Server Intel Xeon CPU 2. 40 GHz, 4 CPUs, 3 GB RAM Windows 2000 w/SP 4, MS SQL Server 2000 Subset of real production data Web/Application Server Intel Pentium 4, 2 CPU 3. 40 Ghz (HT), 2 GB RAM Windows XP w/SP 2, IIS 5. 1 Performance Counters CPU % Usage Execution Time Parallel Programming Professor: Christiana Amza 17 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Performance Comparison Number of Threads 1 (sequential) 2 5 10 20 Parallel Programming CPU Usage / Execution Time (sec) 10% 2552 25% 1545 75% 527 93% 461 100% 370 Professor: Christiana Amza 18 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Performance Comparison (cont’d) CPU % Usage Execution Time Parallel Programming Professor: Christiana Amza 19 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Agenda ü ü Ø Background Problem & Solution Parallel Implementation Performance Measuring Other Approaches Future Work Q&A Parallel Programming Professor: Christiana Amza 20 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Other Approaches (Approach #2) “Locking Temp Files” Parallel Programming Professor: Christiana Amza 21 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Other Approaches (Approach #2 cont’d) “Locking Temp Files” All threads write to single. dat file Using lock for file appending Result: bad as sequential Explanation: same disk file cannot be shared simultaneously by different threads, needs to close/re-open (different from shared memory) Parallel Programming Professor: Christiana Amza 22 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Other Approaches (Approach #3) “Locking Temp Tables” Parallel Programming Professor: Christiana Amza 23 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Other Approaches (Approach #3 cont’d) “Locking Temp Tables” All threads share single temporary database table Using lock for table record inserting Result: much better than sequential, not as good as the Main Approach Explanation: database server has enough memory; lock brings slight delay Parallel Programming Professor: Christiana Amza 24 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Agenda ü ü ü Ø Background Problem & Solution Parallel Implementation Performance Measuring Other Approaches Future Work Q&A Parallel Programming Professor: Christiana Amza 25 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Further Work Database Parallelism Upgrading SQL Server 2000 to 2005 Migrating C# code of data synchronization to database stored procedures, optimizing SQL queries Changing temporary table(s) to permanent schema Using SQL Server Integration Services (SSIS 2005) to do parallel data load & transformation Accessing permanent table (which contains final data to be synchronized) to generate. dat file Parallel Programming Professor: Christiana Amza 26 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Q&A Thanks! Parallel Programming Professor: Christiana Amza 27 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Additional Slide for Q&A (C#) C# (pronounced “C Sharp”) Microsoft. NET Framework-compliant language Simple, modern, object oriented programming language derived from C and C++ Aims to combine the high productivity of Visual Basic and the raw power of C++. C# vs Java Similar but not same in language specifications Compilation: C# to Microsoft Intermediate Language (MSIL), and Java to Java bytecode Running: C# in Common Language Runtime (CLR), Java in Java Virtual Machine (JVM) Parallel Programming Professor: Christiana Amza 28 Student: Bin Li Dec. 12, 2006 @ University of Toronto

Additional Slide for Q&A (Main Approach vs Approach #3) Parallel Programming Professor: Christiana Amza 29 Student: Bin Li Dec. 12, 2006 @ University of Toronto