EXTERNAL SORTING ALGORITHMS AND IMPLEMENTATIONS 05011004 Ayhan KARGIN

  • Slides: 25
Download presentation
EXTERNAL SORTING ALGORITHMS AND IMPLEMENTATIONS 05011004 Ayhan KARGIN 05011027 Ahmet MERAL

EXTERNAL SORTING ALGORITHMS AND IMPLEMENTATIONS 05011004 Ayhan KARGIN 05011027 Ahmet MERAL

Contents ¡ ¡ ¡ ¡ External Sorting Needs and Usage Areas External Sorting Algorithms

Contents ¡ ¡ ¡ ¡ External Sorting Needs and Usage Areas External Sorting Algorithms Environments for Implementing External Sorting Algorithms Used Technologies Phases of External Sorting K-Way Merge Sort Multi-Step K-Way Merge Sort Replacement Selection Sort Simple. DB Layered Components of Simple. DB Classes of Simple. DB Query Layer of Simple. DB Relational Algebra, that Simple. DB Supports Relational Algebra Preparatory Work External Sorting on Simple. DB

External Sorting Needs and Usage Areas ¡ DBMS l Group By, Join, Order By

External Sorting Needs and Usage Areas ¡ DBMS l Group By, Join, Order By Data Warehouse (ETL) ¡ Data Mining ¡ Data Processing ¡

External Sorting Algorithms K-Way Merge Sort ¡ Multi-Step K-Way Merge Sort ¡ Replacement Selection

External Sorting Algorithms K-Way Merge Sort ¡ Multi-Step K-Way Merge Sort ¡ Replacement Selection Sort ¡

Environments for Implementing External Sorting Algorithms Min. SQL ¡ Posgre. SQL ¡ Simple. DB

Environments for Implementing External Sorting Algorithms Min. SQL ¡ Posgre. SQL ¡ Simple. DB ¡

Used Technologies Java VM ¡ Java SE ¡ JDBC ¡ Java RMI ¡ Eclipse

Used Technologies Java VM ¡ Java SE ¡ JDBC ¡ Java RMI ¡ Eclipse ¡

Phases of External Sorting Run Generation Phase ¡ Merge Phase ¡

Phases of External Sorting Run Generation Phase ¡ Merge Phase ¡

K-Way Merge Sort

K-Way Merge Sort

Multi-Step K-Way Merge Sort

Multi-Step K-Way Merge Sort

Multi-Step K-Way Merge Sort

Multi-Step K-Way Merge Sort

Replacement Selection

Replacement Selection

Replacement Selection Stage Area Disc Access Count Run Time K (Run Count) 1 x

Replacement Selection Stage Area Disc Access Count Run Time K (Run Count) 1 x 3817 0. 690 17 2 x 420 0. 742 8 3 x 252 1. 049 6 4 x 258 1. 420 4 5 x 266 1. 952 4 6 x 279 2. 333 3 Test results depending on stage area size Note: Main memory is 8 x.

Simple. DB ¡ The Client Side l ¡ The Basic Server l ¡ That

Simple. DB ¡ The Client Side l ¡ The Basic Server l ¡ That contains the JDBC interfaces and implements the JDBC driver Which provides complete funcionality of DB but ignores efficiency issues Extensions l To the basic server that support efficient query processing.

Layered Components of Simple. DB ¡ Remote l ¡ Parse l ¡ Create an

Layered Components of Simple. DB ¡ Remote l ¡ Parse l ¡ Create an execution strategy for an SQL statement, and translate it to a relational algebra plan. Query l ¡ Extract the tables, fields, and predicate mentioned in an SQL statement. Planner l ¡ Perform JDBC requests received from clients. Implement queries expressed in relational algebra. Metadata l Maintain metadata about the tables in the database, so that its records and fields are accessible.

Layered Components of Simple. DB ¡ Record l ¡ Transaction l ¡ Maintain a

Layered Components of Simple. DB ¡ Record l ¡ Transaction l ¡ Maintain a cache of pages in memory to hold recently-accessed user data. Log l ¡ Support concurrency by restricting page access. Enable recovery by logging changes to pages. Buffer l ¡ Provide methods for storing data records in pages Append log records to the log file, and scan the records in the log file. File l Read and write between file blocks and memory pages.

Classes of Simple. DB

Classes of Simple. DB

Classes of Simple. DB

Classes of Simple. DB

Classes of Simple. DB

Classes of Simple. DB

Query Layer of Simple. DB ¡ Relational Algebra l l Select Project Product Sort

Query Layer of Simple. DB ¡ Relational Algebra l l Select Project Product Sort

Relational Algebra, that Simple. DB Supports Relational Algebra SQL Select Project Product Sort Where

Relational Algebra, that Simple. DB Supports Relational Algebra SQL Select Project Product Sort Where Select Join Order By

Relational Algebra ¡ ¡ Select SId, SName, DName from STUDENT, DEPT where Major. Id=DId

Relational Algebra ¡ ¡ Select SId, SName, DName from STUDENT, DEPT where Major. Id=DId and DName='math' Q 1: Product (STUDENT, DEPT) Q 2: Select (Q 2, Major. Id=DId ) Q 3: Select (Q 2, DName='math') Q 4: Project (Q 3, {SId, SName, DName })

Relational Algebra

Relational Algebra

Preparatory Work ¡ ¡ ¡ ¡ We have to introduce order by operator (in

Preparatory Work ¡ ¡ ¡ ¡ We have to introduce order by operator (in SQL) to parser layer of Simple. DB. The columns, which will be sort, must be presenced in Query. Data class in structural form. If sorting will be happened, sorting plans must be called by Basic. Query. Planner class. Calculating number of random accesses per file on File. Mgr layer. Calculating duration of transaction with adding start time and end time to Transaction class. Calculating number of unsorted records on each transaction. We may have to compare, copy and exchange congeneric records on different scans. So, Record. Exchange class is written for these purposes. Step. Calculator class is written. It splits any number to two close integer multipliers. For example; from 29, 6 and 5; from 49, 7 and 7; from 50, 8 and 7; etc.

External Sorting on Simple. DB Materialize Plan ¡ Edward Sciore’s Merge Sort Plan&Scan ¡

External Sorting on Simple. DB Materialize Plan ¡ Edward Sciore’s Merge Sort Plan&Scan ¡ K-Way Merge Sort Plan&Scan ¡ Multi-Step K-Way Merge Sort Plan&Scan ¡ Replacement Selection Sort Plan&Scan ¡ Multi-Step Replacement Selection Sort Plan&Scan ¡

Algorithms Choice Dependency

Algorithms Choice Dependency