Teradata Platform Introduction Hardware and Software Components in

  • Slides: 40
Download presentation
Teradata Platform Introduction Hardware and Software Components in Enterprise Data Warehouse Derek Jones March

Teradata Platform Introduction Hardware and Software Components in Enterprise Data Warehouse Derek Jones March 2005

Teradata in the Enterprise Teradata is relational database management system • Acts as central

Teradata in the Enterprise Teradata is relational database management system • Acts as central enterprise-wide database > Contains information extracted from operational systems > Central placement minimizes data duplication and provides single view of business 2 > 10/30/2020 Teradata Confidential

Key Teradata Differentiators • Parallelism throughout platform • Shared Nothing Architecture • Proprietary intelligent

Key Teradata Differentiators • Parallelism throughout platform • Shared Nothing Architecture • Proprietary intelligent system inter-connect 3 > 10/30/2020 Teradata Confidential

Teradata Scales Linearly • Scaling achieved via ‘shared nothing’ architecture and unconditional parallelism •

Teradata Scales Linearly • Scaling achieved via ‘shared nothing’ architecture and unconditional parallelism • Power is in linear scalability, where slope = 1 • Scales with data More nodes • Scales with users More work • Scales with work More users More data Node Work Users Data 4 > 10/30/2020 Teradata Confidential

The Teradata Difference “Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency

The Teradata Difference “Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency Data Freshness Query Freedom 5 > 10/30/2020 Query Complexity Query Data Volume Teradata Confidential Schema Sophistication

The Teradata Difference “Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency

The Teradata Difference “Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency Competition can be Tuned to Meet a Static Environment Business Needs Change Data Freshness Query Freedom 6 > 10/30/2020 Query Complexity Query Data Volume Teradata Confidential Schema Sophistication

The Teradata Difference “Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency

The Teradata Difference “Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency Competition can be Tuned to Meet a Static Environment Business Needs Change Desire to Increase User/ Query Concurrency Data Freshness Query Complexity Competition Scales One Dimension at the Expense of Others Query Freedom 7 > 10/30/2020 But At the Expense of Another Dimension Query Data Volume Teradata Confidential Schema Sophistication

The Teradata Difference “Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency

The Teradata Difference “Multi-dimensional Scalability” Data Volume (Raw, User Data) Mixed Workload Query Concurrency Teradata can Scale Simultaneously Across Multiple Dimensions Driven by Business! Competition Scales One Dimension at the Expense of Others Limited by Technology! Data Freshness Query Freedom 8 > 10/30/2020 Query Complexity Query Data Volume Teradata Confidential Schema Sophistication

Key Teradata Differentiators • Parallelism throughout platform • Shared Nothing Architecture • Proprietary intelligent

Key Teradata Differentiators • Parallelism throughout platform • Shared Nothing Architecture • Proprietary intelligent system inter-connect 9 > 10/30/2020 Teradata Confidential

Node Architecture (‘Shared Nothing’) Each Teradata Node is made up of hardware and software

Node Architecture (‘Shared Nothing’) Each Teradata Node is made up of hardware and software • Each node has CPUs, system disk, memory and adapters • Each node runs copy of OS and database SW 10 > 10/30/2020 Teradata Confidential

Node Architecture (‘Shared Nothing’) Each Teradata Node is made up of hardware and software

Node Architecture (‘Shared Nothing’) Each Teradata Node is made up of hardware and software • Each node runs copy of OS, database SW, & virtual processes (above line) • Each node has CPUs, system disk, memory & adapters (below line) V 2 Virtual Processors (Vprocs) PE vproc AMP AMP vproc vproc Vdisk PDE UNIX Vdisk 11 > 10/30/2020 Vdisk Teradata Confidential Vdisk

NCR 5400 Server Value Prop 1. Better Price/Performance > 20% Performance Improvement > 12%

NCR 5400 Server Value Prop 1. Better Price/Performance > 20% Performance Improvement > 12% Price/Performance Improvement 2. Advanced Cabinet Design > Up to 10 Nodes Per Cabinet > Up to a 40% Reduction in Floor Space 3. Investment Protection > Multi Generation (5) Coexistence > 32 -bit/64 -bit Transition Platform 12 > 10/30/2020 Teradata Confidential

NCR 5400 Server Key Messages #2 – Advanced Cabinet Design • Revolutionary cabinet Ethernet

NCR 5400 Server Key Messages #2 – Advanced Cabinet Design • Revolutionary cabinet Ethernet Switches BYNET V 3 Switches increases reliability and FC Switches provides greater configuration flexibility. 1 3 > up to 10 nodes per cabinet enable a 20% - 40% smaller footprint than the 5380 > 30% increase in system storage reliability with new advanced cooling mechanisms > Extend supported distance for large systems (65+ nodes) between cabinets to 300 – 600 meters with new BYNET V 3. > Doubles the number of configurable nodes to 1, 024 13 > 10/30/2020 1 3 Up to 10 nodes within each cabinet 1 3 1 3 1 3 Server Management Module (3 GSM) Teradata Confidential Five UPS Modules 1 3

Key Teradata Differentiators • Parallelism throughout platform • Shared Nothing Architecture • Proprietary intelligent

Key Teradata Differentiators • Parallelism throughout platform • Shared Nothing Architecture • Proprietary intelligent system inter-connect 14 > 10/30/2020 Teradata Confidential

Parallelism via BYNET Interconnect BYNET high-speed interconnect facilitates system communication • All nodes connected

Parallelism via BYNET Interconnect BYNET high-speed interconnect facilitates system communication • All nodes connected via BYNET – Hardware network – Software runs on each node Vproc to Vproc Different communication paths facilitate system parallelism Vproc to Vproc Broadcast (1 to All) Multicast (1 to Many) 15 > 10/30/2020 Teradata Confidential – 1 to 1 – 1 to Many – 1 to All

MPP System Configuration Nodes grouped to increase data availability and system uptime > Not

MPP System Configuration Nodes grouped to increase data availability and system uptime > Not shared storage but access within group > Improves data availability > Improves system up time > Allows for VPROC migration 16 > 10/30/2020 Teradata Confidential

Teradata Clique Node Clique is group of nodes that access same arrays • VPROC

Teradata Clique Node Clique is group of nodes that access same arrays • VPROC smallest unit of parallelism • VPROC has assigned storage within clique • VPROCs can migrate within clique • Improves system up time, data availability, and ease of recovery Disk Array 17 > 10/30/2020 Disk Array Teradata Confidential = VPROC

Teradata Clique and VPROC X Node Disk Array 18 > 10/30/2020 Node VPROC smallest

Teradata Clique and VPROC X Node Disk Array 18 > 10/30/2020 Node VPROC smallest unit of parallelism • VPROC smallest unit of parallelism or work • Data distributed by hash to all VPROCs • VPROC has assigned storage within clique • VPROCs can migrate within clique • Improves system up time, data availability, and ease of recovery • Data fully available at degraded performance until node returns. Disk Array Teradata Confidential = VPROC

Teradata Clique with Hot Standby Node X Node Node Disk Array Fibre Channel Switches

Teradata Clique with Hot Standby Node X Node Node Disk Array Fibre Channel Switches Disk Array 19 > 10/30/2020 Disk Array Teradata Confidential Hot Standby

Teradata Optimizer • The Teradata Optimizer is the most robust in the industry •

Teradata Optimizer • The Teradata Optimizer is the most robust in the industry • Optimizer is parallel-aware, understands available system components • Handles mixed work loads > Multiple complex queries > Joins per query > Unlimited ad-hoc processing • Output is least expensive plan (resources) to answer request 20 > 10/30/2020 Teradata Confidential

Teradata Request Cycle REQUEST Parcel CACHED? Request flow diagram DD • Each request parcel

Teradata Request Cycle REQUEST Parcel CACHED? Request flow diagram DD • Each request parcel contains at least one SQL statement • Six main component steps > > > Syntaxer Resolver Security Optimizer Generator gnc. Apply SYNTAXER RESOLVER SECURITY STATISTICS OPTIMIZER • AMP steps are instructions sent to AMP VPROCs to complete the request • Following completion each request generates a success/fail parcel with any necessary records. 21 > 10/30/2020 DBase Access. Rights TVM TVFields Indexes No GENERATOR DATA parcel Teradata Confidential GNCAPPLY AMP STEPS Yes

Data Protection (Object Locks) Locks protect data from simultaneous access • Vary by type

Data Protection (Object Locks) Locks protect data from simultaneous access • Vary by type – Exclusive, Write, Read, & Access • Vary by object locked – Database, Table, & Row Hash • Locks enforced by hierarchy 22 > 10/30/2020 Teradata Confidential

Data Protection (RAID-1) RAID data protection • RAID-1 (disk mirroring) • Disk pair increases

Data Protection (RAID-1) RAID data protection • RAID-1 (disk mirroring) • Disk pair increases read performance and data availability • In failure scenario, mirrored drive re-built by array controller 23 > 10/30/2020 Teradata Confidential

Data Protection (Fallback) • Fallback table data • Copy of table rows maintained by

Data Protection (Fallback) • Fallback table data • Copy of table rows maintained by database on second AMP VPROC • Fallback copies grouped logically in CLUSTERS so data fully available when physical CLIQUE is off-line. • Fallback + RAID increase data availability 24 > 10/30/2020 Teradata Confidential

Data Storage and Access Primary Index value = 25 Data stored by hash •

Data Storage and Access Primary Index value = 25 Data stored by hash • Primary Index is chosen for data distribution, not same as primary key • Primary Index value hashed • Hash value creates bucket assignment • Hash Map assigns buckets to AMP VPROCs • AMP VPROCs reside on specific node • AMP VPROC writes row to disk • Data and algorithm exceptions require Uniqueness value for guaranteed unique Row ID Parsing Engine Hashing Algorithm Row Hash Bucket # Message Passing Layer AMP 25 25 > 10/30/2020 Hash Map Teradata Confidential AMP

Data Access by Primary Index Data accessed by row hash value Table-id Row-hash Need

Data Access by Primary Index Data accessed by row hash value Table-id Row-hash Need 3 pieces of information to find a row M • Table ID a s • Row Hash of PI value > Output of hash algorithm on PI Value • PI Value • Operation involves only one AMP VPROC 26 > 10/30/2020 t e r AMP #3 Cyl 1 Index Cyl 2 Index r# e ind l Cy Cyl 3 Index PI Value DATA BLOCK Data Row Data I n d e x Teradata Confidential Cyl 4 Index Cyl 5 Index Cyl 6 Index Cyl 7 Index

Hashing Customer Table Algorithm USI Value = 56 ID = 100 Data Access by

Hashing Customer Table Algorithm USI Value = 56 ID = 100 Data Access by Unique Secondary Index (USI) Create USI CREATE UNIQUE INDEX (cust) on customer; Access via USI SELECT * FROM customer WHERE cust = 56; USI Data access • Index is created on table • SQL uses USI by value • PE VPROC managing session uses same information as primary index access (Table ID, Row Hash, Index Value) • This process involves two AMP VPROC operations • After USI subtable lookup, process similar to primary index access Table ID Row Hash Index Value 100 602 56 Message Passing Layer AMP 1 AMP 2 AMP 3 USI Subtable Row. ID Cust 244, 1 74 505, 1 77 744, 4 51 757, 1 27 Table ID Row. ID 884, 1 639, 1 915, 9 388, 1 Row. ID Cust 135, 1 98 296, 1 84 602, 1 56 969, 1 49 Row. ID 555, 6 536, 5 778, 7 147, 1 Row. ID Cust 288, 1 31 339, 1 40 372, 2 45 588, 1 95 AMP 4 Row. ID 638, 1 640, 1 471, 1 778, 3 USI Subtable Row. ID Cust 175, 1 37 489, 1 72 838, 4 12 919, 1 62 Row. ID 107, 1 717, 2 147, 2 822, 1 Row Hash Unique Val 778 100 7 Message Passing Layer AMP 1 AMP 2 AMP 3 Base Table Row. ID Cust Name USI 107, 1 37 White 536, 5 84 Rice 638, 1 31 Adams 640, 1 40 Smith 27 > 10/30/2020 PE Phone NUPI 555 -4444 666 -5555 111 -2222 222 -3333 Row. ID Cust Name USI 471, 1 45 Adams 555, 6 98 Brown 717, 2 72 Adams 884, 1 74 Smith Teradata Confidential Phone NUPI 444 -6666 333 -9999 666 -7777 555 -6666 Row. ID Cust Name USI 147, 1 49 Smith 147, 2 12 Young 388, 1 27 Jones 822, 1 62 Black Phone NUPI 111 -6666 777 -4444 222 -8888 444 -5555 AMP 4 Base Table Row. ID Cust Name USI 639, 1 77 Jones 778, 3 95 Peters 778, 7 56 Smith 915, 9 51 Marsh Phone NUPI 777 -6666 555 -7777 888 -2222

Data Access via Non-unique Secondary Index (NUSI) Create NUSI Access via NUSI CREATE INDEX

Data Access via Non-unique Secondary Index (NUSI) Create NUSI Access via NUSI CREATE INDEX (name) on customer; SELECT * FROM customer WHERE name = ‘Adams’; • SQL uses NUSI by value • PE VPROC managing session uses same information as primary index access (Table ID, Row Hash, Index Value) • This process involves all. AMP VPROC operations AMP 1 Row. ID 432, 8 448, 1 567, 3 656, 1 Name Smith White Adams Rice Hashing Algorithm PE AMP 2 Row. ID 640, 1 107, 1 638, 1 536, 5 Table ID Row Hash Index Value 100 567 ‘Adams’ AMP 3 NUSI Subtable Row. ID 432, 3 567, 2 852, 1 Name Smith Adams Brown Row. ID 884, 1 471, 1 717, 2 555, 6 AMP 4 NUSI Subtable Row. ID 432, 1 448, 4 567, 6 770, 1 Name Smith Black Jones Young Row. ID 147, 1 822, 1 338, 1 147, 2 AMP 1 AMP 2 AMP 3 Base Table Row. ID Cust Name NUSI 107, 1 37 White 536, 5 84 Rice 638, 1 31 Adams 640, 1 40 Smith 28 > 10/30/2020 NUSI Value = ‘Adams’ Message Passing Layer NUSI Subtable • Index is created on table Customer Table ID = 100 Phone NUPI 555 -4444 666 -5555 111 -2222 222 -3333 Row. ID Cust Name NUSI 471, 1 45 Adams 555, 6 98 Brown 717, 2 72 Adams 884, 1 74 Smith Teradata Confidential Phone NUPI 444 -6666 333 -9999 666 -7777 555 -6666 Row. ID Cust Name NUSI 147, 1 49 Smith 147, 2 12 Young 388, 1 27 Jones 822, 1 62 Black Phone NUPI 111 -6666 777 -4444 222 -8888 444 -5555 NUSI Subtable Row. ID 155, 1 396, 1 432, 5 567, 1 Name Marsh Peters Smith Jones Row. ID 915, 9 778, 3 778, 7 639, 1 AMP 4 Base Table Row. ID Cust Name NUSI 639, 1 77 Jones 778, 3 95 Peters 778, 7 56 Smith 915, 9 51 Marsh Phone NUPI 777 -6666 555 -7777 888 -2222

Teradata Structures Database structures • Users • Databases > Tables > Views > Macros

Teradata Structures Database structures • Users • Databases > Tables > Views > Macros > Triggers > Stored Procedures > User Defined Functions 29 > 10/30/2020 Teradata Confidential

Teradata is an Open System Virtually any application or middleware framework can be integrated

Teradata is an Open System Virtually any application or middleware framework can be integrated with Teradata. Messages JSP IIOP ASP EJB JAVA CORBA . NET JDBC ODBC OLE-DB Teradata Utilities Adapter(s) Teradata Adapter(s) Queues 30 > 10/30/2020 Teradata Confidential Message Bus JSM Publish & Subscribe Web

64 -bit Teradata Solution Teradata on Su. SE Linux 2 H 2005 • 64

64 -bit Teradata Solution Teradata on Su. SE Linux 2 H 2005 • 64 -Bit 3 rd Party Partners Teradata Application Teradata Tools & Utilities Teradata System Mgmt Teradata Database • 32 -Bit Client-Tier • Intel Also • IBM/Power PC • SUN/SPARC • HP/PA-RISC 64 -BIT Application Server-Tier Client-Tier • DELL • HP • IBM 32 -BIT Application Server-Tier Operating System • Linux Intel Platform 2 H 2005 Intel 64 -Bit Database Server-Tier Teradata Database on Intel 32 -bit and 64 -bit will support both 32 -bit and 64 -bit applications & clients concurrently 31 > 10/30/2020 Teradata Confidential

Teradata’s Real-Time Enterprise Reference Architecture Legacy Environment C/S Enterprise Users — (Browsers and/or Portal)

Teradata’s Real-Time Enterprise Reference Architecture Legacy Environment C/S Enterprise Users — (Browsers and/or Portal) EDI Consumers Suppliers WAN / VAN Internal MSG-MW Partners EDI Internet / Intranet MSG-MW C/S WAN / VAN Transactional Services NW Legacy Environment Analytic & Decision Making Services MSG-MW TX 1 APPL TX 2 TX 3 APPL TX 4 APPL DA-MW ASP / JSP Service Brokers MSG-MW NW Strategic APPL Tactical APPL BI APPL DA-MW Enterprise Message Bus QD OLTP 1 OLTP 2 OLTP 3 OLTP 4 MSG-MW Event Notification Business Rules Event Detection DA-MW Business Process Automation QD EDW — A EDW — B RS RDBMS Based Event Processing Streaming Batch Transactional Repositories 32 > 10/30/2020 Data Acquisition & Integration Teradata Confidential Analytic & Decision Making Repositories

Transactional Services Application Scope Applications have narrow scope. Tuned for specific book-keeping or transactional

Transactional Services Application Scope Applications have narrow scope. Tuned for specific book-keeping or transactional services. Transactional Services NW MSG-MW TX 1 APPL TX 2 TX 3 APPL TX 4 APPL DA-MW Transactional Application Services Applications that perform book-keeping or transactional services for the enterprise Data Access Middleware Occurs via standards, such as; ODBC, OLE-DB, JDBC, as well as proprietary techniques OLTP 1 OLTP 2 OLTP 3 OLTP 4 OLTP Data Repositories • Data that reflects the current state of various business process • Limited history • Tuned for transaction workload Transactional Repositories 33 > 10/30/2020 Teradata Confidential

Transactional User Base Legacy Environment C/S Enterprise Users — (Browsers and/or Portal) EDI Consumers

Transactional User Base Legacy Environment C/S Enterprise Users — (Browsers and/or Portal) EDI Consumers WAN / VAN Suppliers Internal Internet / Intranet Transactional Services NW MSG-MW TX 1 APPL TX 2 TX 3 APPL TX 4 APPL DA-MW ASP / JSP Service Brokers Enterprise Message Bus OLTP 1 OLTP 2 OLTP 3 Transactional User Base Consumers, Suppliers, Internal, and Trading Partners Service Brokers J 2 EE, CORBA, DCOM, Web Services User-level Integration Occurs via standard EAI services, such as JAVA, Web Sphere, . NET, Tibco, and See. Beyond Client/Server Styles 2 -Tier and 3 -Tier RPC style interfaces OLTP 4 Transactional Repositories 34 > 10/30/2020 Partners Teradata Confidential

Data Warehouse Services Application Scope Strategic and Tactical decision making applications. Though BI tools

Data Warehouse Services Application Scope Strategic and Tactical decision making applications. Though BI tools or custom applications. Analytic & Decision Making Services Applications that provide predictive analysis and assisted decision making MSG-MW NW Strategic APPL Tactical APPL BI APPL DA-MW QD Data Access Middleware Occurs via standards, such as ODBC, OLE-DB, JDBC, as well as proprietary techniques EDW — A EDW — B RS RDBMS Based Event Processing Enterprise Data Warehouse • Consolidated enterprise data • Crosses multiple business domains • Integrated data model 35 > 10/30/2020 QD Analytic & Decision Making Repositories Teradata Confidential

Decision Support User Base Legacy Environment Enterprise Users — (Browsers and/or Portal) Consumers Suppliers

Decision Support User Base Legacy Environment Enterprise Users — (Browsers and/or Portal) Consumers Suppliers Internal EDI Partners WAN / VAN Internet / Intranet DW User Base Consumers, Suppliers, Internal, and Trading Partners Analytic & Decision Making Services ASP / JSP Service Brokers Service Broker Styles J 2 EE, CORBA, DCOM, Web Services C/S MSG-MW NW Strategic APPL Tactical APPL BI APPL DA-MW Enterprise Message Bus QD User-level Integration occurs via standard EAI services, such as Web Services, JAVA, . NET, Tibco, and See. Beyond Client/Server Styles 2 -Tier and 3 -Tier RPC style interfaces 36 > 10/30/2020 QD EDW — A EDW — B RS RDBMS Based Event Processing Teradata Confidential Analytic & Decision Making Repositories

Data Acquisition Services Data Extraction • Data is extracted from OLTP systems • Partner

Data Acquisition Services Data Extraction • Data is extracted from OLTP systems • Partner ETL tools are frequently used here Data Transformation Services • Data cleansing • Data transformation (normalization) • Streaming data for frequent updates • Batch data moves for bulk operations • Partner ETL tools are typically used to perform these services Data Acquisition Options • Traditional load utilities (bulk or continuous loads) • Loads through – “in-flight” Message Passing • Replication – Table level replication from source to target OLTP 1 OLTP 2 OLTP 3 Data Load • Data is loaded into EDW system using Teradata Load tools • Fast. Load • Multi. Load • TPump QD QD EDW — A EDW — B RS RDBMS Based Event Processing OLTP 4 Streaming Batch Transactional Repositories 37 > 10/30/2020 Data Acquisition & Integration Teradata Confidential Analytic & Decision Making Repositories

Event-Driven Business Processes Business Process Automation • Event Detection • Applied Business Rules •

Event-Driven Business Processes Business Process Automation • Event Detection • Applied Business Rules • Event Notification • Messages are passed via P 2 P, Web Services or Enterprise Message Bus RDBMS Based Event Processing • Real-time events are detected through a combination of Triggers, Stored Procedures, and UDFs • Event engine performs query • Messages are passed via P 2 P, Web Services or Enterprise Message Bus QD OLTP 1 OLTP 2 OLTP 3 OLTP 4 MSG-MW Event Notification Business Rules Event Detection DA-MW Business Process Automation QD EDW — A EDW — B RS RDBMS Based Event Processing Streaming Batch Transactional Repositories 38 > 10/30/2020 Data Acquisition & Integration Teradata Confidential Analytic & Decision Making Repositories

Application Integration Transactional Services NW MSG-MW Decision Making Applications interact with bookkeeping applications via

Application Integration Transactional Services NW MSG-MW Decision Making Applications interact with bookkeeping applications via standard Enterprise services, such as Web Services, JAVA, . NET, -orthrough the use of traditional client/server technology. Analytic & Decision Making Services MSG-MW TX 1 APPL TX 2 TX 3 APPL TX 4 APPL DA-MW ASP / JSP Service Brokers MSG-MW NW Strategic APPL Tactical APPL BI APPL DA-MW Enterprise Message Bus QD OLTP 1 OLTP 2 OLTP 3 OLTP 4 MSG-MW Event Notification Business Rules Event Detection DA-MW Business Process Automation QD EDW — A EDW — B RS RDBMS Based Event Processing Streaming Batch Transactional Repositories 39 > 10/30/2020 Data Acquisition & Integration Teradata Confidential Analytic & Decision Making Repositories

Dual Active Solution Replication Services • Changed data capture in V 2 R 6

Dual Active Solution Replication Services • Changed data capture in V 2 R 6 • Update propagation via “Golden. Gate” Teradata Query Director • Query routing control based on business rules • Business Continuity, workload sharing Dual Data Load • Input data stream is split into two independent load streams • Input data is filtered so that only critical data is loaded on the Secondary “Active” system Analytic & Decision Making Services MSG-MW NW Strategic APPL Tactical APPL BI APPL DA-MW QD QD EDW — A EDW — B RS RDBMS Based Event Processing Streaming Batch Data Acquisition & Integration 40 > 10/30/2020 Analytic & Decision Making Repositories Teradata Confidential Secondary “Active” system does not need to be as large as primary system