STORAGE ARCHITECTURE GETTING STARTED SAN SCHOOL 101 Marc

  • Slides: 71
Download presentation
STORAGE ARCHITECTURE/ GETTING STARTED: SAN SCHOOL 101 Marc Farley President of Building Storage, Inc

STORAGE ARCHITECTURE/ GETTING STARTED: SAN SCHOOL 101 Marc Farley President of Building Storage, Inc Author, Building Storage Networks, Inc.

Agenda l Lesson 1: Basics of SANs l Lesson 2: The I/O path l

Agenda l Lesson 1: Basics of SANs l Lesson 2: The I/O path l Lesson 3: Storage subsystems l Lesson 4: RAID, volume management and virtualization l Lesson 5: SAN network technology l Lesson 6: File systems

Lesson #1 Basics of storage networking

Lesson #1 Basics of storage networking

Connecting

Connecting

Connecting l Networking or bus technology l Cables + connectors l System adapters +

Connecting l Networking or bus technology l Cables + connectors l System adapters + network device drivers l Network devices such as hubs, switches, routers l Virtual networking l Flow control l Network security

Storing

Storing

Storing l Device (target) command control § Drives, subsystems, device emulation l Block storage

Storing l Device (target) command control § Drives, subsystems, device emulation l Block storage address space manipulation (partition management) § Mirroring § RAID § Striping § Virtualization § Concatentation

Filing

Filing

Filing l Namespace presents data to end users and applications as files and directories

Filing l Namespace presents data to end users and applications as files and directories (folders) l Manages use of storage address spaces l Metadata for identifying data § file name § owner § dates

Connecting, storing and filing as a complete storage system Connecting

Connecting, storing and filing as a complete storage system Connecting

NAS and SAN analysis NAS is filing over a network SAN is storing over

NAS and SAN analysis NAS is filing over a network SAN is storing over a network NAS and SAN are independent technologies They can be implemented independently They can co-exist in the same environment They can both operate and provide services to the same users/applications

Protocol analysis for NAS and SAN NAS SAN Network Filing Storing Connecting

Protocol analysis for NAS and SAN NAS SAN Network Filing Storing Connecting

Integrated SAN/NAS environment NAS Server + SAN Initiator “NAS Head” Filing Storing Connecting

Integrated SAN/NAS environment NAS Server + SAN Initiator “NAS Head” Filing Storing Connecting

Common wiring with NAS and SAN NAS Head Filing Storing Connecting

Common wiring with NAS and SAN NAS Head Filing Storing Connecting

Lesson #2 The I/O path

Lesson #2 The I/O path

Host hardware path components Memory Processor Memory Bus System I/O Bus Storage Adapter (HBA)

Host hardware path components Memory Processor Memory Bus System I/O Bus Storage Adapter (HBA)

Host software path components Application Operating Filing Cache Volume System Manager Multi. Pathing Device

Host software path components Application Operating Filing Cache Volume System Manager Multi. Pathing Device Driver

Network hardware path components Cabling Fiber optic Copper Switches, hubs, routers, bridges, gatways Port

Network hardware path components Cabling Fiber optic Copper Switches, hubs, routers, bridges, gatways Port buffers, processors Backplane, bus, crossbar, mesh, memory

Network software path components Access and Security Fabric Services Routing Flow Control Virtual Networking

Network software path components Access and Security Fabric Services Routing Flow Control Virtual Networking

Subsystem path components Network Ports Access and Security Cache Resource Manager Internal Bus or

Subsystem path components Network Ports Access and Security Cache Resource Manager Internal Bus or Network

Device and media path components Disk drives Tape Media Solid state devices

Device and media path components Disk drives Tape Media Solid state devices

The end to end I/O path picture App Memory Processor Operating System Cabling Cache

The end to end I/O path picture App Memory Processor Operating System Cabling Cache Volume Multi. Filing System Manager Pathing Network Systems Subsystem Network Poirt Access and Security Cache Fabric Services Resource Manager Memory System I/O Bus Routing Internal Bus or Network Device Storage Driver Adapter (HBA) Flow Control Virtual Networking Disk drives Tape drives

Lesson #3 Storage subsystems

Lesson #3 Storage subsystems

Generic storage subsystem model Controller (logic+processors) Access control Network Ports Resource manager Cache Memory

Generic storage subsystem model Controller (logic+processors) Access control Network Ports Resource manager Cache Memory Internal Bus or Network Storage Resources Power

Redundancy for high availability l Multiple hot swappable power supplies l Hot swappable cooling

Redundancy for high availability l Multiple hot swappable power supplies l Hot swappable cooling fans l Data redundancy via RAID l Multi-path support § Network ports to storage resources

Physical and virtual storage Exported storage Exported storage Subsystem Controller Resource Manager (RAID, mirroring,

Physical and virtual storage Exported storage Exported storage Subsystem Controller Resource Manager (RAID, mirroring, etc. ) Physical storage device Hot Spare Device

SCSI communications architectures determine SAN operations l SCSI communications are independent of connectivity l

SCSI communications architectures determine SAN operations l SCSI communications are independent of connectivity l SCSI initiators (HBAs) generate I/O activity l They communicate with targets • Targets have communications addresses • Targets can have many storage resources • Each resource is a single SCSI logical unit (LU) with a universal • • unique ID (UUID) - sometimes referred to as a serial number An LU can be represented by multiple logical unit numbers (LUNs) Provisioning associates LUNs with LUs & subsystem ports l A storage resource is not a LUN, it’s an LU

Provisioning storage LUN 0 Port S 1 LUN 1 Port S 2 Port S

Provisioning storage LUN 0 Port S 1 LUN 1 Port S 2 Port S 4 LUN 3 SCSI LU UUID C LUN 3 LUN 0 Physical storage devices SCSI LU UUID B LUN 2 Port S 3 SCSI LU UUID A SCSI LU UUID D Controller functions Physical storage devices

Multipathing LUN X Path 1 SCSI LU UUID A MP SW LUN X Path

Multipathing LUN X Path 1 SCSI LU UUID A MP SW LUN X Path 2

Caching Exported Volume Controller Cache Manager Read Caches Write Caches 1. Recently Used 2.

Caching Exported Volume Controller Cache Manager Read Caches Write Caches 1. Recently Used 2. Read Ahead 1. Write Through (to disk) 2. Write Back (from cache)

Tape subsystems Tape Drive Tape Subsystem Controller Tape Drive Tape Slots Robot Tape Drive

Tape subsystems Tape Drive Tape Subsystem Controller Tape Drive Tape Slots Robot Tape Drive

Subsystem management Now with SMIS Management station browser-based network mgmt software Ethernet/TCP/IP Out-of-band management

Subsystem management Now with SMIS Management station browser-based network mgmt software Ethernet/TCP/IP Out-of-band management port In-band management Storage Subsystem Exported Storage Resource

Data redundancy 2 n Duplication Parity n+1 Difference -1 d(x) = f(x) – f(x-1)

Data redundancy 2 n Duplication Parity n+1 Difference -1 d(x) = f(x) – f(x-1)

Duplication redundancy with mirroring Host-based I/O Path Within a subsystem Mirroring Operator Terminate I/O

Duplication redundancy with mirroring Host-based I/O Path Within a subsystem Mirroring Operator Terminate I/O & regenerate new I/Os Error recovery/notification I/O Path. A I/O Path. B

Duplication redundancy with remote copy Host Uni-directional (writes only) A B

Duplication redundancy with remote copy Host Uni-directional (writes only) A B

Point-in-time snapshot Subsystem Snapshot Host A B C

Point-in-time snapshot Subsystem Snapshot Host A B C

Lesson #4 RAID, volume management and virtualization

Lesson #4 RAID, volume management and virtualization

RAID = parity redundancy 2 n Duplication Parity n+1 Difference -1 d(x) = f(x)

RAID = parity redundancy 2 n Duplication Parity n+1 Difference -1 d(x) = f(x) – f(x-1)

History of RAID l Late 1980 s R&D project at UC Berkeley § David

History of RAID l Late 1980 s R&D project at UC Berkeley § David Patterson § Garth Gibson (independent) l Redundant array of inexpensive disks • Striping without redundancy was not defined (RAID 0) l Original goals were to reduce the cost and increase the capacity of large disk storage

Benefits of RAID Capacity scaling ● ● Combine multiple address spaces as a single

Benefits of RAID Capacity scaling ● ● Combine multiple address spaces as a single virtual address Performance through parallelism ● ● Spread I/Os over multiple disk spindles Reliability/availability with redundancy ● ● Disk mirroring (striping to 2 disks) ● Parity RAID (striping to more than 2 disks)

Capacity scaling Combined extents 1 - 12 Exported RAID disk volume (1 address) RAID

Capacity scaling Combined extents 1 - 12 Exported RAID disk volume (1 address) RAID Controller (resource manager) Storage extent 1 Storage extent 2 Storage extent 3 Storage extent 4 Storage extent 5 Storage extent 6 Storage extent 7 Storage extent 8 Storage extent 9 extent 10 Storage extent 11 Storage extent 12

Performance RAID controller (microsecond performance) 1 Disk drive 2 3 Disk drive 4 5

Performance RAID controller (microsecond performance) 1 Disk drive 2 3 Disk drive 4 5 Disk drive 6 Disk drives (Millisecond performance) from rotational latency and seek time Disk drive

Parity redundancy l RAID arrays use XOR for calculating parity Operand 1 False True

Parity redundancy l RAID arrays use XOR for calculating parity Operand 1 False True Operand 2 False True XOR Result False True False l XOR is the inverse of itself § Apply XOR in the table above from right to left § Apply XOR to any two columns to get the third

Reduced mode operations l When a member is missing, data that is accessed must

Reduced mode operations l When a member is missing, data that is accessed must be reconstructed with xor l An array that is reconstructing data is said to be operating in reduced mode l System performance during reduced mode operations can be significantly reduced XOR {M 1&M 2&M 3&P}

Parity rebuild RAID Parity Rebuild l The process of recreating data on a replacement

Parity rebuild RAID Parity Rebuild l The process of recreating data on a replacement member is called a parity rebuild l Parity rebuilds are often scheduled for non-production hours because performance disruptions can be so severe XOR {M 1&M 2&M 3&P}

RAID 0+1, 10 RAID Controller Hybrid RAID: 0+1 Disk drive 2 Disk drive Disk

RAID 0+1, 10 RAID Controller Hybrid RAID: 0+1 Disk drive 2 Disk drive Disk drive 3 4 5 Mirrored pairs of striped members

Volume management and virtualization l Storing level functions l Provide RAID-like functionality in host

Volume management and virtualization l Storing level functions l Provide RAID-like functionality in host systems and SAN network systems l Aggregation of storage resources for: § scalability § availability § cost / efficiency § manageability

OS kernel File system Volume management l RAID & partition management l Device driver

OS kernel File system Volume management l RAID & partition management l Device driver layer between the kernel and storage I/O drivers Volume Manager HBA drivers HBAs

Server system Volume managers can use all available connections and resources and can span

Server system Volume managers can use all available connections and resources and can span multiple SANs as well as SCSI and SAN resources Virtual Storage Volume manager SCSI disk resource SCSI HBA drivers SAN HBA SCSI Bus SAN cable SAN disk resources SAN Switch

SAN storage virtualization l RAID and partition management in SAN systems l Two architectures:

SAN storage virtualization l RAID and partition management in SAN systems l Two architectures: • • In-band virtualization (synchronous) Out-of-band virtualization (asynchronous)

In-band virtualization Exported virtual storage SAN virtualization I/O Path system System(s), switch or router

In-band virtualization Exported virtual storage SAN virtualization I/O Path system System(s), switch or router Disk subsystems

Out-of-band virtualization l Distributed volume management Virtualization management system Virtualization agents l Virtualization agents

Out-of-band virtualization l Distributed volume management Virtualization management system Virtualization agents l Virtualization agents are managed from a central system in the SAN Disk subsystems

Lesson #5 SAN networks

Lesson #5 SAN networks

Fibre channel • The first major SAN networking technology • Very low latency •

Fibre channel • The first major SAN networking technology • Very low latency • High reliability • Fiber optic cables • Copper cables • Extended distance • 1, 2 or 4 Gb transmission speeds • Strongly typed

Fibre channel A Fibre Channel fabric presents a consistent interface and set of services

Fibre channel A Fibre Channel fabric presents a consistent interface and set of services across all switches in a network Host and subsystems all 'see' the same resources Storage Subsystem

Fibre channel port definitions ● FC ports are defined by their network role ●

Fibre channel port definitions ● FC ports are defined by their network role ● N-ports: end node ports connecting to fabrics ● L-ports: end node ports connecting to loops ● NL-ports: end node ports connecting to fabrics or loops ● F-ports: switch ports connecting to N ports ● FL-ports: switch ports connecting to N ports or NL ports in a loop ● E-ports: switch ports connecting to other switch ports ● G ports: generic switch ports that can be F, FL or E ports

Ethernet / TCP / IP SAN technologies l Leveraging the install base of Ethernet

Ethernet / TCP / IP SAN technologies l Leveraging the install base of Ethernet and TCP/IP networks l i. SCSI – native SAN over IP l FC/IP – FC SAN extensions over IP

i. SCSI l Native storage I/O over TCP/IP § New industry standard § Locally

i. SCSI l Native storage I/O over TCP/IP § New industry standard § Locally over Gigabit Ethernet § Remotely over ATM, SONET, 10 Gb Ethernet i. SCSI TCP IP MAC PHY

i. SCSI equipment l Storage NICs (HBAs) § SCSI drivers l Cables § Copper

i. SCSI equipment l Storage NICs (HBAs) § SCSI drivers l Cables § Copper and fiber l Network systems § Switches/routers § Firewalls

l FC/IP § Extending FC SANs over TCP/IP networks § FCIP gateways operate as

l FC/IP § Extending FC SANs over TCP/IP networks § FCIP gateways operate as virtual E-port connections § FCIP creates a single fabric where all resources appear to be local One. TCP/IP fabric FCIP Gateway E-port LAN, MAN or WAN FCIP Gateway E-port

SAN switching & fabrics l High-end SAN switches have latencies of 1 - 3

SAN switching & fabrics l High-end SAN switches have latencies of 1 - 3 µsec l Transaction processing requires lowest latency § Most other applications do not l Transaction processing requires non-blocking switches § No internal delays preventing data transfers

Switches and directors l Switches § 8 – 48 ports § Redundant power supplies

Switches and directors l Switches § 8 – 48 ports § Redundant power supplies § Single system supervisor l Directors § 64+ ports § HA redundancy § Dual system supervisor § Live SW upgrades

SAN topologies l Star • • Simplest single hop l Dual star • Simple

SAN topologies l Star • • Simplest single hop l Dual star • Simple network + redundancy • • Single hop Independent or integrated fabric(s)

SAN topologies l N-wide star • • • Scalable Single hop Independent or integrated

SAN topologies l N-wide star • • • Scalable Single hop Independent or integrated fabric(s) l Core - edge • Scalable • 1 – 3 hops • integrated fabric

SAN topologies l Ring • • • Scalable integrated fabric 1 to N÷ 2

SAN topologies l Ring • • • Scalable integrated fabric 1 to N÷ 2 hops l Ring + Star • Scalable • integrated fabric • 1 to 3 hops

Lesson #6 File systems

Lesson #6 File systems

File system functions Name space Access control Metadata Locking Address space management

File system functions Name space Access control Metadata Locking Address space management

Filing Storing

Filing Storing

Think of the storage address space as a sequence of storage locations (a flat

Think of the storage address space as a sequence of storage locations (a flat address space)

l Superblocks are known addresses used to find Superblocks file system roots (and mount

l Superblocks are known addresses used to find Superblocks file system roots (and mount the file system)

l File systems must have a known and Filing and Scaling dependable address space

l File systems must have a known and Filing and Scaling dependable address space § The fine print in scalability - How does the filing function know about the new storing address space? Filing Storing