Top 10 tips for Oracle DB performance Gu

Top 10 tips for Oracle DB performance Gu y Harri son, CT O db. Ko da Oct ob er 20 1 7

Introductions Web: guyharrison. net Email: guy. harrison@outlook. com Twitter: @guyharrison

The top 10 Be methodical and empirical Optimize your database design Index Wisely Write efficient code Optimize the optimizer Tune SQL and PL/SQL Monitor and Manage contention Optimize memory to reduce IO Tune IO last, but TUNE it well

Be methodical and empirical

Hint 1: Methodical and Empirical tuning Methodical: Empirical • Have a plan • Focus on root causes, not symptoms • Hypothesis and experiment, not trial and error • Measure performance • Compare before and after • What is your control group? • Beware of “old DBA tales” and “silver bullets”

Hint 1: Methodical and Empirical tuning Performance Troubleshooting • Identify the element of your workload which is contributing most to service time • Workload could be SQL or logical transaction • Element is typically CPU time + Oracle wait time • Find a way to reduce that element Tuning by layers • As above, but prioritize higher layers of the stack (which have flow on effects into lower layers) • Distinguish between symptoms and causes

Basics of a database request 1. Application send an SQL request to the database server Application Result sets Return codes SQL 2. Server compiles the SQL and prepares for execution Database Server Request Blocks Return blocks 3. Server looks for data in memory first Server Memory Return blocks Request Blocks IO subsystem 4. If the data is not in memory, server reads from disk

The 4 layers in database performance 1. Tune SQL and application code to reduce the logical demand Application Result sets Return codes SQL 2. Eliminate contention points and bottlenecks RDBMS code Request Blocks Return blocks 3. Configure Memory to avoid as much IO as possible Server Memory Return blocks Request Blocks IO subsystem 4. Tune disk IO system to reduce waits for IO

What’s wrong with the donkey?

Measurement • Standard SQL views • AWR reports • DB control/Cloud control • Toad/Spotlight other 3 rd party tools • A single query can tell you a lot http: //guyharrison. net/OPSGSampl es/Ch 03/time. Model. Simple. sql

Database design

Database design Third Normal form (3 NF) 3 NF discretion Denormalization • The key, the whole key and nothing but the key • OR – Don’t repeat yourself (DRY) • Datatypes (VARCHARs vs LOBS, etc) • Subtypes • NULLS • Replicate column values to avoid joins • Summary tables and materialized views • Vertical partitioning

3 NF is mostly Don’t Repeat Yourself (DRY) • We don’t want to repeat the student name every time they take a new test • We don’t want to repeat the test_name for every student that takes it • We don’t want the “repeating group” of answers – Also, we might not know how many questions there will be in future tests

3 NF version • Students take tests • Tests have questions • Students taking tests provide answers

Don’t go to far! • In this example the designer pulled address into a separate table and city, country into another table. • It’s correct in theory, but it means that every time we want an employees address we have to join three tables. • It would have been better to have just one table

Index Wisely

Indexes exist mainly to improve performance • Choosing the best set of indexes is crucial • Create the best set of Concatenated (multi-column) indexes that will support your queries. • Indexes slow down transactions, so don’t create too many Think of indexes on a table like indexes in a book • You use the index when you want to look up one thing or a few things • You don’t use the index to read the whole book or read a whole chapter • You do use an index when you want to go to a specific page

Concatenated index effectiveness last, first, birthyear, id • The more columns in the index the less IO it takes to resolve a query using all the columns 3 • The order of the columns affects how the index can be used last, first, Birth. Year 4 • If the index has all the columns, don’t need to touch the table at all last+first name 6 last name 63 None 1459 0 200 400 600 Logical IO 800 1000 1200 1400 1600 SELECT cust_id FROM sh. customers c WHERE cust_first_name = 'Connor' AND cust_last_name = 'Bishop' AND cust_year_of_birth = 1976;

Index overhead 7 16, 316 Number of indexes 6 • This chart shows how inserts slow down as more and more indexes are added 14, 285 5 12, 727 4 • Deleting a row with lot’s of indexes is particularly expensive 10, 719 3 • Indexes speed up queries, but make DML (insert, update, Delete) slower 8, 691 2 6, 671 1 (PK only) 1, 191 0 2, 000 4, 000 6, 000 8, 000 10, 000 12, 000 14, 000 16, 000 18, 000 Logical reads required

1000 Index or FTS? Elasped Time (s) 100 10 Full Scan no caching Index sorted data, no caching Break even points Index unsorted, cached data Full Table scan, cached data 1 0 10 20 30 40 50 Pct of table accessed 60 70 80 90 100

Application code

Database coding guidelines SQL Execution • Don’t call the database unless you have to – cache data instead. • Reduce “hard” parsing by using bind variables Transaction design • Minimize lock duration using optimistic and Pessimistic locking strategies Network overhead • Array fetch and Insert • Stored procedures

Parse overhead • It’s easy enough in most programming languages to create a new SQL every time you execute the query:

Better to use the same SQL with different arguments: Prepare the statement once, then execute many times with different arguments Using bind variables

Reduction in parse (SQL Compile) time Bind Variables No Bind variables 0 200 400 600 Parse Time 800 Other 1, 000 executions of the code on preceding two slides in Oracle 1000 1200 1400

Designing transactions • There are two ways to design transaction locking • Pessimistic works best if you think someone else is going to “grab” your row before you’re finished • Optimistic works best if you thing no one else will “grab” the row

Reduce Network traffic • “Round trips” to the database can add a lot of overhead • Two ways to reduce round trips: – Use the “Array” interface in your program code – Use stored procedures for complex interactions with the database

Array fetch performance 40, 000 35, 000 30, 000 25, 000 20, 000 15, 000 10, 000 5, 000 0 0 20 40 60 80 Array fetch size Logical Reads 100 Network round trips 120 140

Network – stored procedures • A stored procedure is code stored in the database • If you have a transaction that goes “back and forth” to the database, consider a stored procedure • ESPECIALLY if you are working across a slow network

Optimize the optimizer

Table and index Structure Cardinality Estimates Object Statistics DB parameters And config Optimizer inputs IO and CPU Estimates • Remember: Garbage In : Garbage Out • The optimizer can only do a good job if statistics are up to date System Statistics Cost estimate

Histograms • Indexes are only good for getting small amounts of the table 20, 000 18, 000 • So it might be a good idea to use an index to get “New Zealand” customers, but not “United States” 14, 000 12, 000 10, 000 • A histogram allows the optimizer to understand how the data is distributed and make the best decision 8, 000 6, 000 4, 000 2, 000 0 Saudi Arabia South Africa Turkey New Zealand Denmark Argentina Singapore Japan Poland China Australia Brazil Canada Spain France United Kingdom Italy Germany United States of America Number of rows 16, 000 • Create histograms to correct bad plans on skewed tables

Without a histogram Number of rows the estimated Real number of rows Optimizer chooses not to use an index

With a histogram Optimizer estimate is correct Optimizer chooses to use an index

Tune SQL and PL/SQL

Find them and tune them Find SQL • • Mine V$SQL ASH, AWR tables Database control, grid/cloud control Toad, Spotlight Tune SQL • • • SQL Profiles SQL Baselines Rewrites Hints (be careful) SQL Optimizer

V$SQL is your friend • Also, V$SQL_PLAN_STATISTICS • Tkprof, session trace

Now let’s look at the donkey

Monitor and manage contention

Contention – the proverbial bottleneck Application Demand for DB services Contention for limited or serialized resources causes waits and or queuing Apparent demand at lower layers is reduced

Types of contention Locks Usually locking problems are due to application locks (remember optimistic locking)? Sometimes internal locks can cause problems. Latches and Mutex Buffer contention Latches are very light weight locks that protect memory instead of tables When sessions have to wait for a memory “buffer” to become available

Latches and mutex contention • Latches are like locks, but instead of protecting table rows, they protect memory (buffers) user • If two sessions try to access the same area of memory, then one will wait • Instead of “sleeping” (like a lock) they waiting session will “spin” on the CPU for a very short time Buffers • Latch problems may indicate “hot” blocks • They might cause CPU drain (because of spinning) Database files

Free buffer waits Write to buffers User Buffers Buffer Read. Waits from • Database buffers improve performance by caching data in memory • When buffers are modified they are called “dirty” – DBWR writes to disk buffers • When all the blocks are “dirty” then sessions have to wait for the buffers to be written before new data can be read Read from disk Database Writer Write dirty blocks to disk Database files • This might mean that your DBWR can’t write to disk fast enough

Optimize memory to reduce IO

Memory is primarily used to avoid IO • Buffer cache avoids IO to table/index tablespaces • PGA avoids IO to temporary tablespace

Temp tablespace IO can easily overwhelm table/index IO Time Multi-pass Disk Sort Single Pass Disk Sort Memory Sort More Memory Available Memory Table/Index IO CPU Time Temp Segment IO Less Memory

Automatic Memory Management • Introduced in 11 g, AMM manages allocations between and within PGA and SGA

AMM can (rarely? ) lead to thrashing or starvation • ALWAYS set minimum values for key components of the SGA

Tune IO last, but tune it well

IO Tuning • Disk IO is the slowest part of the database system, so it’s critical to performance • FIRST: – Tune SQL and application – Remove contention – Allocate memory • Only when you’ve done that will your IO demand be realistic. Then you can tune your IO

Basics of IO tuning (magnetic disks) • The amount of IO you can support depends on the number of disks you have • Provide enough disks to support the amount of IO you need • even if that means the disks are not filled with data • Magnetics disks can do between 75 -150 IO per second (IOPS) – do the math! SAS 15, 000 rpm SAS 10, 000 rpm SATA 7, 200 rpm 0 20 40 60 80 100 IO/ps 120 140 160 180 200

Disk Drive latency 100 • Latency is the time taken to perform a single IO 90 • Most disk drives can return an IO in 5 -10 ms 80 – Faster if the disk has a memory cache • Disk latency increases with: • To get best performance disks should be – Sparsely populated – Under only moderate load • RAID 0+1 is the preferred configuration Response Time (ms) – Throughput (best at 50 -75% of maximum) – Disk fill (best at 50% capacity) 70 60 50 40 30 20 10 0 0 100 200 300 IO/second 400 500

The more that things change. .

SSD - Cheaper by the IO SSD DDR-RAM SSD PCI flash 15 25 SSD SATA Flash 80 Magnetic Disk 4, 000 0 1, 000 2, 000 3, 000 Seek time (us) 4, 000 5, 000

10 10 $$/GB But not by the GB 10 12 2. 9 2. 3 2. 2 7. 4 8 1. 7 1. 3 6 1 2011 1 5. 3 2012 2014 4 2015 3. 2 0. 35 2. 9 0. 28 2. 2 2. 3 1. 7 0. 21 2 1. 3 0. 17 1 0. 35 0. 28 0. 21 0. 17 0. 13 0 0. 1 2012 2013 2014 2015 HDD MLC SDD SLC SSD

Tiered storage management Main Memory DDR SSD $/GB Fast Disk (SAS, RAID 0+1) Slow Disk (SATA, RAID 5) Tape, Flat Files, Hadoop $/IOP Flash SSD

Random reads – Fusion. IO DB File IO Table on SSD Other 121 CPU Total SAS disk, flash cache 583 SAS disk, no flash cache 2, 211 0 500 1000 1500 Elapsed time (s) 2000 2500

Full table scans Table on SSD 72 Flash cache doesn’t accelerate Full table scans b/c scans use direct path reads and flash cache only accelerates buffered reads DB File IO 398 SAS disk, flash cache Other CPU Total SAS disk, no flash cache 418 0 50 100 150 200 250 300 Elasped time (s) 350 400 450

Disk Sorts – temp tablespace SSD vs HDD 4000 3500 Multi-pass Disk Sort 2500 2000 1500 Single Pass Disk Sort Elapsed time (s) 3000 1000 500 0 300 250 200 150 100 Sort Area Size SAS based TTS SSD based TTS 50 0

Concurrent redo workload (x 10) Flash based redo log 1, 637 331 1, 681 CPU Other Log File IO SAS based redo log 1, 605 0 500 1, 000 397 1, 500 1, 944 2, 000 2, 500 3, 000 Elapsed time (s) 3, 500 4, 000 4, 500

Tune RAC

RAC will scale well, providing that…. Load is reasonably well balanced across the instances in the cluster The overhead of maintaining cluster consistency does not dominate overall response time a) The time taken to transfer a block across the interconnect is much less than the time taken to read from disk

Thank you guyharrison. net @guyharrison Guy. harrison@outlook. com