Developers Do the Darndest Things Ken Shaffer Aerodata

  • Slides: 53
Download presentation
Developers Do the Darndest Things… Ken Shaffer, Aerodata Consulting Session Code: F 05 Tuesday,

Developers Do the Darndest Things… Ken Shaffer, Aerodata Consulting Session Code: F 05 Tuesday, 15 November 2016 Platform: DB 2 for LUW 10: 20 – 11: 20

Objectives: • Describe several of the “Darndest” ways that have been tried to thwart

Objectives: • Describe several of the “Darndest” ways that have been tried to thwart my best database designs • Share some humorous “stories from the trenches“ • Provide some tools and methods to mitigate, undo or at least, clean up the damage from these attempts to ruin my day (or my weekend) 2

Developers Do the Darndest Things…

Developers Do the Darndest Things…

Developers Do the Darndest Things… Not Just Developers: • • • System Architects SAN

Developers Do the Darndest Things… Not Just Developers: • • • System Architects SAN Admins AIX / UNIX (Windows) admins ETL / Analytics users Data Architects Other DBAs

Subtitled: You can’t fix STOOPID! (But you can look like a hero… (or heroine)

Subtitled: You can’t fix STOOPID! (But you can look like a hero… (or heroine) while you clean up after it!) 5

Subtitled: Things that have… “Ticked me off” 6

Subtitled: Things that have… “Ticked me off” 6

Subtitled: Things that have… Ruined my day. Or my night… Or my weekend… 7

Subtitled: Things that have… Ruined my day. Or my night… Or my weekend… 7

#10 “Darndest” Thing: • “Can’t Commit” • • Developer has heard that commits are

#10 “Darndest” Thing: • “Can’t Commit” • • Developer has heard that commits are “SLOW” No Convenient way to do commits Doesn’t scale! Log space is FINITE! 8

#10 “Darndest” Thing: • Not always easy to do commits • Classic Example: •

#10 “Darndest” Thing: • Not always easy to do commits • Classic Example: • Delete from mytable where date < (CURRENT_TIMESTAMP – 7 days) • What if it doesn’t run for a while? • No “built in” way to do commits in sql 9

#10 “Darndest” Thing: • Not always easy to do commits • Better: While [

#10 “Darndest” Thing: • Not always easy to do commits • Better: While [ not done ] do db 2 “Delete from (select * from mytable where date < (CURRENT_TIMESTAMP – 7 days) fetch first $n rows only)” db 2 commit done 10

#10 “Darndest” Thing: • Not always easy to do commits • Better still: •

#10 “Darndest” Thing: • Not always easy to do commits • Better still: • Wrap in High level language • Error checking (java try / catch) • Can do alternate methods with, for example perl/dbi • Arrays • Harsh • references 11

#10 “Darndest” Thing: • Not always easy to do commits • Best: • Architect

#10 “Darndest” Thing: • Not always easy to do commits • Best: • Architect it up front • Usually “purge” is an afterthought • Use DB 2 tools to assist • MDC • Range Partitioned Table 12

#10 “Darndest” Thing: • MDC • Rollout deletion (Fast) • Asynchronous index cleanup •

#10 “Darndest” Thing: • MDC • Rollout deletion (Fast) • Asynchronous index cleanup • One log record (ish) per page vs one per row • My nightmare example: • New business requirement 13

#10 “Darndest” Thing: • Range Partitioned Table • Detach whole partition • Can put

#10 “Darndest” Thing: • Range Partitioned Table • Detach whole partition • Can put into a table and “work on it there” • Then ATTACH • Or DETACH, then archive • No longer a Z lock (9. 7. 1) • The ALTER TABLE. . . DETACH PARTITION statement acquires a IX lock on the partitioned table and an X lock on the data partition being detached. 14

OLTP Main table MDC (10 min) ETL Hourly One week “Data Mart” All tables

OLTP Main table MDC (10 min) ETL Hourly One week “Data Mart” All tables MDC (Hour) Normalized ETL Hourly “Forever” Warehouse Range Partition (Month) Star Schema 15

#10 “Darndest” Thing: • What about Load? • No COMMITCOUNT (Like on import) •

#10 “Darndest” Thing: • What about Load? • No COMMITCOUNT (Like on import) • Makes sense as it must do index rebuilds • Can run CDI (INGEST) • in parallel 16

#10 “Darndest” Thing: • BLU considerations • Want to use LOAD if at ALL

#10 “Darndest” Thing: • BLU considerations • Want to use LOAD if at ALL possible • Use LOAD REPLACE to repopulate (if compression not good, etc) • Analyze phase • needs TONS of memory • Builds synopsis table and compression dictionary • LOAD phase • Data compressed using the dictionary • Written to data pages • Synopsis table maintained • Keys built for page map index and any unique indexes • BUILD phase • build page map index and any unique indexes • DELETE phase • Rejected rows and temp tables 17

#10 “Darndest” Thing: • BLU considerations • Want to load as big chunks as

#10 “Darndest” Thing: • BLU considerations • Want to load as big chunks as possible • Default 1 row in synopsis table for 1024 data rows • Commit no fewer than 1000 rows at a time 18

#10 “Darndest” Thing: • Log space may not HAVE to be finite • Could

#10 “Darndest” Thing: • Log space may not HAVE to be finite • Could set infinite logging • logsecond to -1 • If not HADR • Not usually recommended • Rollback slow • Crash Recovery Slow • Must retrieve log files from the archive 19

#9 “Darndest” Thing: • Over Use LOBs • Maybe even: USE LOBs at all?

#9 “Darndest” Thing: • Over Use LOBs • Maybe even: USE LOBs at all? • S-L-O-W • Synchronous IO 20

#9 “Darndest” Thing: • Why are LOBs SO slow: • Latches in 9. x

#9 “Darndest” Thing: • Why are LOBs SO slow: • Latches in 9. x improved in 10. 1(? ) • Finding a hole • Use NOT Compact (Default) • Inserts: (Similar issue) • Use APPEND option • DB 2 MAXFSCRSEARCH • balance insert speed versus space reuse 21

#9 “Darndest” Thing: • INLINE where possible • Obviously limited by pagesize • They

#9 “Darndest” Thing: • INLINE where possible • Obviously limited by pagesize • They ARE in a 32 K pagesize, right? • Consider making them Asynch in the app • Using MQ • Do LOB operations off-line • Put them in a separate table • Pull data out of LOB or XML (Shred it) back into relational data • Careful – Only a subset!! • XML • can at least index • make the data more readily useable 22

#8 “Darndest” Thing: • Over Normalize • Data Archititects… • Rule: • Normalize to

#8 “Darndest” Thing: • Over Normalize • Data Archititects… • Rule: • Normalize to reduce duplication • Denormalize to make it F-A-S-T 23

#8 “Darndest” Thing: • Deduplication • Save space • Therefore I/O • BUT: •

#8 “Darndest” Thing: • Deduplication • Save space • Therefore I/O • BUT: • Causes Joins • Expensive 24

#8 “Darndest” Thing: 25

#8 “Darndest” Thing: 25

#8 “Darndest” Thing: • Warehouse: • Flatten data structures • Not normalized (as much)

#8 “Darndest” Thing: • Warehouse: • Flatten data structures • Not normalized (as much) • Store detail but… • (“Be Reasonable”…) 26

#8 “Darndest” Thing: • Flip side of this issue: • Storing same data –

#8 “Darndest” Thing: • Flip side of this issue: • Storing same data – many places • “A person with one watch always knows what time it is. A person with two is never sure!” • Wasted space • Therefore I/O 27

#8 “Darndest” Thing: • Attempts to solve with federation • Example: • • •

#8 “Darndest” Thing: • Attempts to solve with federation • Example: • • • Split one database into three Easier to tune Large DPF system Easier to handle Don’t want three copies of • LUT • Reference tables 28

#8 “Darndest” Thing: • Federation Downsides: • Performance • Complexity • Might be better

#8 “Darndest” Thing: • Federation Downsides: • Performance • Complexity • Might be better to connect to both datasources • If we need to join, then use federation 29

#7 “Darndest” Thing: • Use a OLTP Database like a Warehouse and / or

#7 “Darndest” Thing: • Use a OLTP Database like a Warehouse and / or • Use a Warehouse like an OLTP Database 30

#7 “Darndest” Thing: • OLTP: • Store WAY too much “stuff” in the OLTP

#7 “Darndest” Thing: • OLTP: • Store WAY too much “stuff” in the OLTP system • Database should be “System of Record” • Not: “system of stash everything everywhere all the time” • Not: “Scratchpad of record” • LOB example from before • Consider MQ • Consider Websphere e. Xtreme Scale • Remember you’ll likely have to purge it, later 31

Message Queuing 32

Message Queuing 32

Elastic Data Grid 33

Elastic Data Grid 33

#7 “Darndest” Thing: • Warehouse: • Attempt to “Boil the Ocean” • Keeping WH

#7 “Darndest” Thing: • Warehouse: • Attempt to “Boil the Ocean” • Keeping WH current up to the second • Inserts updates deletes (shovel) • Indexes (slow inserts) • HUGE with lots of data • Move data with: • Forklift (import, CDI) • Truck (LOAD) • Cargo ship (partitions) • BLU: even more important 34

#7 “Darndest” Thing: • If you must: • Architect for it! • Try to

#7 “Darndest” Thing: • If you must: • Architect for it! • Try to only work on PART of the warehouse • MDC • Your work only on one extent • This month • Range Partition • Do all your work on small table • THEN attach it 35

#7 “Darndest” Thing: • Use a OLTP Database like a Warehouse and • Use

#7 “Darndest” Thing: • Use a OLTP Database like a Warehouse and • Use a Warehouse like an OLTP Database • With Shadow tables you CAN have the best of both worlds 36

#6 “Darndest” Thing: • Doing expensive “stuff” - A LOT! • Connect for every

#6 “Darndest” Thing: • Doing expensive “stuff” - A LOT! • Connect for every access • Very expensive • Several roundtrips • Pass userid and pwd and verify • Multiple queries to do the work of one • Example • Spend your time tuning the SLOW stuff • Or the expensive stuff • DBI philosophy • Get the BEER out of your backpack 37

#6 “Darndest” Thing: • Doing expensive “stuff” - A LOT! • Spend your time

#6 “Darndest” Thing: • Doing expensive “stuff” - A LOT! • Spend your time tuning the SLOW stuff • Or the expensive stuff • Try to train your developers to think this way • Not just tuning SQL • But, spend their time on the stuff that matters • One Database access equals how many thousands of lines of compiled java code • CPUs are fast… 38

#6 “Darndest” Thing: • GIANT queries • Usually created by ETL Tools • Explain

#6 “Darndest” Thing: • GIANT queries • Usually created by ETL Tools • Explain plans tens of pages long • Not possible (for humans) to tune • • • (or even understand) VERY plan dependent (stats) FAST to HORRIBLE (with one tiny change) • Split into chunks and use code to do some of the work • Consider Global Temp Table • DB 2 is great. Not the answer to ALL questions 39

#6 “Darndest” Thing: • Expensive Things: • Network latency • “I can help you

#6 “Darndest” Thing: • Expensive Things: • Network latency • “I can help you a lot, but I can’t make the west coast NOT be 3000 miles away” • 50+ ms roundtrip time 40

Message Queuing 41

Message Queuing 41

#6 “Darndest” Thing: • THEN! To make matters worse… • Use HADR to keep

#6 “Darndest” Thing: • THEN! To make matters worse… • Use HADR to keep the sites in synch • More and more data • Then Data Mart • Then Warehouse • Then HADR them both • Them remove the MDC 42

#5 “Darndest” Thing: • No Reference or Look Up Tables (LUT) • Documentation!! •

#5 “Darndest” Thing: • No Reference or Look Up Tables (LUT) • Documentation!! • Save space • Also I/O • Data Validation 43

#5 “Darndest” Thing: • No Reference or Look Up Tables (LUT) • Store “cryptic”

#5 “Darndest” Thing: • No Reference or Look Up Tables (LUT) • Store “cryptic” codes • Make a LUT so that everyone knows • Compression can achieve similar space savings 44

#4 “Darndest” Thing: • Concurrency Vs. Consistency • THINK about it!! • See this

#4 “Darndest” Thing: • Concurrency Vs. Consistency • THINK about it!! • See this a lot now re: • Optimistic vs. Pessimistic locking • JDBC, Hibernate, etc. • Use the APPROPRIATE isolation level • Lowest that you need • Mention this to an Oracle DBA – for fun! • In Production OLTP – Ad hoc querying: • ALWAYS use UR!!! (yes, always) 45

#4 “Darndest” Thing: • Concurrency Vs. Consistency • Transaction • Think about interactions in

#4 “Darndest” Thing: • Concurrency Vs. Consistency • Transaction • Think about interactions in a transaction • Intertwined locks 46

#3 “Darndest” Thing: • Bad SAN Layout • Disks are SLOW! • Moore’s law

#3 “Darndest” Thing: • Bad SAN Layout • Disks are SLOW! • Moore’s law exacerbates this… 47

#3 “Darndest” Thing: • Bad SAN Layout • Your SAN admin has priorities VERY

#3 “Darndest” Thing: • Bad SAN Layout • Your SAN admin has priorities VERY different from yours • You will get blamed, not them • Get to know them • EARN their respect • Get their help • Evangelize! (To management, architects) • You are NOT wasting space • You are trading capacity for performance • Just like having 32 CPUS when you usually only need 8 • “What are these PINDLES and where can I buy some? ” 48

#2 “Darndest” Thing: • Not thinking “Globally” • Building things that don’t scale •

#2 “Darndest” Thing: • Not thinking “Globally” • Building things that don’t scale • Don’t “work and play well with others” • Examples: • Each instance of UI doing it’s own “totals” • Moved to one Stored proc that did totals • Each instance just selected them • Over using DB objects • Sequences • Each app using several per second • Actually used to track DB performance (!) • Consider using cache 49

#1 “Darndest” Thing: • Using Oracle!! • Or least Wanting too… 50

#1 “Darndest” Thing: • Using Oracle!! • Or least Wanting too… 50

#1 “Darndest” Thing: • Not giving jdbc apps a name 51

#1 “Darndest” Thing: • Not giving jdbc apps a name 51

#1 “Darndest” Thing: • It is SO easy to fix this! com. ibm. db

#1 “Darndest” Thing: • It is SO easy to fix this! com. ibm. db 2. jcc. DB 2 Data. Source ds = new com. ibm. db 2. jcc. DB 2 Data. Source(); ds. set. Driver. Type(4); ds. set. Server. Name("localhost"); ds. set. Port. Number(50000); ds. set. Database. Name("sample"); ds. set. User("username"); ds. set. Password("password"); ds. set. Client. Program. Name("My application"); conn = ds. get. Connection(); 52

Developers Do the Darndest Things… • • • Thank you Ken Shaffer Aerodata Inc.

Developers Do the Darndest Things… • • • Thank you Ken Shaffer Aerodata Inc. Twitter: @aerodata Email: Ken@Comancheguy. com Linked In: https: //www. linkedin. com/in/kenshaffer Session Code: F 05 Tuesday, 15 November 2016 10: 20 – 11: 20 Platform: DB 2 for LUW 53