Turning Practice into Perfect Implementing Fathom 2 0

Notes ä All of the information covered in this presentation is covered in a

Presentation Goal ä Explain how to use Fathom ä Implementing best practices ä Not

The Operation’s Challenge ä ä ä Constant reactive mode Manual processes Poor reporting Unplanned

Goals of a Well Maintained System ä Resiliency – The ability to recover ä

Roadmap ä What are best practices? ä What is Fathom? ä Providing a resilient

What are Best Practices? ä Defined processes to follow ä Consistent verifiable outcome ä

Defined Process to Follow ä Must have clear goals ä Functional ä Business ä

Defined Verifiable Outcome ä Know what you expect ä Know what you are getting

Well Maintained System ä ä ä Ability to support 24 hour operations with only

What is Fathom? ä ä Java-based management console and agent Management console ä ä

Dictionary ä ä ä ä Resource – Anything Fathom can monitor or trend Schedule

Progress Fathom Architecture Fathom File System Net* CPU Production DB Disk Log Memory Fathom

Fathom Architecture Multiple Sites Fathom DB

Fathom Architecture Monitor LocallyTrend Remotely Fathom DB

Fathom Architecture 2. 0 Monitor/Trend Database Remotely DB Agent Fathom DB Fathom DB

Fathom Architecture 2. 0 Monitor & Trend Anywhere DB Agent Fathom DB

Fathom Architecture Manage from One Browser DB Agent Fathom DB Fathom DB

Roadmap ä What are best practices? ä What is Fathom and how does it

Resiliency ä Redundancy ä Developing an effective recovery plan ä Monitoring for problem avoidance

Redundancy ä Disk ä RAID Raid Levels ä Dos and Don’ts ä ä After

RAID ä Redundant Array of Inexpensive Disks Patterson, Gibson and Katz at the University

RAID: Dos and Don’ts ä Do: ä Use RAID 10 for randomized storage ä

Memory Interleaving Memory interleaving works like RAID 0 for memory. While there are significant

Resiliency: Recovery Planning ä Who in involved in the process? ä What gets backed

Who is Involved in Recovery Planning? ä Technical people ä They understand what is

What is Included on the Backup? ä More than just a database backup ä

Where Do We Backup To? ä Capacity – How much do you need to

Where to Store your Backup? ä Formal service ä 24 hour access ä Secure

When to do a Backup? ä As often as practical ä A once a

Why do a Backup? ä Reduce data loss ä Build user confidence ä Keep

How Can Fathom Help? Scheduling ä Consistent schedule that is not forgotten ä Pro-active

How Can Fathom Help? Reporting ä Processing time is captured ä Historical trend report

Resiliency: Problem Avoidance ä Common problem areas: ä Disk full problems ä Database extents

Fathom: Disk Monitoring ä Disk view ä Monitoring disks other than database ä Graphical

Availability ä Reducing the impact of unplanned events ä Planning for system growth ä

Planning for System Growth ä Trending allows for patterns to be viewed and acted

Fathom: Disk Trending ä Correlating database and disk trends ä Month by Month, Week

Fathom: Storage Area Trending ä Fill rate ä Activity by area ä This information

Fathom: Table and Index Trending – database analysis ä Predicting table growth ä Predicting

Fathom: Memory Trending ä Focus on paging and swapping rather than utilization ä This

Fathom: CPU Trending ä Look at Idle ä Look at the ratio between User

Performance ä Performance is relative ä Fast is overrated ä Fathom can help find

Performance is Relative ä What is a baseline? ä Determining your baselines ä How

Determining your baseline ä Good baseline guidelines ä Often accessed portions on the application

Components of Performance ä Network ä Disk ä Memory ä CPU

Issues: Network ä Check your network capacity BEFORE adding any additional applications ä Baseline

Issues: Disk ä Storage capacity vs. throughput capacity ä Remember your RAID levels ä

Issues: Memory ä Memory acts as a buffer between the user processes and disk

Issues: CPU ä Good CPU usage vs. Bad CPU Usage ä The –spin parameter

Monitoring Performance ä Spot checks ä My Fathom Views ä Trend reporting ä Getting

Conclusion ä Start slow ä Remember your goals ä Resiliency ä Availability ä Performance

Slides: 56

Download presentation

Turning Practice into Perfect Implementing Fathom 2. 0 Adam Backman White Star Software adam@wss. com

Notes ä All of the information covered in this presentation is covered in a new portion of the Progress documentation called: Open. Edge Revealed Mastering the Progress Database with Fathom ä A system works as a whole and not as a sum of its parts so the presentation is written the same way. With this in mind please hold your questions until the end

Presentation Goal ä Explain how to use Fathom ä Implementing best practices ä Not here to teach in-depth System Administration ä Point you to other presentations for more indepth information

The Operation’s Challenge ä ä ä Constant reactive mode Manual processes Poor reporting Unplanned downtime Cannot plan for growth Poor System Performance Resulting In: n Unpredictable Operations n Exposure to Errors n Incomplete Information n Frustrated End Users Frustrated Administrators n

Goals of a Well Maintained System ä Resiliency – The ability to recover ä Availability – Provide maximum uptime ä Performance – Consistency despite system load Fathom can help achieve these goals

Roadmap ä What are best practices? ä What is Fathom? ä Providing a resilient system ä Making your system highly available ä Providing consistent performance

What are Best Practices? ä Defined processes to follow ä Consistent verifiable outcome ä End result – well maintained system

Defined Process to Follow ä Must have clear goals ä Functional ä Business ä Document where you are now and how you are going to achieve your goals London Munich Prague Paris Amsterdam

Defined Verifiable Outcome ä Know what you expect ä Know what you are getting ä Test completely prior to implementation ä Unit testing ä End-to-end testing

Well Maintained System ä ä ä Ability to support 24 hour operations with only scheduled outages for upgrades and maintenance Ability to recover from disaster with little or no data loss and minimal interruption to operations Ability to support the changing needs of the business with little or no performance degradation during times of heavy processing

Roadmap ä What are best practices? ä What is Fathom? ä Providing a resilient system ä Making your system highly available ä Providing consistent performance

What is Fathom? ä ä Java-based management console and agent Management console ä ä Provides interface to the agent Provides an interface to the Fathom Database Allows for definition of alerts Fathom agent ä ä Collector of operating system resource information Collector of Progress database management information

Dictionary ä ä ä ä Resource – Anything Fathom can monitor or trend Schedule – A defined timeframe when a resource is available for monitoring, alerting, and trending Poll – The process of gathering information about a resource Rule – a performance requirement that can be evaluated Alert – A response to a rule being broken Action – A process to be performed in response to an alert Trending – The process of storing performance and audit data in the Fathom Trend Database Monitoring – Performing polling, evaluating rules, generating alerts, executing actions, and trending of a resource within a scheduled timeframe.

Progress Fathom Architecture Fathom File System Net* CPU Production DB Disk Log Memory Fathom DB

Fathom Architecture Multiple Sites Fathom DB

Fathom Architecture Monitor LocallyTrend Remotely Fathom DB

Fathom Architecture 2. 0 Monitor/Trend Database Remotely DB Agent Fathom DB Fathom DB

Fathom Architecture 2. 0 Monitor & Trend Anywhere DB Agent Fathom DB

Fathom Architecture Manage from One Browser DB Agent Fathom DB Fathom DB

Roadmap ä What are best practices? ä What is Fathom and how does it work? ä Providing a resilient system ä Making your system highly available ä Providing consistent performance

Resiliency ä Redundancy ä Developing an effective recovery plan ä Monitoring for problem avoidance

Redundancy ä Disk ä RAID Raid Levels ä Dos and Don’ts ä ä After imaging ä Memory

RAID ä Redundant Array of Inexpensive Disks Patterson, Gibson and Katz at the University of California Berkeley (1987) ä Common RAID Levels ä ä RAID 0 – striping RAID 1 – mirroring RAID 10 or 0+1 – Striped with mirrors RAID 5 – Striped with calculated parity

RAID: Dos and Don’ts ä Do: ä Use RAID 10 for randomized storage ä Use RAID 1 for sequential storage ä Use RAID 5 for READ-ONLY data ä Don’t ä Use RAID 5 for OLTP ä Use RAID 0 for data storage

Memory Interleaving Memory interleaving works like RAID 0 for memory. While there are significant potential performance gains from interleaving memory you run the risk of having one faulty memory chip bring down your application.

Resiliency: Recovery Planning ä Who in involved in the process? ä What gets backed up? ä Where do we backup up our data ä Where do we store the physical backup? ä When do we do a backup? ä Why do a backup at all? ä How can Fathom help?

Who is Involved in Recovery Planning? ä Technical people ä They understand what is possible ä Business people ä They understand what is needed and the cost of downtime ä Management ä They understand where the business is headed and what can be afforded

What is Included on the Backup? ä More than just a database backup ä Database ä Application ä Other Files ä Physical backup ä Secondary machine room ä Additional Hardware ä Infrastructure

Where Do We Backup To? ä Capacity – How much do you need to store? ä Removable – To allow off-site archival ä Reliable – It must work every time ä Compatible – Keeps your options open

Where to Store your Backup? ä Formal service ä 24 hour access ä Secure ä Highly disaster resistant ä Separate location (different building) ä Inexpensive ä Greater need for planning (access, security, disaster, etc. )

When to do a Backup? ä As often as practical ä A once a day backup will cause you to loose up to 24 hour of processing in the worst case ä Fill in with after imaging ä Store AI on different disk ä Archive AI files throughout the day ä Keep warm standby to reduce downtime

Why do a Backup? ä Reduce data loss ä Build user confidence ä Keep your job

How Can Fathom Help? Scheduling ä Consistent schedule that is not forgotten ä Pro-active notification if there is a problem ä Fathom 2. 0 Job Templates

How Can Fathom Help? Reporting ä Processing time is captured ä Historical trend report of backup ä Audit trail

Resiliency: Problem Avoidance ä Common problem areas: ä Disk full problems ä Database extents filling fast

Fathom: Disk Monitoring ä Disk view ä Monitoring disks other than database ä Graphical view of what disks look like

Roadmap ä What are best practices? ä What is Fathom and how does it work? ä Providing a resilient system ä Making your system highly available ä Providing consistent performance

Availability ä Reducing the impact of unplanned events ä Planning for system growth ä Reducing impact of change to the user ä Scheduling Online Utilities

Planning for System Growth ä Trending allows for patterns to be viewed and acted upon ä Trending allows for operational thresholds to be established ä Trending allows for advanced planning so maintenance can be scheduled when convenient for the business

Fathom: Disk Trending ä Correlating database and disk trends ä Month by Month, Week by Week or Day by Day it is your choice ä Fill rates and activity of each disk

Fathom: Storage Area Trending ä Fill rate ä Activity by area ä This information can show a need to spread data even further

Fathom: Table and Index Trending – database analysis ä Predicting table growth ä Predicting Index growth ä Index compaction rates can be monitored and actions can be taken if the compaction drops below a certain level ä Utilization of each table and index can also be tracked and viewed in other areas of fathom

Fathom: Memory Trending ä Focus on paging and swapping rather than utilization ä This is currently a weak area within the Fathom product

Fathom: CPU Trending ä Look at Idle ä Look at the ratio between User and System ä High system time can indicate an incorrect value for –spin or High paging or swapping

Roadmap ä What are best practices? ä What is Fathom and how does it work? ä Providing a resilient system ä Making your system highly available ä Providing consistent performance

Performance ä Performance is relative ä Fast is overrated ä Fathom can help find tough problems

Performance is Relative ä What is a baseline? ä Determining your baselines ä How Fathom can help ä Important indicators ä Who is your canary?

Determining your baseline ä Good baseline guidelines ä Often accessed portions on the application ä High customer impact ä End to End (Time to enter an order) ä Bad baseline ä Year-end process ä Management reporting (in most cases) ä Little used portions of the application

Components of Performance ä Network ä Disk ä Memory ä CPU

Issues: Network ä Check your network capacity BEFORE adding any additional applications ä Baseline response times with Fathom ä Routed vs. switched networks ä Location of Progress files ä Program Libraries

Issues: Disk ä Storage capacity vs. throughput capacity ä Remember your RAID levels ä Location of data

Issues: Memory ä Memory acts as a buffer between the user processes and disk ä Use memory for the common good ä Increase broker memory first ä Increase client memory (-Bt, …) ä Then get creative

Issues: CPU ä Good CPU usage vs. Bad CPU Usage ä The –spin parameter ä Have a CPU problem? Look at your disks

Monitoring Performance ä Spot checks ä My Fathom Views ä Trend reporting ä Getting out of the Forest

Conclusion ä Start slow ä Remember your goals ä Resiliency ä Availability ä Performance ä Consider the cost/benefit before adding monitoring or trending to a resource

Questions