Microsoft Exchange Server Best Practices Analyzer Tool Paul
Microsoft Exchange Server Best Practices Analyzer Tool Paul Bowden Program Manager Exchange Server Development Microsoft Corporation
What is it? • The Exchange Sever Best Practices Analyzer 'encodes' the top product support issues into a tool which can be run against a live deployment. – Step by step documentation tells you how to resolve each problem • The tool can be run as part of a proactive 'health check' which can expose availability or scalability problems. Additionally, the tool can be run as part of a reactive troubleshooting step for problem diagnosis and identification. – The tool will report issues currently causing problems within the topology, and discrepancies which may cause future outages. • The tool can be used to actively document the design and configuration of the Exchange topology. This data can be used to track the history of a deployment, or provide a ‘quick-start' to administrators and product support staff who need to analyze the history and configuration of an unfamiliar deployment.
Why we developed it • Administrators are finding it difficult to keep up with the documentation that we produce – Urgency – Relevance • Customers find it difficult to keep track of whether they are conforming to all the best practices • Exchange has many options and finding root cause for a problem can be a long process – ~60% of Exchange problems are mis-configurations • We have many tools for collecting information, but not many provide auto-analysis
Design Principles • Concentrate on Performance, Scalability and Availability of Exchange Servers – Ex. BPA does not check security configuration • Make it easy to run – – – • No complex configuration settings Auto-detect everything Allow multiple credentials to be entered No server-side components to install No impact on Exchange performance, even at peak periods Don’t leave me hanging – Every Error | Warning | Non. Default rule has a specific article which tells you more about the problem and how we detected it • Keep it up-to-date – Provide best practice updates every month – Make the tool auto-download the updates • Work in all environments – From single server SBS implementations through to the largest enterprise – Make the tool work seamlessly in both open and closed networks
Similar Tools • MBSA – Microsoft Baseline Security Analyzer • SQLBPA – Microsoft SQL Server Best Practices Analyzer • The Ex. BPA engine has now been mandated as part of the WSS 2006 Common Engineering Criteria – BPAs for other Microsoft products are forthcoming
Architecture • One tool runs against all versions of Exchange – No support for pure Exchange 5. 5 topologies • You generally install the tool on a Windows XP workstation, and it remotely collects the data – Don’t need to install any components on the server • Ex. BPA is written in managed code (C#) • Input/output data model is XML based • Analysis engine is based on XPath
Where do we look? • We look for data in… – – – – Active Directory DNS WMI Registry Metabase Performance Monitor Files on disk TCP/IP ports • First pass of execution - collection – Ex. BPA collects the data and places it in the same namespace • Second pass of execution – analysis – Individual settings are analysed against the defined rules. Crosschecking between data sources is possible as the data is in the same hierarchy
How it works Active Directory Exchange Server XML Export Ex. BPA Dispatcher collectors Output Data Exchange Server Ex. BPA Analyzer Ex. BPA Interface XML Rules Import
Demonstration…
What does Ex. BPA check today? This following is not an exhaustive list of the checks that the tool performs, but it should give you a general idea!
Exchange Roles • Ex. BPA detects and understands the difference between… – – – Small mailbox servers Large mailbox servers Clustered Exchange servers Front-end servers Bridgehead servers • Rules are conditioned for their roles (e. g. Circular logging needs to be disabled on mailbox servers, but should be enabled on bridgehead servers)
Rule Types • Error – We found something that is causing, or will cause a problem – Example: No maximum message size set for the organization • Warning – We found something that looks suspicious – Example: An ADC connection agreement is scheduled to ‘Never’ • Non. Default – We found a setting which has been changed – Example: One of the many store parameters has been tuned/tweaked • Time – We found something that was changed during the past 5 days – Example: The cost on an SMTP connector was changed • Best. Practice – We found that a best practice is not being followed – Example: Dr. Watson crashes are not being uploaded to Microsoft for analysis • Info – We found something of interest – Example: Your server has 8 processors installed
Active Directory • Forest-wide – Forest functionality level – Exchange schema extensions – Default policy changes • Per-domain – Domain functionality level – Domains which have been renamed – Check availability of FSMO servers – EDS/EES group renamed/deleted/moved – MESO container renamed/deleted/moved
Active Directory Connector • ADC Server – – Server is overloaded Server is idle (i. e. no connection agreements) There’s a newer version of the ADC available Server is running the latest OS Service Pack • Connection Agreements – – – Orphaned agreements Schedule set to never Nominated server is missing One way agreements Out-of-date agreements
Exchange Organization • Check – Global message size limits are enforced – Stray Exchange objects in Lost. And. Found container – More than 10 administrators defined – Forest. Prep version – Mixed/native mode – OMA/EAS options – UCE thresholds – Recipient Update Service definitions – Address List and OAB definitions
Admin Groups • Check – Validity of legacy. Exchange. DN – Policy containers intact • Routing Groups – Check for valid routing master – Enumerate all connectors – Check for connectors that have recently changed
Exchange Server object • Check – Validity of server name – FQDN/Net. BIOS name resolution – Latest Exchange Service Pack / Roll-up – Time synchronization with the Active Directory
Cluster Configuration • Checks both Active and Passive nodes • Cluster-specific checks – Number of nodes in the cluster – Configuration discrepancies between nodes – Cluster account TEMP/TMP path – Quorum configuration – Heartbeat configuration – DNS/WINS configuration – Enumerates all resources and parameters – Kerberos configuration
Directory Access • Check – DSAccess cache configuration and non-default parameters. E. g. • Max. Memory. User | Max. Memory. Config • Ldap. Keep. Alive. Secs, Disable. Net. Logon. Check • Min. User. DC – DSAccess cache efficiency • DSAccess topology – Round-trip times between Exchange and each DC/GC in the topology – Hardware/OS configuration of each DC/GC – Calculates the GC to Exchange processor ratio
Information Store • Check – – – – ESE cache configuration Current state of virtual memory Online maintenance window Checkpoint depth Circular logging state Log buffer configuration Log generation level File system characteristics (NTFS/Compression/Encryption) Validity of legacy. Exchange. DN Database and logs on the same LUN Content Indexing state Non-default parameters in Private|Public-GUID registry Database size E-mail address on Public Folder stores RPC Compression / Buffer Packing settings Hard-coded TCP/IP ports, and clashes with other Exchange ports
Store Process Parameters Check for non-default settings and bad values Examples • More – – – – – – Disable MAPI Cllients Enable Tracing Initial Memory Percentage Initial Reserve Size KB Ignore Zombie Users Logon Only As Mailbox Cache Age Limit Mailbox Cache Idle Limit Mailbox Cache Size Max. Open. Messages. Per. Logon Reserve Increment KB Suppress. OOFs. To. Distribution. Lists Trace User Legacy. DN VM Warning|Error Level objt. Attachment objt. Folder. View objt. Message Prorate. Factor Prorate. Start Prorate. Max IMAIL settings Ex. IFS drive
Transport • Check – – – – – Main configuration parameters in the AD Cross-check AD and metabase for consistency Non-default settings File system characteristics for ‘mailroot’ folders (NTFS/Compression/Encryption) SMTP stack verb validation (e. g. X-LINK 2 STATE) SMTP mail submission test Enumeration of transport event sinks Enumeration of MTA settings, calling out any non-defaults Detection of Archive Sink and configuration Non-default routing parameters (e. g. Suppress. State. Changes)
System Attendant • Check – Service state – File system characteristics for message tracking folder (NTFS/Compression/Encryption) – RFR service – RFR / NSPI Target Server configuration – Hard-coded TCP/IP ports
Anti-virus Support • CA e. Trust 6/7 file-level AV configuration and exclusions • Trend Micro Scan. Mail – Patch level – Performance tuning configuration (threads/thresholds/debug settings) • Product detection and configuration settings for – Mc. Afee Group. Shield – Symantec Mail Security for Exchange – Sybari Antigen • VS API configuration settings – Warn if number of threads is not appropriate for underlying hardware
Other Installed Applications • Check – RPC Client|Server binding order configuration – Presence of Leak. Diag – For old versions of Simpler-Webb ERM – ISA 2000 Service Pack level – Presence of MOM Agent
Hardware Configuration • Check – System BIOS is not over a year old – Specific support for HP, Dell and IBM servers – Processor configuration – Physical memory installed
Disk Storage System • Check – Performance counters are enabled – Enumeration of physical and logical disks – Enumeration of identification of mount points – Enumeration of disk controllers and driver levels – Configuration of Host Bus Adaptors – Version of multi-pathing software (e. g. Secure. Path, Power. Path)
File Versions • Verify 29 key Exchange binaries – Physical presence – Make sure that they’re not too old – Identify binaries which are hotfixes • Check – Server MAPI subsystem – Presence of old Roll-ups – Presence of ESE API virus scanners
Hotfixes • Detect all hotfixes and Service Packs installed for – Windows 2000 – Windows 2003 – Exchange 5. 5 – Exchange 2000 – Exchange 2003 • Call out any updates that were installed during the past 5 days, and the logon name of the user that performed the installation
Network Subsystem • Enumerate all network cards • Check – NIC connection status – DNS/WINS configuration – IP Gateway settings – Primary DNS is alive – Domain suffix
Operating System • Check – – – Page Table Entry (PTE) levels Paged|Non. Paged pool configuration Crash. On. Audit. Fail configuration Heap. De. Commit. Free. Block. Threshold TEMP/TMP paths System. Pages configuration /3 GB /USERVA configuration Physical Address Extensions (PAE) detection OS Version and SKU (e. g. Standard, Enterprise, etc) Dr. Watson configuration Debug settings (including Global. Flag, Page. Heap. Flags) Virtual PC / Virtual Server / VMWare detection
Success Stories • Identified that circular logging was enabled on a 12, 000 user Exchange cluster – Was a potential time-bomb • Identified incorrect memory configuration that required the Exchange server to be restarted every two weeks • Identified a case where database files were being stored on a compressed volume – Root cause of the performance problems
Ex. BPA Timeline • V 1. 0 – September 21 st – 1200 point collection / 800 rules • V 1. 1 – December 6 th – Usability improvements – 1300 point collection / 900 rules • V 2. 0 – Early March – – – Localized in all Exchange Server languages Performance sampling and root cause analysis infrastructure Admin API support (e. g. find out time of last backup) Optional integration with MOM 2005 Export to XML / HTM / CSV New baseline logic • V 3. 0 – Later on in the year – More rules and refinements – MAPI. NET collector
Appendix: Screen Shots
- Slides: 37