Felix Ehm MONITORING AND DIAGNOSTIC OF MIDDLEWARE SERVICES























- Slides: 23

Felix Ehm MONITORING AND DIAGNOSTIC OF MIDDLEWARE SERVICES 28 th June 2012

Introduction Problem: how can I detect a failure in my systems ? What is the reason? Host, Network ? � Add machine monitoring Is my program running correctly ? � ? 28 th June 2012 2

Introduction Problem: how can I detect a failure in my systems ? Gain control by exposing process internal information to enable constant monitoring for pre-failure recognition. JMX for Java Processes � CMWAdmin for CMW servers � CMX for C/C++ general services � Tracing/Central Logging System � 28 th June 2012 Gina Gorgogianni, CMX Feedback 3

The Java Management Extension Java Standard to expose process internal information � Inspect data (remotely) via JConsole/Jvisualvm � Many monitoring systems support this � Example for JMS Broker 28 th June 2012 4

The CMWAdmin GUI Java GUI to inspect CMW-enabled process � Browse and watch information from one server � Uses CMW middleware to access data � CMW Servers list from the Directory Server CMWAdmin 28 th June 2012 5

The CMX Library A general solution to allow exposure of internal metrics for C/C++ programs. Idea origins from JMX: Why can’t we have something like this for C/C++? • Requirements • • Small memory footprint • Non-blocking calls • Metrics: floats & strings • Project started in 2012 28 th June 2012 6

Architecture � High Level � 2 lightweight APIs with non-blocking operations to ○ Update : registers, exposes & updates metrics ○ Read : retrieves information for metrics / process �No dependencies � Low Level �Main Segment: table containing the registered processes �Process Segments: structures containing information on metrics 28 th June 2012 Shared Memory 9

CMX Library Characteristics Very small footprint: 140 KB in memory usage � Easy non-blocking API: 10 core functions in total � Supports floats and string data types � Incorporated input from real-time experts � CMX Library is ready for preproduction � No dependencies on external libraries � Future: Deployment for all cmw servers � �But also applicable for other C/C++ projects 28 th June 2012 11

Constant Monitoring Host Health � Process up/down � Process service endpoint ok? � �E. g. HTTP Server : is wget successful ? � Process does what it is supposed to do 28 th June 2012 12

Constant Monitoring � DIAMON as CO in-house solution �Reads metrics and applies rules �Easy to extend though pluggable architecture �Provides history of metrics �Provides replay functionality � Controls config DIAMON In case of problem detection �Displays it to Operators �Sends notification via SMS/Mail 28 th June 2012 DAQs JMX CMW CMX 13

The DIAMON Synoptic Viewer 28 th June 2012 14

The DIAMON Console 28 th June 2012 15

Diamon � View History Data on metrics 28 th June 2012 16

The Central Tracing/Logging System I need more information than just numbers to diagnose a problem! � Log events are helpful � Find the point where the program crashes/fails � Access to (past) events is required � Problems � Frontends are diskless � Multi-layer systems implies watching many sources at the same time � You get quickly drawn in the amount of information � CMW Project was initiated June 2011 � Target: Collect log events from CMW servers for better diagnostic (n. b. log events = info, debug, error, warning, etc. ) � Replace previous system 28 th June 2012 17

The Central Tracing/Logging System Finding/Debugging a problem becomes cumbersome! Collecting and unifying tracing messages in one central place Easy correlation of events among many services Tracing Server ? DB Equipment Specialist / Developer 28 th June 2012 18

The Tracing/Log GUI Record to File Filter Finding/Debugging a problem becomes cumbersome! Collecting and unifying tracing messages in one central place Easy correlation of events among many services Tracing Server Avail. Log Instances Incoming log events ? DB Message Panel

The CMW Tracing Package � C++ client library �Very lightweight �Supports TCP + UDP �File + syslog + STOMP appender �Integrated with CMW components �Log level can be changed during runtime � JAVA client library �Based on log 4 j �Very easy to integrate with existing JAVA services 28 th June 2012 20

The CMW Tracing Package � The Server �Modules ○ Converters to accept message ○ Broker to distribute data ○ File. Writer and Database Writer ○ Registry keeping discovered sources �Can be deployed as all-in-one process or separate processes ○ Scale horizontally and vertically 28 th June 2012 21

The CMW Tracing Package C/C++ & Java Libraries for Log Events � C/C++ Library for Config Messages � � Server �Accepts events coming via UDP or TCP �Stores events in database and files �Sends events to multiple receivers � User Interface(s) �“online” : Java GUI, Linux Console (web console) �“offline” : Database viewer based UDP based

The CMW Tracing Service � Nearly all CMW services send log events �Proxies, RDA servers, JMS, … Great help for identifying problems � Easy to extend to other protocols � Performance �~100 M Messages/day � 6% stored in the DB � 100% stored additionally in Files �System does very well Low network and CPU load �

The CMW Tracing Service Collects also other information than log events � What is done, where and when by whom? � �Software upgrades / installations �Process restarts events �Configuration changes 28 th June 2012 24

Summary Gain control by exposing process internal information to enable constant monitoring for pre-failure recognition JMX for Java Processes � CMWAdmin for CMW servers � CMX for C/C++ general services � Tracing/Central Logging System � � DIAMON But: try also to monitor the system as the user sees it �JMS : send test message and measure speed > 100 ms = WARNING 28 th June 2012 25

28 th June 2012 26