JINR Tier1 service monitoring system Ideas and Design
- Slides: 25
JINR Tier-1 service monitoring system: Ideas and Design LIT Igor Pelevanyuk, Ivan Kadochnikov @GRID 2016
1 Introduction Why it is important and complicated
WLCG World LHC Computing Grid Tier-0: 20% of compute capacity. Tier-1: Highly reliable. Serve T 2 centers. Tier-2: 160 centers. Serve users’ tasks. Tier-3: Centers serving specific groups.
Purpose Store Process Deliver Data coming from Large Hadron Collider For To Other T 1 s T 2 s T 3 s
Services Process Store Transfer Other
Hierarchy Services OS/Software Hardware Network Infrastructure Icons are Designed by Freepik
Task Deploy a system to show status of services on a single page with ability to investigate reasons of problems.
2 Work done What we have already achieved
Sources Local Parsing Fast Slow Controllable Security Parsing Security Uncontrollable Security Parsing Important data
Web-security Local network HTTP request Proxy. Agent <HTML> Monitoring host <HTML> HTTPS request
SSH-security Monitoring host Monitored host ssh monitor@monitored test 1 stdout, stderr Monitored host configured to run Proxy. Command. sh on particular ssh key Proxy. Command. sh contains list with allowed commands. Passing test 1 as parameter could lead to executing /opt/adm/qsub -q
First iteration Visualization Collection Executors JSONs Retrievers HTTP Requests HTML response HTTP Request
Dashboard
Transfer Service - Phedex
Phedex - Quality
Happy. Face Simple Aggregation Alarms Detalisation
Task 2. 0 Deploy a system to show status of services on different scales with ability to react automatically on occurring events, and alow forecast based on past data.
3 Next Steps To the Future
Module REST Collect HTML, CSS, JS Analyze Architecture DB Forecast React
Java. Script
Want big impact?
New Techs https: //github. com/tier-one-monitoring Landscape. io – Coding style check Travis CI – Continuous Integration Installation, Regressive, Compatibility, Functionality, … Covetalls. io – Code coverage
4 Concusion Wrap up!
Conclusion Alarm system required Model of service interaction Complex Event Processing techniques More, more, and even more modules
THANKS! Any questions? You can write me to Pelevanyuk (at) jinr. ru
- Nrv jinr
- Pin jinr
- Input and output form design
- Ideas have consequences bad ideas have victims
- Ideas principales e ideas secundarias
- What does product and service design do?
- ___________ helps in managing lifecycles on cluster node.
- Service availability monitoring
- Monitoring customer service
- Wg status monitoring service
- Wgstatus
- Cmems copernicus
- User interface design in system analysis and design
- Dialogue design in system analysis and design
- Final major project ideas
- Early warning intervention and monitoring system
- Centralized public grievance redress and monitoring system
- Scouts own service
- What are the main features of analyzed system?
- System security in system analysis and design
- System proposal in system analysis and design
- Feasibility study in system analysis and design
- Define system analysis
- What is adequate service
- Soa architecture
- Earth is a closed system