VNF Event Streaming Onboarding Telemetry Policies Alok Gupta
VNF Event Streaming: Onboarding Telemetry Policies Alok Gupta, AT&T Bryan Sullivan, AT&T June 15, 2017
VES (VNF Event Stream) Objective § Enable significant reduction in effort to integrate VNF telemetry into automated VNF management systems § Convergence to a common event stream format to simplify closed loop automation § Enable Self Service onboarding thru VES Artifact for defining telemetry policy https: //wiki. opnfv. org/display/VES 2 VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved.
Why Needed § Telemetry data formats and semantics / expected actions by management systems vary widely § For Fault Events, vendors use SNMP, 3 GPP Corba, MTOSI, OSSJ etc, and semantics can differ (e. g. Critical Severity as “ 1” or “ 5”) § For Measurement events (KPI/KCI), vendors deliver CSV or XML based files, with varying internal data formats § Requires AIDs, MIBs, NMTPs for each VNF and VNF Release § This variance results in substantial development and maintenance costs for VNF integration into management systems § 3 -6 months development is typically needed 3 VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved.
Project Scope & Deliverables § VNF Event Stream Common Event Data Model § Now on the 3 rd major revision in OPNFV § Further updates to be developed in ONAP § VNF On-boarding Artifact § YAML-based artifact defining VNF telemetry capabilities and needs § Integration into OPNFV reference platforms § Agent code from the Barometer project (collectd) § ONAP VES library for integration with C-based agents (e. g. VES demo agent) § ONAP Collector and telemetry backend systems (e. g. database, message bus, closed-loop policy framework, dashboards) 4 VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved.
VES – Common Event Data Model § § § 5 Common Event Data Model § Common Header and Domain Specific Event § Extensible for additional fields or domains Collector connection and data profile established at VM creation § Connection/authentication/profile parameters injected into VM Data profile is fully controllable, to optimize telemetry overhead VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved.
VES Demo • Leveraging ONAP demo VES Agent and Collector • Demonstrating Influx. DB/Grafana backend • Covering host, VM, and VNF status/stats • Showing fault/stats correlation Open. Stack Client API Jumphost (Triple. O) Terminal Web Server traffic Tacker / Docker JSON/REST SSH etc VES C Agent ONAP C Agent Traffic, status CPU, v. NIC stats NGi. NX / Docker v. LB (iptables) v. FW (iptables) VDU 1/2 VM VDU 3 VM VDU 4 VM VNF Event Streaming: Onboarding Telemetry Policies Nova-compute Neutron-gw etc VES Collector Influx. DB Grafana Barometer Collectd Agent VDU 5 VM Open. Stack Compute Host 6 Open. Stack Controller Host and VM (libvirt) stats: CPU, NIC, memory, … Copyright 2017 AT&T Intellectual Property. All rights reserved.
VES On-Boarding Artifact q To enable Self-Service, an on-boarding artifact can be provided by VNF Vendors, covering § § § § Which VES event Domains are supported by the VNF’s VES agents Optional fields supported, both in the body and as name/value extensions Enumeration of Fault events with recommended action to resolution Ranges and related Thresholding Crossing Alert/Actions for VNF Measurement fields Complex (multi-field correlated) Thresholding Crossing Alert/Actions Scale in/out recommendation based upon single or correlated fields Syslog Tag data with recommended actions All Artifacts via one Onboarding process - no need for AIDs, MIBS, NMTPs - automation enabling 7 VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved.
VES On-Boarding Artifact Use SDC VES-Onboarding Artifact All Domains Supported including Optional Fields • heartbeat – default heartbeat interval • Fault – all possible faults with recommended action • Measurement (KPI/KCI) – with recommended TCAs • Syslogs Tags with recommended actions Policy Creation Framework Design, Closed Loop Design Creation Info to DCAE Components • Event Validation e. g. Domains/Fields supported • Rough Policies based Vendor Recom. Actions • Mapping Info or Logic needed by Micro-services • CLAMP Flows for Closed Loop and Open Loop VNFC Instantiation/VFC Life Cycle • User Name/Password • FQDN • Configurable Parameters • Heartbeat Interval • Measurement Interval • Configurable Domain Data Other Portals • OPs Portal – Electronic M&Ps • P&E Portal – Capacity Planning Data 8 VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved.
Registration Spec --# registration for Heartbeat_v. MRF event: {presence: required, structure: { common. Event. Header: {presence: required, structure: { domain: {presence: required, value: heartbeat}, event. Name: {presence: required, value: Heartbeat_v. MRF }, event. Id: {presence: required}, nf. Naming. Code: {presence: required, value: mrfx}, priority: {presence: required, value: High}, reporting. Entity. Name: {presence: required}, sequence: {presence: required}, source. Name: {presence: required}, start. Epoch. Microsec: {presence: required}, last. Epoch. Microsec: {presence: required}, version: {presence: required, value: 3. 0} }}, heartbeat. Fields: {presence: optional, structure: { heartbeat. Fields. Version: {presence: required, value: 1. 0}, heartbeat. Interval: {presence: required, range: [ 0, 600 ], default: 60 } }} }}. . . 9 VNF Event Streaming: Onboarding Telemetry Policies Sample Event { "event": { "common. Event. Header": { “domain”: “heartbeat”, "event. Name": "Heartbeat_v. MRF", "event. Id": "ab 305 d 54 -85 b 4 -a 31 b-7 db 2 -fb 6 b 9 e 546015", “nf. Naming. Code”: “mrfx”, "priority": "Normal", “reporting. Entity. Id”: “cc 305 d 54 -75 b 4 -431 b-adb 2 -eb 6 b 9 e 541234”, “reporting. Entity. Name”: “Mega. MRFVf”, "sequence": 0, "source. Id": "de 305 d 54 -75 b 4 -431 b-adb 2 -eb 6 b 9 e 546014", “source. Name”: “Mega. MRF”, “start. Epoch. Microsec”: 1413378172000000, “last. Epoch. Microsec”: 1413378172000000, “version”: 3. 0 } } } Copyright 2017 AT&T Intellectual Property. All rights reserved.
Registering Event. Type: Fault_v. MRF_Invalid. License Registration Spec --# registration for Fault_v. MRF_Invalid. License event: { presence: required, action: [any, invalid. License, RECO-renew. Licence], structure: { common. Event. Header: {presence: required, structure: { domain: {presence: required, value: fault}, event. Name: {presence: required, value: Fault_v. MRF_Invalid. License}, event. Id: {presence: required}, nf. Naming. Code: {presence: required, value: mrfx}, priority: {presence: required, value: High}, reporting. Entity. Name: {presence: required}, sequence: {presence: required}, source. Name: {presence: required}, start. Epoch. Microsec: {presence: required}, last. Epoch. Microsec: {presence: required}, version: {presence: required, value: 3. 0} }}, fault. Fields: {presence: required, structure: { fault. Fields. Version: {presence: required, value: 1. 2}, alarm. Condition: {presence: required, value: "Invalid license key"}, event. Source. Type: {presence: required, value: virtual. Network. Function}, specific. Problem: {presence: required, value: "The node license key is invalid"}, event. Severity: {presence: required, value: CRITICAL}, vf. Status: {presence: required, value: Active}, alarm. Additional. Information: {presence: required, array: [ field: {presence: required, structure: { name: {presence: required, value: license_key}, value: {presence: required} }} ]} }} }}. . . 10 VNF Event Streaming: Onboarding Telemetry Policies Sample Event { "event": { "common. Event. Header": { “domain”: “fault”, “event. Name”: “Fault_v. SCF_Invalid. License”, "event. Id": "ab 305 d 54 -85 b 4 -a 31 b-7 db 2 -fb 6 b 9 e 546015", “nf. Naming. Code”: “mrfx”, "priority": "High", “reporting. Entity. Id”: “cc 305 d 54 -75 b 4 -431 b-adb 2 -eb 6 b 9 e 541234”, “reporting. Entity. Name”: “Mega. MRFVf”, "sequence": 0, "source. Id": "de 305 d 54 -75 b 4 -431 b-adb 2 -eb 6 b 9 e 546014", “source. Name”: “Mega. MRF”, “start. Epoch. Microsec”: 1413378172000000, “last. Epoch. Microsec”: 1413378172000000, “version”: 3. 0 }, "fault. Fields": { “fault. Fields. Version”: 1. 2 "alarm. Condition": “Invalid license key", "event. Source. Type": "virtual. Network. Function", "specific. Problem": "The node license key is invalid" “event. Severity": "CRITICAL", “vf. Status”: “Active”, “alarm. Additional. Information”: [ { “name”: “license_key”, “value”: “ 1000” } ] } } } Copyright 2017 AT&T Intellectual Property. All rights reserved.
Registering Event. Type: MFVS v. MRF --# registration for Mfvs_v. MRF event: {presence: required, structure: { common. Event. Header: {presence: required, structure: { domain: {presence: required, value: measurements. For. Vf. Scaling}, event. Name: {presence: required, value: Mfvs_v. MRF}, event. Id: {presence: required}, nf. Type: {presence: required, value: mrfx}, priority: {presence: required, value: Normal}, reporting. Entity. Name: {presence: required}, sequence: {presence: required}, source. Name: {presence: required}, start. Epoch. Microsec: {presence: required}, last. Epoch. Microsec: {presence: required}, version: {presence: required, value: 3. 0} }}, measurements. For. Vf. Scaling. Fields: {presence: required, structure: { measurements. For. Vf. Sclaing. Fields. Version: {presence: required, value: 2. 0}, measurement. Interval: {presence: required, range: [ 60, 1200 ], default: 180 }, concurrent. Sessions: {presence: required}, cpu. Usage. Array: {presence: required, array: { cpu. Usage: {presence: required, structure: { cpu. Identifier: {presence: required}, percent. Usage: {presence: required, range: [ 0, 100 ], action: [ 90, up, Cpu. Usage. High, RECO-scale. Out, Tca_v. MRF_High. Cpu. Usage ], action: [25, down, Cpu. Usage. Low, RECO-scale. In, Tca_v. MRF_Low. Cpu. Usage ]} }}, 11 VNF Event Streaming: Onboarding Telemetry Policies memory. Usage. Array: {presence: required, array: { memory. Usage: {presence: required, structure: { vm. Identifier: {presence: required}, memory. Free: {presence: required, range: [ 0, 100 ], action: [ 100, down, Free. Mem. Low, RECO-scale. Out, Tca_v. MRF_Low. Free. Memory ], action: [1000, up, Free. Mem. High, RECO-scale. In, Tca_v. MRF_High. Free. Memory ]}, memory. Used: {presence: required} }}, number. Of. Media. Ports. In. Use: {presence: required, range: [ 1, 300 ] }, additional. Measurements: {presence: required, array: [ measurement. Group: {presence: required, structure: { name: {presence: required, value: license. Usage}, measurements: {presence: required, array: [ field: {presence: required, structure: { name: {presence: required, value: [ G 711 Audio. Port, G 729 Audio. Port, G 722 Audio. Port, AMRAudio. Port, AMRWBAudio. Port, Opus. Audio. Port, H 263 Video. Port, H 264 Non. HCVideo. Port, H 264 HCVideo. Port, MPEG 4 Video. Port, NP 8 Non. HCVideo. Port, VP 8 HCVideo. Port, PLC, NR, NG, NLD, G 711 Fax. Port, T 38 Fax. Port, RFactor, T 140 Text. Port ] }, value: {presence: required} }} ]} }} }}. . . Copyright 2017 AT&T Intellectual Property. All rights reserved.
Registering Event. Type: Complex TCAs --# Rules: [ rule: { trigger: Cpu. Usage. High && Free. Mem. Low, microservices: [scale. Out] # Note: this presumes there is a scale. Out microservice alerts: [Tca_v. MRF_Out. Of. Resources] # Note: this TCA should be defined in the YAML }, rule: { trigger: Cpu. Usage. Low && Free. Mem. High, microservices: [scale. In] # Note: this presumes there is a scale. In microservice } ]. . . 12 VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved.
Registering Event. Type: syslogs v. MRF # registration for Syslog_v. MRF syslog. Fields: {presence: required, structure: { # log all, restart if tag = Out_of_Memory event. Source. Host: {presence: required}, event: {presence: required, action: [any, null, RECO-log] event. Source. Type: {presence: required, value: virtual. Network. Function}, structure: { syslog. Facility: {presence: required, range: [0, 23]}, common. Event. Header: {presence: required, structure: { syslog. Fields. Version: {presence: required, value: 3. 0}, domain: {presence: required, value: syslog}, syslog. Msg: {presence: required}, event. Name: {presence: required, value: Syslog_v. MRF}, syslog. Pri: {presence: required, range: [0, 192]}, event. Id: {presence: required}, syslog. Proc: {presence: required, range: [0, 65536]}, nf. Naming. Code: {presence: required, value: mrfx}, syslog. SData: {}, priority: {presence: required, value: Normal}, syslog. Sd. Id: {}, reporting. Entity. Name: {presence: required}, syslog. Sev: {presence: required, range : [0 -7]}, sequence: {presence: required}, syslog. Tag: {presence: required, action: [“Out_of_Memory”, at, null, reco-restart]}, source. Name: {presence: required}, syslog. Ver: {presence: required, value 0} start. Epoch. Microsec: {presence: required}, }} last. Epoch. Microsec: {presence: required}, }}. . . version: {presence: required, value: 3. 0} }}, 13 VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved.
What’s next for upstream development • Create Policies from YAML Artifact • Policy Hardening (from intent to detailed specification) • Explore Data Storage • Build Micro-Services that can be used by Policies • Build Flows to Analyze Data and Take Action Service Design and Creation (SDC) Policy Apps Analytics Apps Data Store (influx. DB) DMaa. P DCAE Collector DCAE/App-C Controllers Open. Stack Controller JSON/REST VES C Agent ONAP v. LB C Agent ONAP v. FW C Agent Traffic, status VPP stats NGi. NX / Docker ONAP VPP v. LB/v. DNS ONAP VPP v. FW VDU 1/2 VM VDU 3 VM VDU 4 VM Barometer Collectd Agent Host and VM (libvirt) stats: CPU, NIC, memory, … Open. Stack Compute Host 14 Grafana VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved. Nova-compute Neutron-gw etc
Takeaways and Next Steps § Common telemetry data models and collection frameworks can improve ROI in VNF onboarding/automation § OPNFV will work with ONAP as upstream home for § Alignment with SDOs (TM Forum, OASIS, ETSI) on VES Event Data Model and On-Boarding Artifact standards § Expanding VES to control plane monitoring/integration, in addition to VNFs § OPNFV VES will integrate/test ONAP telemetry-driven policy management across OPNFV reference platforms 15 VNF Event Streaming: Onboarding Telemetry Policies Copyright 2017 AT&T Intellectual Property. All rights reserved.
Thank You!
- Slides: 16