ONAPAuto Test Framework 1 Test Overall Objectives Depends
ONAP-Auto Test Framework 1 Test Overall Objectives Depends. On 2 Test Strategy Depends. On 3 Test Documentation Implements 4 Test Script Implementation Script language (Python, bash, …); Comments in script capture the test documentation Test Specific Objectives Depends. On Use Case Description “section” or “chapter” of a Test; could be just 1 UC; Assertion (Pass/Fail Criteria) Feature/Capability being tested, success/fail determination (binary pass/fail condition, quantitative thresholds/ranges, …) Executes 5 Test Runner (Calling script) Environment Description Physical Resources (unless Test aa. S) Physical servers (CPU, RAM), disks (HD, SSD), OS; pod organization; Is. Variant. Of Cloud & Virtual Resources Manager: Openstack, K 8 S, AWS, … Resources: compute (OS), storage, network, …; VMs or Containers; * Environment Variant VNF Manager VNFs VNF Lifecycle Events ONAP, Cloudify, Tacker, … v. XYZ (CPE, FW, Switch, HSS, I/S/P-CSCF, …); Source (OPNFV, Clear. Water, ETSI, …); Format (JSON, TOSCA, …) Onboard, Remove, Deploy/Activate, Terminate, Scale-In/Out, Monitor, … Follows (sequence) Pre-test State Description of required configuration, environment variables, … Action (Test Step) e. g. : setup, start, use, stop, (functions in a script language); execute planned VNF LC events Assertion Step Evaluation of current state, application of assertion criteria, determination of Pass/Fail result Post-test State Capture/Log of significant state data (environment variables, …) Cleanup Step Function to return system to its Pre -test State Follows (sequence)
Resilience Improvements Use Case: Overall Objectives • https: //wiki. opnfv. org/display/AUTO/Auto+Use+Cases#Auto. Use. Cases-Use. Case 2 • As an NFV edge service provider, I need to assess what degree of added VIM+NFVI platform resilience I obtain by leveraging ONAP closed-loop control, vs VIM+NFVI self-managed resilience, so I can determine the ROI for integrating ONAP with my VIM+NFVI platforms, both locally on the VIM+NFVI platform, or remotely. • assess VIM+NFVI resilience under specific stresses without ONAP support • assess VIM+NFVI resilience under specific stresses with ONAP support, i. e. VES+DCAE+CLAMP+specific policies to address the type of stress • stress the system via: • traffic: different types as needed per the resilience measurements • Instability: • physical infra failure (host, NIC, disk, . . . ) • virtual infra failure (cloud control plane components, SDNC components, NFVI components) • security threats • measure (and compare) resilience in terms of • • link failure, e. g. loss ping failures transaction failure, e. g. failed web/API requests service degradation/failure duration or recovery time scope of disruption GD: failure detection duration (time to notice/notify failure)
Resilience Improvements Use Case: Strategy Resilience Improvements Through ONAP [ID: Auto-UC-02] Physical Infrastructure Failure Server Failure, Migration [ID: auto-resiliency-pif-001] [JIRA: AUTO-9] Virtual Infrastructure Failure Compute Service Failure, (Auto)-Restoration or Migration [ID: auto-resiliency-vif-001] In progress [JIRA: AUTO-13] Security Failure Host Tampering, Fencing, Migration [ID: auto-resiliency-sec-001] [JIRA: AUTO-16] Disk Failure, Migration [ID: auto-resiliency-pif-002] [JIRA: AUTO-10] SDN-C Service Failure, (Auto)-Restoration or Migration [ID: auto-resiliency-vif-002] [JIRA: AUTO-14] Host Intrusion, Fencing, Migration [ID: auto-resiliency-sec-002] [JIRA: AUTO-17] Link Failure, Migration [ID: auto-resiliency-pif-003] [JIRA: AUTO-11] OVS Failure, (Auto)-Restoration or Migration [ID: auto-resiliency-vif-003] [JIRA: AUTO-15] Network Intrusion, Fencing [ID: auto-resiliency-sec-003] [JIRA: AUTO-18] NIC Failure, Migration [ID: auto-resiliency-pif-004] [JIRA: AUTO-12] Storage Service Failure, (Auto)-Restoration or Migration [ID: auto-resiliency-vif-xyz] [JIRA: xyz] Networking Service Failure, (Auto)-Restoration [ID: auto-resiliency-vif-xyz] [JIRA: xyz]
Resilience Improvements: Strategy • Environment description: • • management stack: ONAP (on OPNFV) on Open. Stack physical servers: ARM servers VNFs: Events: • Infrastructure-level: Failures of physical/virtual resources, security failures • VNF-level: Migration, restoration, isolation
Resilience Improvements: Documentation
Resilience Improvements: Script Implementation • Python, using Django Framework, IDE Py. Charm • Scripts are installed and executed on the ARM jump server 10. 50. 12: • Directory: ~/auto-env • executables in ~/auto-env/bin • scripts could also run from a VM or a Docker container • virtual environment managed from ~/auto-env/bin/activate • scripts interact by HTTP APIs with: • configuration and result (data records) storage tables (see next page) • test data records can be used for analysis • VIM: to simulate failures, etc. (e. g. Open. Stack) • VNF manager/orchestrator (MANO) (e. g. ONAP) • SQLite is used to browse the tables from the server CLI • e. g. to consult the test results • curl is used for HTTP or FTP transfers of these tables
Resilience Improvements: Script Implementation • reference tables to manage test data, for dynamic test execution (retrieve config, store results) • use tables via HTTP (REST API, http: //10. 50. 12: 8000/); GET and POST Data Table: Assessments Assessment. Id: Unique ID for assessment Record. Label: Descriptive text to help group and compare similar assessment tests, e. g. v. CPE Assessment with ONAP closed loop control, v. CPE Assessment with non-ONAP MANO such as Tacker, Cloudify, Openstack VIM, etc. Status: Assessment Status, e. g. Init, Started, Passed, Partially. Failed, Error, etc. Start. Time: Assessment Start Timestamp. Completion. Time: Assessment Completion Timestamp. Vnf. Details. Id: Refers to an entry in table Vnf. Details Stress. Or. Fault. Id: Refers to an entry in table Stress. Or. Fault. Data Recovery. Time: Time it took for VNF to recover after fault was injected vnf. Orchestrator: NFV MANO component for VNF management and orchestration, e. g. Tacker, Cloudify, ONAP, Openstack VIM, etc. orchestrator. Info: Credentials to use APIs of the NFV MANO (e. g. Tacker, ONAP) for VNF data collection Resiliency. Measure. Param: e. g. failed. Transaction, service. Degradation, packet. Loss, etc. Resiliency. Measure. Value: Requestor. Id: User Id or Application Id, initiating the Assessment Comments: Text Comments added for the assessment record Data Table: Stress. Or. Fault. Data stress. Fault. Id: Unique ID stress. Fault. Type: compute. Host. Service. Failure, sdn. Servicefailure, ovs. Bridge. Failure, compute. Host. Failure, disk. Failure, link. Failure, nic. Failure, host. Tampering, host. Intrusion, network. Intrusion Start. Time: Timestamp when the fault is injected End. Time: Timestamp when the fault is removed/recovered stress. Or. Fault. Status: Started, Failed, Error, Completed/Removed/Restored host. Ip: Compute Host where the fault is being injected command: Stress or Fault command to execute, e. g. service nova-compute restart/stop command. Outcome: Outcome from the stress/fault command execution Data Table: VNFDetails VNFId: Target VNF Id VNFName: Target VNF Name Hosts. List: Compute Hosts where the VNF VMs are launched Networks. List: Networks being used by VNF Vnf. Info: e. g. flavor, image, etc. VMs. List: List of all VMs of the target VNF impacted. VMs. List: List of VMs impacted due to the fault injection, e. g. all VMs hosted by a failing compute host.
Resilience Improvements: Script Implementation Data Table: Assessments Assessment ID Label (tag) VNF Orchestrator Credentials VNF Orchestrator (ONAP, Tacker, Cloudify, ETSI OSM, . . . ) VNF Orchestrator Info VNF Orchestrator Status Resilience Metric (Name, Definition, Formula) Resilience Metric Value (Quantitative measurement) Start Time Completion Time VNF Details ID Stress. Or. Fault ID Recovery Time Resiliency Measure Param Resiliency Measure Value VNF ID VNF Name VNF Info VIM Credentials VIM (Open. Stack, K 8 S, AWS, Azure, GCP, . . . ) App-C SDN-C VM List Network List Host List VNF impacted VM List Requestor ID Requestor (User, Application, Service) Comments Data Table: VNFDetails Container VM (compute) Virtual Storage (Block, Object, FS, DB, . . . ) Network (subnet, VPN, . . . ) Physical Host (with CPU, RAM, OS) stress. Fault ID stress. Fault Type Start Time End Time stress. Or. Fault Status host IP command Outcome impacted VM List ? Data Table: Stress. Or. Fault. Data Physical Disk (HD, SSD, . . . ) Physical NIC (Eth, Wi. Fi, Cellular, . . . ) Physical Link (wire, radio) Logical Link (tunnel)
Resilience Improvements: Test Runner
- Slides: 9