x RCU Integration with libraries Resource reclamation framework
x RCU Integration with libraries (Resource reclamation framework in DPDK) HONNAPPA NAGARAHALLI ARM
Agenda • Recap • Resource reclamation • General process • With lib-urcu 1 • With rte_rcu_qsbr • Resource Reclamation Framework for DPDK • Performance [1] https: //liburcu. org/ 2
Recap Delete entry 1 from D 1 Delete entry 2 from D 1 Reader Thread 1 T 2 Delete Quiescent states (QS) D 2 D 1 Remove reference to entry 1 Grace Period (GP) D 2 D 1 T 3 Free Critical sections D 2 Free resources for entries 1 and 2 after every reader has gone through at least 1 quiescent state Time RCU helps the writer determine the end of Grace Period 3
Resource Reclamation – General process • General process can be divided into following parts • Initialization • Quiescent State reporting • Resource Reclamation – This is the focus of this discussion • Shutdown 4
Resource Reclamation – Trivial Process Writer Thread Lock-Free Data structure Reader Thread. N Thread 2 Thread 1 RCU State rte_data_structure_delete () { Delete Start GP Report QS data_structure_delete_entry(); Poll QS Status rte_rcu_qsbr_check(wait == true); Report QS Poll QS Status data_structure_free_entry(); Poll QS Status Report QS } Poll QS Status Free • • Writer is unaware of the lock-free-ness of the algorithm No code changes required in writer to switch to lock-free algorithm • • Writer polls between ‘Delete’ and ‘Free’, which reduces writer’s performance Writer and readers access RCU state concurrently 5
Resource Reclamation – With lib-urcu, call_rcu Lock-Free Data structure Writer Thread Defer Queue Global Reclamation Thread Reader Thread. N Thread 1 RCU State Delete rte_data_structure_delete () { Enqueue Resource (call_rcu) Delete Enqueue Resource (call_rcu) data_structure_delete_entry(); Dequeue Resource call_rcu(deleted resource); Dequeue Resource Start GP (synchronize_rcu) Poll QS Status Report QS } Poll QS Status Free Poll QS Status Report QS Free • • Writer does not poll, performance not affected Cost of reclamation is amortized • • • Another thread added to the application If writer runs out of resources… Polling still exists – in reclamation thread Time to reclaim increases Contention on global defer queue 6
Resource Reclamation – With lib-urcu, rcu_defer Lock-Free Data structure Writer Thread Queue Defer. TLS Queue Per Thread Reclamation Thread Reader Thread. N Thread 1 RCU State rte_data_structure_delete () { Delete Enqueue Resource (rcu_defer) data_structure_delete_entry(); Dequeue Resource rcu_defer(deleted resource); Start GP (synchronize_rcu) Poll QS Status Report QS } Poll QS Status Free Poll QS Status Report QS Free • • Less contention on defer queue If writer finds defer queue is full • It reclaims resources • Writer thread has to poll again for 1 GP 7
Resource Reclamation – With rte_rcu_qsbr • Lock-Free Data structure Structure + Defer Queue Writer Thread Reader Thread. N Thread 1 RCU State • • No reclamation thread. Reclamation runs in the context of writer thread. Defer queue per data structure. Deletion • Reclamation • Addition • Batching benefits enabled by new patch in RCU library 1 Delete Start GP (rte_rcu_qsbr_start) Enqueue Resource Report QS Peek Queue Check GP (rte_rcu_qsbr_check) Dequeue Free No contention on defer queue Less contention on RCU State [1] https: //patchwork. dpdk. org/patch/58960/ rte_data_structure_delete () { data_structure_delete_entry(); rte_rcu_qsbr_start(); /* Start the GP ASAP */ if (defer_queue_full) reclaim_resource(); /* Mostly no waiting for GP */ enqueue_resource(); } __rte_reclaim_resource () { peek_queue(); if (rte_rcu_qsbr_check(wait = FALSE) == SUCCESS) { /* No Continuous polling */ dequeue_resource(); free_resource(); } } rte_data_structure_add () { if (no_free_resources) reclaim_resource(); /* Reclaim the exact resources needed */ data_structure_add_entry(); } 8
Resource Reclamation Proposal for DPDK • • Initialization • Responsibility - Application/main thread • Allocating RCU variable • Registering reader threads • Provide the RCU variable to the data structure library Quiescent State reporting • Responsibility - Application/reader threads • Provides flexibility to the application 9
Resource Reclamation Proposal for DPDK • Resource Reclamation • • Responsibility – Data structure library Ø This removes a significant burden from the application Ø No code changes to application’s writer thread Provide an API to register the RCU variable to use Ø • Create a defer queue to store the deleted resource and token Augment data structure delete entry API Ø Start the grace period after deleting the resource by calling rte_rcu_qsbr_start Ø If the defer queue is full – Reclaim resources Ø Otherwise, enqueue the deleted resource and token to the defer queue 10
Resource Reclamation Proposal for DPDK • Resource Reclamation (continued) • • Augment data structure add entry API Ø If there are no free resources – Reclaim resources Ø Add the entry to the data structure Reclaim resources Ø Peek the token at the head of the defer queue Ø Use non-blocking rte_rcu_qsbr_check API to query the quiescent state Ø If success, dequeue the resource/token from defer queue and free the resource 11
Resource Reclamation Proposal for DPDK • Shutdown • Responsibility – Application and Data structure library • Application • Ø Ensure reader threads are not using the data structure Ø Unregister the reader threads Data structure library Ø Reclaim all the resources on defer queue 12
Performance • Test setup • LPM library integrated with DPDK RCU library • 1 writer thread, 42 M adds/deletes routes with prefix length > 24 11 reader threads report the quiescent state status every 1024 lookups • • Numbers • • Without RCU integration: 2484. 4 cycles With RCU integration: 2517. 25 cycles (1. 3%) 13
Next Steps • New APIs • Provide APIs in rte_rcu for common functionality Ø Create defer queue (rte_rcu_qsbr_dq_create, rte_rcu_qsbr_delete) Ø Push resources to defer queue (rte_rcu_qsbr_dq_enqueue) Ø Reclaim resources (rte_rcu_qsbr_dq_reclaim) 14
Thanks to • Ruifeng Wang – Integrating RCU with LPM • Dharmik Thakkar – Integrating RCU with Hash 15
Thank you Questions?
- Slides: 16