BECO review Looking back at LS 1 Delphine
BE-CO review Looking back at LS 1 Delphine Jacquet BE/OP/LHC Denis Cotte BE/OP/PS 1 CERN 1/12/2015 774 -2 -058
Outline • OP particularity • What worked well during LS 1 • What didn’t work well • Possible improvement for LS 2 2
OP particularity • OP is involved with software modifications in 2 ways • As operational applications programmer: operational application need to be maintained, and are directly affected by the changes in all the CO layers (FESA, LSA, CMW, RDA etc. . ) • As user of the control system: at the machine start-up OP directly suffers the consequences if the software upgrades are not properly handled: • • debugging by OP during operation Missing functionalities Some applications not working at all Delays in the start-up 3
LINAC 2 Booster What worked well during LS 1 ? 4
Smooth upgrade good examples • Upgrade of the timing system in CPS/SPS: • New functionalities well discussed with OP, good collaboration. • New management of the coast • New management of the economy • New hardware • New version ready well before the SPS start-up Enough time to perform tests with OP, debug and correct if necessary • Migration to SL 4 J: tools (ant task) and documentation provided for an easy update of code. 5
Machine Control Coordinators (MCC) • One coordinator per machine to follow the control system renovation • Document EDMS with a description of the controls upgrade for each system. • Regular meetings for follow-up • Very useful from the PS/PSB point of view, collaboration worked very well. • No real interest from SPS and LHC point of view. 6
Dry runs • in PS/PSB • Expert presence to test the new software very much appreciated. • Very efficient, faster debugging, problems solved quickly • OP very involved with CO, good collaboration. • In LHC : • Tests of the operational scenario involving different equipment and systems • Requires an operational control system to be in place: • • Settings management (LSA) Logging Sequencer CMW, DIP, etc… • Control system already tested and debugged well before the start-up 7
LINAC 2 Booster What didn’t work well (or not as well) during LS 1 ? 8
FESA • FESA 3: • Should have been ready, debug and stable at the beginning of LS 1. LINAC 2 • fesa-code generation • functionality for application developer promoted by CO, and it was widely used for FESA 2. Booster • For FESA 3, this was not ready before the end of LS 1, difficult to get stable, was seen as low priority. • The mechanism was changed, new jars have now to be generated manually by the FESA developer, this is not systematically done. • Delay in the operational application development. • New tool to access FESA 3, but no way easy to know if a device was migrated or not. (painful in OP) • Migration FESA 2 – FESA 3 : no common rules or procedures (careful with change of device, property names, value types etc. . ) 9 • Avoid to much version number for a specific class or try to unify them. (ex: LTIM)
Lack of global and coherent planning • Applications had to be adapted several times : • • After the LSA API change: adapt code and test LINAC 2 Migration of a class to FESA 3: adapt the code, test. RDA 3 change, adapt the code, test Booster Change of a device name, etc… • Problem with jar compatibilities: pro jars provided by different product were no compatible preventing the application to start. (i. e. directory service) • The jar order depended on which console the program was running: application could work on some consoles not on others. (solved now). 10
Change of API cost a lot ! • LSA change of API : • Means huge work for OP to update and test all the applications. (orphan. LINAC 2 applications need to be taken over) • Was done early enough for all the applications to be updated for the start-up. Booster • Wiki page was created to help to adapt the application to the new API, but it was not always straightforward to find equivalent methods. • Nevertheless, the LSA team gave good support. • Logging changed API twice in a year! (and no reason clearly explained to the users) • JAPC: selector “xx. USER. ALL” not allowed, to be replaced by null. (every SPS applications impacted) • This changes of API should really be avoided, or better tools put in place to help with the client software update. 11
CCDB, LSA DB, working sets • A lot of renaming of CCDB devices, properties + cleaning of LSA database: lot needed to be re-imported into LSA, but no LINAC 2 automatic tool anymore! • We had to call support for each modification, heavy! Booster • Working set and knob: following the migration to LSA, configuration tool was available much too late. • need to use no-In. CA Working. Set to drive EIS devices in CPS complex. • good training once available. • Still waiting for Array 2 D compatibility in Working. Set. • Virtual devices works well to overcome this systemic problem 12
Tools, training • Lack of training on how to use CO tools and software. • i. e. how to use the LSA API to make a trim • i. e. how to find you way in the jungle of JAPC Values family (Map. Parameter. Value, Immutable. Discrete. Function, Scalar etc…) • How to use the timber API etc … • New application given sometime without any explanation how to use it (i. e. new RBAC roles app) • Poor documentation and lack of information on the web • No proper search tools on CO site to find anything • Wiki pages not easy to work with, difficult to find the right page , to know if the information are obsolete. • Sometime the information exist but nobody knows, or we don’t have right to read. • Have you ever tried to find how to use the dataviewer API? 13
Changes in CO organization • Change of software responsible : • not easy at the start-up to know who to contact LINAC 2 • Less efficiency at the beginning, especially when the problems needed to be solved urgently. Booster • CPS complex has lost CO piquet during LS 1. Necessity to analyze a bit more where a problem comes from before calling the CO expert. • Generalization of the use of • Op-issues : a very good tool to follow issues. (once the spam was removed) • Generic mailing list for support : we had to change our good old habits, but proved to be efficient. (but provide a clear list of the support mailing list available) 14
LINAC 2 Booster Possible improvement for LS 2 15
API changes, software upgrade Only backward compatible please! Provide a proper testing environment. 16
Planning • Planning of the released in a coherent way across systems. Have a clear strategy for us to know what will change and when. LINAC 2 • OP is responsible for high level operational application that depends Booster on all the CO layers : • we need the control system to be STABLE well before the start-up to have time to adapt and test out code. • This is also applicable for equipment groups. • OP involved into HW test: control system ready for devices to be controlled from the control room. • Be aware of the accelerators that are still in operation (CTF 3, LINAC 4 commissioning…) 17
Engineering change request (ECR) The ECR could be useful to document the major controls upgrades: LINAC 2 • CO 3 decides which upgrade requires documenting (ECR) • Form small teams to work out the specification and milestone Booster schedule: • Technical Leader (person from equipment group concerned) • BE-OP machine responsible/representative • BE-CO machine controls coordinator • Take into account “the big picture” and include all dependencies and steps • CCDB, LSA/In. CA, Appl. , Naming conventions, specialist and OP needs, …. • Write this up in an ECR type document for approval • Quite similar to the HW baseline ECRs, which should not be too heavy !!! 18
Push farther the collaboration OP/CO • Continue the good collaboration that was in place in LS 1 • More training and guidelines from CO • Training on CO products (LSA, JAPC, timber etc…) • Basic training on generic tools like Sonar, Bamboo, crucible… (improvement of code quality) • Work on common projects with OP programmers • Learn from each other • Code review • Improve the trust on each other 19
Conclusion • From OP point of view, most of the software upgrade have been handled properly and was no stopper for the machines start-up. • Thanks to a good follow-up by CO • Thanks to an excellent collaboration between the teams • The OP developers of operational application had some difficulties mainly due to • Non backward compatible API changes (LSA, timber, change of property, device names in FESA) • Lack of global planning that multiplied the work to adapt the soft at each new upgrade. • Lack of information and training 20 • Nevertheless CO gave enough support and the feeling is positive.
Conclusion • For LS 2, CO should take the same recipe, with some improvement on • Non backward compatible change to be avoided • Deliver stable version of low level software layers much earlier (FESA 3!) • Possibility to formalise and document the software upgrade thanks to the ECR. • Improve information availability and training of development teams • Enhance OP/CO collaboration with common development projects. 21
Thank You for your attention ! Questions ? 22
- Slides: 22