Double Chooz ALICE Irfu Edelweiss HESS Herschel CMS
Double Chooz ALICE Irfu Edelweiss HESS Herschel CMS Interpreting radiations from the Universe. Site report 2016 IRFU JOËL SURGET / PIERRE-FRANCOIS HONORÉ
NEW ORGANISATION (1/1/2016) CEA ( French Alternative Energies and Atomic Energy Commission ) Basic Research Division Materials Life Sciences 6200 people Sciences Division 15 Institutes/Labs Division Nuclear Energy Division Technologies Division Defense Division IRFU 800 people Site report 2016 IRFU
IRFU FULL MEMBER OF « UNIVERSITY PARIS SACLAY »
SUMMARY Unix GRID Infrastructure Windows Team Site report 2016 IRFU|
UNIX EVOLUTIONS Cobbler for Linux installation - supported: Cent. OS, Ubuntu - laptop encryption with LUKS Puppet / Foreman for configuration management - git + r 10 k - security update, software deployment - Evaluating Mac OSX integration with Munki Open. LDAP & Active Directory File Server : GPFS to Ceph ? - testing mid range multisite Ceph cluster Site report 2016 IRFU
OPENLDAP : MECHANISM Open. LDAP runs on Hyper-V cluster 4 NIS = 1500 accounts => conflicts : login and uid home directory migration Site report 2016 IRFU
CEPH INFRASTRUCTURE 7 servers : Dell Power. Edge 730 XD - 2 x Intel E 5 -2620 v 3 12 cores, 64 GB - 16 x 4 TB HDD Site report 2016 IRFU
GRID • IPV 6 Finally “un-white-listed” on 12/04/2016. Suffering major performance issues : 200 mbits max. Anyone with IPv 6 XP on Cisco Catalyst 6500 routing ? IPv 4 : IPv 6 : • • • New computing resources 8 Dell C 6320 : Intel Xeon E 5 -2650 v 3 with 128 GB memory Still not production since 11/2015 : the 10 G cards cannot be plugged took 3 months to get the cards but still missing 1 metallic piece for the chassis • • • 11 DELL Power. Edge R 730 xd 16 x 4 TB HDD / server Try Ceph (without SSD). Doesn’t scale well: Read 8. 5 GB/s Write 3. 3 GB/s • • New storage facilities Network Deployed a 80 G (2 x 40 LACP) « backbone » between computing rooms Site report 2016 IRFU
CEPH SCALING : 64 THREADS/4 M BLOCKS/JEMALLOC/15 CLIENTS/WRITE PERF WRITEperf 3500 3269. 83 3055. 3 3000 2913. 88 Bandwidth (MB) 2500 2499. 54 2151. 27 2016. 83 2000 2902. 36 2620. 45 2588. 93 2470. 87 2444. 88 2080. 73 2188. 82 2608. 99 2508. 43 1473. 53 2685. 28 2599. 33 2478. 88 2335. 24 2290. 93 2035. 01 1906. 25 1810. 6 1802. 78 1519. 97 1493. 52 1503. 97 1520. 82 1347. 58 1302. 54 1202. 73 1126. 66 1054. 87 1000 2901. 35 2809. 66 2767. 51 2653. 65 2436. 43 1689. 78 1500 3136. 8 3134. 75 1823. 45 1783. 1 1768. 37 1733. 16 1618. 76 1506. 87 1564. 77 1371. 18 1590. 64 1439. 09 Added 6 R 510 779. 664 644. 082 611. 054 604. 861 500 10 R 730 xd 0 30 60 90 120 150 160 178 196 214 232 250 268 286 304 Ndiscs (added 3 by 3, on all servers somultaneously) WR[EC_ISA_4_1]64_thr_4096 K WR[EC_ISA_5_1]64_thr_4096 K WR[RBD_2 REPL]64_thr_4096 K WR[RBD_TIER_EC]64_thr_4096 K ceph. com : « petabyte scaling » . . . . ?
GRID • Suffered many power cuts since last time • And we now think we understood why: failsafe - automation 0 log, 0 doc 0 problem Site report 2016 IRFU
GRID - OPS • puppet • • reinstalled on Cent. OS 7 : huge performance boost (ruby 2) started fixing all modules for puppet 4 : again, huge boost foreseen • OS • moved all SL 6 to SL 6. 7 because of epel/rhel policy, moving rpms to « latest only » . • Monitoring Installed collectd with graphite exporter And this destroyed our graphite. (10 k+ IOPS just for a few collectd) • • • Installed prometheus as a « collectd replacement » • keeping graphite for now, (long term low IOPS graphs) • 10000+ => 300 IOPS (VM + Ceph +ssd pool) (~350 servers, 24 K metrics/s) • NOT influxdb : clustering just went closed source. https: //prometheus. io/ a c ool rac ing jus k t di ed
MICROSOFT Windows 10 under study Create a secure official CEA image SCCM 2016 (System Center Configuration Manager): this summer Sharepoint 2013: Web site and collaborative site CEA (IRFU) Education-Research Identity Federation (Renater): this summer Site report 2016 IRFU
INFRASTRUCTURE 3 rooms mostly finished and up to date 126 square meters 31 water cooled racks (20 and 40 k. W) Cooling installation : 500 k. W (4 groups) New 600 A low voltage panel installed 7 years of work Site report 2016 IRFU
IT TEAM 2014/2015 retired personnel : Pierrick Micout, Joseph Le Foll 2 new engineers in 2015 (one Windows, one Linux) 2 job vacancies ü one permanent position for support activities ü one 1 -year position (from June) for the Indigo Data. Cloud project Contact : http: //moorea. cea. fr or joel. surget@cea. fr Site report 2016 IRFU
Irfu Double Chooz ALICE Edelweiss HESS Herschel CMS Interpreting radiations from the Universe. Commissariat à l’énergie atomique et aux énergies alternatives Centre de Saclay | 91191 Gif-sur-Yvette Cedex Etablissement public à caractère industriel et commercial | RCS Paris B 775 685 019 Direction de la Recherche Fondamentale Institut de Recherche sur les lois Fondamentales de l’Univers
- Slides: 15