The ROOT System A Data Access Analysis Framework








































- Slides: 40

The ROOT System A Data Access & Analysis Framework Trees 4 -5 -6 February 2003 René Brun/EP http: //root. cern. ch ROOT courses 1

R. Brun LCG ROOT corses 2

Memory <--> Tree Each Node is a branch in the Tree Memory 0 T. Get. Entry(6) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 T. Fill() T R. Brun LCG ROOT corses 3

Tree Creation Example A few lines of code to create a Tree for structures that may be very complex R. Brun LCG ROOT corses 4

8 leaves of branch Electrons A double-click to histogram the leaf 8 Branches of T R. Brun LCG ROOT corses 5

The Tree Viewer & Analyzer A very powerful class supporting complex cuts, event lists, 1 -d, 2 -d, 3 -d views parallelism R. Brun LCG ROOT corses 6

Tree Friends 01 23 45 67 89 10 11 12 13 14 15 16 17 18 R. Brun LCG 0 1 2 3 4 5 6 7 8 Entry # 8 01 23 45 67 89 10 11 12 13 14 15 16 17 18 Public User read Write ROOT corses 7

Tree Friends Analysis group protected Collaboration-wide public read user private Processing time independent of the number of friends unlike table joins in RDBMS x Root > TFile f 1(“tree 1. root”); Root > tree. Add. Friend(“tree 2”, “tree 2. root”) Root > tree. Add. Friend(“tree 3”, “tree 3. root”); Root > tree. Draw(“x: a”, ”k<c”); Root > tree. Draw(“x: tree 2. x”, ”sqrt(p)<b”); R. Brun LCG ROOT corses 8

Chains of Trees n n A TChain is a collection of Trees. Same semantics for TChains and TTrees n root >. x h 1 chain. C n root > chain. Process(“h 1 analysis. C”) { //creates a TChain to be used by the h 1 analysis. C class //the symbol H 1 must point to a directory where the H 1 data sets //have been installed TChain chain("h 42"); chain. Add("$H 1/dstarmb. root"); chain. Add("$H 1/dstarp 1 a. root"); chain. Add("$H 1/dstarp 1 b. root"); chain. Add("$H 1/dstarp 2. root"); } R. Brun LCG ROOT corses 9

Ntuples and Trees n n n Ntuples n support PAW-like ntuples and functions n PAW ntuples/histograms can be imported Trees n Extension of Ntuples for Objects n Collection of branches (branch has its own buffer) n Can input partial Event n Can have several Trees in parallel Chains = collections of Trees R. Brun LCG ROOT corses 10

Why Trees ? n n Any object deriving from TObject can be written to a file with an associated key with object. Write() However each key has an overhead in the directory structure in memory (about 60 bytes). Object. Write is very convenient for objects like histograms, detector objects, calibrations, but not for event objects. R. Brun LCG ROOT corses 11

Why Trees ? n n n Trees have been designed to support very large collections of objects. The overhead in memory is in general less than 4 bytes per entry. Trees allow direct and random access to any entry (sequential access is the best) Trees have branches and leaves. One can read a subset of all branches. This can speed-up considerably the data analysis processes. R. Brun LCG ROOT corses 12

Adding a Branch Many Branch constructors Only a few shown here n n n Branch name Class name Address of the pointer to the Object (descendant of TObject) Buffer size (default = 32, 000) Split level (default = 1) Event *event = new Event(); my. Tree->Branch(”e. Branch", "Event", &event, 64000, 1); R. Brun LCG ROOT corses 13

Splitting a Branch Setting the split level (default = 1) Split level = 0 Split level = 1 Example: tree->Branch("Ev. Br", "Event", &ev, 64000, 0); R. Brun LCG ROOT corses 14

Adding Branches with a List of Variables n n Branch name Address: the address of the first item of a structure. Leaflist: all variable names and types Order the variables according to their size Example TBranch *b = tree->Branch ("Ev_Branch", &event, "ntrack/I: nseg: nvtex: flag/i: temp/F"); R. Brun LCG ROOT corses 15

Why Trees ? n n n PAW ntuples are a special case of Trees are designed to work with complex event objects. High level functions like TTree: : Draw loop on all entries with selection expressions. Trees can be browsed via TBrowser Trees can be analized via TTree. Viewer The PROOF system is designed to process chains of Trees in parallel in a GRID environment R. Brun LCG ROOT corses 16

Create a TTree Object A tree is a list of branches. The TTree Constructor: n n Tree Name (e. g. "my. Tree") Tree Title TTree *tree = new TTree("T", "A ROOT tree"); R. Brun LCG ROOT corses 17

ROOT I/O - Split - multifile Object in Object memory in memory Streamer Object in memory File 1 TAGs File 2 Tapes File 3 R. Brun LCG ROOT corses 18

Serial mode Split mode R. Brun LCG ROOT corses 19

Str uc tu re d ve sup esig ry po ne lar dt rt ge o DB s R. Brun LCG ROOT corses 20

The Event class Event : public TObject { private: char Int_t UInt_t Float_t Event. Header TClones. Array TRef TH 1 F f. Type[20]; f. Ntrack; f. Nseg; f. Nvertex; f. Flag; f. Temperature; f. Measures[10]; f. Matrix[4][4]; *f. Closest. Distance; f. Evt. Hdr; *f. Tracks; *f. High. Pt; *f. Muons; f. Last. Track; *f. H; //event type //Number of tracks //Number of track segments //[f. Nvertex] //->array with all tracks //array of High Pt tracks only //array of Muon tracks only //reference pointer to last track //-> class Event. Header { private: Int_t R. Brun LCG f. Evt. Num; f. Run; f. Date; See $ROOTSYS/test/Event. h ROOT corses 21

The Track class Track : public TObject { private: Float_t Float_t Float_t Float_t Int_t Short_t R. Brun LCG f. Px; f. Py; f. Pz; f. Random; f. Mass 2; f. Bx; f. By; f. Mean. Charge; f. Xfirst; f. Xlast; f. Yfirst; f. Ylast; f. Zfirst; f. Zlast; f. Charge; f. Vertex[3]; f. Npoint; f. Valid; //X component of the momentum //Y component of the momentum //Z component of the momentum //A random track quantity //The mass square of this particle //X intercept at the vertex //Y intercept at the vertex //Mean charge deposition of all hits //X coordinate of the first point //X coordinate of the last point //Y coordinate of the first point //Y coordinate of the last point //Z coordinate of the first point //Z coordinate of the last point //Charge of this track //Track vertex position //Number of points for this track //Validity criterion ROOT corses 22

Event Builder void Event: : Build(Int_t ev, Int_ntrack, Float_t ptmin) { Clear(); ………. . for (Int_t t = 0; t < ntrack; t++) Add. Track(random, ptmin); } Track *Event: : Add. Track(Float_t random, Float_t ptmin) { // // // Add a new track to the list of tracks for this event. To avoid calling the very time consuming operator new for each track, the standard but not well know C++ operator "new with placement" is called. If tracks[i] is 0, a new Track object will be created otherwise the previous Track[i] will be overwritten. TClones. Array &tracks = *f. Tracks; Track *track = new(tracks[f. Ntrack++]) Track(random); //Save reference to last Track in the collection of Tracks f. Last. Track = track; //Save reference in f. High. Pt if track is a high Pt track if (track->Get. Pt() > ptmin) f. High. Pt->Add(track); //Save reference in f. Muons if track is a muon candidate if (track->Get. Mass 2() < 0. 11) f. Muons->Add(track); return track; } R. Brun LCG ROOT corses 23

Tree example Event (write) void demoe(int nevents) { //load shared lib with the Event class g. System->Load("$ROOTSYS/test/lib. Event"); //create a new ROOT file TFile f("demoe. root", ”new"); All the examples can be executed with CINT or the compiler root >. x demoe. C++ //Create a ROOT Tree with one single top level branch int split = 99; //try also split=1 and split=0 int bufsize = 16000; Event *event = new Event; TTree T("T", "Event demo tree"); T. Branch("event", "Event", &event, bufsize, split); //Build Event in a loop and fill the Tree for (int i=0; i<nevents; i++) { event->Build(i); T. Fill(); } T. Print(); T. Write(); //Print Tree statistics //Write Tree header to the file } R. Brun LCG ROOT corses 24

Tree example Event (read 1) void demoer() { //load shared lib with the Event class g. System->Load("$ROOTSYS/test/lib. Event"); //connect ROOT file TFile *f = new TFile("demoe. root"); //Read Tree header and set top branch address Event *event = 0; TTree *T = (TTree*)f->Get("T"); T->Set. Branch. Address("event", &event); //Loop on events and fill an histogram TH 1 F *h = new TH 1 F("hntrack", "Number of tracks", 100, 580, 620); int nevents = (int)T->Get. Entries(); for (int i=0; i<nevents; i++) { T->Get. Entry(i); Rebuild the full event h->Fill(event->Get. Ntrack()); in memory } h->Draw(); } R. Brun LCG ROOT corses 25

Tree example Event (read 2) void demoer 2() { //load shared lib with the Event class g. System->Load("$ROOTSYS/test/lib. Event"); //connect ROOT file TFile *f = new TFile("demoe. root"); //Read Tree header and set top branch address Event *event = 0; TTree *T = (TTree*)f->Get("T"); T->Set. Branch. Address("event", &event); Tbranch *bntrack = T->Get. Branch(“f. Ntrack”); //Loop on events and fill an histogram TH 1 F *h = new TH 1 F("hntrack", "Number of tracks", 100, 580, 620); int nevents = (int)T->Get. Entries(); for (int i=0; i<nevents; i++) { bntrack->Get. Entry(i); Read only h->Fill(event->Get. Ntrack()); one branch } Much faster ! h->Draw(); } R. Brun LCG ROOT corses 26

Tree example Event (read 3) void demoer 3() { //load shared lib with the Event class g. System->Load("$ROOTSYS/test/lib. Event"); //connect ROOT file TFile *f = new TFile("demoe. root"); //Read Tree header TTree *T = (TTree*)f->Get("T"); //Histogram number of tracks via the Tree. Player T->Draw(“event->Get. Ntrack()”); } R. Brun LCG ROOT corses 27

Writing CMS PSim. Hit in a Tree void demo 3() { //create a new ROOT file TFile f("demo 3. root", "recreate"); //Create a ROOT Tree with one single top level branch int split = 99; //you can try split=1 and split=0 int bufsize = 16000; PSim. Hit *hit = 0; TTree T("T", "CMS demo tree"); T. Branch("hit", "PSim. Hit", &hit, bufsize, split ); //Create hits in a loop and fill the Tree TRandom r; for (int i=0; i<50000; i++) { delete hit; Local 3 DPoint pentry(r. Gaus(0, 1), r. Gaus(0, 10)); Local 3 DPoint pexit (r. Gaus(0, 3), r. Gaus(50, 20)); float pabs = 100*r. Rndm(); float tof = r. Gaus(1 e-6, 1 e-8); float eloss= r. Landau(1 e-3, 1 e-7); int ptype = i%2; int det. Id = i%20; int track. Id= i%100; hit = new PSim. Hit(pentry, pexit, pabs, tof, eloss, ptype, det. Id, track. Id ); T. Fill(); } T. Print(); T. Write(); //Print Tree statistics //Write Tree header to the file } R. Brun LCG ROOT corses 28

Browsing the PSim. Hit Tree split = 0 *Tree : T : CMS demo tree * *Entries : 50000 : Total = 4703775 bytes File Size = 2207143 * * : : Tree compression factor = 2. 13 * *************************************** *Br 0 : hit : * *Entries : 50000 : Total Size= 4703775 bytes File Size = 2207143 * *Baskets : 295 : Basket Size= 16000 bytes Compression= 2. 13 * *. . . . . * 1 branch only R. Brun LCG ROOT corses 29

Browsing the PSim. Hit Tree split = 1 *************************************** *Tree : T : CMS demo tree * *Entries : 50000 : Total = 5258415 bytes File Size = 2021907 * * : : Tree compression factor = 2. 60 * *************************************** *Branch : hit * *Entries : 50000 : Branch. Element (see below) * *. . . . . * *Br 0 : TObject : * *Entries : 50000 : Total Size= 697816 bytes File Size = 79579 * *Baskets : 56 : Basket Size= 16000 bytes Compression= 8. 77 * *. . . . . * *Br 1 : the. Entry. Point : * *Entries : 50000 : Total Size= 1704437 bytes File Size = 750090 * *Baskets : 119 : Basket Size= 16000 bytes Compression= 2. 27 * *. . . . . * *Br 2 : the. Exit. Point : * *Entries : 50000 : Total Size= 1704318 bytes File Size = 744721 * *Baskets : 119 : Basket Size= 16000 bytes Compression= 2. 29 * *. . . . . * *Br 3 : the. Pabs : * *Entries : 50000 : Total Size= 191988 bytes File Size = 170871 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 12 * *. . . . . * *Br 4 : the. Tof : * *Entries : 50000 : Total Size= 191976 bytes File Size = 145548 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 32 * *. . . . . * *Br 5 : the. Energy. Loss : * *Entries : 50000 : Total Size= 191964 bytes File Size = 122761 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 56 * *. . . . . * *Br 6 : the. Particle. Type : * *Entries : 50000 : Total Size= 191988 bytes File Size = 1860 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 103. 22 * *. . . . . * *Br 7 : the. Det. Unit. Id : * *Entries : 50000 : Total Size= 191952 bytes File Size = 2298 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 83. 53 * *. . . . . * *Br 8 : the. Track. Id : * *Entries : 50000 : Total Size= 191976 bytes File Size = 4179 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 45. 94 * *. . . . . * * R. Brun LCG ROOT corses 9 branches 30

Browsing the PSim. Hit Tree split = 99 *************************************** *Tree : T : CMS demo tree * *Entries : 50000 : Total = 2687592 bytes File Size = 1509041 * * : : Tree compression factor = 1. 78 * *************************************** *Branch : hit * *Entries : 50000 : Branch. Element (see below) * *. . . . . * *Br 0 : f. Unique. ID : * *Entries : 50000 : Total Size= 191964 bytes File Size = 1272 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 150. 92 * *. . . . . * *Br 1 : f. Bits : * *Entries : 50000 : Total Size= 191964 bytes File Size = 1260 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 152. 35 * *. . . . . * *Br 2 : the. Entry. Point : * *Entries : 50000 : Total Size= 0 bytes File Size = 0 * *Baskets : 0 : Basket Size= 16000 bytes Compression= 1. 00 * *. . . . . * *Br 3 : the. Entry. Point. the. Vector. the. X : * *Entries : 50000 : Total Size= 191952 bytes File Size = 177959 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 08 * *. . . . . * *Br 4 : the. Entry. Point. the. Vector. the. Y : * *Entries : 50000 : Total Size= 191952 bytes File Size = 177934 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 08 * *. . . . . * *Br 5 : the. Entry. Point. the. Vector. the. Z : * *Entries : 50000 : Total Size= 191952 bytes File Size = 178312 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 08 * *. . . . . * *Br 6 : the. Exit. Point : * *Entries : 50000 : Total Size= 0 bytes File Size = 0 * *Baskets : 0 : Basket Size= 16000 bytes Compression= 1. 00 * *. . . . . * *Br 7 : the. Exit. Point. the. Vector. the. X : * *Entries : 50000 : Total Size= 191988 bytes File Size = 178060 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 08 * *. . . . . * *Br 8 : the. Exit. Point. the. Vector. the. Y : * *Entries : 50000 : Total Size= 191988 bytes File Size = 178072 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 08 * *. . . . . * *Br 9 : the. Exit. Point. the. Vector. the. Z : * *Entries : 50000 : Total Size= 191988 bytes File Size = 168655 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 14 * *. . . . . * *Br 10 : the. Pabs : * *Entries : 50000 : Total Size= 191988 bytes File Size = 170871 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 12 * *. . . . . * *Br 11 : the. Tof : * *Entries : 50000 : Total Size= 191976 bytes File Size = 145548 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 32 * *. . . . . * *Br 12 : the. Energy. Loss : * *Entries : 50000 : Total Size= 191964 bytes File Size = 122761 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 1. 56 * *. . . . . * *Br 13 : the. Particle. Type : * *Entries : 50000 : Total Size= 191988 bytes File Size = 1860 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 103. 22 * *. . . . . * *Br 14 : the. Det. Unit. Id : * *Entries : 50000 : Total Size= 191952 bytes File Size = 2298 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 83. 53 * Double click produces this histogram *. . . . . * *Br 15 : the. Track. Id : * *Entries : 50000 : Total Size= 191976 bytes File Size = 4179 * *Baskets : 12 : Basket Size= 16000 bytes Compression= 45. 94 * *. . . . . * R. Brun LCG 16 branches ROOT corses 31

Collections of Hits n A more realistic Tree will have n A collection of Detectors n Each detector one or more collection of hits R. Brun LCG ROOT corses 32

19 leaves in branch f. Dele 36 branches in Tree T R. Brun LCG ROOT corses 33

8 leaves of branch Electrons A double-click to histogram the leaf 8 Branches of T R. Brun LCG ROOT corses 34

The Tree Viewer & Analyzer A very powerful class supporting complex cuts, event lists, 1 -d, 2 -d, 3 -d views parallelism R. Brun LCG ROOT corses 35

Chains Scenario: Perform an analysis using multiple ROOT files. All files are of the same structure and have the Chains same tree. are collections of chains or files Chains can be built automatically by quering the run/file catalog R. Brun LCG ROOT corses 36

The “No Shared Library” case n n n n There are many applications for which it does not make sense to read data without the code of the corresponding classes. In true OO, you want to exploit Data Hiding and rely on the functional interface. However, there also cases where the functional interface is not necessary (PAW ntuples). It is nice to be able to browse any type of file without any code. May be you cannot do much, but it gives some confidence that you can always read your data sets. We have seen a religious debate on this subject. Our conclusion was that we had to support these two modes of operation. Support for the “No Shared Lib case” is non trivial R. Brun LCG ROOT corses 37

read/query Trees without the classes R. Brun LCG ROOT corses 38

TFile: : Make. Project Generate the classes header files Compile them make a shared lib link the shared lib R. Brun LCG ROOT corses 39

TFile: : Make. Project All necessary header files are included Comments preserved Can do I/O Inspect Browse, etc R. Brun LCG ROOT corses 40