Clang Tutorial CS 453 Automated Software Testing 0

  • Slides: 16
Download presentation
Clang Tutorial, CS 453 Automated Software Testing 0 /15 How to build a program

Clang Tutorial, CS 453 Automated Software Testing 0 /15 How to build a program analysis tool using Clang • Initialization of Clang • Useful functions to print AST • Line number information of Stmt • Code modification using Rewriter • Converting Stmt into String • Obtaining Source. Location

Clang Tutorial, CS 453 Automated Software Testing 1 /15 Initialization of Clang • Initialization

Clang Tutorial, CS 453 Automated Software Testing 1 /15 Initialization of Clang • Initialization of Clang is complicated • To use Clang, many classes should be created and many functions should be called to initialize Clang environment • Ex) Complier. Instance, Target. Options, File. Manager, etc. • It is recommended to use the initialization part of the sample source code from the course homepage as is, and implement your own ASTConsumer and Recursive. ASTVisitor classes

Clang Tutorial, CS 453 Automated Software Testing 2 /15 Useful functions to print AST

Clang Tutorial, CS 453 Automated Software Testing 2 /15 Useful functions to print AST • dump() and dump. Color() in Stmt and Function. Decl to print AST • dump() shows AST rooted at Stmt or Function. Decl object • dump. Color() is similar to dump() but shows AST with syntax highlight • Example: dump. Color() of my. Print Function. Decl 0 x 368 a 1 e 0 <line: 6: 1> my. Print 'void (int)' |-Parm. Var. Decl 0 x 368 a 120 <line: 3: 14, col: 18> param 'int' `-Compound. Stmt 0 x 36 a 1828 <col: 25, line: 6: 1> `-If. Stmt 0 x 36 a 17 f 8 <line: 4: 3, line: 5: 24> |-<<<NULL>>> |-Binary. Operator 0 x 368 a 2 e 8 <line: 4: 7, col: 16> 'int' '==' | |-Implicit. Cast. Expr 0 x 368 a 2 d 0 <col: 7> 'int' <LValue. To. RValue> | | `-Decl. Ref. Expr 0 x 368 a 288 <col: 7> 'int' lvalue Parm. Var 0 x 368 a 120 'param' 'int' | `-Integer. Literal 0 x 368 a 2 b 0 <col: 16> 'int' 1 |-Call. Expr 0 x 368 a 4 e 0 <line: 5: 5, col: 24> 'int' | |-Implicit. Cast. Expr 0 x 368 a 4 c 8 <col: 5> 'int (*)()' <Function. To. Pointer. Decay> | | `-Decl. Ref. Expr 0 x 368 a 400 <col: 5> 'int ()' Function 0 x 368 a 360 'printf' 'int ()' | `-Implicit. Cast. Expr 0 x 36 a 17 e 0 <col: 12> 'char *' <Array. To. Pointer. Decay> | `-String. Literal 0 x 368 a 468 <col: 12> 'char [11]' lvalue "param is 1" `-<<<NULL>>>

Clang Tutorial, CS 453 Automated Software Testing 3 /15 Line number information of Stmt

Clang Tutorial, CS 453 Automated Software Testing 3 /15 Line number information of Stmt • A Source. Location object from get. Loc. Start() of Stmt has a line information • Source. Manager is used to get line and column information from Source. Location • In the initialization step, Source. Manager object is created • get. Expansion. Line. Number() and get. Expansion. Column. Number() in Source. Manager give line and column information, respectively bool Visit. Stmt(Stmt *s) { Source. Location start. Location = s->get. Loc. Start(); Source. Manager &srcmgr=m_srcmgr; //you can get Source. Manager from the initialization part unsigned int line. Num = srcmgr. get. Expansion. Line. Number(start. Location); unsigned int col. Num = srcmgr. get. Expansion. Column. Number(start. Location); … }

Clang Tutorial, CS 453 Automated Software Testing 4 /15 Code Modification using Rewriter •

Clang Tutorial, CS 453 Automated Software Testing 4 /15 Code Modification using Rewriter • You can modify code using Rewriter class • Rewriter has functions to insert, remove and replace code • Insert. Text. After(loc, str), Insert. Text. Before(loc, str), Remove. Text(loc, size), Replace. Text(…) , etc. where loc, str, size are a location (Source. Location), a string, and a size of statement to remove, respectively • Example: inserting a text before a condition in If. Stmt using Insert. Text. After() 1 bool My. ASTVisitor: : Visit. Stmt(Stmt *s) { 2 if (isa<If. Stmt>(s)) { 3 If. Stmt *if. Stmt = cast<If. Stmt>(s); 4 condition = if. Stmt->get. Cond(); 5 m_rewriter. Insert. Text. After(condition->get. Loc. Start(), "/*start of 6 cond*/"); 7 } } if( param == 1 ) if( /*start of cond*/param == 1 )

Clang Tutorial, CS 453 Automated Software Testing 5 /15 Output of Rewriter • Modified

Clang Tutorial, CS 453 Automated Software Testing 5 /15 Output of Rewriter • Modified code is obtained from a Rewriter. Buffer of Rewriter through get. Rewrite. Buffer. For() • Example code which writes modified code in output. txt • Parse. AST() modifies a target code as explained in the previous slides • The. Consumer contains a Rewriter instance The. Rewriter 1 int main(int argc, char *argv[]) { 2 … 3 Parse. AST(The. Comp. Inst. get. Preprocessor (), &The. Consumer, The. Comp. Inst. get. ASTContext()); 4 const Rewrite. Buffer *Rewrite. Buf = The. Rewriter. get. Rewrite. Buffer. For(Source. Mgr. get. Main. File. ID()); 5 ofstream output(“output. txt”); 6 output << string(Rewrite. Buf->begin(), Rewrite. Buf->end()); 7 output. close(); 8 }

Clang Tutorial, CS 453 Automated Software Testing 6 /15 Converting Stmt into String •

Clang Tutorial, CS 453 Automated Software Testing 6 /15 Converting Stmt into String • Convert. To. String(stmt) of Rewriter returns a string corresponding to Stmt • The returned string may not be exactly same to the original statement since Convert. To. String() prints a string using the Clang pretty printer • For example, Convert. To. String() will insert a space between an operand an operator a<100 Parst. AST Convert. To. String a < 100

Clang Tutorial, CS 453 Automated Software Testing 7 /15 Source. Location • To change

Clang Tutorial, CS 453 Automated Software Testing 7 /15 Source. Location • To change code, you need to specify where to change • Rewriter class requires a Source. Location class instance which contains location information • You can get a Source. Location instance by: • get. Loc. Start() and get. Loc. End() of Stmt which return a start and an end locations of Stmt instance respectively • find. Location. After. Token(loc, tok, … ) of Lexer which returns the location of the first token tok occurring right after loc • Lexer tokenizes a target code • Source. Location. get. Loc. With. Offset(offset, …) which returns location adjusted by the given offset

Clang Tutorial, CS 453 Automated Software Testing 8 /15 get. Loc. Start() and get.

Clang Tutorial, CS 453 Automated Software Testing 8 /15 get. Loc. Start() and get. Loc. End() • get. Loc. Start() returns the exact starting location of Stmt • get. Loc. End() returns the location of Stmt that corresponds to the last-1 th token’s ending location of Stmt • To get correct end location, you need to use Lexer class in addition • Example: get. Loc. Start() and get. Loc. End() results of If. Stmt condition get. Loc. Start() points to The last token of If. Stmt condition if (param == 1) get. Loc. End() points to the end of “==“ not “ 1”

Clang Tutorial, CS 453 Automated Software Testing 9 /15 find. Location. After. Token (1/2)

Clang Tutorial, CS 453 Automated Software Testing 9 /15 find. Location. After. Token (1/2) • Static function find. Location. After. Token(loc, Tkind, …) of Lexer returns the ending location of the first token of Tkind type after loc static Source. Location find. Location. After. Token (Source. Location loc, tok: : Token. Kind TKind, const Source. Manager &SM, const Lang. Options &Lang. Opts, bool Skip. Trailing. Whitespace. And. New. Line) • Use find. Location. After. Token to get a correct end location of Stmt • Example: finding a location of ‘)’ (tok: : r_paren) using find. Location. After. Token() to find the end of if condition 1 bool My. ASTVisitor: : Visit. Stmt(Stmt *s) { 2 if (isa<If. Stmt>(s)) { 3 If. Stmt *if. Stmt = cast<If. Stmt>(s); 4 condition = if. Stmt->get. Cond(); 5 Source. Location end. Of. Cond = clang: : Lexer: : find. Location. After. Token(condition-> get. Loc. End(), tok: : r_paren, m_source. Manager, m_lang. Options, false); 6 // end. Of. Cond points ‘)’ find. Location. After. Token 7 } if. Stmt->get. Cond()->get. Loc. End() ( , tok: : r_paran) 8 } if ( a + x > 3 )

Clang Tutorial, CS 453 Automated Software Testing 10 /15 find. Location. After. Token (2/2)

Clang Tutorial, CS 453 Automated Software Testing 10 /15 find. Location. After. Token (2/2) • You may find a location of other tokens by changing TKind parameter • List of useful enums for HW #3 Enum name Token character tok: : semi ; tok: : r_paren ) tok: : question ? tok: : r_brace } • The fourth parameter Lang. Options instance is obtained from get. Lang. Opts() of Compiler. Instance (see line 99 and line 106 of the appendix) • You can find Compiler. Instance in the initialization part of Clang

Clang Tutorial, CS 453 Automated Software Testing 11 /15 References • Clang, http: //clang.

Clang Tutorial, CS 453 Automated Software Testing 11 /15 References • Clang, http: //clang. llvm. org/ • Clang API Documentation, http: //clang. llvm. org/doxygen/ • How to parse C programs with clang: A tutorial in 9 parts, http: //amnoid. de/tmp/clangtut/tut. html

Clang Tutorial, CS 453 Automated Software Testing Appendix: Example Source Code (1/4) • This

Clang Tutorial, CS 453 Automated Software Testing Appendix: Example Source Code (1/4) • This program prints the name of declared functions and the class name of each Stmt in function bodies 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Print. Functions. c #include <cstdio> #include <string> #include <iostream> #include <sstream> #include <map> #include <utility> #include #include #include #include "clang/ASTConsumer. h" "clang/AST/Recursive. ASTVisitor. h" "clang/Basic/Diagnostic. h" "clang/Basic/File. Manager. h" "clang/Basic/Source. Manager. h" "clang/Basic/Target. Options. h" "clang/Basic/Target. Info. h" "clang/Frontend/Compiler. Instance. h" "clang/Lex/Preprocessor. h" "clang/Parse. AST. h" "clang/Rewrite/Core/Rewriter. h" "clang/Rewrite/Frontend/Rewriters. h" "llvm/Support/Host. h" "llvm/Support/raw_ostream. h" using namespace clang; using namespace std; class My. ASTVisitor : public Recursive. ASTVisitor<My. ASTVisitor> { public: 12 /15

Clang Tutorial, CS 453 Automated Software Testing Appendix: Example Source Code (2/4) 29 30

Clang Tutorial, CS 453 Automated Software Testing Appendix: Example Source Code (2/4) 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 bool Visit. Stmt(Stmt *s) { // Print name of sub-class of s printf("t%s n", s->get. Stmt. Class. Name() ); return true; } bool Visit. Function. Decl(Function. Decl *f) { // Print function name printf("%sn", f->get. Name()); return true; } }; class My. ASTConsumer : public ASTConsumer { public: My. ASTConsumer() : Visitor() //initialize My. ASTVisitor {} virtual bool Handle. Top. Level. Decl(Decl. Group. Ref DR) { for (Decl. Group. Ref: : iterator b = DR. begin(), e = DR. end(); b != e; ++b) { // Travel each function declaration using My. ASTVisitor. Traverse. Decl(*b); } return true; } private: My. ASTVisitor; }; int main(int argc, char *argv[]) { 13 /15

Clang Tutorial, CS 453 Automated Software Testing Appendix: Example Source Code (3/4) 64 65

Clang Tutorial, CS 453 Automated Software Testing Appendix: Example Source Code (3/4) 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 if (argc != 2) { llvm: : errs() << "Usage: Print. Functions <filename>n"; return 1; } // Compiler. Instance will hold the instance of the Clang compiler for us, // managing the various objects needed to run the compiler. Compiler. Instance The. Comp. Inst; // Diagnostics manage problems and issues in compile The. Comp. Inst. create. Diagnostics(NULL, false); // Set target platform options // Initialize target info with the default triple for our platform. Target. Options *TO = new Target. Options(); TO->Triple = llvm: : sys: : get. Default. Target. Triple(); Target. Info *TI = Target. Info: : Create. Target. Info(The. Comp. Inst. get. Diagnostics(), TO); The. Comp. Inst. set. Target(TI); // File. Manager supports for file system lookup, file system caching, and directory search management. The. Comp. Inst. create. File. Manager(); File. Manager &File. Mgr = The. Comp. Inst. get. File. Manager(); // Source. Manager handles loading and caching of source files into memory. The. Comp. Inst. create. Source. Manager(File. Mgr); Source. Manager &Source. Mgr = The. Comp. Inst. get. Source. Manager(); // Prreprocessor runs within a single source file The. Comp. Inst. create. Preprocessor(); // ASTContext holds long-lived AST nodes (such as types and decls). The. Comp. Inst. create. ASTContext(); // A Rewriter helps us manage the code rewriting task. Rewriter The. Rewriter; 14 /15

Clang Tutorial, CS 453 Automated Software Testing 15 /15 Appendix: Example Source Code (4/4)

Clang Tutorial, CS 453 Automated Software Testing 15 /15 Appendix: Example Source Code (4/4) 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 } The. Rewriter. set. Source. Mgr(Source. Mgr, The. Comp. Inst. get. Lang. Opts()); // Set the main file handled by the source manager to the input file. const File. Entry *File. In = File. Mgr. get. File(argv[1]); Source. Mgr. create. Main. File. ID(File. In); // Inform Diagnostics that processing of a source file is beginning. The. Comp. Inst. get. Diagnostic. Client(). Begin. Source. File(The. Comp. Inst. get. Lang. Opts(), &The. Comp. Inst. get. Preprocessor()); // Create an AST consumer instance which is going to get called by Parse. AST. My. ASTConsumer The. Consumer; // Parse the file to AST, registering our consumer as the AST consumer. Parse. AST(The. Comp. Inst. get. Preprocessor(), &The. Consumer, The. Comp. Inst. get. ASTContext()); return 0;