Lori Pollock Professor CIS Program Analysis Software Development

  • Slides: 24
Download presentation
Lori Pollock Professor, CIS Program Analysis, Software Development & Maintenance Tools, Optimizing Compilers ‘

Lori Pollock Professor, CIS Program Analysis, Software Development & Maintenance Tools, Optimizing Compilers ‘ 81 ’ 81 -’ 86 B. S. CS and Econ, Allegheny Ph. D in CS, U of Pittsburgh Married Mark ’ 86 -’ 90 Assistant Prof, Rice U Lauren ‘ 88; Lindsay ‘ 90 ’ 91 Assistant, Associate, Full Prof UD CIS Matt ’ 95 Today: Lauren & Lindsay here at UD, Matt 14, 3 Ph. D students and a few undergraduate researchers

What I do here at UD • Research – Software Engineering and Compilation Lab

What I do here at UD • Research – Software Engineering and Compilation Lab (Hiperspace) • 213 Smith Hall – Collaborations • Vijay Shanker (UD CIS), Terry Harvey (UD CIS), Lisa Marvel (Army Research Lab), Martin Swany (UD CIS), Guang Gao (UD ECE) – Funding • Primarily NSF grants; previously some Army funding • Graduate Teaching – – CISC 672 Compilers CISC 673 Program Analysis and Transformations CISC 879 Software Testing and Maintenance CISC 879 Software Tools and Environments • Undergraduates – Programming XO laptops for local middle school teachers – Study abroad programs

What I do outside UD • Associate Editor, Transactions on Software Engineering and Methodology

What I do outside UD • Associate Editor, Transactions on Software Engineering and Methodology (TOSEM) • Computing Research Association (CRA)’s Committee on the Status of Women in Computer Research (CRA-W) • Mentoring – speaker at mentoring workshops for undergrads, assistant and associate profs, and industry lab researchers • Program committees, conf org, NSF panels, paper reviews, … (typical of university researchers)

Ph. D Students in Training Antony Danalis Ph. D Emily Gibson Hill Ph. D

Ph. D Students in Training Antony Danalis Ph. D Emily Gibson Hill Ph. D Giri Sridhara Ph. D Masters: Divya Muppaneni Current Undergraduates: Eric Enslen, Sana Malik, Katie Baldwin

Recently Completed Ph. D 2007 -08 David Shepherd Postdoc, Startup Ben Breech Postdoc, Nasa

Recently Completed Ph. D 2007 -08 David Shepherd Postdoc, Startup Ben Breech Postdoc, Nasa Sara Sprenkle Assistant Prof Washington & Lee U Mike Jochen Assistant Prof East Stroudsburg U

Overview: Research Projects Natural Language Analysis of Programs Testing Web Applications Collaboration with Sara

Overview: Research Projects Natural Language Analysis of Programs Testing Web Applications Collaboration with Sara Sprenkle Emily, Giri Program Analysis Compiler Technology Optimization of Cluster Parallel Programs Antony Ben Runtime Test Generation via Dynamic Compilers Software Tools…………. . Testing…. . Compilers…. ……Parallel Computing

Optimizing Cluster Parallel Programs Research Problem - How can scientific codes be scaled to

Optimizing Cluster Parallel Programs Research Problem - How can scientific codes be scaled to a cluster of many CPUs? Major Challenge – Communication Costs Approach and Contributions: An integrated system to hide communication latency -Surveyor: Collect “knowledge” of cluster -Compiler: analyze dependencies and transform to create maximal communication/computation overlap -Communication Library: Use a companion library to MPI

ASPh. ALT: Automatic System for Parallel App. Lication Transformations Contribution: FIRST to cluster-optimize MPI

ASPh. ALT: Automatic System for Parallel App. Lication Transformations Contribution: FIRST to cluster-optimize MPI codes

Testing Web Applications Web Application Structure Client (HTML) Server (Java) Browser Front End Database

Testing Web Applications Web Application Structure Client (HTML) Server (Java) Browser Front End Database (My. Sql) Back End • Combination of – Stand-alone applications – GUIs and Database applications – Distributed applications • Numerous technologies and components

Traditional Software Testing Process Hard to obtain when testing web applications!! Application Representation Test

Traditional Software Testing Process Hard to obtain when testing web applications!! Application Representation Test Case User-session-based Testing Application Specification Application Implementation Expected Results Generator Test Cases Replay Tool Actual Results Oracle Pass/ Fail

User-session-based Testing Process User 1: register. jsp? name=ss&pass=tst User-session-based login. jsp? name=ss&pass=tst logout. jsp

User-session-based Testing Process User 1: register. jsp? name=ss&pass=tst User-session-based login. jsp? name=ss&pass=tst logout. jsp Users Beta Web Application (v. 0. 9) Deployment Web Application Implementation (v. 1. 0) Expecte d Results User Log User Sessions Create/Reduce Create test Requests cases Test Cases Replay Tool Actual Results Oracle Pass/ Fail

Maintenance Testing for Web Applications Research Problem: How can we exploit user session logging

Maintenance Testing for Web Applications Research Problem: How can we exploit user session logging for testing of web applications after initial deployment, with minimal tester effort? Contributions: Scalable, practical, automated structural testing framework for web applications * Test case generation * Test suite reduction * Test oracles * Test coverage criteria in terms of URLs, parameters, values

Analyzing the Names in Software Research Problem - 60 -90% software costs are in

Analyzing the Names in Software Research Problem - 60 -90% software costs are in reading and navigating large software systems to fix bugs and add new features. Can we help with automation of search, navigation, location of relevant code? - Key: Programmers leave clues of their intent as they choose names. Focus on actions -Correspond to verbs -Verbs need Direct Object - Phrases more useful Proposed Approach – Develop, extend, and apply natural language-based analysis to the identifier names and comments Contribution - Aid understanding, debugging, maintenance, development

Our Research Focus and Impact Search Exploration Understanding … Software Maintenance Tools NLPA Natural

Our Research Focus and Impact Search Exploration Understanding … Software Maintenance Tools NLPA Natural Language Analysis Word splitting Word relations (synonyms, antonyms, … Part of speech tagging Abbreviations…

How do SE tool users search code now? Determine Relevance of Results (Re)formulate Query

How do SE tool users search code now? Determine Relevance of Results (Re)formulate Query User Search Results Search Method 1. User formulates query 2. Query executed by search method 3. User views search results 4. Repeat as necessary Source Code Our focus: Natural language (NL) queries User faces 2 challenges: • Decide what query words to search for • Determine whether the results are relevant “compile report” vs. “compil*report”

Our Contextual Search Process (Re)formulate Query Search Method Source Code Information n Extraction Process

Our Contextual Search Process (Re)formulate Query Search Method Source Code Information n Extraction Process Determine Relevance of Results User Search Method NL Phrase Mapping Partial Source Phrase Code Matching Hierarchical Search Results

Another of our Tools: Dora the Program Explorer* Query Natural Language Query • Maintenance

Another of our Tools: Dora the Program Explorer* Query Natural Language Query • Maintenance request • Expert knowledge • Query expansion Program Structure • Representation • Current: call graph • Seed starting point Dora Relevant Neighborhood • Subgraph relevant to query * Dora Relevant Neighborhood comes from exploradora, the Spanish word for a female explorer.

1. 2. 3. 4. Illustrating some issues: Extracting Clues from Signatures Split Name into

1. 2. 3. 4. Illustrating some issues: Extracting Clues from Signatures Split Name into Words Part-of-speech tag method name Chunk method name Identify Verb and Direct-Object (DO) public User. List get. User. List. From. File( String path ) throws IOException { get try { User List From File <verb> <adj> <noun> <prep> <noun> File tmp. File = new File( path ); return parse. File(tmp. File); get POS Tag Chunk User List From File <verb phrase> <noun phrase> <prep phrase> } catch( java. io. IOException e ) { throw new IOr. Exception( ”User. List format issue" + path + " file " + e );

Developing Basic NL Analyses Search Exploration Understanding … Software Maintenance Tools NLPA Natural Language

Developing Basic NL Analyses Search Exploration Understanding … Software Maintenance Tools NLPA Natural Language Analysis Word splitting Word relations (synonyms, antonyms, … Part of speech tagging Abbreviations…

 • • • Automatic Abbreviation Expansion Don’t want to miss relevant code with

• • • Automatic Abbreviation Expansion Don’t want to miss relevant code with abbreviations Given a code segment, identify character sequences that are short forms and determine long form non-dictionary word no boundary Approach: Mine expansions from code [MSR 08] 1. Split Identifiers: 2. Identify non-dictionary words 3. Determine long form

Issues with a Simple Dictionary Approach • Manually create a lookup table of common

Issues with a Simple Dictionary Approach • Manually create a lookup table of common abbreviations in code - Vocabulary evolves over time, must maintain table - Same abbreviation can have different expansions Control Flow context: Graph depending on domain AND ? cfg Context-Free Grammar configuration configure

What have we learned overall? • Evaluation studies indicate Natural language analysis has far

What have we learned overall? • Evaluation studies indicate Natural language analysis has far more potential to improve software maintenance tools than we initially believed • Existing technology falls short Synonyms, collocations, morphology, word frequencies, part-of-speech tagging, AOIG • Keys to further success Improve recall Extract additional NL clues

What are we doing now? • Emily – Developing a general software word usage

What are we doing now? • Emily – Developing a general software word usage model – Implementing SWUM – Evaluating for search • Giri – Developing techniques for automatic comment generation • Which statements to include in summary content? • How to generate phrases for that content? – Providing feedback on SWUM refinement • Divya – Analyzing specific Java statement structures – Developing templates for comment phrase generation

Overview: Research Projects Natural Language Analysis of Programs Testing Web Applications Collaboration with Sara

Overview: Research Projects Natural Language Analysis of Programs Testing Web Applications Collaboration with Sara Sprenkle Emily, Giri Program Analysis Compiler Technology Optimization of Cluster Parallel Programs Antony NEW… Natural Language Analysis of Parallel Programs Software Tools…………. . Testing…. . Compilers…. ……Parallel Computing