Automated Source Code Changes Classification for Effective Code
- Slides: 26
Automated Source Code Changes Classification for Effective Code Review and Analysis Evgeny G. Knyazev Senior developer «Transas Technologies» Post-graduate student SPb State University of Information Technologies, Fine Mechanics and Optics
Source Code Review Informal source code lookthrough trying to find different kind problems in it 2
Source Code Review Helps to… increase code quality find errors on early stages know all code of a system keep an eye on novices work 3
Source Control System and Code Changes Review Source Control System Change Request (Revision X) Review Source control system keeps development history It allows to review only changed code Developer 4
Code Change Review Example 5
Changes Review Task Complexity In large project a lot of changes need to be reviewed Project Size, LOC Observation Period ~1 month Changes Count Tortoise SVN ~ 200 thousand Navi. Manager ~ 250 thousand 22. 09. 2007 -22. 10. 2007 215 72 KDE ~ 4. 7 million 17. 09. 200714. 10. 2007 11841 (!) 6
The Solution Split changes into classes 2. Choose class for review 1. New Functionality Implementation Code Deletion Cosmetics Refactoring Bugfix 7
The solution (2) 3. Automate changes classification Source Control System Automated Changes Classifier Change Class Review Yes Is This Class Interesting ? Developer 8
Known Code Changes Classification Methods Changes Comments Classification ◦ “bug”, “fixed” – a bug fix ◦ “implement”, “feature” – new feature implementation Refactoring Search Using Changes Metrics ◦ Extract parent class ( DIT>0 и NOM<0, …) ◦ Move to other class ( DIT=0 и NOM<0, …) ◦ Split method ( NOM < T, . . . ) Difference Search in Semantic Graphs ◦ Build code graph before and after the change ◦ Generate transition script ◦ Search refactoring templates 9
Changes Metrics Clustering Method: Learning Phase 1. Learning Set Preparation 2. Expert Classification of Learning Set 3. Change Metrics Calculation 4. Change Metrics Vectors Clustering 5. Mapping Clusters to Expert Classes 10
Fuzzy Change Metrics Clustering Algorithm 11
Changes Metrics Clustering Method: Changes Classification 1. Changes Metrics Calculation 2. Map Changes to Nearest Clusters of Learning Set 3. Computation of Change Class by Cluster-Class Mapping, Built During Learning 12
Changes Metrics Calculated as subtraction of revisions metrics ◦ ∆M = Mr – Mr-1 CC – Cyclomatic Complexity (number of linearly independent paths in execution graph) CS – number of Classes/Structures e. LOC – Effective Lines of Code (without empty and comment lines) 13
Metrics Calculation and Clustering of Changes from Navi-Manager Project Revis Nearest ion 16820 16833 17026 17029 17038 17107 CC IC -2 0 +4 0 0 e. LOC Cluster +1 0 +4 0 0 +4 +12 -5 -1 +18 +1 -1 +89 Change Comment 1 Vessel objects now merged in one transaction. 2 Deleted an extra commit command. 3 Full format of lat and lon during polling report. 4 Set Message. Source. Update. Time after processing of each change. 2 Revert changes from r 17029. There’s no need to update time after each message processed. 3 Implementation of first version of vessel tracks loading from Mon. Server. 14
Fuzzy Clusters of Revisions Table Revision / Cluster 16820 16833 17026 17029 17038 17107 1 2 3 4 0, 78 0, 14 0, 00 0, 08 0, 02 0, 79 0, 00 0, 21 0, 32 0, 11 0, 36 0, 03 0, 30 0, 00 0, 67 0, 02 0, 79 0, 00 0, 20 0, 11 0, 68 0, 11 15
Method Learning Example Project: Navi-Manager of Learning Set: 29 changes Number of Clusters: 4 Size Cluster 1 2 3 4 Expert Class Refactoring Code Deletion New Functionality Implementation Bugfix 16
Classification Example Revisio n Nearest Class Nearest Cluster Change Comment 16820 Refactor. 1 Vessel objects now merged in one transaction. Delete 16833 Func. 2 Deleted an extra commit command. 3 Full format of lat and lon during polling report. 4 Set Message. Source. Update. Time after processing of each change. 2 Revert changes from r 17029. There’s no need to update time after each message processed. 3 Implementation of first version of vessel tracks loading from Mon. Server. New 17026 Feature 17029 Bugfix 17038 Del. Func. New 17107 Feature 17
Classification Fuzziness r 16833 «Deleted an extra commit command» classified as: Change 2% as refactoring On 79% as code deletion On 0% as new functionality implementation On 20% as bugfix On 18
Code Changes Classification in Software Development Process Project Manager Dev. Team Leader Source Code Developer Testing Team Leader 19
Changes Control During Important Development Phases Deny potentially classes destabilizing changes Main Dev Cycle Stop Сode Code Freeze New Functionality Implementation + – – Code Deletion + + – – – + Dev Phase Change Class Refactoring Small Bugfixes Critical Bugfixes 20
Request List of Changes by Class For Example: request refactorings list done in specific version X Request Refactorings in Version X Dev Team Leader List of Refactorings in Version X Automated Source Code Changes Classifier List of Changes in Version X Source Control System 21
Project Statistics Analysis Navi-Manager Change Statistics 6%2% 22% Small bugfixes: 70% Small new features+refac toring: 22% 70% Big new features: 6% Tortoise. SVN Change Statistics 38% 34% 28% Bugfixes: 38% Refactoring: 28% New functionality: 34% 22
Achieved Results on Navi-Manager Project Effectiveness ◦ More than 50% time economy on code review Development Problems Discover ◦ Too much bugfixes comparing to new feature implementations 23
Automated Changes Classification Tool Works with Subversion Low depended from program language Calculates CC, CS, e. LOC metrics Discovers change classes: ◦ ◦ ◦ New feature implementation Code deletion Refactoring Cosmetic Changes Bugfixes* 24
Future Research Method improvements ◦ Gustavson-Kessel Clustering ◦ Object and coupling metrics usage Refactorings classification Application widening ◦ Usage in development process on constant basis ◦ Adaptability analysis for different types of projects 25
Thank you! Any questions? evgeny. knyazev@gmail. com 26
- Elizabeth mulroney
- Physical and chemical changes examples
- Difference between source code and machine code
- Busceral
- Cpt code 95700
- What caused the beverly hills supper club fire
- 2020 florida electrical code
- Nec article 406
- Formuö
- Typiska drag för en novell
- Nationell inriktning för artificiell intelligens
- Returpilarna
- Shingelfrisyren
- En lathund för arbete med kontinuitetshantering
- Personalliggare bygg undantag
- Tidbok för yrkesförare
- A gastrica
- Förklara densitet för barn
- Datorkunskap för nybörjare
- Stig kerman
- Tes debattartikel
- Autokratiskt ledarskap
- Nyckelkompetenser för livslångt lärande
- Påbyggnader för flakfordon
- Vätsketryck formel
- Offentlig förvaltning
- Kyssande vind