P JF How Do Developers Utilize Source Code

  • Slides: 22
Download presentation
P JF How Do Developers Utilize Source Code from Stack Overflow? (Empirical Software Engineering)

P JF How Do Developers Utilize Source Code from Stack Overflow? (Empirical Software Engineering) Yuhao Wu Shaowei Wang Cor-Paul Bezemer Katsuro Inoue

Stack Overflow (S. O. ) is a question and answer (Q&A) platform for developers

Stack Overflow (S. O. ) is a question and answer (Q&A) platform for developers Ask Ideas What are the barriers for code reuse and how can we improve? Answer Code snippets 2

Developers reuse source code from S. O. posts An answer on Stack Overflow function

Developers reuse source code from S. O. posts An answer on Stack Overflow function generate. ID() { return "avalon" + Math. random(). to. String(36). substring(2, 15) } Source code in a Git. Hub project 3

Structure of this study Part I: Exploratory Study How do developers reuse source code?

Structure of this study Part I: Exploratory Study How do developers reuse source code? Part II: Survey How do you think S. O. can be improved? Responses 4

Part I: Exploratory study searchcode. com 289 files “stack  overflow” Analyze code reuse 4,

Part I: Exploratory study searchcode. com 289 files “stack  overflow” Analyze code reuse 4, 878 source files Filter RQ 1: modification frequency and reason RQ 2: origin of code reuse 5

53% of the files that contain a reference to S. O. reuse source code

53% of the files that contain a reference to S. O. reuse source code from S. O. 35% 32% of the reused source code snippets requires additional modification (C 2+C 3) 22% 21% A summary of each answer might be helpfu 13% 10% 6

26% of the source code was reused from non-accepted answers 43% Developers adopt answers

26% of the source code was reused from non-accepted answers 43% Developers adopt answers based on different needs: simplicity, correctness, efficiency, etc. 26% 30% An advanced tagging system 1%might be helpful 7

Part II: Large-scale survey on 400+ participants Survey Questions Demography (7) Barriers (10) Suggestions

Part II: Large-scale survey on 400+ participants Survey Questions Demography (7) Barriers (10) Suggestions (2) - Experience Type of project participated. . . - Difficulties in reusing source code Opinions on OSS licenses. . . - Usefulness of proposed advanced tagging system Any other suggestions. . . 8

Three authors open coded the responses Discussion Draft schema A Unified schema Answers Draft

Three authors open coded the responses Discussion Draft schema A Unified schema Answers Draft schema B Review Final schema 9

Most participants of our study are experienced developers 78% 70% 74% 73% 20% 12%

Most participants of our study are experienced developers 78% 70% 74% 73% 20% 12% 8% 18% Distribution of programming experience Distribution of type of projects the participants worked on 10

Developers reuse source code slightly more frequently than reimplement source code 53% 35% 33%

Developers reuse source code slightly more frequently than reimplement source code 53% 35% 33% 29% 25% 1% 1% 5% 2% 11

Improving code comprehension and code quality may have positive effects on code reuse 65%

Improving code comprehension and code quality may have positive effects on code reuse 65% 44% 32% 17% 7% 12

75% of the participants do not have a good understanding of license terms of

75% of the participants do not have a good understanding of license terms of the Q&A platform 31% 33% 21% 12% 4% * S. O. adopts CC BY-SA 3. 0 license, which is viral and requires attribution 13

Most suggestions are on code quality 35% 24% 13% 12% 10% 7% 14

Most suggestions are on code quality 35% 24% 13% 12% 10% 7% 14

Suggestions: improving code quality (35% of all categories) Integrated validator (42. 2%) Outdated code

Suggestions: improving code quality (35% of all categories) Integrated validator (42. 2%) Outdated code (29. 7%) Answer quality (17. 2%) Code review (10. 9%) “In-browser code review and “Make date important marking “Better support answers “An inbuilt REPL environment forinfor commenting similar to that outdated code, deprecate that are and good, but of as many languages/ environments provided by commercial code those snippets date. ” via the community” as possible. ” review tools. ” 15

Suggestions: information enhancement & management (24%) Answer tagging (37. 2%) Code evolution (14. 0%)

Suggestions: information enhancement & management (24%) Answer tagging (37. 2%) Code evolution (14. 0%) Resource linking (11. 6%) Answer writing (9. 3%) “where does the code come “Provide/require tagging of the “[. . . ] it would be nice if it “Books suggestions from and copied to, and version number(s) of the would be easier to ask on questions. ” also thebased revisions inside the language [. . . ]” a good question [. . . ]” platform” 16

Suggestions: clearer and more permissive license (13%) Clearer license (69. 6%) Permissive license (30.

Suggestions: clearer and more permissive license (13%) Clearer license (69. 6%) Permissive license (30. 4%) “By far the most important requirement is clear“Let licensing. Much the user choose a more of the code providedre-user-friendly on such license (e. g. platforms is not currently copy without usable reference). ” because the license is unclear. ” 17

Suggestions: better data organization (12%) Code searching/indexing (47. 6%) Duplicate posts (38. 2%) Comments

Suggestions: better data organization (12%) Code searching/indexing (47. 6%) Duplicate posts (38. 2%) Comments (14. 3%) “Source Code“Auto-suggest indexing for easier similar “Code in *comments* must retrieval. It could questions, also give particularly the for be expressed better, than possibility to find questions example that of don't have on Stack Overflow. ” usage functions. ” answers. ” 18

Suggestions: human factor (10%) Better curator (63. 2%) Gamication related (36. 8%) “Base reputation

Suggestions: human factor (10%) Better curator (63. 2%) Gamication related (36. 8%) “Base reputation on number of “Definitely curators for specific by others, languages to rateanswersup-voted in specific areas. ” not on personal activity. ” 19

20

20

22

22