Cost Estimation CIS 375 Bruce R Maxim UMDearborn

Cost Estimation CIS 375 Bruce R. Maxim UM-Dearborn 1

Types of Cost Models • Experiential – derived from past experience • Static – derived using “regression” techniques – doesn’t change with time • Dynamic – derived using “regression” techniques – often includes the effects of time 2

Expert Guessing A = The most pessimistic estimate. B = The most likely estimate. C = The most optimistic estimate. Ê = (A + 4 B + C) 6 (Weighted average; where Ê = estimate). 3

Delphi Technique 1. Group of experts, make "secret" guesses. 2. "secret" guesses are used to compute group average. 3. Group average is presented to the group. 4. Group, once again makes "secret" guesses. 5. Individual guesses are again averaged. 6. If new average is different from previous, then goto (4). 7. Otherwise Ê = new average. 4

Wolverton Model - 1 Uses a software type matrix where the column headings come from the cross product {old, new} X {easy, moderate, hard} For example: Type OE OM OH NE NM NH Control 21 27 30 33 40 49 I/O 5

Wolverton Model -2 • Estimate models in terms of LOC: C(k) = Ss(k) * Ci, j(k) Cost of = Size matrix cost entry module k = of module k for modules like K System Cost = C(k) where k = 1 to n 6

Problems with Expert Judgement • It is subjective. (consensus is difficult to achieve) • Extrapolating from one project to another may be difficult. • Users and project managers tend not to estimate costs very well. • Cost matrices require periodic updates. 7

Function Points Parameter Simple + Average + Complex = Fi Distinct input items 3( ) + 4( ) + 6( ) = ? Output screens/reports 4( ) + 5( ) + 7( ) = ? Types of user queries 3( ) + 4( ) + 6( ) = ? Number of files 7( ) + 10( ) + 15( ) = ? External interface 5( ) + 7( ) + 10( ) = ? Total = ? 8

Function Point Equation F. P. ’s = T * (0. 65 + 0. 01 * Q) T = unadjusted table total Q = score from questionnaire (14 items with values 0 to 5) • Cost of producing one function point? May be organization specific. 9

Function Point Questionnaire 1. 2. 3. 4. 5. Backup. Data communication. Distributed processes. Optimal performance. Heavily used operating system. 6. On-line data security. 7. Multiple screens. 8. On-line master file update. 9. Complex inputs, queries, outputs. 10. Complex internal processing. 11. Reusable code. 12. Conversion or installation. 13. Multiple user sites. 14. Ease of use. 10

Static Linear Models Often built using regression analysis Effort = c 0 + ci * xi C = regression coefficient X = product or process attribute 11

Static Non-Linear Models Examples Effort = c 0 + ci * xidi Ci and di are non-linear regression constants or Effort = (a + b S C) * m(X) where S is size in KLOC a, b, and c are regression constants 12

Halstead’s Software Science Assumptions • complete algorithm specification exists • programmer works alone • programmer knows what to do • Based on N = # of unique operators n = # of unique operands 13

Halstead Equations Effort E = N 2 * log 2 (n) / 4 To compute N N = k * Ss k = average # operators per LOC k is language specific To compute n N = n * log 2 (n / 2) 14

Watson and Felix Model E = 5. 25 * S 0. 91 composite productivity factor p = wi * xi L = LOC person-month = f(p) E = S / L 15

Bailey and Basili Model E` = 5. 5 + 0. 73 * S 1. 6 R = E / E’ = actual effort/predicted effort Adjusted effort is ERadj = R – 1 if R >= 1 = 1 – 1/R if R < 1 Eadj = (1 + ERadj) * E if R >= 1 = E / (1 + ERadj) if R < 1 16

COCOMO - I • Model E = a Sb * m(x) BASIC MODE INTERMEDIATE a b Organic 2. 4 1. 05 3. 2 1. 05 Semidetached 3. 0 1. 12 Embedded 3. 6 1. 20 2. 8 1. 20 17

Basic COCOMO • Computes software development effort (and cost) as a function of program size, expressed in estimated lines of code. • m(x) = 1 18

Intermediate COCOMO • Computes software development effort as a function of program size and a set of "cost drivers" that include subjective assessments of product, hardware, personnel, and project attributes. • m(x) = m(xi) 19

20

Detailed COCOMO • Includes all characteristics of the intermediate version with an assessment of the cost driver’s impact on each step (analysis, design, ect. ) of the software engineering process • m(x) based on similar questionnaire 21

Static Model Problems • Existing models rely at least in part on expert judgment • Most static estimates require estimation of the product in lines of code (LOC) • Not clear which cost factors are significant in all development environments 22

Dynamic Models • It is helpful to know when effort will be required on a project as well as how much total effort is required • Most models are time or phase sensitive in their effort computations 23

Putnam Model Based on Rayliegh curve - > skewed, median & mean offset from one another 24

Putnam Model Details • Volume of work • Difficulty gradient for measuring complexity • Project technology factor measuring staff experience • Delivery time constraints • Staffing model based on – Total cumulative staff – How quickly new staff can be absorbed – # days in project month 25

Putnam Equations E = y(T) = 0. 3945 * K K = area under curve [0 , 1) measured in programmer year T = optimal development time in years D = K / T 2 difficulty P = ci * D – 2/3 productivity S = c * K – 1/3 * T 4/3 lines of code 26

Parr Model 27

Parr Equation • Putnam variation • Staff may already be familiar with project tools, methods, and requirements Staff(t) = (sech 2 (a*t + c) / 2) / 4 28

Jensen Model • Putnam variation • Less sensitive to schedule compression than Putnam S = Cte * T * K 1/2 29

Cooperative Programming Model • Includes size of project team in estimate as well as code size E = E 1(S) + E 2(M) S = code size M = average # team members E 1(S) = a + b * S effort of single team member E 2(M) = c * Md effort required for coordination with other members 30

Dynamic Model Problems • Still rely on expert judgment • Not clear that all project costs have been accommodated here either 31