Software Testing Metrics

on 20120109

- Based on Open Tech Talk on 29/12/2011.

Agenda
• What and Why
• Metrics classification
• Metrics life cycle
• Some crucial Software Testing metrics
• Do’s & Don’ts

Can You Answer these Questions about Your Software Product?
• How large is the product?
• What is the test coverage?
• What is the automation reliability?
• When can you hit ZBB?
• How many bugs were found before it was released?
• How many bugs did the customers find in UAT?
• How fast are fixes to customer reported problems made?
• What is the quality of fixes delivered?

What is Metric?
• Metric is a quantitative measure of degree to which a system, component or process possesses a given attribute.
Why to measure software?
Computing metrics –
• Improves project planning.
• Helps us to understand if we have achieved the desired quality
• Helps in improving the process followed.
• Helps in analyzing the risk associated.
• Analyzing metrics in every phase of testing improves defect removal efficiency .

Metrics classification
• Product Metrics
Product metric – A metric used to measure that characteristic of any product of the software development process
Assesses the state of the project
Track potential risks
Uncover problem areas
Adjust workflow or tasks
Evaluate teams ability to control quality

• Process Metrics
Process metric – A metric used to measure characteristics of the methods, techniques, and tools employed in developing, implementing, and maintaining a software system
Insights of process paradigm, software engineering tasks, work product, or milestones
Lead to long term process improvement

Few Product Metrics
• Number and type of defects found during requirements, design, code, and test reviews
• Number of pages of documentation delivered
• Number of source lines of code delivered
• Total number of bugs found as a result of system testing
• No. of test cases passed, failed and blocked
• Defects by Action Taken/Resolution
• Defects by Type
• Automation coverage
• Code coverage

Few Process Metrics
• Average find-fix cycle time
• Number of person-hours per review
• Average number of defects found per review
• Average amount of rework time
• Average number of bugs per tester per day
• The ratio of Severity 1 bugs to Severity 4 bugs
• No. of test cases planned Vs. Ready for execution
• Total time spent on test design Vs. Estimated time
• No. of test cases executed Vs. Test cases planned
• Defect effectiveness, Defect Quality
• Avg. time taken for automation suit execution

Metrics Life cycle



Crucial Metrics used in Software Testing

• Requirement Volatility
Number of requirements agreed v/s number of requirements changed.
(Number of Requirements Added + Deleted + Modified) *100 / Number of Original Requirements

Example: VSS 1.3 release had total 67 requirements initially, later they added another 7 new requirements and removed 3 from initial requirements and modified 11 requirements

So, requirement Volatility is
(7 + 3 + 11) * 100/67 = 31.34%
Means almost 1/3 of the requirement changed after initial identification

• Test coverage on Functionality
Total number of requirement v/s number of requirements covered through test scripts.
(No of requirements covered / total number of requirements) * 100


Example: Total number of requirements estimated are 46, total number of requirements tested 39, blocked 7…define what is the coverage ?

So test coverage on Functionality is
39 / 46 *100 = 84.78%

• Code Coverage
No. block or functions of code covered through testing v/s Total No. of blocks or functions of code developed

We can use tools like Magellan to instrument the code and get the code coverage as part of test pass or automation suit execution.

This is a good indication of stability of the product and it is also useful to identify the gaps in testing done.

• Automation Reliability Index

Reliability Index tells about the reliability or stability of the Automation scripts designed.
(No. of scripts passed/ [No. of scripts passed + failed with code issue]) * 100
Automation Pass % is the % of scripts passed Vs. total no. of scripts executed.
Failed cases could be because of product issues or environmental issues or automation code issues.

• Test Case defect density
Total number of errors found in test scripts v/s developed and executed.
(Defective Test Scripts /Total Test Scripts) * 100

Example: Total test script developed 1360, total test script executed 1280, total test script passed 1065, total test script failed 215
So, test case defect density is
215 X 100
---------------------------- = 16.8%
1280

This 16.8% value can also be called as Test case efficiency , which is depends upon total number of test cases which uncovered defects

• Defect Density
Total number of valid defects v/s size of the product
(no. of defects/ KLOC or Estimation in Functional points) * 100

Example: Total defects logged 38, and the product has 200 KLOC of code
So, defect density is
38 X 100
---------------------------- = 19%
200


• Bug Convergence
Bug convergence is the point at which the number of bugs fixed exceeds the number of bugs reported.


• Defect Find Rate
This is a most useful derived metric both for measuring the cost of testing and for assessing the stability of the system.
It can give a good indication of the stability of the system being tested.


• Review Efficiency
The Review Efficiency is a metric that offers insight on the review quality and testing

Review efficiency=100*Total number of defects found by reviews/Total number of project defects

Example: A project found total 269 defects in different reviews, which were fixed and test team got 476 defects which were reported and valid

So, Review efficiency is [269/(269+476)] X 100 = 36.1%

This is also called as Static Testing efficiency

• Defect Slippage Ratio
Number of defects slipped (reported from production) v/s number of defects reported during execution.
Number of Defects Slipped / (Number of Defects Raised - Number of Defects Withdrawn)

Example: Customer filed defects or defects found in UAT are 21, total defect found while testing are 267, total number of invalid defects are 17

So, Slippage Ratio is
[21/(267-17) ] X 100 = 8.4%


• Defect Removal Effectiveness
Number of defects found in development phase v/s number of defects reported during execution.

DRE= Defects removed during development phase x100%
Defects latent in the product

Defects latent in the product = Defects removed during development
phase+ defects found later by user

Example: Customer filed defects or defects found in UAT are 21, total defect found while testing are 267, total number of invalid defects are 17

So, Defect Removal Effectiveness is
[(267-17)/(267-17+21) ] X 100 = 92.25%

Using Metrics to Support Investment in Software Testing

on 20090722

In the current economic environment, one would think that we would be using all means to secure investment in software testing. However at a recent industry forum it became apparent that the level of maturity in using metrics is somewhat limited.

Of the participants, many admitted to using defect metrics, but rarely used other metrics such as time and budget allocation, effort, coverage, and test outputs. This is somewhat disappointing to have a poor measure of the work that we undertake. It leaves us vulnerable to cutbacks, as other managers who may be more verbally gifted may take away our budget. Metrics provide a useful way of supporting our benefit.

The meeting went on to review the kinds of metrics that testers use. The following list provides some suggested metrics:

• Defects
Type
Cost
Age
Status
proportion found in stage

• Coverage
automated vs manual
planned vs executed
requirements
code

• Effort, time, cost, schedule

• Test outputs
test designed
test executed

• Test inputs
size
complexity
developer hours
project budget

• Risk
Some comments were made that metrics can be broadly grouped into two categories:
• Efficiency - can we do more testing using less effort
• Effectiveness - can we acheive a better outcome from our testing effort
While the discussion focused on supporting the business case for software testing in tighter economic times, it is important to note different uses of metrics. Metrics are used by test managers for the following:
• Managing Progress - producing estimates, how complete is testing, how much more to go, ...
• Quality Decisions - is the product good, bad or indifferent, are we ready for release, ...
• Process Improvement - what are the areas that future improvement should target, how do we know our process has improved, ...
• Business Value - what benefit has testing produced for the business in terms of reduced costs, increase revenue, reduced risk, ...

The choice of what metric to use can be daunting. It is not a good idea to collect everything, as you get overwhelmed in what data to use to make a management recommendation.

There was a lot of discussion that metrics assessing productivity can be dangerous. My personal view is that we should you use metrics to influence productivity, but that the following points need to be kept in mind:

• Productivity or performance measures do change behaviour. People will start aligning behaviour towards meeting the metric target. A poorly chosen metric or target could create behaviour that you never intended.
• One metric wont tell the correct story. Measuring productivity in terms of tests per hour completed, may mean that people run poor tests that are just quick to run. You may need several metrics to get a better perspective, for instance collecting information on defect yields, defect removal ratios from earlier phases and so on to get better picture.
• Given that metrics will change behaviour, you may change your metrics from time-to-time to place emphasis on improving or changing other parts of your process or performance.
• Metrics should be used by the manager to ask more questions, not a hard and fast rule to make a decision. A metric may lead you to make deeper enquiries with individual testers or developers.
• Managers need to build trust with the team in how they use metrics. If the team don't trust how you will use the metrics they will likely subvert the metrics process.

When we were discussing using metrics to justify the business case for testing it is very easy to get caught up in the technical metric, that is important to the test manager. However, when discussing with other business stakeholders, you need to talk their language. You may need to explain the metric in terms of what it means in terms of cost saving or increased revenue for it to have its impact. Don't explain it in terms of how many more tests cases are executed per hour, instead explain it is $'s per hour saved. Other business may need it explain in other terms, such as risk reduction or satisfying compliance.


* Thanks to Kelvin Ross for the original version of this blog