empty
empty

Introduction

The OWASP Benchmark is a test suite designed to evaluate the speed, coverage, and accuracy of automated vulnerability detection tools. Without the ability to measure these tools, it is difficult to understand their strengths and weaknesses, and compare them to each other. The Benchmark contains thousands of test cases that are fully runnable and exploitable.

The chart below presents the overall results for this set of tools scored against version 1.2beta of the Benchmark. The score for each tool is the overall true positive rate (TPR) across all the test categories, minus the overall false positive rate (FPR). To see the detailed results for any particular tool, select the tool from the menus above. For an explanation of all the metrics calculated for each tool, see the Guide page.

For more information, please visit the OWASP Benchmark Project Site.

Summary of Results by Tool

ToolTPR*FPR*Score*
FBwFindSecBugs v1.5.096,84%57,74%39,10%
FindBugs v3.0.15,12%5,19%-0,07%
OWASP ZAP vD-2016-09-0519,95%0,12%19,84%
PMD v5.2.30,00%0,00%0,00%
SonarQube Java Plugin v3.1450,36%17,02%33,34%

*-Please refer to each tool's scorecard for the data used to calculate these values.

Key

True Positive (TP) Tests with real vulnerabilities that were correctly reported as vulnerable by the tool.
False Negative (FN) Tests with real vulnerabilities that were not correctly reported as vulnerable by the tool.
True Negative (TN) Tests with fake vulnerabilities that were correctly not reported as vulnerable by the tool.
False Positive (FP) Tests with fake vulnerabilities that were incorrectly reported as vulnerable by the tool.
True Positive Rate (TPR) = TP / ( TP + FN ) The rate at which the tool correctly reports real vulnerabilities. Also referred to as Recall, as defined at Wikipedia.
False Positive Rate (FPR) = FP / ( FP + TN ) The rate at which the tool incorrectly reports fake vulnerabilities as real.
Score = TPR - FPR Normalized distance from the random guess line.