Benchmarking CVD
This page is not normative
This page is not considered a core part of the Vultron Protocol as proposed in the main documentation. Although within the page we might provide guidance in terms of SHOULD, MUST, etc., the content here is not normative.
Our Observational analysis supports the conclusion that
vulnerability disclosure as currently practiced demonstrates
skill. In both data sets examined, our estimated
If, as seems plausible from the evidence, it turns out that further
observations of
- CVD Benchmarks discusses this topic, which should be viewed as an examination of what "reasonable" should mean in the context of a "reasonable baseline expectation."
- MPCVD suggests how the model might be applied to establish benchmarks for CVD processes involving any number of participants.
CVD Benchmarks
As described above, in an ideal CVD situation, each observed history would
achieve all 12 desiderata
Per the Event Frequency table in Reasoning Over Possible Histories,
(reproduced below for convenience), even in a world without skill we would
expect
Expected Frequency of
0 | 1 | 1 | 0.333 | 0.667 | 0.750 | |
0 | 0 | 1 | 0.111 | 0.333 | 0.375 | |
0 | 0 | 0 | 0.037 | 0.167 | 0.187 | |
0.667 | 0.889 | 0.963 | 0 | 0.500 | 0.667 | |
0.333 | 0.667 | 0.833 | 0.500 | 0 | 0.500 | |
0.250 | 0.625 | 0.812 | 0.333 | 0.500 | 0 |
This means that
Benchmarking CVD
In fact, we propose to generalize this for any
where
We propose as a starting point a naïve benchmark of
The i.i.d. assumption may not be warranted.
We anticipate that event
ordering probabilities might be conditional on history: for example,
exploit publication may be more likely when the vulnerability is public
(
Supporting Observations
Some example suggestive observations are:
-
There is reason to suspect that only a fraction of vulnerabilities ever reach the exploit public event
, and fewer still reach the attack event . Recent work by the Cyentia Institute found that "5% of all CVEs are both observed within organizations AND known to be exploited", which suggests that . -
Likewise,
holds in 28 of 70 (0.4) . However Cyentia found that "15.6% of all open vulnerabilities observed across organizational assets in our sample have known exploits", which suggests that .
We might therefore expect to find many vulnerabilities remaining
indefinitely in
On their own these observations can equally well support the idea that we are broadly observing skill in vulnerability response, rather than that the world is biased from some other cause. However, we could choose a slightly different goal than differentiating skill and "blind luck" as represented by the i.i.d. assumption. One could aim to measure "more skillful than the average for some set of teams" rather than more skillful than blind luck.
If this were the "reasonable" baseline expectation, the
primary limitation is available observations. This model helps overcome
this limitation because it provides a clear path toward collecting
relevant observations. For example, by collecting dates for the six
Interpreting Frequency Observations as Skill Benchmarks
As an applied example, if we take the first item in the above list as a
broad observation of
from Discriminating Skill and Luck
to get a potential benchmark of