For my vulnerability research and analysis for the Security Intelligence Report (SIR) and other vulnerability reports, I frequently get methodology questions from readers that try to duplicate the results using the raw NVD and get different results.
It isn’t possible duplicate the results of this analysis by simply going to the NVD site and querying their raw data. I put in quite a bit of time and work, scrubbing the raw data, confirming and validating using several vendor advisory sites, and using the NVD references to research when each vulnerability was first publicly disclosed. Finally, I maintain my own supplementary database of the oldest disclosure dates, built up over years, and generate the results.
Having said that, though implementing the methodology takes time and tools, describing the methodology is not that hard, so I want to share it here, for those that might want to attempt to duplicate it themselves or spot check any results.
1. Source Vulnerability Index. Start with the raw National Vulnerability Database (NVD) data. I download the raw NVD/CVE XML 2.0 data files from http://nvd.nist.gov/download.cfm and have developed my own c# tools to process and query the xml files on my local system. Note that the 2.0 Schema and changelog is also available on the page, if you are interested in understanding the xml layout.
I leverage the NVD because it is the most comprehensive, cross-industry list of vulnerabilities and data that I have found. Built upon the Mitre Common Vulnerability and Exposures (CVE) dictionary of publicly known vulnerabilities and exposures, it provides a common naming and classification system that is widely adopted across the industry.
2. Scrub Data with Vendor Advisories. When a new vulnerability is published in the NVD, the entry is populated with existing relevant references to the vulnerability that are available at that time. For example, if I look at the CVE-2013-7026 entry today, it has references to a couple of kernel information sites and a Ubuntu Security Notice describing a fix for Ubuntu 13.10. It doesn’t yet have references to other Linux distributions, With a web search, I can find that Debian has issued a fix for CVE-2013-7026 (here), whereas Red Hat documents that, though the code is potentially present in their source, as compiled and shipped in RHEL 5, 6 and RHE MRG 2, their products were not vulnerable.
One can easily find examples where fixes from other vendors are released days, weeks or even months later, at a time well after the entry was added to the NVD and, unless NIST is specifically notified, their process won’t automatically add these references. To provide more comprehensive data for analysis, I do a scrub of NVD references and affected product references for software commonly used in the enterprise using the vendor’s own advisory sites:
For each of these sites, I examine each advisory and where it references a CVE identifier, I add that references to my database. Where the vendor acknowledges that the vulnerability affected their product, I add the product to the affected product list, and note that the vendor issued a fix for the vulnerability.
3. Determine oldest public reference to vulnerability details (the “first public disclosure”). For each vulnerability identifier in the NVD, I visit the web references cited in the NVD, as well as applicable vendor advisories (see previous step) and determine the first public disclosure for the vulnerability, which does not always align with the CVE identifier year, nor the NVD publication date.
Let’s look at a few examples to clarify how this plays out:
So, as you can see, NVD web site queries for the “CVE-2013-nnnn” or for vulnerabilities published during 2013 will not result in a comparable list to the disclosed vulnerabilities reported in my analysis.