Working to find bugs in the software security industry is much like prospecting for natural resources. An engineer takes a high level view of an unknown piece of territory to determine the lay of the land and narrow down the geography into a few key locations of interest using intuition, experience, and macro-scale information. Next, they use analysis instruments to collect data that eventually tells them where to apply human resources to dig for GOLD! Or Oil. Or bugs!

Now, back before there were things such as accurate soil analysis instruments, there was still prospecting. It just wasn’t all that efficient. Just like in security, we’ve been finding bugs through laborious and often gut-instinct driven processes that are simply inefficient. Lately, I’ve been thinking hard about this problem and pooling resources with professionals in the security field as well as academics in the visualization field. I recently had the honor of participating on a panel at VizSec, which is held at MIT, and had presenters and attendees that stretched the globe bringing their heads together to see where visualization is going. The observation I made was that the network security folks are thinking a lot harder about visualization than those specializing in software security and I think I understand why: network data has many concrete properties that are readily consumed by existing data analysis functions and data visualization software. That is to say, they know some of the trends they’re looking for and those values are typically very literal, such as who is sending packets where and of what type, etc. I found myself trying to create parallel scenarios that would fit software security and program analysis and there certainly are some to be found (such as reachability on a graph), however for the most part, software security analysis is facing different problems due to a lack of metrics that can be plugged into functions that find trends. 

So, what does this mean to software security prospectors like us in the context of visualization? It means that we need to define metrics that can be measured and then presented visually, of course! There are some good articles you can dig up from ACM on the topic, but one basis for my work has been Matt Miller’s paper titled “Improving Software Security Analysis using Exploitation Properties”. Matt’s paper introduces the concept of exploitation properties which are specific metrics that can be interpreted together to get a sense of whether an area of code is at higher risk of exploitation if a bug is present – you’re going to exploit the stack overflow that isn’t protected by GS before you try to work around the one that is protected with GS, right? Expand this concept a bit and you can begin to collect metrics beyond exploitability that may help improve software security or maintainability. 

So my upcoming talk at BlueHat will focus on two areas of concern: what do these metrics look like and what do proper visualizations of multi-dimensional data look like. We have some interesting challenges because not all of our metrics are as reliable as others. In fact, off-the-shelf bug finding tools are often heuristic based and can be highly unreliable, but that just means we have a unique opportunity to take large data sets and correlate them to find out where the real gold is buried.

-Richard Johnson, Security Software Development Engineer, Trustworthy Computing, Microsoft