7 August 2022

Cybersecurity Research: A DARPA Retrospective

Rich Heimann

Cyber threats are real and constantly evolving, and responsible cybersecurity is looming. This confluence of factors makes cybersecurity more important than ever. However, this article is not a detailed account of cyber threats or the necessity of cybersecurity. It is 2022, and I will assume you already know these realities. Instead, this article is about research, specifically how to pursue research properly, based on my experience with two different research programs at the Defense Advanced Research Projects Agency (DARPA).

The DARPA Network Defense (ND) program focused on threat detection using machine learning to discover behavioral patterns in network traffic. The program analyzed actual security events using network traffic acquired through a partnership program. In effect, industry partners shared data with the government for the program’s security analyses.

ND was spun out of another DARPA program. By 2010, there was a growing sentiment that intelligence analysis in Afghanistan was “only marginally relevant.”However, there was a general acknowledgment of a vast, underappreciated trove of data. Nexus 7 (N7) helped military leaders in Afghanistan understand aspects of the war by using nontraditional methods and unconventional data sources.

These two programs are more similar than they first appear. Both originated in the same DARPA office. They had overlap in the leadership team and researchers. Both had applied research goals by turning the possible into the actual, with one distinction. ND had nebulous goals beyond the customers and their problems. An ancillary goal of the program was to use and, in some cases, advance state-of-the-art elements of unsupervised machine learning. However, managing basic research and applied research goals consecutively—as Network Defense did—proved to be a double bind.

ND was organized into various teams. These talented researchers often found exciting results, including network infiltration, covert command and control, reconnaissance detection, and identification of botnets coordinating DDOS. However, each team represented an analytical family rather than the underlying problem. To be sure, cyber is distributed, but not by analytical families. Cyber has no time-series, clustering, sequence, or network analysis problems. That is not to say these teams found nothing. Instead, it suggests that the program did not align personnel to the problem because it didn’t know what problem it was solving.

The existential difficulty of organizing research around analytical families and a specific learning paradigm (i.e., unsupervised machine learning) is that it ignores the whole problem for some version of the problem that is good for some solution. Cybersecurity does not require one solution, a family of solutions, or a learning paradigm. Instead, it requires many solutions to work together. You can look at the MITRE ATT&CK framework to see how many threat vectors exist and how they are related.

Particularly, threat detection requires distributed computation for the whole problem. Yet, Network Defense had no unified, distributed computation, only fragmented, underspecified partial solutions organized by purposeless analytical teams. Developing a meta-algorithm that learned from other solutions would have reduced false positives (FP), improved usability, and aligned and unified the Network Defense research. However, that was never built for the program because no team was responsible for the whole problem, leading to the so-called separation of concerns (Instead, a meta-algorithm was constructed only years later when Cybraics spun out from DARPA. All the performers on Network Defense shared the intellectual property. However, no one aside from Cybraics did anything with it to my knowledge. And the truth is that Cybraics did not directly use any ND research for commercialization which is telling).

Moreover, security events are time-oriented. Some move fast, and others move pretty slowly. Figuring out what is moving and how quickly it is moving is part of problem comprehension because fast and slow are problem constraints. Slow dynamics in a system may dominate faster components, though sometimes the quick can change the slow, neither of which will be evident if you isolate a problem from its context. A familiar sneer on ND among the data scientists was that the FP rate was the time it took to build a PowerPoint slide. This process sometimes took weeks of sifting through stochastic outputs from opaque algorithms. This process does not represent how cyber analysts work and ignores FP. Cyber analysts are not looking for one novel static result at the cost of everything else. Ultimately, research environments cannot make a problem more accessible to the point of making it fake. Understanding how security analysts work is an integral part of applied research.

N7 was not interested in general knowledge that lacked application, which is the foundation of basic research. For example, researchers on N7 often used machine learning, including unsupervised machine learning. However, researchers weren’t required to solve a given problem with machine learning, much less solve some of the tricky aspects of unsupervised machine learning. N7 was applied and spent most of its time acquiring the correct problem at a scale and speed that roughly matched the scale and speed of the mission. Of course, alignment and tempo are not easy to accomplish when sitting in an office building in Arlington, VA, and your customer is 7,000 miles away in a combat zone. This explains why the leadership did extreme things like making everyone sit in a hallway outside the office of the DARPA director at the time.

You may question the scientific literacy of a leadership team that considers sitting in the hallway akin to Afghanistan’s austere conditions. I certainly did. I thought I was suddenly working for Elton Mayo in his notorious Hawthorne experiment. However, everyone understood the mission and so-called battle rhythm rather than some cooked industrial psychology project. Some team members even deployed to Afghanistan to meet the customer and better understand their constraints. Stateside team members would regularly interact with these deployed elements.

Meanwhile, ND never tried to replicate the problem as it exists for security analysts in any enterprise, much less customers. The program never understood how customers used the program’s security analyses, nor did it have cybersecurity experts until the final year or so of performance. While problem framing is always tricky, even for those with the problem, applied research must understand how those affected by a problem work. You must walk a distance greater than zero in their shoes if your research is applied.

Why ND lacked cybersecurity experts and ignored the problem explains itself: Basic research does not want the responsibility of a problem or customer. Without a problem and customer, everyone is safe from responsibility. This is a kind of Gresham’s law for research. Instead of an economic principle stating that “bad money drives out good,” it is a research principle that deems research contaminated if prematurely exposed to any discussion of real-world customers or applications. There was one incident on N7 where the team thought it had discovered an al-Qaeda group operating near a military base in Afghanistan. The team was disappointed when one of the program’s subject matter experts informed everyone that what was found was the Jordanian Special Forces calling home to check on their kids. Subject matter experts are the kind of people who would quickly tell you your results are trivial or wrong or your research is going in the wrong direction based on the problem and how they get their job done. Ignoring them does not improve research.

The implicit assumption in programs like ND is that research can start basic and move to applied and effortlessly transition to deployment. In this case, we can do some unsupervised machine learning in years one and two, add some cyber experts to make everything applied, and deploy to the enterprise in year five. Not only is five years a long time to field anything, but these two programs also tell us that the linear research model is built on a flawed assumption. Technology transfer does not move along a single dimension from basic to applied to deployment nor is there some magical midpoint equidistant from basic and applied research that delivers both. ND experienced schisms between these phases because each phase failed to match the target domain. N7 did not because it was always deployed.

N7 won’t be remembered in the annals of DARPA history even though it won the DARPA Program of the Year. The reason is simple. There are no remnants today. There is nothing to point to and call success, aside from this article. ;) The war is over, and when applied research loses its connection to a problem (and data), the solutions quietly disappear. There was nothing as perpetual as stealth technology, GPS, or the Internet. There was no fundamental knowledge developed that transcended the program. That is not the goal of applied research. Nonetheless, during the performance period, the work mattered, and many involved knew it. Conversely, ND didn’t matter, and most involved knew it.

Cybersecurity research is vital because the problem is not going away. The cyber problem is complex, distributed, and continuously evolving. Ideally, organizations would develop their security strategy, tools, and methods iteratively with the support of cybersecurity research. For this reason, I am surprised by the lack of applied cybersecurity research in many enterprises. However, this anecdotal evidence shows that even applied research is challenging. Even getting most things right is often not enough. Hopefully, this article shows that leadership matters and that how you structure your research also matters. The lesson for technical leaders is that you want to make everything as easy as possible but never easier. You cannot lose connection to the problem, which is nearly impossible to understand without connection to an actual user and the context of both. You cannot reduce research to something fake to make it easier.

No comments: