Source- fifthdomain.com
This past summer, the Internal Revenue Service issued a request for information to learn more about how artificial intelligence can improve cyber security.
The request went beyond just using machine-learning technologies to improve cyber operations. The agency wanted to know how to create a system that continuously learns its environment, triages alerts, identifies previously unknown trends and analyzes data to provide actionable context for officials.
Artificial intelligence has been one of the most prominent buzzwords in the federal government over the past year. The federal government has made strides to bring artificial intelligence into agencies, but it has only begun to scratch the surface of its capabilities and use cases.
One of the most important potential use cases for artificial intelligence in government is cyber security. Most cyber security solutions use rules-based or signature-based methodology that requires too much human intervention and institutional knowledge. These systems require constant updates to those rules – taking up employee time – and typically forcing analysts to only look at a single part of the enterprise, failing to get a holistic picture of the environment. Artificial intelligence can augment that human element to make the time spent on cyber security more productive.
At its core, artificial intelligence is the science of training systems to emulate human intelligence through continuous learning. Although the role of the human will always be an important component for cyber security, the ability for a system to learn about the environment it must protect, automatically handling tasks and searching for anomalies in user behavior, is critical. Artificial intelligence can analyze large volumes of data, recognizing complex patterns of malicious behavior, and drive rapid detection of incidents and automated response.
Artificial intelligence can also help eliminate visibility gaps within an enterprise. To date, the federal government has largely pieced together its cyber security systems, resulting in a fragmented approach to protecting systems. Analytics help close those gaps that are a result of this approach, analyzing the data generated in a system to identify malicious activity in areas that human analysts might miss.
Artificial intelligence relies on the security analytics lifecycle, which is made up of three pillars: data, discovery and deployment. For artificial intelligence to be successful, it must be able to flow through these three pillars quickly and successfully. This lifecycle provides the ability for agencies to gain insight into their security ecosystem to quickly identify incidents and gain an understanding of their posture. Let’s look at each area:
Data – For artificial intelligence to work, it first needs data to analyze, either stored or streaming data. Both types of data sources can be valuable in analyzing a cyber environment. The federal government has long produced large amounts of data and with the right streams, the key will be to identify the right pieces of data to get the best results. Additionally, better information sharing between the private sector and federal government can enhance this data inventory, increasing the data available to get a more comprehensive understanding of the threat landscape, as well as best practices for mitigating those threats.
Discovery – This is the process of taking data and using technology to provide insights into security networks. With machine learning and artificial intelligence, agency personnel will build models for supervised and unsupervised purposes. Supervised models take advantage of datasets with known outcomes and build a model to predict or classify the behavior that drove that outcome. Unsupervised models do the same thing, except it works with data where there is no known outcome. It looks for outliers in the data that can show anomalies that are indicative of security incidents and finds areas of concern that human analysts would have a difficult time finding. That said, there is not a lot of labeled data in the cyber domain, so a combination of these approaches – or a semi-supervised learning approach – is often used to bridge the gap.