Navy - 25.1 SBIR - Neuro-Symbolic Artificial Intelligence (AI) Agents for Cybersecurity Authority To Operate (ATO) Development

Neuro-Symbolic Artificial Intelligence (AI) Agents for Cybersecurity Authority To Operate (ATO) Development

Navy SBIR 25.1- Topic N251-019
Naval Air Systems Command (NAVAIR)
Pre-release 12/4/24 Opens to accept proposals 1/8/25 Closes 2/5/25 12:00pm ET [ View Q&A ]

N251-019 TITLE: Neuro-Symbolic Artificial Intelligence (AI) Agents for Cybersecurity Authority To Operate (ATO) Development

OUSD (R&E) CRITICAL TECHNOLOGY AREA(S): Integrated Network Systems-of-Systems;Integrated Sensing and Cyber;Sustainment

The technology within this topic is restricted under the International Traffic in Arms Regulation (ITAR), 22 CFR Parts 120-130, which controls the export and import of defense-related material and services, including export of sensitive technical data, or the Export Administration Regulation (EAR), 15 CFR Parts 730-774, which controls dual use items. Offerors must disclose any proposed use of foreign nationals (FNs), their country(ies) of origin, the type of visa or work permit possessed, and the statement of work (SOW) tasks intended for accomplishment by the FN(s) in accordance with the Announcement. Offerors are advised foreign nationals proposed to perform on this topic may be restricted due to the technical data under US Export Control Laws.

OBJECTIVE: Research, design, and develop an innovative automated software toolset to assist the Cybersecurity workforce personnel in developing and maintaining Authority to Operate (ATO) packages under the Risk Management Framework (RMF) process.

DESCRIPTION: The DoD leverages the RMF to guide cybersecurity processes and requirements [Refs 3, 5, 7, and 9]. Further, program offices have experienced a dramatic increase in the man-hours required to produce a cybersecurity ATO package and maintain that package throughout the lifecycle of the system or systems supported. This increase has put a strain on budgets and increased schedules.

One outcome from Deep Neural Network (DNN) experiments into what are called large language models (LLMs; e.g., GPT3 or Lambda) is the ability for analysis of large data sets and the capability of system composed documents based on user requests [Refs 1 and 2]. For example, one might say, "Write me a paper about ‘Logistics issues in Africa’", and the system can then automatically produce a document that may sound reasonable. It has classified or categorized information about both logistics and Africa. It may even have found areas of overlap. The system is trained to understand, identify, and replicate patterns of what a paper should look like, how it might be organized, and the structure of paragraphs and sentences. There is a chance, therefore, that the paper actually conveys real information. There is, however, a significant chance that the paper is utter nonsense (i.e., a pattern borne out of mimicry rather than substance). The use of the LLM approach may be less viable as one moves towards novelty. That is, if a paper that describes a new concept, device, method, process, or strategy is desired, LLMs are unable to provide much help. In one sense they are merely sophisticated search algorithms that can find existing patterns and sometimes combine those patterns to useful effect.

An ATO is by its very nature a novel problem. So, one might argue that the LLMs are not going to add much value. This, however, is only true if they are used in isolation. This SBIR topic seeks a technical approach that leverages one or more technology type, such as LLMs, and capabilities offered by Artificial Intelligence (AI) and/or Machine Learning (ML) [Refs 4 and 8]. For example, approaching this challenge as an applied engineering discipline, focusing on applying a myriad of AI Techniques such as DNN to identified AI reasoning tasks with an understanding that most expertise is found in the heads of subject matter experts (SMEs) rather than in large data repositories, is expected to maximize the efficiency and effectiveness of the capability and maximize return on investment. The desired outcome of this SBIR topic is to develop technology and a methodology to work with SMEs to capture their expertise and mental models on the RMF and ATO process. From here, the technical approach should leverage these mental models to generate bias DNN classifiers and provide a way to represent an organization’s specific expertise and content.

Expected outcomes include:

Efficiency Gains: Significant reduction in time and manpower required for ATO drafting.
Consistency and Compliance: Standardized ATOs that adhere to Department of Defense (DoD) Cybersecurity regulations and policies.
Scalability: Potential application across various DoD acquisition entities, enhancing overall efficiency of the Cybersecurity workforce.

Work produced in Phase II may become classified. Note: The prospective contractor(s) must be U.S. owned and operated with no foreign influence as defined by 32 U.S.C. § 2004.20 et seq., National Industrial Security Program Executive Agent and Operating Manual, unless acceptable mitigating procedures can and have been implemented and approved by the Defense Counterintelligence and Security Agency (DCSA) formerly Defense Security Service (DSS). The selected contractor must be able to acquire and maintain a secret level facility and Personnel Security Clearances. This will allow contractor personnel to perform on advanced phases of this project as set forth by DCSA and NAVAIR in order to gain access to classified information pertaining to the national defense of the United States and its allies; this will be an inherent requirement. The selected company will be required to safeguard classified material during the advanced phases of this contract IAW the National Industrial Security Program Operating Manual (NISPOM), which can be found at Title 32, Part 2004.20 of the Code of Federal Regulations.

PHASE I: Design and develop a system that captures and organizes cybersecurity hardware/software configuration information and can automatically write an ATO for a given system leveraging that information. Develop a hybrid solution that integrates the capabilities of large language models, leveraging the mental models of experts and past ATOs into the ATO creation process. The Phase I effort will include prototype plans to be developed under Phase II.

PHASE II: Develop, test, and validate prototype software toolset proof of concept. Recognizing that initial generated ATOs will lack quality, develop and engage in an iterative test cycle, design and development software refinement, and document proposed concept of operations for employing technology. The goal is to capture and adapt knowledge over time, which incrementally improves the process through feedback from experts.

Work in Phase II may become classified. Please see note in the Description section.

PHASE III DUAL USE APPLICATIONS: Mature technology and seek approvals for deployment on DoD systems. Extend capability to more advanced capabilities for higher level security documentation. Investigate solutions to automated sustainment of underlying knowledge models. Consider additional modular capabilities to extend utility and use throughout the ATO process.

Cybersecurity is an issue for commercial sector organizations beyond DoD such as banking, medical, and civil infrastructure (e.g., power, water, GPS, and internet). As technology use has continued to increase, individual considerations for protections increase as well. The commercial sector will likely benefit from similar technology within these industries, as well as means for commercial products used within households to increase certification/guarantees to consumers.

REFERENCES:

1. Kitchin, R. "Big Data, new epistemologies and paradigm shifts." Big data & society, 1(1) , 2014. https://doi.org/10.1177/2053951714528481

2. Fan, J.; Han, Fang and Liu, Han. "Challenges of big data analysis." National Science Review, 1(2), February 5, 2014, pp. 293-314. https://doi.org/10.1093/nsr/nwt032

3. "Risk Management Framework." https://rmf.org/

4. Snoek, J.; Larochelle, H. and Adams, R. P. "Practical bayesian optimization of machine learning algorithms." Advances in neural information processing systems, 25, 2012. https://www.cs.princeton.edu/~rpa/pubs/snoek2012practical.pdf

5. Takai, T. M. "DoDI 8510.01 Risk management framework (RMF) for DoD Information Technology (IT)." Department of Defense, March 12, 2014. https://www.esd.whs.mil/Portals/54/Documents/DD/issuances/dodi/851001p.pdf?ver=2019-02-26-101520-300

6. Ellett, J. M. and Khalfan, S. "The transition begins: DoD risk management framework." CHIPS, April-June 2014. https://www.doncio.navy.mil/chips/ArticleDetails.aspx?ID=5015

7. AcqNotes. (n. d.). "Risk management framework (RMF)." AcqNotes: Defense Acquisitions Made Easy. http://acqnotes.com/acqnote/careerfields/risk-management-framework-rmf-dod-information-technology

8. Defense Innovation Board. (n. d.). "AI principles: Recommendations on the ethical use of artificial intelligence by the Department of Defense." Department of Defense. https://media.defense.gov/2019/Oct/31/2002204458/-1/-1/0/DIB_AI_PRINCIPLES_PRIMARY_DOCUMENT.PDF

9. "Risk Management Framework (RMF)." National Institute of Standards and Technology. https://csrc.nist.gov/projects/risk-management/rmf-overview

10. "National Industrial Security Program Executive Agent and Operating Manual (NISP), 32 U.S.C. § 2004.20 et seq. (1993)." https://www.ecfr.gov/current/title-32/subtitle-B/chapter-XX/part-2004

KEYWORDS: Automated Software; Cybersecurity; Authority to Operate (ATO); Risk Management Framework (RMF); Deep Neural Network (DNN); Generative Artificial Intelligence

** TOPIC NOTICE **

The Navy Topic above is an "unofficial" copy from the Navy Topics in the DoD 25.1 SBIR BAA. Please see the official DoD Topic website at www.dodsbirsttr.mil/submissions/solicitation-documents/active-solicitations for any updates.

The DoD issued its Navy 25.1 SBIR Topics pre-release on December 4, 2024 which opens to receive proposals on January 8, 2025, and closes February 5, 2025 (12:00pm ET).

Direct Contact with Topic Authors: During the pre-release period (December 4, 2024, through January 7, 2025) proposing firms have an opportunity to directly contact the Technical Point of Contact (TPOC) to ask technical questions about the specific BAA topic. Once DoD begins accepting proposals on January 8, 2025 no further direct contact between proposers and topic authors is allowed unless the Topic Author is responding to a question submitted during the Pre-release period.

DoD On-line Q&A System: After the pre-release period, until January 22, at 12:00 PM ET, proposers may submit written questions through the DoD On-line Topic Q&A at https://www.dodsbirsttr.mil/submissions/login/ by logging in and following instructions. In the Topic Q&A system, the questioner and respondent remain anonymous but all questions and answers are posted for general viewing.

DoD Topics Search Tool: Visit the DoD Topic Search Tool at www.dodsbirsttr.mil/topics-app/ to find topics by keyword across all DoD Components participating in this BAA.

Help: If you have general questions about the DoD SBIR program, please contact the DoD SBIR Help Desk via email at [email protected]

Topic Q & A

1/11/25	Q.	Is there a target operating system (i.e. Linux, Windows, OSX)?
	A.	There is not a specific target operating system; however, proposals should clearly articulate any expected limitations of a technical approach that is restricted to a specific OS or benefits of the selected OS to their technical approach and transition.
1/7/25	Q.	Frequently Asked Questions provided by Technical Points of Contact for topic N251-019: What is the main problem the Navy is trying to solve with this SBIR? What are the key technical areas of focus in this project? What specific challenges have been identified in the manual ATO generation/review process that AI could address? Is there a particular use case / system expected to be processed by the system? What level of engagement with Navy subject matter experts (SMEs) will be provided? What types of data should the system expect to process? Will example data be provided? What resources are available to provide more information regarding DoD cybersecurity requirements? Who are the expected users of this system? Is there a preference between utilizing open-source/cloud-based or local/government-sponsored LLMs? What aspects of the proposal will be weighted most heavily? How many awards does the Navy intend to make for this topic?
	A.	The Navy seeks to reduce the extensive time, manpower, and costs required to create and maintain Authority to Operate (ATO) packages under the Risk Management Framework (RMF). Leveraging artificial intelligence (AI) and Machine Learning (ML), the goal is to automate and optimize this process while maintaining compliance and accuracy. This topic seeks innovative solutions that address technical considerations such as human-in-the-loop and/or human-on-the-loop approaches, reliability, and transparency of the system. While increased utilization of AI to minimize human engagement would be a desired end state for the future, part of the effort should also explore the reliability and validity of the AI outputs to help inform how much human involvement would be needed to ensure quality and accurate outputs. The solicitation outlines the following desired outcomes: 1) decrease timeline to develop/review/approve packages (efficiency gains); 2) increase consistency/standardization of review process (increased package accuracy, comprehensiveness, and compliance with standards); and 3) promote scalability. To expand on outcome 3, the technical approach could demonstrate applicability to a variety of Navy training systems (e.g., computer-based training software, part task trainers, integrated complex training solutions, LVC training architectures), which addresses the range of our immediate training use cases. However, technical approaches that focus on use cases other than training systems would be within scope. Additionally, contractors should consider a long-term sustainment plan that would be needed to refine and maintain the model(s) over time. There is no preference or priority as to how these outcomes should be addressed. Proposals can choose to focus on one or two outcomes, address all outcomes, or suggest other solutions believed to positively impact the ATO generation/review process. No specific systems of interest, but technical solutions should address adaptability to various system types (scalability). As a training systems example, system types will range from desktop software to hardware/software simulators to LVC platforms that integrate with real/operational systems. During Phase I, our ability to provide access to government SMEs will be limited at best. Companies interested in pursuing are encouraged to consider internal expertise in the ATO/RMF process or partnering with groups who have appropriate expertise (e.g., academic institutions, consultants, companies). During Phase II, the government will attempt to provide periodic and iterative engagement with Navy SMEs throughout the development process. SME collaboration is critical to refining the technology and ensuring alignment with RMF and ATO requirements. This is dependent on the technical approach and specific models to be developed, but we expect relevant data sources might include policy / regulations, example documentation and packages, and expertise from SMEs and/or relevant training. While there are no restrictions or limitations to the type of data to be considered by a technical approach, contractors should consider methods and techniques to train models effectively and validate outputs without requiring government furnished information. During Phase II, the government can explore the potential to share relevant example data and SMEs to provide input and validate technology, but access will not be available during the initial Phase I period of performance. Contractors should consider methods and techniques to develop and validate the system without requiring government furnished information. Part of Phase I effort could include defining what would be necessary to be successful in Phase II (documentation / format) so the government can explore the potential to share relevant example data, if available, to continue model validation. In addition to the references listed in the solicitation and resources available to companies through internal expertise or external partnerships, the following link may be a useful resource. However, technical approaches do not need to be limited to the content provided in these resources. https://www.navair.navy.mil/nawctsd/Cybersecurity-Information https://www.doncio.navy.mil/contentview.aspx?id=16503 Targeted user groups will depend on individual technical approaches, but could include small businesses, prime contractors, and government individuals that are part of the ATO process (e.g., system owners, authorizing officials, security officers). Open-source capabilities are not necessarily out of bounds; however, vendors should be mindful of the need to protect information being brought into the system to build an ATO package. Depending on the technical approach and ability to ensure data is protected, leveraging open-source capabilities may or may not be a feasible approach. Evaluators look at technical approach, personnel qualifications, and commercialization process. Technical approach involves innovation, feasibility of technology/solution/approach, and ability to meet the requirements as outlined in the topic. The number of awards made is dependent on the quality of proposals received and funding available; typically, 2-4 Phase I contracts are awarded.
1/5/25	Q.	To what extent should the proposed solution rely on large language models (LLMs) versus other AI/ML techniques, such as neuro-symbolic AI or Bayesian optimization? Are there preferred methods or technologies? What specific mechanisms or tools are expected to capture the expertise and mental models of subject matter experts (SMEs)? Should the solution include features for ongoing feedback and iterative learning? What criteria will be used to assess the effectiveness and quality of the ATO generation process during Phases I and II? Are there baseline expectations for efficiency gains or accuracy? Will existing ATO documentation, RMF guidelines, or other data be provided to aid in training and development? Should synthetic datasets or other external resources be created independently? What specific export control or ITAR restrictions must be considered when developing and deploying the solution? Are there additional requirements for handling classified or sensitive information during Phase II? Should the design explicitly consider adaptability for commercial sector applications (e.g., banking, healthcare)? If so, are there prioritized industries or features to include?
	A.	There are no preferred methodologies or technologies. The topic is broadly written to encourage exploration of AI / ML technologies to a range of documents / artifacts that are part of the ATO process. Proposals should provide context on the feasibility of proposed technology to provide the specific outputs being pursued as part of their technical approach, the return on investment for leveraging AI/ML technology for those outputs, and other relevant challenges/strengths of their approach (e.g., human-in-the-loop vs human-on-the-loop approaches, reliability, transparency). Neuro-symbolic AI was identified to emphasize the desire for a solution that support transparency and understandable results. This is dependent on technical approach, and there are no specific requirements on a method, tool, or periodicity for addressing this technical objective. Proposals should clearly identify how the intend to address in their specific technical approach and how/if refinement over time is addressed. The evaluation team will be looking for how contractors define and expect to measure performance metrics / benchmarks. At a minimum, the government considers there to be a need for well-defined performance metrics to quantify efficiency gains, accuracy, and compliance improvements. As a starting point, contractors will determine metrics to address these 3 criteria (efficiency gains, accuracy, and compliance improvements) for testing; during Phase II, the government will seek additional input from ISSO/ISSM SMEs to help refine contractor derived metrics to increase the likelihood of successful transition. Contractors should consider methods and techniques to develop and validate the system without requiring government furnished information. Companies interested in pursuing are encouraged to consider internal expertise in the ATO/RMF process or partnering with groups who have appropriate expertise (e.g., academic institutions, consultants, companies). Part of Phase I effort could include defining what would be necessary to be successful in Phase II (documentation / format) so the government can explore the potential to share relevant example data, if available, to continue model validation. During Phase I, there is an expectation that efforts will remain UNCLASS to increase flexibility to the early development. However, to increase the likelihood of successful transition, efforts will need to be able to address a range of export control/ITAR systems and classified technology. How this is addressed and what level(s) can be addressed will be dependent on individual technical approach. A goal of the SBIR program is to increase private sector commercialization of innovations derived from Federal R/R&D. As a result, it would be acceptable to consider adaptability for commercial sector applications, but there are no prioritized industries or features to include.
1/3/25	Q.	The topic description seems to be split into two main focus areas: reduction of hallucinations in utilized LLM solutions and serializing the mental models of SMEs into something that improves DNN outputs. Is one of these a higher priority than the other? Or both equal factors for evaluating a proposed solution? Are there any specific types of DoD IT in focus for this effort, referring to the categorization utilized in Ref 6? Information Systems, PIT, IT Services, and/or IT Products? Is there an expectation for integration into the RMF Knowledge Service (KS)? Is it safe to assume the “capture” phase of the hardware/software configuration induction step of the model will be highly error intolerant? i.e., mistakes here, or at the regulatory crosscheck phase, would be disqualifying to the solution, even with SMEs in the loop? How important is generation of possible operational contingencies in case of exploit or system downtime? “The desired outcome of this SBIR topic is to develop technology and a methodology to work with SMEs to capture their expertise and mental models on the RMF and ATO process.” Will winning proposals have access to NAVAIR human SME annotators to facilitate this solutioning? Can we safely assume that a SME facing data collection and annotation UI should be included in some form even in the Phase 1 solution? Are your “Expected outcomes” in ranked order? Certain AI modeling solutions would be appropriate for addressing the “more efficient ATO drafting” task, while others would be more appropriate for fault intolerant “high consistency” regulatory cross checking use cases. Are “mental models of experts” normalized in any way? Are they collected at all currently? How many experts do you anticipate participating? Will these experts be able to offer "toy" or "notional" data for initial prototyping that won't require lengthy security reviews/clearances? Would we be correct in reading this as a qualitative data collection phase necessitating surveys/interviews and relatively frequent user testing? How will the Phase I prototype be evaluated? Are there specific criteria for performance, accuracy, user satisfaction, or system integration that should be met? What success criteria will indicate that the Phase I effort has achieved its goals?
	A.	The topic is broadly written to encourage exploration of AI / ML technologies to a range of documents / artifacts or process improvement capabilities that are part of achieving an ATO. Proposals should provide context on the feasibility of proposed technology to provide the specific outputs being pursued as part of their technical approach, which aspect(s) of the problem are being addressed, the return on investment for leveraging AI/ML technology in specific areas (e.g., expected metrics), and other relevant challenges/strengths of their approach (e.g., human-in-the-loop vs human-on-the-loop approaches, reliability, transparency). No specific systems of interest, but technical solutions should address adaptability to various system types (scalability) focusing initially on training systems, and should clearly articulate what aspect of the IT process is being supported by the proposed technology. As a training systems example, system types will range from desktop software (e.g., courseware) to hardware/software simulators (e.g., virtual cockpit simulators) to LVC training solutions (i.e., environments that integrate virtual and/or constructive training technology with real/operational systems). Not at this time. This is dependent on technical approach, and benchmarks and expected target performance metrics should be defined based on that technical approach. During Phase I, our ability to provide access to government SMEs will be limited at best. Companies interested in pursuing are encouraged to consider internal expertise in the ATO/RMF process or partnering with groups who have appropriate expertise (e.g., academic institutions, consultants, companies). During Phase II, the government will attempt to provide periodic and iterative engagement with Navy SMEs throughout the development process. Regarding the second question, this would be dependent on technical approach, but within scope of the topic. There is no preference or priority as to how these outcomes should be addressed. Proposals can choose to focus on one or two outcomes early, address all outcomes, or suggest other solutions believed to positively impact the ATO generation/review process as part of their phased approach. However, the proposal should make clear what is being addressed in Phase I, what is expected Phase II objectives, and what is out of scope of the technical approach. While there are no restrictions or limitations to the type of data to be considered by a technical approach, contractors should consider methods and techniques to train models effectively and validate outputs without requiring government furnished information. Additionally, companies are encouraged to consider internal expertise in the ATO/RMF process or partnering with groups who have appropriate expertise (e.g., academic institutions, consultants, companies) as our ability to provide SMEs during early stages of this topic will be limited. The evaluation team will be looking for how contractors define and expect to measure performance metrics / benchmarks. At a minimum, the government considers there to be a need for well-defined performance metrics to quantify efficiency gains, accuracy, and compliance improvements. As a starting point, contractors will determine metrics to address these 3 criteria (efficiency gains, accuracy, and compliance improvements) for testing; during Phase II, the government will seek additional input from ISSO/ISSM SMEs to help refine contractor derived metrics to increase the likelihood of successful transition.
12/31/24	Q.	ATO and Data Requirements: Should the tool address all six RMF lifecycle steps, or are there specific phases (e.g., categorization, control selection) that require more focus in Phase I? SME Collaboration: What is the specific role of SMEs in the current ATO process (e.g., decision-making, validation)? Will SMEs assist in defining symbolic rules, refining tool outputs, or identifying knowledge gaps? Tool Expectations: Should the tool prioritize creating explainable outputs, such as confidence scores or reasoning trails? Should the tool automate the generation of documents like SARs or POA&Ms to streamline the ATO approval process?
	A.	ATO and Data Requirements: This is dependent on technical approach. There is no preference or priority as to how these aspects of the ATO process should be addressed. Proposals can choose to focus on one part of the process or to address the entire process. SME Collaboration: Companies interested in pursuing are encouraged to consider internal expertise in the ATO/RMF process or partnering with groups who have appropriate expertise (e.g., academic institutions, consultants, companies) to better understand SME roles in the current ATO process. During Phase I, our ability to provide access to government SMEs will be limited at best. Again, companies interested in pursuing are encouraged to consider internal expertise in the ATO/RMF process or partnering with groups who have appropriate expertise. During Phase II, the government will attempt to provide periodic and iterative engagement with Navy SMEs throughout the development process. SME collaboration is critical to refining the technology and ensuring alignment with RMF and ATO requirements. Tool Expectations: This is dependent on technical approach and within scope of this topic. This is dependent on technical approach and within scope of this topic.
12/10/24	Q.	Is it intended for the desired system to use AI/ML to create an ATO letter/Authorization Determination Decision (ADD), or to create the RMF/eMASS artifacts that contribute to a system receiving an ATO?
	A.	The topic is broadly written to encourage exploration of AI / ML technology to a range of documents / artifacts that are part of the ATO process. Proposals should provide context on the feasibility of proposed technology to provide the specific outputs being pursued as part of their technical approach, the return on investment for leveraging AI/ML technology for those outputs, and other relevant challenges/strengths of their approach (e.g., human-in-the-loop vs human-on-the-loop approaches, reliability, transparency).

[ Return ]