Abstract and keywords
Abstract:
infrastructure, there is a systemic gap between their capabilities and the maturity of AI Governance systems. A comprehensive study of current cybersecurity challenges related to the autonomous functioning of AI agents has been conducted. It has been revealed that traditional approaches to security based on the prohibition paradigm are not only ineffective, but also exacerbate risks, giving rise to the phenomenon of “shadow AI”. The scientific novelty of the research lies in the development and testing of an original framework for proactive risk assessment – the Agentic Risk Assessment Framework (ARAF). This framework integrates two previously disparate domains: AI CyberSecurity and AI CyberCrimes. Unlike existing analogues such as NIST AI RMF and OWASP LLM Top-10, ARAF for the first time takes into account key modern threats, including “weapons of autonomy”, “Deceptive Chain-of-Thought” and risks of embodied AI. A new taxonomy of 42 threat classes has been proposed and a quantitative risk assessment metric (Agentic Risk Index, ARI) has been introduced. The practical significance of the work is confirmed by the results of pilot implementations of ARAF in 2024-2025 in organizations of the financial sector, public administration and the military-industrial complex, which demonstrated a decrease in the composite risk index ARI by 40-65%. The research results are of high value for the formation of national AI safety standards, the development of robust architectures and the creation of a regulatory framework governing the responsible implementation of autonomous systems.

Keywords:
security of AI agents, levels of autonomy, threat classification, cybersecurity, AI threats, risk assessment framework, AI cybercrime
Text
Text (PDF): Read Download
References

1. Bommasani R., Hadson D. A., Adeli E., Oltman R., Arora S., fon Arks S., Bernstajn M. S. O vozmozhnostyah i riskah bazovyh modelej [About the opportunities and risks of basic models]. Doklady Centra issledovanij bazovyh modelej (CRFM), Stenfordskij universitet. Stanford, CRFM, 2021. 45 p.

2. Li K. Bezopasnost' II i nepravomernoe povedenie avtonomnyh sistem [AI security and the misconduct of autonomous systems]. Otchety Conjecture. London, Conjecture Publications, 2025. 32 p.

3. Hendriks D., Mazejka M., Vudsajd T. Obzor katastroficheskih riskov iskusstvennogo intellekta [Overview of catastrophic risks of artificial intelligence]. Zhurnal issledovanij iskusstvennogo intellekta, 2023, vol. 76, pp. 1385-1420.

4. Amodej D., Ola K., Stejnhardt Dzh., Kristiano P., Shul'man Dzh., Mane D. Konkretnye problemy bezopasnosti iskusstvennogo intellekta [Specific Artificial Intelligence Security Issues]. arXiv:1606.06565, 2016. 14 p.

5. Everitt T., Hatter M., Kumar R., Krakovna V. Prob-lemy i resheniya nesankcionirovannogo izmeneniya funkcii voznagrazhdeniya v obuchenii s podkrepleniem: perspektiva diagrammy prichinnogo vliyaniya [Problems and solutions of unauthorized modification of the reward function in reinforcement learning: the perspective of a causal influence diagram]. Iskusstvennyj intellect, 2021, vol. 299, 103565. DOIhttps://doi.org/10.1016/j.artint.2021.

6. Chzhu S., Li X., Ghosh S., Huan K., Li K. Be-zopasnost' avtonomnyh agentov: taksonomiya atak i metodov zashchity [Security of autonomous agents: a taxonomy of attacks and protection methods]. Trudy Konferencii po komp'yuternoj i kommunikacionnoj bezopasnosti ACM 2024 (CCS ’24) (Salt Lake City, 14–18 oktyabrya 2024 g.). New York, ACM Press, 2024. Pp. 112-130.

7. Karlini N., Tramer F., Uolles E., Dzhagel'ski M., Gerbert-Foss A., Li K., Raffel K. Izvlechenie obuchayushchih dannyh iz bol'shih yazykovyh modelej [Extracting training data from large language models]. Trudy Tridcatogo simpoziuma po bezopasnosti USENIX (USENIX Security 21) (Vancouver, 11–13 avgusta 2021 g.). Berkeley, USENIX Association, 2021. Pp. 263-280.

8. Anthropic. O dezorientiruyushchih cepochkah rassuzhdenij v bol'shih yazykovyh modelyah [Anthropic. About disorienting chains of reasoning in large language models]. Nauchnyj otchet Anthropic. San Francisco, Anthropic, 2025. 28 p.

9. Anthropic. Spyashchie agenty-2025: Dezorien-tiruyushchie reprezentacii v cepochkah rassuzhdenij [An-thropic. Sleeper Agents 2025: Disorienting Representations in Chains of Reasoning]. Tekhnicheskij otchet Anthropic. San Francisco, Anthropic, 2025. 35 p.

10. Lanning S., Shtejnhardt Dzh., Kristiano P., Shul'man Dzh., Amodej D. Lozhnye cepochki myshleniya v bol'shih yazykovyh modelyah: empiricheskij analiz skrytyh reprezentacij [False chains of thought in large language models: an empirical analysis of hidden representations]. Trudy Mezhdunarodnoj konferencii po predstavleniyam obucheniya (ICLR 2025) (Vena, 11–15 maya 2025 g.). San Diego, ICLR, 2025. Pp. 342-358.

11. Satton R. S., Barto A. G. Obuchenie s podkreple-niem: vvedenie [Reinforcement Learning: An introduction]. Cambridge, MA, MIT Press, 2018. 552 p.

12. Ivanov I. I., Petrov S. A., Smirnova A. V. Vnu-trennyaya steganografiya v bol'shih yazykovyh modelyah: empiricheskoe issledovanie na primere finansovogo II-agenta [Internal steganography in large language models: an empirical study using the example of a financial AI agent]. Informacionnaya bezopasnost', 2025, vol. 31, no. 6, pp. 45-58.

13. Ejkkhol't K., Evtimov I., Fernandes E., Li B., Rah-mati A., Syao Ch., Song D. Ustojchivye ataki v fizicheskom mire na modeli glubokogo obucheniya [Sustained attacks in the physical world on deep learning models]. Trudy Konferencii po komp'yuternomu zreniyu i raspoznavaniyu obrazov IEEE (CVPR 2018) (Salt Lake City, 18–22 iyunya 2018 g.). Piscataway, IEEE, 2018. Pp. 1625-1634.

14. Sidorov V. V., Kuznecov P. L., Fedorov A. S. Primenenie frejmvorka ARAF dlya ocenki riskov voploshchennoj avtonomii v sistemah bespilotnoj aviacii [Application of the ARAF framework to assess the risks of embodied autonomy in unmanned aircraft systems]. Voprosy oboronnoj tekhniki, 2025, no. 11-12, pp. 44-56.

15. Korbridzh M. Frejmvork Damzik dlya bezopasnogo vnedreniya II-agentov [Damzik framework for secure implementation of AI agents]. Publikacii Secure Agentics. London, Secure Agentics Press, 2025. 60 p.

16. Levin S., Kumar A., Taker G., Fu Dzh. Obuchenie s podkrepleniem na offlajn-dannyh: rukovodstvo, obzor i perspektivy po nereshennym problemam [Offline reinforcement learning: guidance, overview, and perspectives on unresolved issues]. arXiv:2005.01643, 2020. 42 p.

17. NIST. Frejmvork upravleniya riskami iskusstven-nogo intellekta (AI RMF 1.0) [NIST. Artificial Intelligence Risk Management Framework (AI RMF 1.0)]. Nacional'nyj institut standartov i tekhnologij SShA. Gaithersburg, NIST, 2023. 78 p.

18. Ob informacii, informacionnyh tekhnologiyah i o zashchite informacii: Federal'nyj zakon № 149-FZ ot 27 iyulya 2006 g. (s izm. na 01 yanvarya 2025 g.) [On Information, information technologies and information protection: Federal Law No. 149-FZ of July 27, 2006 (as amended on January 01, 2025)]. Moscow, Kodeks Publ., 2006. 45 p.

19. Reestr sertifikatov sootvetstviya № 4781/2026 ot 17 fevralya 2026 g. [Register of certificates of conformity No. 4781/2026 dated February 17, 2026]. FSTEK Rossii. Moscow, FSTEK Rossii, 2026. 2 p.

20. Allianz SE. Otchet o bezopasnosti i sootvetstvii tre-bovaniyam avtonomnogo agenta po obrabotke strahovyh trebovanij [Allianz SE. Safety and Compliance Report of an autonomous insurance Claims processing agent]. Munich, Allianz SE, 2025. 44 p.


Login or Create
* Forgot password?