Modelling direct messaging networks with multiple recipients for cyber deception

Date

Authors

Moore, Kristen
Christopher, Cody James
Liebowitz, David
Nepal, Surya
Selvey, Renee

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers Inc.

Access Statement

Research Projects

Organizational Units

Journal Issue

Abstract

Cyber deception is the practice of deliberately introducing fake or misleading artefacts into cyber systems. It is emerging as a promising approach to defending networks and systems against attackers and data thieves. However, despite being relatively cheap to deploy [1], the generation of realistic content at scale is very costly when it is hand-crafted. With recent improvements in Machine Learning, we now have the opportunity to bring scale and automation to the creation of realistic and enticing simulated content. In this work, we propose a framework to automate the generation of email and instant messaging-style group communications at scale. Such messaging platforms within organisations contain a lot of valuable information inside private communications and document attachments, making them an enticing target for an adversary. The presence of an active messaging platform also enhances the realism of a deceptive network simulation, contributing both traffic and message artefacts. We address two key aspects of simulating this type of system: modelling when and with whom participants communicate, and generating topical, multi-party text to populate simulated conversation threads. We present the LogNormMix-Net Temporal Point Process as an approach to the first of these, building upon the intensity-free modeling approach of Shchur et al. [2] to create a generative model for unicast and multi-cast communications. We demonstrate the use of fine-tuned, pretrained language models to generate convincing multi-party conversation threads. A live email server is simulated by uniting our LogNormMix-Net TPP (to generate the communication timestamp, sender and recipients) with the language model, which generates the contents of the multi-party email threads. We evaluate the generated content with respect to a number of realism-based properties, that encourage a model to learn to generate content that will engage the attention of an adversary to achieve a deception outcome. Our simulations run in real time, making them suitable for deployment in cyber deception as a honeypot in its own right, or as part of a larger deception environment.

Description

Citation

Source

Book Title

Proceedings - 7th IEEE European Symposium on Security and Privacy: EUROS&P

Entity type

Publication

Access Statement

License Rights

Restricted until