Real Time Prediction of Drive by Download Attacks on Twitter

Amir Javed, Pete Burnap, Omer Rana

The popularity of Twitter for information discovery, coupled with the automatic shortening of URLs to save space, given the 140 character limit, provides cyber criminals with an opportunity to obfuscate the URL of a malicious Web page within a tweet. Once the URL is obfuscated the cyber criminal can lure a user to click on it with enticing text and images before carrying out a cyber attack using a malicious Web server. This is known as a drive-by- download. In a drive-by-download a user's computer system is infected while interacting with the malicious endpoint, often without them being made aware, the attack has taken place. An attacker can gain control of the system by exploiting unpatched system vulnerabilities and this form of attack currently represents one of the most common methods employed. In this paper, we build a machine learning model using machine activity data and tweet meta data to move beyond post-execution classification of such URLs as malicious, to predict a URL will be malicious with 99.2% F-measure (using 10-fold cross validation) and 83.98% (using an unseen test set) at 1 second into the interaction with the URL. Thus providing a basis from which to kill the connection to the server before an attack has completed and proactively blocking and preventing an attack, rather than reacting and repairing at a later date.

Knowledge Graph

arrow_drop_up

Comments

Sign up or login to leave a comment