Database memory forensics: A machine learning approach to reverse-engineer query activity

Authors: Mahfuzul I. Nissan, James Wagner, Sharmin Aktar

DFRWS EU 2023

Abstract

Memory analysis allows forensic investigators to establish a more complete timeline of system activity using a snapshot of main memory (i.e., RAM). Investigators may rely on such analysis to detect malicious activity and understand the scope of what data was exfiltrated. This is of particular interest in the presence of incomplete or untrusted logs, where a privileged user (or an attacker with such capabilities) can altogether bypass or disable logging. In such instances, a forensic investigator can still rely on the fact that data must ultimately be processed in memory, regardless of the information that is recorded in audit logs.
In this work, we propose methods to reverse-engineer query activity from a database management system (DBMS) process snapshot. Since DBMSes are used to manage and store an organization’s most sensitive data, they are of particular concern for data exfiltration. A DBMS processes queries using a series of operations, such as index sort, file sort, or joins, which produce their own set of distinct forensic artifacts in memory. Our methods use these artifacts to make conclusions about recent query activity even in the presence of untrusted or incomplete logs. Our methods use a supervised learning based model using support vector machines (SVM) to approximate recently executed queries given these memory artifacts. We extract feature vectors from the byte frequencies in a special area of the DBMS process called the sort area fragment, and use SVM to predict the type of the query operation under supervised learning. We demonstrate the capabilities and the accuracy of our methods for two representative DBMSes, PostgreSQL and MySQL. Experimental results show that, our model achieved an accuracy of 92% and 90% on MySQL and PostgreSQL datasets, respectively.

Downloads

Database memory forensics: A machine learning approach to reverse-engineer query activity (Paper)