Authors: Cezar Serhal (University College Dublin) and Nhien-An Le-Khac (University College Dublin)



With the rapid increase in mobile phone storage capacity and penetration, digital forensic investigators face a significant challenge in quickly identifying relevant examinable files within a plethora of uninteresting OS and application files extracted by forensic tools. This challenge can have serious adverse effects in time critical cases, and can also result in increasing case backlog. A possible solution for this issue is to prioritize digital artifacts. This is referred to as triage. Several digital forensic triage methodologies based on classical automation techniques such as block hash and regular expression matching have been proposed. However, such techniques suffer from the significant limitation of requiring users to know and hardcode data templates and relations of interest. In literature, more flexible machine learning based approaches have been proposed to classify whether a mobile device, rather than a mobile device artifact, is of interest or not based on its usage metrics and file-system metadata. Also, recently an approach has been proposed and tested in triaging data generated and extracted from a computer-based operating system. However, this approach did not cover smart mobile operating system, and it did not consider key steps such as feature engineering, feature selection, and hyper-parameter tuning. Hence, in this paper, we propose a comprehensive machine learning based solutions with features extracted from file metadata to identify possible smart phone files of interest that should be examined. A range of classification algorithms are tested and their performance compared. Our classification models were trained and tested on a dataset consisting of the metadata of nearly 2 million files extracted from devices running Android OS and linked to real terrorism cases. The use of real case data allows obtaining realistic results, and restricting the operating system and case type helps narrow the experimentation scope enough to provide a proof of concept. Through our experiments, a best classifier is also identified.