Authors: Gregory Conti, Sergey Bratus, Benjamin Sangster, Roy Ragsdale, Matthew Supan, Andrew Lichtenberg, Robert Perez-Alemany, Anna Shubina
DFRWS USA 2010
Abstract
Security analysts, reverse engineers, and forensic analysts are regularly faced with large binary objects, such as executable and data files, process memory dumps, disk images and hibernation files, often Gigabytes or larger in size and frequently of unknown, suspect, or poorly documented structure. Binary objects of this magnitude far exceed the capabilities of traditional hex editors and textual command line tools, frustrating analysis. This paper studies automated means to map these large binary objects by classifying regions using a multi-dimensional, information-theoretic approach. We make several contributions including the introduction of the binary mapping metaphor and its associated applications, as well as techniques for type classification of low-level binary fragments. We validate the efficacy of our approach through a series of classification experiments and an analytic case study. Our results indicate that automated mapping can help speed manual and automated analysis activities and can be generalized to incorporate many low-level fragment classification techniques.