Authors: Mahfuzul I. Nissan, James Wagner, Alexander Rasin
DFRWS USA 2025 — “History in the Making” — Jubilee 25th Anniversary
Abstract
The increased use of NoSQL databases to store and manage data has led to a demand to include them in forensic investigations. Most NoSQL databases use diverse storage formats compared to file carving and relational database forensics. For example, some NoSQL databases manage key-value pairs using B-Trees, while others maintain hash tables or even binary protocols for serialization. Current research on NoSQL carving focuses on single-database solutions, making it impractical to develop individual carvers for every NoSQL system. This necessitates a generalized approach to forensic recovery, enabling the creation of a unified carver that can operate effectively across various NoSQL platforms.
In this research, we introduce Automated NoSQL Carver, ANOC, a novel tool designed to reconstruct database contents from raw database images without relying on the database API or logs. ANOC adapts to the unique storage characteristics of various NoSQL systems, utilizing byte-level reverse engineering to identify and parse data structures. The proposed framework addresses challenges such as the lack of explicit paging, dynamic segment headers, and metadata obfuscation common in NoSQL databases. By analyzing storage layouts algorithmically, ANOC identifies and reconstructs key-value pairs, hierarchical storage structures, and associated metadata across multiple NoSQL platforms.
Through extensive experimentation, we demonstrate ANOC’s ability to recover data from four representative key-value store NoSQL databases: Berkeley DB, ZODB, etcd, and LMDB. We explore ANOC’s limitations in environments where data is corrupted and RAM snapshots. Our findings establish the feasibility of a generalized carver capable of addressing the challenges posed by the diverse and evolving NoSQL ecosystem.