Digital forensics involves analyzing digital devices to recover evidence related to cybercrimes, data breaches, or malicious activities. Python is widely used in forensics due to its powerful libraries, automation capabilities, and ability to analyze various types of digital artifacts.
What You’ll Learn
✔ Acquiring and processing forensic data
✔ File system and memory analysis
✔ Log file and network forensics
✔ Email and metadata extraction
✔ Automating forensic investigations
1. Setting Up a Forensic Environment
Required Tools & Libraries
- Forensic Libraries
pytsk3
(File system forensics using Sleuth Kit)dfvfs
(Digital Forensics Virtual File System)volatility
(Memory forensics)scapy
(Network forensics)pymisp
(Malware Information Sharing)exiftool
(Metadata extraction)hashlib
(Hash verification)yara-python
(Malware pattern detection)
- Forensic Tools
- Autopsy (GUI-based forensic suite)
- FTK Imager (Disk imaging)
- Wireshark (Network packet analysis)
2. File System Forensics with Python
Forensic investigators often analyze file systems to detect tampered or hidden files.
Extracting Metadata from Files
import os
import time
file_path = "evidence.txt"
file_stat = os.stat(file_path)
print(f"File Size: {file_stat.st_size} bytes")
print(f"Creation Time: {time.ctime(file_stat.st_ctime)}")
print(f"Last Access Time: {time.ctime(file_stat.st_atime)}")
print(f"Last Modification Time: {time.ctime(file_stat.st_mtime)}")
Useful for identifying recently modified or deleted files.
Searching for Hidden Files
import os
for root, dirs, files in os.walk("/suspect_folder"):
for file in files:
if file.startswith("."): # Hidden file
print(f"Hidden file found: {os.path.join(root, file)}")
Detects hidden files that may contain illicit data.
3. Hashing and File Integrity Checks
Investigators use hashing to verify file integrity and detect tampering.
Calculating SHA-256 Hashes of Files
import hashlib
def get_hash(file_path):
with open(file_path, "rb") as f:
data = f.read()
return hashlib.sha256(data).hexdigest()
print(get_hash("evidence.txt"))
Compare against known file hashes in forensic databases.
4. Memory Forensics with Volatility
Memory forensics helps recover volatile evidence like running processes, open network connections, and encryption keys.
Listing Running Processes from Memory Dump
volatility -f memory.dmp --profile=Win10x64 pslist
Identify suspicious processes running in memory.
Extracting Registry Hives from Memory
volatility -f memory.dmp --profile=Win10x64 hivelist
Recover deleted registry keys related to malware persistence.
5. Log File Analysis with Python
Analyzing system and application logs can provide insights into unauthorized access and anomalies.
Parsing Linux Authentication Logs
log_file = "/var/log/auth.log"
with open(log_file, "r") as f:
for line in f:
if "Failed password" in line:
print(line.strip()) # Potential brute-force attack
Detect unauthorized login attempts.
Analyzing Windows Event Logs
import win32evtlog
server = "localhost"
logtype = "Security"
hand = win32evtlog.OpenEventLog(server, logtype)
total = win32evtlog.GetNumberOfEventLogRecords(hand)
print(f"Total security logs: {total}")
Identify login failures, privilege escalations, or policy violations.
6. Network Forensics with Python
Network logs and packet captures (PCAPs) are crucial in investigating attacks.
Capturing Network Traffic with Scapy
from scapy.all import sniff
def packet_callback(packet):
print(packet.summary())
sniff(prn=packet_callback, count=10)
Detect unusual network activity.
Extracting URLs from PCAP Files
import pyshark
cap = pyshark.FileCapture('network.pcap')
for packet in cap:
if "HTTP" in packet:
print(packet.http.host)
Identify communication with malicious domains.
7. Email Forensics with Python
Extracting Headers from Emails
from email import policy
from email.parser import BytesParser
with open("suspicious_email.eml", "rb") as f:
email = BytesParser(policy=policy.default).parse(f)
print(f"From: {email['from']}")
print(f"To: {email['to']}")
print(f"Subject: {email['subject']}")
print(f"Date: {email['date']}")
Find phishing indicators and spoofed senders.
8. Metadata Analysis in Forensics
Metadata can reveal the history of a document, including edits and GPS locations.
Extracting EXIF Metadata from Images
import exiftool
with exiftool.ExifTool() as et:
metadata = et.get_metadata("suspect.jpg")
print(metadata.get("EXIF:GPSLatitude"), metadata.get("EXIF:GPSLongitude"))
Determine the location where a photo was taken.
9. Timeline Analysis in Forensics
Creating a timeline helps reconstruct events leading up to an incident.
Generating a Timeline from File Metadata
import os
import time
file_path = "evidence.txt"
timestamps = {
"Created": time.ctime(os.stat(file_path).st_ctime),
"Modified": time.ctime(os.stat(file_path).st_mtime),
"Accessed": time.ctime(os.stat(file_path).st_atime),
}
print(timestamps)
Understand when files were accessed or modified.
10. Automating Forensic Investigations
Python can automate repetitive forensic tasks, improving efficiency.
Automating File Hashing for an Entire Directory
import os
import hashlib
def hash_files(directory):
for root, _, files in os.walk(directory):
for file in files:
file_path = os.path.join(root, file)
print(f"{file}: {get_hash(file_path)}")
hash_files("/forensic_data")
Generate hashes for forensic integrity checks.