Python for Malware Analysis

Loading

Malware analysis is the process of examining malicious software to understand its behavior, capabilities, and impact. Python is widely used in malware analysis due to its powerful libraries, automation capabilities, and ease of scripting.

This guide covers:
✔ Static and dynamic malware analysis
✔ Extracting indicators of compromise (IOCs)
✔ Reverse engineering techniques
✔ Sandboxing malware in a controlled environment


🔹 1. Setting Up a Malware Analysis Environment

Tools Required

  • Python 3
  • Virtual Machines (VMWare/VirtualBox)
  • Sandboxing tools (Cuckoo Sandbox, Any.Run)
  • Python Libraries:
    • pefile (for PE file analysis)
    • pydeep (for fuzzy hashing)
    • yara-python (for malware pattern detection)
    • scapy (for network traffic analysis)
    • pyshark (to analyze PCAP files)

Isolate Your Environment

Never analyze malware on your main machine!

  • Use a dedicated VM (Windows/Linux)
  • Disable network access or use a safe VPN
  • Take VM snapshots before executing malware

2. Static Malware Analysis with Python

Static analysis involves examining the malware file without executing it.

Extracting Metadata from PE Files

Portable Executable (PE) files are common in Windows malware.

import pefile

pe = pefile.PE("malware.exe")

print(f"Entry Point: {hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)}")
print(f"Sections: {[section.Name.decode().strip() for section in pe.sections]}")
print(f"Imported DLLs: {[entry.dll.decode() for entry in pe.DIRECTORY_ENTRY_IMPORT]}")

Find hidden imports, packed files, and suspicious DLLs.

Computing File Hashes for Malware Detection

import hashlib

def get_hash(file_path):
with open(file_path, "rb") as f:
data = f.read()
return hashlib.sha256(data).hexdigest()

print(get_hash("malware.exe"))

Compare against known malware hashes from VirusTotal.


3. Dynamic Malware Analysis with Python

Dynamic analysis involves running the malware in a controlled environment and monitoring its behavior.

Monitoring System Calls with Python

import psutil

for proc in psutil.process_iter(['pid', 'name', 'username']):
print(proc.info)

Detect suspicious processes spawned by malware.

Capturing Network Traffic with Scapy

from scapy.all import sniff

def packet_callback(packet):
print(packet.summary())

sniff(prn=packet_callback, count=10)

Analyze if malware is contacting C2 servers.


4. Detecting Malware with YARA Rules

YARA is a rule-based tool to detect malware patterns.

Example YARA Rule

rule Trojan_Dropper {
strings:
$a = "malicious_code_here"
$b = { E8 83 EC 18 68 }
condition:
any of them
}

Using Python to Scan Files with YARA

import yara

rules = yara.compile(filepath="rules.yara")
matches = rules.match("malware.exe")

if matches:
print("Malware detected:", matches)

Identify malware families and signatures.


5. Extracting Malware Configuration Data

Some malware hides its configuration in encrypted files, registry keys, or encoded scripts.

Extracting Strings from Malware

import strings

with open("malware.exe", "rb") as f:
data = f.read()

for string in strings.extract(data):
print(string)

Extract possible IPs, URLs, or suspicious commands.


6. Automating Malware Analysis with Python

Writing a Basic Malware Sandbox

import os
import subprocess

malware_path = "malware.exe"

try:
output = subprocess.check_output(malware_path, shell=True, timeout=10)
print("Malware executed:", output)
except subprocess.TimeoutExpired:
print("Execution timed out - possible sandbox detection")

Run malware in a restricted environment for observation.


7. Reverse Engineering Malware

Use tools like Ghidra, IDA Pro, or Radare2 for deep analysis. Python can automate parts of the reverse engineering process.

Extracting Opcode Sequences

from capstone import *

code = b"\x55\x48\x8b\x05\xb8\x13\x00\x00"
md = Cs(CS_ARCH_X86, CS_MODE_64)

for i in md.disasm(code, 0x1000):
print("0x%x:\t%s\t%s" % (i.address, i.mnemonic, i.op_str))

Useful for analyzing shellcode and obfuscated binaries.


8. Detecting Malware Persistence Mechanisms

Malware often creates registry keys or startup entries to persist after reboot.

Checking Windows Startup Entries

import winreg

key = winreg.OpenKey(winreg.HKEY_LOCAL_MACHINE, r"SOFTWARE\Microsoft\Windows\CurrentVersion\Run")
i = 0
while True:
try:
value = winreg.EnumValue(key, i)
print(value)
i += 1
except OSError:
break

Detects if malware runs at startup.


9. Analyzing Malware Communication

Extracting C2 Server Information from Memory Dumps

import re

with open("memory.dmp", "rb") as f:
data = f.read()

ips = re.findall(rb"\b(?:[0-9]{1,3}\.){3}[0-9]{1,3}\b", data)
print(set(ips))

Find C2 IPs embedded in malware memory dumps.


10. Detecting Malware Using Machine Learning

Python can train models to detect malware based on static and dynamic features.

Example: Machine Learning for Malware Classification

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

# Sample feature set (hashes, file size, entropy)
X = [[1.2, 500, 7.8], [0.9, 200, 3.5], [1.5, 1000, 6.4]]
y = [1, 0, 1] # 1 = Malware, 0 = Benign

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
clf = RandomForestClassifier()
clf.fit(X_train, y_train)

print("Malware detection accuracy:", clf.score(X_test, y_test))

Detects malware based on file characteristics.

Leave a Reply

Your email address will not be published. Required fields are marked *