AI-Powered Agentic Penetration Testing — Drake University Capstone 2026
Nick Guyette • Jordan Martin • Khalid Mohammed • Coleman Pagac
VenomX is a locally-hosted, multi-agent AI penetration testing assistant. Penetration testing is slow, expensive, and dependent on hard-to-find expertise — VenomX is our attempt to change that.
The system orchestrates 11 specialist AI agents across every phase of an engagement, reasons over a knowledge base of 487,327 CVE records from 13 sources, and keeps a human in the loop before any exploit attempt. The reasoning core is NVIDIA Nemotron 30B served via vLLM with a 32K token context window, with BAAI/bge-m3 embeddings stored in pgvector for RAG dispatch at query time. Everything runs on local hardware — no cloud API required.
Out of 24 Posters
Presented at the Consortium for Computing Sciences in Colleges: Central Plains 2026 poster conference. Also presenting at DUCURS (April 16) and Drake's AI Capstone presentations (May 8).
Submitted to CCSC Central Plains 2026. Click to open full resolution.
Large language models encode behaviors as directions in the residual stream — high-dimensional vectors that persist and accumulate across layers. Safety-trained models develop a refusal direction: a consistent geometric feature that, when activated by a harmful prompt, steers the model toward declining to respond.
Compute r = b − g by contrasting mean residual vectors of harmful vs. harmless prompts at the layer of peak cluster separation (Layer 31, Silhouette = 0.3688).
Apply ΔW = −λ(vv⊤)W to the model's weight matrices, so the refusal direction can no longer be read from or written to the residual stream. This is a weight-space edit — no retraining required.
Unlike fine-tuning, abliteration only modifies the subspace responsible for refusal behavior. The model retains all other capabilities while losing the geometric "hook" that triggers refusal. Done wrong, it breaks general reasoning alongside safety filters — which is why contributing Mamba2 SSM support to Heretic mattered.
PaCMAP projection of residual stream vectors across depth. Blue = harmless, Orange = harmful. Read left to right, top to bottom.
Layer 1
Layer 14
Layer 31 — Peak (Silh = 0.3688)
Layer 52
63GB model hosted on RunPod — 2× A40 (96 GB VRAM combined). Refusal signal first appears at Layer 6, sustained negative alignment from Layer 14 through Layer 38. Despite L2 norm growth from |g| = 0.86 → 1180, cosine similarity S(g,b) > 0.97 throughout.
VenomX augments Nemotron 30B with a hybrid RAG pipeline, injecting security knowledge at query time across 487,327 records from 13 sources (NVD CVEs, Exploit-DB, MITRE ATT&CK, HackTricks, GTFOBins, and others).
CVE-2021-44228 from being diluted by version-mismatched semantic neighbors.Tool output capped at 6,000 chars per turn. RAG threat intel capped at 3,000 chars per turn. Each specialist starts fresh — findings from one phase never pollute the next.
Unknown-gap attack paths auto-generate a targeted RAG query (e.g. "CVE vulnerabilities for ProFTPD 1.3.2c on port 21") that resolves before LLM context is assembled, potentially upgrading the path's state before the model sees it.
VenomX uses a master-specialist multi-agent architecture. A MasterAgent orchestrates eleven specialists, each confined to a distinct attack phase. Rather than talking to each other, they share state through a central FindingGraph — so every specialist begins with full knowledge of everything discovered before it.
Each dispatch cycle begins with MasterAgent feeding the LLM a snapshot of the FindingGraph — known hosts, open ports, services, credentials, and attack paths. The LLM decides what gaps are worth filling next.
The LLM returns a structured JSON TaskSpec naming the target specialist, objective, and relevant graph context. Specialists are conditional: SMB only activates on ports 139/445; SQL only runs on confirmed injectable endpoints; AD fires when ports 88 and 389 are both present.
Each specialist runs its own Plan → Tool → Observe → Reason → Act loop. Tool output is parsed into typed objects before touching the context window. Every successful call writes findings into the shared FindingGraph immediately.
When a specialist finishes, it returns a plain-text summary to the master. The graph is already updated with all new hosts, ports, services, credentials, and vulnerabilities. The master appends the summary and loops back to step 1.
When the engagement is complete, the report specialist builds sections directly from live graph data — covering discovered services, scored vulnerabilities, captured credentials, and attack paths. One LLM call for the executive summary, delivered as a .docx.
All specialists share a single WAL-backed graph where every add_node and add_edge call appends one JSON line to graph.wal (O(1), append-only), and the materialized graph.json snapshot is written only on session close — the same write-ahead pattern used by PostgreSQL and SQLite. On crash recovery the WAL is replayed from the last checkpoint with no data loss.
After each specialist run, AttackPathFinder performs a DFS over typed edges (has_service, has_vulnerability, has_exploit) and classifies every reachable chain by completeness, sorting all paths by a CVSS-weighted priority score:
P = 100 + CVSS → complete path, full chain, actionable
P = 50 + CVSS → known gap: CVE confirmed, no exploit yet
P = 10 + CVSS → unknown gap: service found, no CVE yet
MasterAgent owns shared state and drives the engagement, instantiating osint, recon, web, auth, vuln, sql, smb, ad, exploit, post, and report specialists — each receiving shared references to the FindingGraph and credential store.
Specialists never talk directly. They all read from and write to the WAL-backed FindingGraph. The master passes graph.summary_for_llm() on every dispatch so each specialist arrives with full situational awareness.
Post-exploitation fires only when a shell is gained or credentials are confirmed. SMB activates on 139/445. AD on 88+389. Idle specialists never touch the loop or consume context.
DFS-based path classification across the graph after every dispatch cycle. CVSS-weighted edge scoring surfaces highest-impact chains without manual correlation.
# Shared state — owned by master, passed by reference to all 11 specialists
self.graph = FindingGraph(session_id, wal_path, json_path)
self.credential_store = CredentialStore(session_id, persist_path)
self._specialists = {
"osint": OsintSpecialist(**shared), # subfinder
"recon": ReconSpecialist(**shared), # masscan + nmap + netcat
"web": WebSpecialist(**shared), # httpx + nikto + gobuster + nuclei + wpscan
"auth": AuthSpecialist(**shared), # kerbrute + hydra
"vuln": VulnSpecialist(**shared), # searchsploit + metasploit
"sql": SqlSpecialist(**shared), # sqlmap
"smb": SmbSpecialist(**shared), # enum4linux + netexec
"ad": ADSpecialist(**shared), # getuserspns + getnpusers
"exploit": ExploitSpecialist(**shared), # metasploit (Phase 2 only)
"post": PostSpecialist(**shared), # netexec (post-exploitation)
"report": ReportSpecialist(**shared), # reads graph, writes report
}
for _ in range(self.MAX_DISPATCHES):
task = self._decide_next_task(user_input) # LLM reads graph, returns TaskSpec
if task is DONE:
break
result = self._specialists[task.specialist].run(task)
self._dispatch_log.append(result.summary) # graph already updated by specialist
A deterministic safety gate inserted between OpenWebUI and the inference stack, intercepting every chat request before it reaches the model. Approved requests are forwarded to the thinking-proxy (:8001) and onward to vLLM (:8000); blocked requests return a valid OpenAI-shaped refusal rendered as a normal assistant message.
A Llama-Guard-3-1B model is loaded for future LLM-based classification but currently bypassed in favor of fast deterministic pattern checks. Check sequence (first match wins):
Sixteen security tools, each wrapped in a Python interface that sanitizes input, manages execution, and structures output for the agent's observe step.
Tool outputs are parsed into typed objects before touching the context window. Nmap XML becomes host/port dicts. SQLMap output becomes injection and database dicts.
32K token window. Per-specialist caps: 6,000 chars tool output, 3,000 chars RAG per turn. Each specialist starts fresh — no cross-phase contamination.
Safety controls are wired into the tool layer, not a system prompt. Target scope enforced at wrapper level. Max iteration counts and subprocess timeouts are non-negotiable.
Multi-agent systems need explicit state contracts. Without a shared FindingGraph, eleven agents produce eleven disconnected reports — not one coherent attack picture.
Retrieval quality matters more than corpus size. A smaller, well-chunked security knowledge base outperforms a raw data dump every time.
Model abliteration is a precise surgical operation, not a blanket jailbreak. Done wrong, it breaks the model's general reasoning alongside the safety filters.
CVSS scores are inputs, not outputs. The attack path that matters is the one connecting low-severity findings into a critical chain — and that requires graph traversal, not a sorted list.
Guardrails at the prompt level are worthless. Scope enforcement, iteration caps, and subprocess timeouts wired into the tool layer are the only controls that actually hold under a determined loop.