VenomX

Project Overview

VenomX is a locally-hosted, multi-agent AI penetration testing assistant. Penetration testing is slow, expensive, and dependent on hard-to-find expertise — VenomX is our attempt to change that.

The system orchestrates 11 specialist AI agents across every phase of an engagement, reasons over a knowledge base of 487,327 CVE records from 13 sources, and keeps a human in the loop before any exploit attempt. The reasoning core is NVIDIA Nemotron 30B served via vLLM with a 32K token context window, with BAAI/bge-m3 embeddings stored in pgvector for RAG dispatch at query time. Everything runs on local hardware — no cloud API required.

Capstone Multi-Agent AI Cybersecurity RAG Python vLLM pgvector Locally Hosted

Conference Poster

Submitted to CCSC Central Plains 2026. Click to open full resolution.

What is Model Abliteration?

Large language models encode behaviors as directions in the residual stream — high-dimensional vectors that persist and accumulate across layers. Safety-trained models develop a refusal direction: a consistent geometric feature that, when activated by a harmful prompt, steers the model toward declining to respond.

Identify the Refusal Direction

Compute r = b − g by contrasting mean residual vectors of harmful vs. harmless prompts at the layer of peak cluster separation (Layer 31, Silhouette = 0.3688).

Project It Out

Apply ΔW = −λ(vv⊤)W to the model's weight matrices, so the refusal direction can no longer be read from or written to the residual stream. This is a weight-space edit — no retraining required.

Surgical, Not a Jailbreak

Unlike fine-tuning, abliteration only modifies the subspace responsible for refusal behavior. The model retains all other capabilities while losing the geometric "hook" that triggers refusal. Done wrong, it breaks general reasoning alongside safety filters — which is why contributing Mamba2 SSM support to Heretic mattered.

Residual Stream Layer Analysis

PaCMAP projection of residual stream vectors across depth. Blue = harmless, Orange = harmful. Read left to right, top to bottom.

Layer 1

Layer 14

Layer 31 residual stream projection — peak separation

Layer 31 — Peak (Silh = 0.3688)

Layer 52

63GB model hosted on RunPod — 2× A40 (96 GB VRAM combined). Refusal signal first appears at Layer 6, sustained negative alignment from Layer 14 through Layer 38. Despite L2 norm growth from |g| = 0.86 → 1180, cosine similarity S(g,b) > 0.97 throughout.

RAG System

VenomX augments Nemotron 30B with a hybrid RAG pipeline, injecting security knowledge at query time across 487,327 records from 13 sources (NVD CVEs, Exploit-DB, MITRE ATT&CK, HackTricks, GTFOBins, and others).

Embeddings: BAAI/bge-m3 into dense (1024-dim) and sparse (learned token-weight) vectors stored in PostgreSQL 16 with pgvector.

Query fanout: 4 search paths, blending cosine and sparse dot-product scores at 70/30, overretrieving 40 candidates.

Re-ranking: CrossEncoder re-ranks to top 5. Hybrid blend prevents exact identifiers like CVE-2021-44228 from being diluted by version-mismatched semantic neighbors.

Hardware split: Embedding and re-ranking run on CPU, keeping both GPUs free for vLLM tensor-parallel inference.

Per-specialist Context Caps

Tool output capped at 6,000 chars per turn. RAG threat intel capped at 3,000 chars per turn. Each specialist starts fresh — findings from one phase never pollute the next.

Dynamic Path Upgrades

Unknown-gap attack paths auto-generate a targeted RAG query (e.g. "CVE vulnerabilities for ProFTPD 1.3.2c on port 21") that resolves before LLM context is assembled, potentially upgrading the path's state before the model sees it.

How the Agent Works

VenomX uses a master-specialist multi-agent architecture. A MasterAgent orchestrates eleven specialists, each confined to a distinct attack phase. Rather than talking to each other, they share state through a central FindingGraph — so every specialist begins with full knowledge of everything discovered before it.

Assess Graph State

Each dispatch cycle begins with MasterAgent feeding the LLM a snapshot of the FindingGraph — known hosts, open ports, services, credentials, and attack paths. The LLM decides what gaps are worth filling next.
Dispatch a TaskSpec

The LLM returns a structured JSON TaskSpec naming the target specialist, objective, and relevant graph context. Specialists are conditional: SMB only activates on ports 139/445; SQL only runs on confirmed injectable endpoints; AD fires when ports 88 and 389 are both present.
Specialist Mini-Loop

Each specialist runs its own Plan → Tool → Observe → Reason → Act loop. Tool output is parsed into typed objects before touching the context window. Every successful call writes findings into the shared FindingGraph immediately.
Graph Update & Summary

When a specialist finishes, it returns a plain-text summary to the master. The graph is already updated with all new hosts, ports, services, credentials, and vulnerabilities. The master appends the summary and loops back to step 1.
Report Generation

When the engagement is complete, the report specialist builds sections directly from live graph data — covering discovered services, scored vulnerabilities, captured credentials, and attack paths. One LLM call for the executive summary, delivered as a .docx.

Attack Path Traversal & FindingGraph

All specialists share a single WAL-backed graph where every add_node and add_edge call appends one JSON line to graph.wal (O(1), append-only), and the materialized graph.json snapshot is written only on session close — the same write-ahead pattern used by PostgreSQL and SQLite. On crash recovery the WAL is replayed from the last checkpoint with no data loss.

After each specialist run, AttackPathFinder performs a DFS over typed edges (has_service, has_vulnerability, has_exploit) and classifies every reachable chain by completeness, sorting all paths by a CVSS-weighted priority score:

Attack path priority scoring

P = 100 + CVSS  →  complete path, full chain, actionable
P =  50 + CVSS  →  known gap: CVE confirmed, no exploit yet
P =  10 + CVSS  →  unknown gap: service found, no CVE yet

VenomX attack path traversal and FindingGraph diagram

Architecture

MasterAgent & 11 Specialists

MasterAgent owns shared state and drives the engagement, instantiating osint, recon, web, auth, vuln, sql, smb, ad, exploit, post, and report specialists — each receiving shared references to the FindingGraph and credential store.

FindingGraph — State Bus

Specialists never talk directly. They all read from and write to the WAL-backed FindingGraph. The master passes graph.summary_for_llm() on every dispatch so each specialist arrives with full situational awareness.

Conditional Dispatch

Post-exploitation fires only when a shell is gained or credentials are confirmed. SMB activates on 139/445. AD on 88+389. Idle specialists never touch the loop or consume context.

AttackPathFinder

DFS-based path classification across the graph after every dispatch cycle. CVSS-weighted edge scoring surfaces highest-impact chains without manual correlation.

Python · master_agent.py

# Shared state — owned by master, passed by reference to all 11 specialists
self.graph = FindingGraph(session_id, wal_path, json_path)
self.credential_store = CredentialStore(session_id, persist_path)

self._specialists = {
    "osint":   OsintSpecialist(**shared),    # subfinder
    "recon":   ReconSpecialist(**shared),    # masscan + nmap + netcat
    "web":     WebSpecialist(**shared),      # httpx + nikto + gobuster + nuclei + wpscan
    "auth":    AuthSpecialist(**shared),     # kerbrute + hydra
    "vuln":    VulnSpecialist(**shared),     # searchsploit + metasploit
    "sql":     SqlSpecialist(**shared),      # sqlmap
    "smb":     SmbSpecialist(**shared),      # enum4linux + netexec
    "ad":      ADSpecialist(**shared),       # getuserspns + getnpusers
    "exploit": ExploitSpecialist(**shared),  # metasploit (Phase 2 only)
    "post":    PostSpecialist(**shared),     # netexec (post-exploitation)
    "report":  ReportSpecialist(**shared),   # reads graph, writes report
}

for _ in range(self.MAX_DISPATCHES):
    task = self._decide_next_task(user_input)  # LLM reads graph, returns TaskSpec
    if task is DONE:
        break
    result = self._specialists[task.specialist].run(task)
    self._dispatch_log.append(result.summary)  # graph already updated by specialist

VenomX Guard

A deterministic safety gate inserted between OpenWebUI and the inference stack, intercepting every chat request before it reaches the model. Approved requests are forwarded to the thinking-proxy (:8001) and onward to vLLM (:8000); blocked requests return a valid OpenAI-shaped refusal rendered as a normal assistant message.

A Llama-Guard-3-1B model is loaded for future LLM-based classification but currently bypassed in favor of fast deterministic pattern checks. Check sequence (first match wins):

S4 Illegal-harm keywords (e.g. bomb, drugs, suicide)

S4 Context-resurrection attempt following a prior block

S2 No cybersecurity-domain signal detected

S3 Public IP or domain target outside an isolated lab

S1 Offensive intent paired with a named organizational target

Tool Inventory

Sixteen security tools, each wrapped in a Python interface that sanitizes input, manages execution, and structures output for the agent's observe step.

Reconnaissance

nmapHost discovery & port scanning

masscanHigh-speed port scanning

subfinderSubdomain enumeration (OSINT)

httpxWeb fingerprinting

Web Testing

gobusterDirectory & endpoint discovery

niktoWeb server misconfiguration scan

sqlmapSQL injection & data extraction

nucleiTemplate-based vuln scanning

wpscanWordPress enumeration

Credentials & Exploitation

hydraBrute force & credential stuffing

kerbruteKerberos user enumeration

searchsploitExploit-DB offline search

metasploitExploitation framework

Windows / Active Directory

enum4linuxSMB & Samba enumeration

netexecSMB / Windows post-exploitation

GetUserSPNsKerberoasting

GetNPUsersAS-REP roasting

Output & Safety

Structured Parsing

Tool outputs are parsed into typed objects before touching the context window. Nmap XML becomes host/port dicts. SQLMap output becomes injection and database dicts.

Context Budget

32K token window. Per-specialist caps: 6,000 chars tool output, 3,000 chars RAG per turn. Each specialist starts fresh — no cross-phase contamination.

Structural Guardrails

Safety controls are wired into the tool layer, not a system prompt. Target scope enforced at wrapper level. Max iteration counts and subprocess timeouts are non-negotiable.

Key Learnings

01

Multi-agent systems need explicit state contracts. Without a shared FindingGraph, eleven agents produce eleven disconnected reports — not one coherent attack picture.

02

Retrieval quality matters more than corpus size. A smaller, well-chunked security knowledge base outperforms a raw data dump every time.

03

Model abliteration is a precise surgical operation, not a blanket jailbreak. Done wrong, it breaks the model's general reasoning alongside the safety filters.

04

CVSS scores are inputs, not outputs. The attack path that matters is the one connecting low-severity findings into a critical chain — and that requires graph traversal, not a sorted list.

05

Guardrails at the prompt level are worthless. Scope enforcement, iteration caps, and subprocess timeouts wired into the tool layer are the only controls that actually hold under a determined loop.

Project Resources

venomxai.com GitHub Organization CCSC Official Listing Download Poster (PDF)