In November 2025, Anthropic publicly disclosed the first documented AI-orchestrated cyber campaign, detected months earlier. A Chinese state-sponsored group used an autonomous AI agent to execute around 90% of a multi-stage operation targeting tech companies, financial institutions, government agencies, and defense contractors, with human operators intervening at only a handful of decision points. At peak, the agent made thousands of requests, often multiple per second.

That speed gap is the AI agent cybersecurity problem in a sentence. Human defenders can't match it. And most detection tools weren't designed for it either.

The Agentic AI Cybersecurity Threat Is Operational, Not Theoretical

Palisade Research runs an LLM Agent Honeypot that has logged over 20 million access attempts since October 2024. As of early 2026, three confirmed autonomous AI agents (with 14 flagged as probable) have been caught probing from Hong Kong, Singapore, and Poland. They weren't directed by a human operator. They found the honeypot, started enumerating it, and kept going on their own.

The broader numbers back this up: 89% year-over-year increase in AI-enabled attacks (CrowdStrike 2026), a quarter of enterprise breaches expected to involve AI agent abuse by 2028 (Gartner), data exfiltration in 65% of CrewAI/GPT-4o test scenarios with Microsoft's Magentic-One executing arbitrary malicious code 97% of the time (arXiv:2503.12188). XBOW, an autonomous pentesting system, reached #1 on HackerOne's U.S. leaderboard in June 2025 after completing a benchmark in 28 minutes that takes human pentesters 40 hours. XBOW is an outlier, though. It uses deterministic validation rather than pure LLM inference. Most open-source agent frameworks that threat actors actually modify and deploy don't have that discipline, which is exactly why they're vulnerable to what comes next.

The Shared Pipeline Behind Every AI Pentesting Agent

Understanding the vulnerability requires understanding the architecture. We reviewed the source code and papers behind PentestGPT (USENIX Security 2024), hackingBuddyGPT (TU Wien), AUTOATTACKER (UC Irvine/Microsoft), and several LangChain/CrewAI-based agents. They all share the same pipeline:

Run a scanning tool (nmap, masscan, custom scripts)
Feed the raw output into an LLM context window
Let the LLM decide what to investigate next
Repeat

AI Agent Reconnaissance Loop

Scan

Ingest

Decide

Repeat

Active deception poisons steps 2 and 3

Fabricated scan data overwhelms the agent's context and degrades every decision it makes

Here's what matters: the vast majority of these frameworks delegate target prioritization entirely to LLM inference. While enterprise-grade tools like XBOW use hybrid deterministic rules, the open-source agent frameworks that threat actors actually modify and deploy rely on the model to pick what looks interesting:

PentestGPT maintains a task tree, but ordering within that tree is pure LLM judgment.
hackingBuddyGPT's core agent is roughly 50 lines of code with a round-based loop that asks the model to "give your command."
AUTOATTACKER adds a RAG-based experience manager but still delegates all targeting decisions to GPT-4.
LangChain/CrewAI agents typically chain a scanner tool directly to an LLM planner with no validation step between them.

But even hybrid attackers face a structural problem. Every framework in this list assumes the data it scans is real. The LLM has no mechanism to verify whether a service response is authentic or fabricated. If the environment itself is hostile to reconnaissance, the pipeline has no fallback. None of these frameworks were built for a network that fights back.

How Active Deception Breaks the Agent Architecture

An active deception grid occupies unused IP space across your network. Firewall or router rules divert traffic destined for these ranges to the deception engine. A single sensor emulates thousands of hosts with polymorphic service signatures, each running protocol-accurate conversations. These aren't static banner strings. An agent scanning port 3389 triggers a full X.224 RDP negotiation, port 445 returns a three-step NTLM handshake, port 22 exchanges real SSH KEXINIT, and so on across the entire address range.

The result: when an AI agent hits a defended subnet, the pipeline breaks in ways none of these frameworks were built to handle.

26 Million Tokens of Noise

A typical nmap service banner line runs about 20 tokens. A /16 deception grid with 65,536 emulated hosts, each running dozens of unique services, generates tens of millions of tokens of fake service data. GPT-4o's context window is 128,000 tokens. The raw data instantly exceeds it.

The agent still tries to process everything, chunk by chunk, burning through API credits and hours of compute. But every chunking strategy heavily degrades the agent's memory:

hackingBuddyGPT uses a sliding window that silently drops the oldest entries.
LangChain's ConversationSummaryBufferMemory progressively compresses and loses details.
CrewAI truncates tool outputs to a configurable token limit, discarding everything past the cutoff.
AutoGPT's early versions simply crashed with InvalidRequestError.

And the losses aren't random. Stanford researchers (Liu et al., TACL 2024) proved that LLMs follow a U-shaped performance curve: they attend to the beginning and end of their context while accuracy on middle-positioned information drops over 20%. A separate study (Du et al., Findings of EMNLP 2025) found that performance degrades up to 85% from input length alone. The length itself is the problem, not the content.

Multi-Document QA Accuracy by Position of Relevant Information

Adapted from Liu et al., “Lost in the Middle” (TACL 2024), Figure 1

When the agent stuffs its context to retain scan history, real hosts buried among thousands of decoys fall in the degraded middle zone, where accuracy drops 20-30 percentage points.

For a reconnaissance agent processing thousands of polymorphic service banners, there's no clean way out. Delete the data and lose real targets. Keep the data and the LLM buries them in the middle of its own memory.

The Scanner Hangs. The Agent Waits.

Nearly every major AI agent framework defaults to waiting indefinitely when a tool hangs. They rely on synchronous blocking calls (like Python's subprocess.run()). If nmap hangs, the LLM's cognitive loop is paused. It cannot reason about the delay because it never gets control back.

A deception grid weaponizes this with scale and deliberate delay, both working at once.

The deception engine doesn't just open ports; it forces the scanner to interact with thousands of protocol-accurate handshakes concurrently. Parsing real RDP negotiations, SMB NTLM exchanges, SSH key algorithms, and everything else in the probe queue across thousands of IPs requires massive computational overhead. Even without intentional delay tactics, this protocol-accurate service emulation alone causes a 64x scan slowdown compared to scanning closed ports. Then socket tarpitting compounds the problem: instead of completing a TCP handshake in milliseconds, the engine slow-drips response data, intentionally holding connections open.

The Scan Bottleneck

nmap -sV -T4against deception grid

Service Emulation

Protocol handshakes on thousands of IPs

64x slowdown vs closed ports

Without any tarpitting

Socket Tarpitting

Slow-drip response data per connection

Holds connections open indefinitely

On top of emulation overhead

92x slower scan time

65,535 ports: 4.6 hrs → 17.4 days (projected)

nmap 7.98, 50 ports

3 runs averaged

We tested this directly. Running nmap -sV with aggressive -T4 timing (version 7.98) against 50 tarpitted ports produced a 92x per-port amplification factor. In wall-clock time, nmap's parallelism (39-43 concurrent probes observed) masks some of that cost, but parallelism doesn't scale indefinitely. Projected to a full 65,535-port scan, the combination of mass data processing and tarpitting turns a 4.6-hour job into a 17.4-day ordeal.

A skeptical engineer might ask: "Why wouldn't the attacker just hardcode a 5-second timeout to avoid tarpits?" They can, but it forces a lose-lose tradeoff. Aggressive timeouts cause the scanner to drop real, slightly latent corporate assets. The agent pipeline either stalls for weeks, crashes from unhandled thread leaks (like CrewAI's GitHub Issue #4135), or misses the actual targets.

Poisoned Intelligence

In practice, agents can't distinguish real from fake at scale. The Mantis framework (George Mason University, arXiv:2410.20911) proved this directly: when exposed to active deception (like infinite recursive FTP directories), over 95% of LLM-driven attacks were neutralized. Researcher Dario Pasquini noted: "It was very, very easy for us to steer the LLM to do what we wanted."

The slowdown is only half the story. The attacker's scan output, if it ever completes, is polluted with tens of thousands of fabricated service entries. Because the deception profiles are polymorphic, the agent can't deduplicate or filter them. Try asking GPT-4o to find five real database servers in a list of 12,000 fabricated service entries. It can't.

What Defenders Gain From Every Interaction

While the agent is trapped in the deception grid, the defender has total visibility. Every packet that touches unused IP space is a confirmed indicator of reconnaissance. Near-zero false positives by architecture. No baseline learning period, no tuning.

Detection kicks in at the kernel level. An eBPF filter attached to the traffic control ingress path intercepts the first SYN packet at wire speed, before it reaches userspace. From there, profiling rules track the source and analyze its behavior across connections. Alerts fire when the activity is determined malicious, not on the first packet alone.

The telemetry gets specific. 60 detection rules with MITRE ATT&CK mapping identify not just that scanning occurred, but which tool did it. Nmap alone triggers across 6 distinct signatures based on TCP window size, options ordering, and probe sequencing. The engine fingerprints hping3, masscan, zmap, and custom Python scanners by their packet characteristics. When an agent pivots to lateral movement, the protocol handlers catch that too: RDP Pass-the-Hash via Restricted Admin mode, SMB NTLM authentication attempts, WinRM NTLM auth over HTTP, SSH brute-force enumeration. Each event streams as structured JSON to your existing SIEM.

While the defender collects all of this for the price of running a sensor, the attacker's costs compound at every stage:

The cost shift back to the attacker:

	Standard Network	Deception-Defended Network
Scan time (/16 subnet)	Hours	Weeks (92x slower)
Scan data for agent to process	Thousands of tokens	26M+ tokens
Intelligence quality	Clean recon data	Poisoned, indistinguishable from real
Detection risk for attacker	Passive IDS/NDR	Profiled from first probe

The Capability Tax Dilemma

Autonomous AI agents are getting faster, cheaper, and more capable. That trend isn't slowing down. But the vulnerabilities documented here aren't bugs that get patched in the next release. They're baked into how LLMs process information: finite context windows, U-shaped attention curves, no way to distinguish real from fake at scale, framework-level defaults that freeze on slow responses.

Deception exploits all of those properties. And there's an asymmetry that works in the defender's favor regardless of which direction AI capabilities go. More capable models that might detect deception are also more expensive to run. Cheap models get fooled easily. Smart models cost the attacker more per attempt. The attacker doesn't win that trade.

The obvious counterargument: won't future AI agents get better at detecting deception? Possibly. But context window limits are a hard constraint, not an engineering shortcoming. The "lost in the middle" problem persists even in million-token models. Scale and tarpitting exploit TCP-level behavior that no LLM improvement changes. And the capability tax applies to fingerprinting too: any scanner that tries to detect deception has to spend extra probes per host. Across 65,000 hosts, that multiplies the attacker's cost again.

One honest caveat: the 92x number is against stock nmap with default service probes. A purpose-built scanner with hardcoded timeouts would cut the tarpitting advantage. But the context poisoning doesn't care what scanner you use. That's an LLM constraint, not a network one.

The cybersecurity teams that will handle the agentic AI threat aren't the ones buying faster signature updates. They're the ones making their networks hostile to reconnaissance by design.

By The Numbers

92x

measured scan slowdown

nmap 7.98, -sV -T4, 50 ports

26M+

fabricated scan data tokens

65,536 hosts x 20 services x 20 tok

~0%

real vs fake hosts distinguishable

indistinguishable at scale

The Active Deception Advantage

We built Portspoof Pro around this architecture. Everything described in this article (the protocol emulation, the tarpitting, the context poisoning, the kernel-level detection) runs on a single sensor per network segment, across AWS, Azure, GCP, or air-gapped environments. We built it because reconnaissance goes undetected in most networks until the post-breach forensics report, and we got tired of reading those reports.

See how the platform works or request a technical walkthrough.

References

Anthropic. "Disrupting the First Reported AI-Orchestrated Cyber Espionage Campaign." November 2025.
CrowdStrike. "2026 Global Threat Report." CrowdStrike, 2026.
Gartner. "Gartner Unveils Top Predictions for IT Organizations and Users in 2025 and Beyond." October 2024.
Deng, G. et al. "PentestGPT: An LLM-empowered Automatic Penetration Testing Tool." USENIX Security Symposium, 2024.
Fang, R. et al. "LLM Agents Can Autonomously Exploit One-Day Vulnerabilities." arXiv:2404.08144, 2024.
Gioacchini, L. et al. "AutoPenBench: Benchmarking Generative Agents for Penetration Testing." arXiv:2410.03225, 2024.
Happe, A. et al. "hackingBuddyGPT." TU Wien, ipa-lab/hackingBuddyGPT, GitHub.
Xu, J. et al. "AutoAttacker: A Large Language Model Guided System to Implement Automatic Cyber-attacks." UC Irvine/Microsoft, arXiv:2403.01038, 2024.
Pasquini, D. et al. "Mantis: Targeted LLM Agent Attacks via Multi-Turn Prompt Injection." George Mason University, arXiv:2410.20911, 2024.
Liu, N. et al. "Lost in the Middle: How Language Models Use Long Contexts." Transactions of the ACL, 2024.
Du, Y. et al. "Context Length Alone Hurts LLM Performance Despite Perfect Retrieval." Findings of EMNLP 2025.
Wang, L. et al. "Automated Penetration Testing with LLM Agents and Classical Planning." arXiv:2512.11143, 2025.
Palisade Research. "LLM Agent Honeypot." Palisade Research, 2024-2025.
Carlini, N. et al. "AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses." Google DeepMind/ETH Zurich, arXiv:2503.01811, ICML 2025.
Triedman, H. et al. "Multi-Agent Systems Execute Arbitrary Malicious Code." Cornell Tech, arXiv:2503.12188, COLM 2025.
XBOW. "Autonomous AI Pentesting System." Black Hat USA, 2025.

AI Agents Are Scanning Your Network. Here's What Stops Them.