AI Hacking Agents Will Outperform Humans

I believe AI Agents will outhack humans. Here’s why.

The Logic for How

There’s a finite number of known attack vectors to use when hacking something.

Most top hackers could articulate a majority of them.

The majority of bugs do not require incredible creativity. They require a methodology and intuition (contextual understanding).

Let’s assume there’s an AI hacker agent with the ability to modify requests and try different payloads.

And it has a relatively large prompt specifically written for finding one vulnerability type. That prompt is filled with all the tips and strategies as articulated by top hackers.

The agent will use the ideas detailed in the prompt, and LLMs are quite good at extrapolating, so it would also creatively use its own ideas.

Agents can be scaled much wider than human hackers as there’s practically infinite compute.

Models are improving at an alarming rate. I mean we’ve only had the tech to build decent agents for less than a year, and the tools being developed are already insane.

This is just the beginning

Sure, the first phase of hacking agents will only be able to find a small percent of all vulnerabilities in an app (such as 2%). Let’s make up a number and imagine that a new program launch on a bug bounty platform for the same app would find about 70% of all vulnerabilities. Here’s the thing: They will improve.

They’ll get better through:
- Better models
- Larger context windows
- Improved prompts
- Fine-tuning for hacking-specific tasks
- Getting built on better agent frameworks
- Better state management
- Other improvements we don’t know about yet

Maybe they’ll be able to find 5% of all bugs in an app within a year, or maybe it’ll be 10%. In 3 years, that percentage will go up more. In 5 years, it’ll go up even more.

It might take hacking agents 5 years before they surpass humans or it might take 20, but there’s no reason why they won’t surpass us eventually.

I don’t know when that will happen, but I’m excited about it.

Right now, there are so many companies that never get pentests or launch Vulerability Disclosure Programs. Those companies will be able to spin up an AI hacking agent or purchase a much cheaper AI Pentest that will find significantly more than a vulnerability scan.

Maybe AI hacking agents will be born out of vulnerability scanners that have been evolved with the power of AI to find way more.

Unknown Attack Vectors

The one caveat is “unknown attacks” which is actually two categories. The first is new attacks on new technology, like the current prompt injection attacks on LLM systems. The other is novel attacks on existing technology such as the research Portswigger does, stuff like request smuggling and single-packet attacks.

Humans will have the edge in these two categories until LLMs surpass human intelligence in security research. It’ll be awhile before AI agents are better than humans in those areas, but there’s no doubt we’ll get there.

Imagine GPT-100 (or the equivalent from other companies) built into an amazing agent framework (essentially an app that utilizes the AI model well for decision making and self-reflection, etc.). I believe it will be better than us at finding flaws in completely new systems and products.

Wrapping up

One major reason I believe the arguments in this post is that there is a massive financial incentive to chip away at this task. An AI agent with hacking abilities that could find even 5% of vulnerabilities in a variety of applications would make a lot of money when scaled to all bug bounty programs and sold as a service. “AI Agent Pentests” will be a profitable market because the pricing can be lower than a bespoke manual assessment.

The future is going to be wild, but, in many ways, it’s already here. In other ways, it’s rapidly approaching.

Thanks,
Joseph

Sign up for my email list to know when I post more content like this.