Stop Feeding Your AI Agent 1.3MB JavaScript Files: How BurpQL Makes Burp Suite Data Actually Usable for AI
If you're using an AI agent to help with web app pentesting and you've connected it to Burp Suite via MCP, you've probably hit the wall. You ask your agent to search proxy history for a pattern, and it dutifully returns the full raw request and response for every match — including that 1.3MB minified main.js bundle that obliterates your context window and your token budget.
BurpQL fixes this. It sits between your AI agent and your Burp project data, providing compact, search-optimized responses that give the agent exactly what it needs and nothing more.
The Problem: MCP Returns Everything
The Burp Suite MCP server exposes three relevant tools for proxy history analysis:
get_proxy_http_history— paginate through raw request/response pairsget_proxy_http_history_regex— regex search, returns full raw bodies for every matchget_scanner_issues— scanner findings
That's it. There's no stats command. No endpoint listing. No header search. No way to say "just show me the metadata." Every query returns complete request and response bodies, whether you need them or not.
Here's what happened when I searched for clientId using MCP's regex tool with count=2:
- Match 1: Full request + full response from
/rest/admin/application-configuration(reasonable size, contained the Google OAuth clientId I was looking for) - Match 2: Full request + the entire
main.jsfile (1.3MB, truncated by the agent runtime before it could even process it)
The actual finding — a Google OAuth clientId in a config endpoint — was a few hundred bytes. The rest was noise. The agent burned its context window on a minified JavaScript bundle and got nothing useful from it.
What BurpQL Does Differently
BurpQL imports your Burp project XML export into a SQLite database with full-text search (porter stemming) and trigram substring indexing. It exposes a REST API with purpose-built endpoints for the queries pentesters actually run.
The key design decisions:
- Metadata by default, bodies opt-in. Search results return request ID, method, path, status, content type, and highlighted match snippets. You only pull full bodies when you've found something worth reading.
- Built-in recon commands.
stats,hosts,endpoints,parameters,headers— the stuff you'd normally piece together by manually scrolling through Burp's proxy history tab. - Two search modes.
searchuses porter stemming for word-level queries ("cache" matches "cached", "caching").grepuses trigram indexing for exact substring matching ("eyJ0eX" to find JWTs, "document.cookie" to find DOM sinks). - Highlighted snippets. Match context is wrapped in
>>>and<<<markers so the agent immediately sees what matched and where.
AI Agent Skill
BurpQL includes a Claude skill with a self-contained Python search tool that wraps the REST API. This avoids curl/jq shell escaping issues that commonly break AI agent terminal interactions.
Setup
The project's prerequisites are Python 3 and uv. You can run the API server locally or using Docker if you prefer. The dockerfile is included in the project.
First, export your scope from Burp to an XML file:
- Go to Proxy → HTTP history (or Target → Site map).
- Highlight the items they want to export (e.g., Ctrl+A to select all).
- Right-click and select Save items.
- In the save dialog, ensure the format is set to XML and that "Base64-encode requests and responses" is checked.
# Ingest and serve
docker compose run burpql my-project export.xml
# Or serve an already-ingested project
docker compose run burpql my-projectThe API listens on localhost:8888. BurpQL ships with a Python CLI wrapper that handles URL encoding and JSON formatting, designed specifically for AI agent use:
uv run burpql-search.py stats
uv run burpql-search.py hosts
uv run burpql-search.py search "password" 20
uv run burpql-search.py grep "eyJ0eX" 10The Recon Workflow
Here's the standard workflow I use when starting analysis on a Burp project. This is the same sequence I ran against an OWASP Juice Shop capture (69 requests) to benchmark BurpQL against raw MCP queries.
Step 1: Project Overview
uv run burpql-search.py statsReturns a compact summary: total requests, unique hosts, HTTP method distribution, status code breakdown, top content types. With MCP, you'd have to paginate through all 69 requests and count these yourself.
Step 2: Map the Attack Surface
uv run burpql-search.py hosts
uv run burpql-search.py endpoints
uv run burpql-search.py parametersThree commands. You get: every host with request counts, every unique method+path combination (your API surface), and every URL parameter with example values. This is the equivalent of 20 minutes of clicking around in Burp's sitemap — compressed into structured JSON that an AI agent can reason over.
Step 3: Security Header Audit
uv run burpql-search.py headers Set-Cookie
uv run burpql-search.py headers Content-Security-Policy
uv run burpql-search.py headers Access-Control-Allow-OriginEach returns only the requests containing that header, with the header value shown inline. Against Juice Shop, this immediately surfaced:
- A JWT containing an MD5 password hash set via
Set-Cookie unsafe-evalin the CSP- Wildcard CORS (
Access-Control-Allow-Origin: *) across all API endpoints
Step 4: Search for Secrets and Sensitive Data
uv run burpql-search.py grep "password" 20
uv run burpql-search.py grep "eyJ0eX" 10
uv run burpql-search.py grep "clientId" 10The grep command found the Google OAuth clientId leak in 2 results with compact highlighted snippets. The MCP regex equivalent returned the same 2 matches but with full bodies, consuming orders of magnitude more tokens.
Step 5: Drill Into Specific Findings
# Now that you know request #42 is interesting, get the full detail
uv run burpql-search.py request 42This is where you intentionally pull the body — for a single, specific request you've already triaged as worth reading.
Side-by-Side Comparison
I ran identical searches through both BurpQL and the Burp MCP server. Here's what I found:
Recon Capabilities
| Task | BurpQL | Burp MCP |
|---|---|---|
| Project statistics | stats — one command |
No equivalent |
| List all hosts | hosts — one command |
No equivalent |
| List all endpoints | endpoints — one command |
No equivalent |
| List all parameters | parameters — one command |
No equivalent |
| Search headers | headers <name> — returns only matching headers |
No equivalent |
| Text search | search (stemmed) + grep (substring) with snippets |
get_proxy_http_history_regex — full bodies |
MCP has no recon primitives at all. You're stuck paginating through raw history and parsing it yourself.
Context Window Impact
The same clientId search:
- BurpQL: ~200 tokens of structured metadata + highlighted snippets
- MCP: ~50,000+ tokens of raw request/response bodies (including a truncated 1.3MB JS file)
That's not a rounding error. That's the difference between an agent that can search 20 patterns in a single conversation and one that runs out of context after 2 queries.
What MCP Does Better
BurpQL isn't a replacement for the Burp MCP server. It's a complement. MCP gives you:
- Live interaction with a running Burp instance
- Scanner results via
get_scanner_issues - Active testing — sending requests, triggering scans
BurpQL is a read-only snapshot. It can't interact with a live Burp session.
The Workflow That Works
Use both tools together:
- BurpQL first — Map the attack surface, search for patterns, identify interesting request IDs. Build your understanding of the target without burning context.
- Burp MCP second — Take action on specific findings. Check scanner issues. Replay requests. Interact with live Burp features for the targets you've already identified.
Think of BurpQL as the recon phase and MCP as the exploitation phase. You wouldn't run sqlmap on every endpoint blindly — you'd map the surface first, identify injection points, then test. Same principle.
Quick Reference
# Recon
uv run burpql-search.py stats
uv run burpql-search.py hosts
uv run burpql-search.py endpoints
uv run burpql-search.py parameters
# Search (metadata only by default)
uv run burpql-search.py search "authorization" 20
uv run burpql-search.py grep "api_key" 20
uv run burpql-search.py headers Set-Cookie
# Drill down (full bodies for a specific request)
uv run burpql-search.py request 42
# Filtered listing
uv run burpql-search.py requests --host api.example.com --status 200 --method POSTTL;DR
BurpQL gives your AI agent the ability to efficiently search and analyze Burp Suite project data without drowning in raw HTTP bodies. It's the difference between asking "which endpoints leak credentials?" and getting a crisp answer versus getting a 1.3MB JavaScript file dumped into your context window. Use it for recon and search. Use MCP for live Burp interaction. Use both and your AI-assisted pentesting workflow actually works.