feat(wordlist): add generator for sqlmap risk-classified SQLi payloads

Replaces the bulk-files approach from #1319. Adds a small Python script that
reads sqlmap's payload XMLs (boolean_blind, error_based, inline_query,
stacked_queries, time_blind) and writes one wordlist per (risk, category)
under Low-/Medium-/High-Risk-Payloads. union_query is skipped because sqlmap
generates those at runtime from a column-count range.

Refs #1011
This commit is contained in:
0xBassia 2026-04-25 13:10:19 +03:00
parent a4f7d9e9f6
commit fa8c65b482
3 changed files with 145 additions and 1 deletions

View file

@ -1,3 +1,5 @@
# SQL Injection wordlists
> [!CAUTION]
> Many of these wordlists contain potentially destructive queries which may permanently delete data on any databases they're used on. For more information see [issue #1011](https://github.com/danielmiessler/SecLists/issues/1011)
> Many of these wordlists contain potentially destructive queries which may permanently delete data on any databases they're used on. For more information see [issue #1011](https://github.com/danielmiessler/SecLists/issues/1011)
For a safer starting set, see [`sqlmap-risk-classified/`](sqlmap-risk-classified/). It ships a small Python script that pulls sqlmap's payload XMLs and writes them to disk split by sqlmap's own risk level (1/2/3), so you can pick low-risk payloads first.

View file

@ -0,0 +1,54 @@
# sqlmap risk-classified SQLi payloads
This folder doesn't ship the wordlists themselves. Run `generate.py` and it
pulls sqlmap's payload XMLs and writes the wordlists into
`Low-Risk-Payloads/`, `Medium-Risk-Payloads/`, `High-Risk-Payloads/` next to
the script.
```
python3 generate.py
```
For air-gapped use, point it at a local sqlmap checkout:
```
python3 generate.py --offline /path/to/sqlmap/data/xml/payloads
```
## What the risk levels mean
Same meaning as sqlmap's own `--risk` option:
* Risk 1 (Low): AND-based and CASE-WHEN payloads. Fine to fire at any
parameter you don't yet understand. This is sqlmap's default.
* Risk 2 (Medium): heavy time-based payloads. Not destructive, but they hold
the connection open and can stress the target.
* Risk 3 (High): OR-based payloads. The condition matches every row, which
matters if the parameter ends up in an UPDATE or DELETE WHERE clause.
Don't fire these at unknown endpoints. Issue #1011 has the background.
## Placeholders kept verbatim from sqlmap
Payloads still contain sqlmap's tokens. If you're piping into a fuzzer that
doesn't substitute them, swap them out with `sed` first.
| Token | Replace with |
| --- | --- |
| `[RANDNUM]`, `[RANDNUM1]`, `[RANDNUM2]` | any integer |
| `[RANDSTR]` | any short alphanumeric string |
| `[ORIGVALUE]` | the legitimate value of the param |
| `[SLEEPTIME]` | seconds for time-based payloads, e.g. 5 |
| `[INFERENCE]` | a boolean expression like `1=1` |
| `[GENERIC_SQL_COMMENT]` | `-- -` |
| `[DELIMITER_START]`, `[DELIMITER_END]` | any short unique tokens |
## Why a script and not the files
So you can re-run it whenever sqlmap adds payloads upstream and not have to
wait for someone to update SecLists.
`union_query.xml` is skipped on purpose. sqlmap builds those at runtime from
a column-count range, there are no fixed strings to extract.
Source: https://github.com/sqlmapproject/sqlmap/tree/master/data/xml/payloads
License of the payload content: GPL v2 (sqlmap's license).

View file

@ -0,0 +1,88 @@
#!/usr/bin/env python3
# Build risk-classified SQLi wordlists from sqlmap's payload XMLs.
# Splits each XML by <risk> into Low/Medium/High folders.
# Run: python3 generate.py (fetches from github)
# python3 generate.py --offline /path/to/sqlmap/data/xml/payloads
#
# Issue ref: https://github.com/danielmiessler/SecLists/issues/1011
import argparse
import os
import sys
import urllib.request
from xml.etree import ElementTree as ET
RAW = "https://raw.githubusercontent.com/sqlmapproject/sqlmap/master/data/xml/payloads"
# union_query.xml is skipped on purpose: sqlmap builds those payloads from
# a column-count range at runtime, there are no fixed strings to extract.
SOURCES = [
"boolean_blind.xml",
"error_based.xml",
"inline_query.xml",
"stacked_queries.xml",
"time_blind.xml",
]
RISK = {1: "Low-Risk-Payloads", 2: "Medium-Risk-Payloads", 3: "High-Risk-Payloads"}
def load(name, offline):
if offline:
with open(os.path.join(offline, name), "rb") as f:
return f.read()
with urllib.request.urlopen(f"{RAW}/{name}", timeout=30) as r:
return r.read()
def payloads(xml):
for t in ET.fromstring(xml).findall(".//test"):
r = t.find("risk")
p = t.find("request/payload")
if r is None or p is None or p.text is None:
continue
try:
risk = int(r.text.strip())
except ValueError:
continue
if risk in RISK:
yield risk, p.text.strip()
def main():
ap = argparse.ArgumentParser()
ap.add_argument("--offline", help="path to sqlmap/data/xml/payloads")
ap.add_argument("--out", default=os.path.dirname(os.path.abspath(__file__)))
args = ap.parse_args()
bucket = {(r, s): [] for r in RISK for s in SOURCES}
for src in SOURCES:
try:
data = load(src, args.offline)
except Exception as e:
print(f"failed to read {src}: {e}", file=sys.stderr)
return 1
for risk, payload in payloads(data):
bucket[(risk, src)].append(payload)
n = 0
for (risk, src), items in bucket.items():
if not items:
continue
# dedupe but keep order (same payload string appears in multiple <test>
# blocks with different dbms hints)
seen = set()
uniq = [p for p in items if not (p in seen or seen.add(p))]
d = os.path.join(args.out, RISK[risk])
os.makedirs(d, exist_ok=True)
out = os.path.join(d, src.replace(".xml", ".txt"))
with open(out, "w") as f:
f.write("\n".join(uniq) + "\n")
n += 1
print(f"{RISK[risk]}/{os.path.basename(out)}: {len(uniq)}")
print(f"wrote {n} files under {args.out}")
return 0
if __name__ == "__main__":
sys.exit(main())