Constable Documentation

Intelligent Security Proxy · AgileSecOps

Constable is a single-binary Go reverse proxy with WAF-like request inspection. It sits in front of one or more upstream targets and blocks or logs requests matching configurable regex rules applied to URLs, headers, and request/response bodies — then inspects, rewrites, and header-hardens the response on the way back.

The proxy binary is stdlib-only — zero external dependencies. Config is hot-reloaded from disk every 3 seconds (no restart for rules, limits, TLS domains, peers, cache, and more), and can optionally be pulled from a remote GitHub URL.

Quick Start

Install and run in two commands.

Guided Setup

Observe → review → enforce.

Request Pipeline

The 25-stage gauntlet.

Full Example

A complete config.json.

What's in the box #

Every capability below is configured in a single config.json. Enable only what you need; everything is off by default unless noted.

Per-IP rate limiting & connection limits

IP allow / block lists & GeoIP blocking

Auth: Basic / API key / JWT / mTLS

TLS termination + Let's Encrypt (HTTP-01 / DNS-01)

Multiple upstreams, load balancing, health checks

Response caching & gzip compression

Inline CVE detection with stack-aware exposure scoring

3-layer botnet detection

Adaptive learning (shadow-first anomaly scoring)

Peer-to-peer block sync across nodes

Prometheus metrics & SIEM/syslog streaming

Daily AI security report

New here?

Start with the Quick Start, then follow the Guided Setup Path — the recommended progression of start minimal → run in observe mode → review → promote rules to blocking. This reference page documents every field exhaustively.

Quick Start #

Two things are required: where to listen and where to forward.

1. Install

sudo apt install ./constable_<version>_amd64.deb
sudo systemctl start constable

The package installs the binary to /usr/bin/constable and its config to /etc/constable/config.json. The config file is hot-reloaded every 3 seconds whenever it changes — no restart needed. See Install from .deb for the full package layout.

2. The bare-minimum config.json

{
  "listen_addr": "127.0.0.1:8080",
  "target_url": "http://127.0.0.1:9000"
}

That's a working reverse proxy. Every request to 127.0.0.1:8080 is forwarded to your backend. A health endpoint is available at /healthz (bypasses auth/WAF, rate-limited).

Environment variables for config loading

Variable	Description
`CONSTABLE_CONFIG_PATH`	Path to a local config file (overrides the default `config.json` location).
`CONSTABLE_REMOTE_CONFIG_URL`	Remote URL to poll for config — see Remote Config.

Tip

Keep listen_addr on 127.0.0.1 until you've finished tuning, then move to 0.0.0.0 (or put it behind your existing edge) when you go live.

Guided Setup Path #

The recommended progression: start minimal → observe with learning → review what it found → promote rules to blocking → layer on the rest.

Guiding principle

Never start in blocking mode on day one. Run in detect/observe first, look at what the proxy would have done against your real traffic, and only then turn on enforcement. This avoids blocking legitimate users while you tune.

1. Add observability early #

Turn on the metrics endpoint now — you'll need it to see what the learning modes are doing. Keep it local-only so it isn't exposed to the world:

{
  "listen_addr": "127.0.0.1:8080",
  "target_url": "http://127.0.0.1:9000",
  "metrics_addr": "127.0.0.1:9090",
  "metrics_local_only": true,
  "log_file": "/var/log/constable/constable.log",
  "log_format": "json"
}

curl -s http://127.0.0.1:9090/metrics gives you Prometheus counters; json log format is recommended for anything you'll grep or ship to a log pipeline.

2. Run both learning systems in observe mode #

Constable has two complementary learning systems. Run both, in observe/log mode first, then review.

System	What it learns	Best for
Learn Mode	The query-parameter names your app uses → a file of `mode:"log"` rules to review	Bootstrapping a static rule set from scratch
Adaptive Learning	A live behavioral profile + a risk score per request	Ongoing anomaly detection that adapts over time

Enable Learn Mode and Adaptive Learning in shadow, then drive representative, legitimate traffic through the proxy (real users, a staging suite, or a crawler).

3. Detect: review what learning found #

# Learn Mode candidate rules firing
grep '"event":"DETECT"' constable.log | grep 'learned-'

# Adaptive: what shadow mode WOULD have blocked
curl -s http://127.0.0.1:9090/metrics | grep proxy_adaptive_would_block_total
grep '"event":"ADAPTIVE"' constable.log

You are looking for two things: the would-block counter rises with known-bad traffic, and stays at (or near) zero for legitimate traffic.

4. Apply: promote rules to blocking #

Only after review, copy trusted rules from learned-rules.json into url_rules/header_rules and change "mode": "log" to "mode": "block"; then disable Learn Mode. Promote Adaptive Learning to enforce via a conditional rule with trigger_on: "adaptive_score".

5. Going to production — checklist #

listen_addr bound where you want it (behind your edge/firewall as appropriate).
log_file set, log_format: "json", rotation configured.
metrics_addr set with metrics_local_only: true.
Learn Mode disabled after rules were promoted.
Adaptive Learning ran in shadow for a full traffic cycle before enforce.
Rules promoted to block were watched in log first.
Secrets supplied via $ENV{VAR} expansion, not hard-coded.
Running under systemd with restart-on-failure.

Remote Config from GitHub #

The proxy can poll a remote URL (e.g. a raw GitHub file) and reload automatically when the content changes.

Variable	Required	Default	Description
`CONSTABLE_REMOTE_CONFIG_URL`	yes	—	Full URL to poll. Must be `https` — plaintext is refused at startup unless the allow-http flag is set. Redirects to internal/loopback/metadata addresses are blocked.
`CONSTABLE_REMOTE_CONFIG_TOKEN`	no	—	GitHub token for private repos (sent as `Authorization: token <value>`). Also raises the rate limit from 60 to 5,000 req/hour.
`CONSTABLE_REMOTE_CONFIG_INTERVAL`	no	`60`	Poll interval in seconds.
`CONSTABLE_REMOTE_CONFIG_ALLOW_HTTP`	no	—	Set to `1` to allow a plaintext `http://` URL (insecure; logs a loud warning).

export CONSTABLE_REMOTE_CONFIG_URL="https://raw.githubusercontent.com/you/repo/main/proxy.json"
export CONSTABLE_REMOTE_CONFIG_TOKEN="ghp_xxxxxxxxxxxx"   # optional, for private repos
export CONSTABLE_REMOTE_CONFIG_INTERVAL=60                # optional, default 60
./constable

How it works

The local config.json is always loaded first as the initial/fallback config.
On startup the proxy fetches the remote URL; on success it applies the remote config and writes it back to the local file.
Every interval it polls using an ETag / If-None-Match header. GitHub returns 304 Not Modified when unchanged — free requests that don't count against the rate limit.
When the file changes, the new config is validated and applied live, the local cache file is updated, and a reload line is logged.
If the remote fetch fails (network error, non-2xx, invalid JSON), the proxy logs a warning and continues with the current config unchanged.

Suppressing poll log noise

Set "remote_config_silent": true to suppress routine poll / 304 lines; errors, 429 rate-limits, and 200 config applied always log. This setting is hot-reloadable.

Install from .deb (Debian/Ubuntu) #

Constable is delivered as prebuilt .deb packages for amd64 and arm64.

# install the package you were provided (match your host architecture)
sudo apt install ./constable_<version>_amd64.deb
sudo $EDITOR /etc/constable/config.json   # seeded from the shipped example on first install
sudo systemctl start constable

What the package lays down

Path	Purpose
`/usr/bin/constable`	The proxy binary.
`/etc/constable/config.json`	Live config. Seeded from the example on first install; never overwritten on upgrade.
`/etc/constable/config.json.example`	Reference schema, refreshed every upgrade (dpkg conffile).
`/lib/systemd/system/constable.service`	systemd unit. Runs as the `constable` user with `CAP_NET_BIND_SERVICE`.
`/var/lib/constable/`	State dir (good place for the Let's Encrypt cache).
`/var/log/constable/`	Log dir.

The unit sets CONSTABLE_CONFIG_PATH=/etc/constable/config.json and is enabled on install. Upgrades preserve config.json and try-restart the service. apt purge removes the system user and config dir.

Running as a systemd Service #

If you can't use the .deb directly (non-Debian distro, custom paths), this manual layout is what the package effectively does. The proxy is a single static binary, so it runs cleanly under systemd. Extract the binary and example config from the package you were provided with dpkg-deb -x constable_<version>_amd64.deb ./extracted.

Lay out files

sudo useradd --system --home /opt/constable --shell /usr/sbin/nologin constable
sudo mkdir -p /opt/constable /var/log/constable /var/lib/constable
sudo install -m 0755 constable /opt/constable/proxy
sudo install -m 0640 config.json.example /opt/constable/config.json   # then edit it
sudo chown -R constable:constable /opt/constable /var/log/constable /var/lib/constable

Permissions

Keep config.json group-readable by the constable user (mode 0640, group constable). A root-owned 0600 file will make the service fail to read it.

systemd unit — /etc/systemd/system/constable.service

[Unit]
Description=Constable
After=network-online.target
Wants=network-online.target

[Service]
Type=simple
User=constable
Group=constable
WorkingDirectory=/opt/constable
ExecStart=/opt/constable/proxy
EnvironmentFile=-/etc/constable.env
Environment=CONSTABLE_CONFIG_PATH=/opt/constable/config.json
Restart=on-failure
RestartSec=2s
LimitNOFILE=1048576

# Allow binding 80/443 as non-root
AmbientCapabilities=CAP_NET_BIND_SERVICE
CapabilityBoundingSet=CAP_NET_BIND_SERVICE

# Hardening
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
ReadWritePaths=/var/log/constable /var/lib/constable /opt/constable

[Install]
WantedBy=multi-user.target

The [Install] section is required — without it systemctl enable refuses the unit.

Secrets in an env file (optional)

Only needed if you've enabled the AI report, SMTP, or peer sync:

sudo tee /etc/constable.env >/dev/null <<'EOF'
ANTHROPIC_API_KEY=sk-ant-...
SMTP_PASSWORD=...
PEER_SYNC_KEY=...
EOF
sudo chmod 0600 /etc/constable.env
sudo chown root:constable /etc/constable.env

Start and enable

sudo systemctl daemon-reload
sudo systemctl enable --now constable
sudo systemctl status constable
sudo journalctl -u constable -f          # live logs

Day-to-day

Edit the config and the proxy hot-reloads within ~3 seconds — no restart for rules, rate limits, peers, cache, gzip, or Let's Encrypt domains. systemctl restart only when you swap the binary or change values in the env file.

Request Processing Pipeline #

Every incoming request passes through these checks in order. The first failure stops processing and returns the configured error response (default 403).

Incoming request
      │
 1.  Generate / propagate Request ID
 2.  Health check endpoint  (/healthz)
 3.  ACME HTTP-01 challenge  (Let's Encrypt)
 4.  HTTP → HTTPS redirect
 5.  Acquire worker slot  → 503 on timeout
 6.  Rate limit check (per-IP)  → 429 if exceeded
 7.  IP allow / block list
 8.  GeoIP check
 9.  Botnet detection
         Layer 1: IP reputation blocklists
         Layer 2: Behavioral (error rate + path scan)
         Layer 3: User-agent fingerprinting
10.  Conditional rules  (per-endpoint threshold blocks)
11.  Authentication  (Basic / API key / JWT)
12.  HTTP method check
13.  Header count and size limits
14.  URL rules
15.  Header rules
16.  CVE detection — URL + headers  (named signatures, stack-scoped)
17.  Per-path rules
18.  Adaptive scoring  (shadow: log only; enforce: feed conditional rules)
19.  Read request body  → 413 if over max_body_bytes
20.  Body rules  → 408 on regex timeout
21.  CVE detection — request body
22.  Select upstream  (load balancer, health-aware)
      │  ▼  Forward to upstream  ▼
23.  Passive stack fingerprint + inspect response body
24.  Apply response rewrites
25.  Inject security headers / strip removed headers
      │
 Record against conditional + adaptive windows → Return to client

Rule modes

"mode": "log" emits a [DETECT] line but does not block — the request continues. "mode": "null" silently drops the TCP connection without any response (mirrors iptables DROP), emitting [DROP]. CVE detections emit [CVE_DETECT] / [CVE_BLOCK]. Adaptive scoring never blocks inline — in shadow it only emits [ADAPTIVE].

Multi-Domain Hosting #

One proxy can serve multiple distinct domains/subdomains, each with its own upstreams, rules, auth, methods, gzip, cache, and security headers. Add a domains map where each key is the Host header to match (lowercased, port stripped).

{
  "listen_addr": ":443",
  "tls": { "listen_addr": ":443" },
  "lets_encrypt": { "enabled": true, "email": "ops@example.com", "cache_dir": "/var/lib/constable/acme" },

  "url_rules": [
    { "label": "deny .env (global)", "pattern": "(?i)\\.env($|\\?)" }
  ],

  "domains": {
    "api.example.com": {
      "upstreams": [{ "url": "http://api-backend:8080" }],
      "url_rules": [{ "label": "api: block /admin", "pattern": "(?i)^/admin" }],
      "auth": { "api_key": { "enabled": true, "header": "X-API-Key", "keys": ["$ENV{API_KEY_PROD}"] } }
    },
    "www.example.com": {
      "target_url": "http://www-backend:8080",
      "gzip": { "enabled": true, "level": 6 }
    }
  }
}

Per-Vhost vs Global #

Per-vhost (in `domains.*`)	Stays global (top-level only)
`target_url` / `upstreams` / `load_balance` / `preserve_host`	`rate_limit`
`url_rules` / `header_rules` / `body_rules` / `response_body_rules`	`botnet_detection`
`path_rules`	`conditional_rules`
`allowed_methods`	`geoip`
`auth`	`peers`
`security_headers` / `remove_response_headers`	`allowed_ips` / `blocked_ips`
`gzip` / `cache`	`tls` / `listen_addr` / `max_workers`
`inspect_get_body` / `max_body_bytes` / `allowed_redirect_hosts`	`lets_encrypt` (but SAN list auto-includes vhost keys)
`response_rewrites` / `upstream_stack`	`health_check` / `trusted_proxies` / `ai_report`

Stateful subsystems (rate-limit, botnet, conditional rules, peer sync, GeoIP) intentionally stay global — they track cross-host attack signal. CVE detection is global too, but its upstream_stack declaration is per-vhost.

Merge precedence #

Per-request merge order is per-path rules > per-domain block > top-level config. Within the per-domain block, each overridable field uses nil = inherit, non-nil = override:

Slice fields (url_rules, etc.): absent / null ⇒ inherit top-level. A non-nil slice (even []) replaces the top-level slice for that vhost.
Pointer fields (gzip, cache, auth, …): absent ⇒ inherit; present ⇒ override.
security_headers: absent map ⇒ inherit; present map ⇒ override (maps are not merged).

Default vhost

Requests whose Host doesn't match any domains key fall through to the top-level config, which acts as the default vhost — so existing single-host configs work unchanged. To reject unknown hosts, leave target_url and upstreams empty at the top level and the proxy returns 502 for unmatched hosts.

Core Settings #

Field	Type	Default	Description
`listen_addr`	string	required	Address to listen on, e.g. `":80"` or `"127.0.0.1:8000"`.
`target_url`	string	—	Single upstream origin, e.g. `"http://localhost:8080"`. Use `upstreams` for multiple.
`max_workers`	int	`NumCPU×4`	Max concurrent requests. Excess requests queue until `queue_timeout_ms`.
`max_procs`	int	`0` (all CPUs)	`GOMAXPROCS`. `0` uses all available CPUs.
`queue_timeout_ms`	int	`5000`	Milliseconds to wait for a worker slot before returning 503.
`inspect_get_body`	bool	`false`	Apply body rules to GET requests.
`log_allowed`	bool	`false`	Emit `[ALLOW]` log lines for requests that pass all checks.
`preserve_host`	bool	`false`	Forward the original `Host` header to the upstream instead of rewriting it.

Blocking Behavior #

Field	Type	Default	Description
`block_status_code`	int	`403`	HTTP status code returned when a request is blocked.
`block_message`	string	`"Blocked by proxy policy"`	Response body text when a request is blocked.
`block_x_forwarded_for`	bool	`false`	Also apply IP allow/block checks to `X-Forwarded-For` / `X-Real-IP`. Only honored when the value parses as a real IP and arrives via a configured `trusted_proxies` hop.
`trusted_proxies`	[]string	`[]`	IPs/CIDRs of trusted upstream proxies allowed to set `X-Forwarded-For`.

Forwarding headers sent upstream

The proxy is the sole authority for the X-Forwarded-* headers it sends to the backend: it overwrites X-Forwarded-For with its own trusted client-IP view, sets X-Forwarded-Proto/X-Forwarded-Host from its own state, and strips client-supplied Forwarded, X-Forwarded-Scheme, X-Original-URL/-Host, and X-Rewrite-URL. If you place this proxy behind another L7 hop, list that hop in trusted_proxies.

IP Allow / Block Lists #

Exact IPs and CIDR ranges are both supported. If allowed_ips is non-empty, all IPs not in the list are blocked. The allow list takes precedence over the block list.

Field	Type	Default	Description
`allowed_ips`	[]string	`[]`	If non-empty, only these IPs/CIDRs are allowed through.
`blocked_ips`	[]string	`[]`	These IPs/CIDRs are always blocked.

{
  "allowed_ips": ["203.0.113.5", "198.51.100.0/24"],
  "blocked_ips": ["192.0.2.100"]
}

allowed_ips traffic takes precedence and bypasses adaptive scoring.

Rate Limiting #

Per-IP token bucket rate limiter. Activates as soon as requests_per_second > 0 — there is no separate enabled flag. Stale entries are cleaned up automatically.

Field	Type	Default	Description
`rate_limit.requests_per_second`	float	`0` (off)	Sustained request rate per IP.
`rate_limit.burst`	int	`0`	Maximum burst above the sustained rate.
`rate_limit.cleanup_interval_sec`	int	`300`	How often stale per-IP buckets are removed (seconds).
`rate_limit.max_concurrent_per_ip`	int	`0`	Caps simultaneously in-flight requests per IP (defeats slowloris-style connection holding). See Production Hardening.

{
  "rate_limit": {
    "requests_per_second": 100,
    "burst": 200,
    "cleanup_interval_sec": 300
  }
}

Request Limits #

Field	Type	Default	Description
`max_body_bytes`	int	`10485760` (10 MB)	Maximum request body size. Returns 413 if exceeded.
`max_header_bytes`	int	`0` (unlimited)	Maximum total size of all request headers in bytes.
`max_headers`	int	`0` (unlimited)	Maximum number of request headers.
`max_url_length`	int	`0` (unlimited)	Maximum URL length in bytes.
`regex_timeout_ms`	int	`5000`	Per-request deadline for body regex scanning. Returns 408 on timeout.

Allowed HTTP Methods #

Global method whitelist. Requests using any method not in the list are blocked.

{
  "allowed_methods": ["GET", "POST", "PUT", "PATCH", "DELETE", "HEAD", "OPTIONS"]
}

Logging #

Field	Type	Default	Description
`log_format`	string	`"text"`	`"text"` for human-readable, `"json"` for structured JSON.
`log_file`	string	`""`	Path to a log file. Logs go to both stderr and this file. Empty = stderr only.
`log_max_size_mb`	int	`0`	Max log file size in MB before rotation. `0` disables rotation.
`log_max_backups`	int	`0`	Number of rotated files to keep. `0` keeps all.

When log_max_size_mb is set, the proxy rotates the log file when it reaches the size. Rotated files are numbered sequentially (constable.log.1, .2, …) with .1 the most recent. Rotation is handled internally — no reload or restart required.

Request ID #

A unique ID is generated for every request (or propagated from an incoming header) and attached to log lines, upstream requests, and responses for end-to-end tracing.

Field	Type	Default	Description
`request_id_header`	string	`"X-Request-ID"`	Header name to read from incoming requests and write to upstream requests and responses.

Rules Overview #

Rules are regex patterns applied to specific parts of the request or response. Each rule has:

Field	Type	Required	Description
`label`	string	yes	Human-readable name shown in logs.
`pattern`	string	yes	RE2 regular expression (max 4,096 chars).
`mode`	string	no	`"block"` (default) rejects with the configured status. `"log"` emits a `[DETECT]` line but forwards. `"null"` silently drops the TCP connection (iptables DROP behavior).
`exceptions`	[]string	no	If any of these literal strings are present in the input, the rule is skipped entirely.

Decode-aware matching

url_rules are evaluated against both the raw (percent-encoded) URL and a fully-decoded view, so a payload can't slip past by alphanumeric percent-encoding. Before any rule runs, the proxy rejects (400) requests with encoded path separators at any decoding depth (%2f, %252f, %5c, backslash) or a .. dot-segment — so the form the WAF inspects always matches the path the upstream resolves.

`url_rules` #

Applied to the full request URL including query string.

{
  "url_rules": [
    { "label": "block .env files",
      "pattern": "\\.env($|\\?)" },

    { "label": "SQL injection in query string",
      "pattern": "(?i)(select|insert|update|delete|drop|union).+(from|into|where|table)" },

    { "label": "path traversal",
      "pattern": "(?i)(\\.\\.[\\\\/]|%2e%2e[%2f%5c])" },

    { "label": "sensitive file access",
      "pattern": "(?i)\\.(htaccess|htpasswd|git|svn|bak|old|swp)($|[\\?/])" },

    { "label": "debug and status endpoints",
      "pattern": "(?i)/(phpinfo|server-status|server-info|elmah\\.axd|actuator|metrics)($|[\\?/])" },

    { "label": "SSRF private IP in query param",
      "pattern": "(?i)[?&][^=]+=https?://(127\\.|10\\.|192\\.168\\.|localhost)" },

    { "label": "log access to /admin (detect only)",
      "pattern": "(?i)^/admin", "mode": "log" },

    { "label": "block /bad but allow known safe route",
      "pattern": "(?i)/bad", "exceptions": ["badtest"] },

    { "label": "silently drop /honeypot",
      "pattern": "(?i)^/honeypot", "mode": "null" }
  ]
}

`header_rules` #

Applied to the raw Key: Value string for each header.

{
  "header_rules": [
    { "label": "block scanner user-agents",
      "pattern": "(?i)User-Agent:.*(sqlmap|nikto|nmap|masscan|gobuster|ffuf|wfuzz|nuclei)" },

    { "label": "XSS via Referer or Origin",
      "pattern": "(?i)(Referer|Origin|X-Forwarded-Host):.*(



      
      
        body_rules #
        Applied to the request body. Skipped for GET requests unless inspect_get_body is true.
        {
  "body_rules": [
    { "label": "SQL injection",
      "pattern": "(?i)(select\\s.+from\\s|insert\\s+into\\s|drop\\s+table\\s|union\\s+select)" },

    { "label": "XSS script tags",
      "pattern": "(?i)]*>" },

    { "label": "command injection",
      "pattern": "(?i)(;|\\||&&|`|\\$\\()\\s*(ls|cat|wget|curl|bash|sh|nc|whoami|id|uname)" },

    { "label": "template injection (SSTI)",
      "pattern": "(\\$\\{|\\{\\{|<%|%>|#\\{).*?(exec|import|system|eval|Runtime|getClass)" },

    { "label": "log4shell JNDI",
      "pattern": "(?i)\\$\\{jndi:(ldap|rmi|dns|iiop|corba|nds|http)://" },

    { "label": "XXE external entity",
      "pattern": "(?i)]*(SYSTEM|PUBLIC)" },

    { "label": "PHP deserialization",
      "pattern": "(?i)(O:\\d+:\"|a:\\d+:\\{|s:\\d+:\")" },

    { "label": "credit card numbers",
      "pattern": "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13})\\b" },

    { "label": "SSN pattern",
      "pattern": "\\b[0-9]{3}-[0-9]{2}-[0-9]{4}\\b" }
  ]
}
      

      
      
        response_body_rules #
        Applied to the upstream response body. Gzip-compressed responses are transparently decompressed for inspection; the original compressed payload is forwarded unchanged.
        
          
            Field Type Default Description
            
              max_response_inspect_bytes int 1048576 (1 MB) Maximum response bytes to scan. Bytes beyond this limit are not inspected.
            
          
        
        {
  "response_body_rules": [
    { "label": "block SSN in response",
      "pattern": "\\b[0-9]{3}-[0-9]{2}-[0-9]{4}\\b" },

    { "label": "block credit card in response",
      "pattern": "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13})\\b" },

    { "label": "log WordPress failed login",
      "pattern": "(?i)(The password you entered for the username|Unknown username|Invalid username)",
      "mode": "log" }
  ],
  "max_response_inspect_bytes": 1048576
}
      

      
      
        Per-Path Rules #
        Override or extend global rules for requests whose path matches a regex. Per-path rules are evaluated after global rules.
        
          
            Field Type Description
            
              label string Human-readable name.
              path_pattern string RE2 regex matched against the request path (not query string).
              allowed_methods []string Method whitelist for this path only. Overrides global for matching requests.
              inspect_get_body bool Override inspect_get_body for this path only.
              cache bool Override global cache.enabled for this path. Omit to inherit.
              gzip bool Override global gzip.enabled for this path. Omit to inherit.
              url_rules / header_rules / body_rules []rule Additional rules applied only to matching requests.
            
          
        
        {
  "path_rules": [
    {
      "label": "API endpoint",
      "path_pattern": "^/api/",
      "allowed_methods": ["GET", "POST", "PUT", "DELETE"],
      "body_rules": [
        { "label": "UNION SELECT", "pattern": "(?i)\\bunion\\b.+\\bselect\\b" }
      ]
    },
    {
      "label": "upload endpoint",
      "path_pattern": "^/upload",
      "allowed_methods": ["POST"],
      "body_rules": [
        { "label": "PHP opening tag", "pattern": "(?i)<\\?(php|=)" },
        { "label": "webshell eval", "pattern": "(?i)eval\\s*\\(\\s*(base64_decode|gzinflate|str_rot13)" }
      ]
    },
    { "label": "static assets", "path_pattern": "^/static/", "cache": true },
    { "label": "auth endpoints (never cache)", "path_pattern": "^/(login|logout|auth)", "cache": false },
    { "label": "downloads (skip gzip)", "path_pattern": "^/downloads/", "gzip": false }
  ]
}
      

      
      
        Conditional Rules #
        Threshold-based control statements that track how many qualifying requests a single IP has made to a specific endpoint within a time window, then fire an action when that threshold is crossed. Use cases: brute-force login protection, credential-stuffing mitigation, failure-based rate limiting.
        
          
            Field Type Default Description
            
              label string required Unique human-readable name shown in logs.
              path_pattern string required* RE2 regex matched against the path. Optional for rule_detect (defaults to ".*").
              methods []string [] (any) If non-empty, only these HTTP methods are counted.
              trigger_on string "failure" "failure" = any 4xx/5xx; "status_codes" = codes in status_codes; "rule_detect" = a rule in trigger_by_labels fires [DETECT]; "adaptive_score" = corroborated adaptive verdict.
              status_codes []int [] Specific codes to count (required when trigger_on is "status_codes").
              trigger_by_labels []string [] Rule labels to watch for [DETECT] events (required for "rule_detect").
              threshold int required Number of qualifying events before the action fires.
              window_sec int required Sliding window length in seconds.
              action string "block" "block" blocks the IP from path_pattern endpoints; "block_global" blocks proxy-wide.
              block_duration_min int 0 How long to block, in minutes. 0 = until the proxy restarts.
              log_only bool false Log without blocking. Use to tune thresholds before enforcing.
            
          
        

        Brute-force protection on a login endpoint
        {
  "conditional_rules": [
    {
      "label": "login brute-force",
      "path_pattern": "^/login$",
      "methods": ["POST"],
      "trigger_on": "failure",
      "threshold": 3,
      "window_sec": 1800,
      "action": "block",
      "block_duration_min": 60
    }
  ]
}

        Trigger on rule detection (content-inspection)
        {
  "response_body_rules": [
    { "label": "log WordPress failed login",
      "pattern": "(?i)(The password you entered for the username|Unknown username)",
      "mode": "log" }
  ],
  "conditional_rules": [
    {
      "label": "wordpress login brute-force",
      "trigger_on": "rule_detect",
      "trigger_by_labels": ["log WordPress failed login"],
      "threshold": 3,
      "window_sec": 1800,
      "action": "block",
      "block_duration_min": 60
    }
  ]
}
        
          On config reload
          Active blocks are cleared and the new rules take effect immediately — letting you adjust thresholds and unblock IPs by editing the config without restarting.
        
      

      
      
        CVE Detection #
        Inline, named, explainable detection of known-CVE exploitation. Where generic rules say "this looks like an attack," CVE detection says "this is the exploitation path for CVE-2021-44228 (Log4Shell), CVSS 10.0" — and, with stack awareness, "…you run Contact Form 7 5.8.1, which is < 5.9, so you are EXPOSED."
        Rules merge from three sources at load time:
        
          A built-in baseline catalog (Log4Shell, Spring4Shell, Struts2 OGNL, Shellshock, Confluence OGNL, plus .env/.git harvesting and WordPress user-enumeration recon) — works offline.
          A feed of CVE rules. With feed_url empty (default), a bundled cve-feed.json (~400 signatures from CISA KEV ∩ nuclei-templates) loads with no outbound request. Set feed_url to poll a live feed.
          Operator custom_rules.
        
        Matching is two-stage: a case-folding Aho-Corasick literal prefilter answers "could any rule plausibly match?" in one pass, and only survivors run their full RE2 regex.
        
          
            Field Type Default Description
            
              enabled bool false Master switch.
              mode string block Default action when a rule omits its own mode (block/log/null).
              require_stack_match bool false Only evaluate a rule when its platform/component is present behind the upstream. Unknown stacks fail open. Ubiquitous payloads (Log4Shell etc.) are never scoped out.
              block_when string always Gate blocking on the exposure verdict: always, exposed_or_unknown, or exposed.
              only_kev bool false Load only rules flagged kev: true (CISA KEV catalog).
              feed_url string — Feed of []CVERule JSON. Empty = bundled feed. http(s):// = remote poll. Bare path / file:// = local file.
              feed_token string — $ENV{}-expandable bearer token; redacted from the AI report.
              feed_interval_sec int 3600 Feed poll interval.
              feed_allow_http bool false Allow a plaintext http:// feed URL (otherwise rejected).
              disable_builtins []string — CVE ids to drop from the built-in catalog only.
              disable []string — CVE ids to drop from any source.
              custom_rules []CVERule — Operator-supplied rules.
            
          
        
        { "cve_detection": { "enabled": true, "mode": "block" } }
        A block emits [CVE_BLOCK]; a log-only hit emits [CVE_DETECT] (which also feeds conditional-rule trigger_on: rule_detect). A local-only /cve-intel endpoint returns the loaded rules, provenance, feed status, and per-host fingerprinted stacks.
      

      
      
        Stack Awareness #
        Because the proxy knows what runs behind each upstream, it (1) scopes the CVE ruleset to only what's deployed and (2) attaches an exposure verdict to every CVE hit. It's hybrid — declaration anchors, fingerprinting fills in.
        
          Declared (upstream_stack, top-level or per domains.<host>): you state the platform and optionally components with versions. Declared versions override anything detected. Set auto_detect: false to rely on the declaration alone.
          Passively detected: as upstream responses pass through, the proxy reads Server / X-Powered-By / X-Generator headers, the <meta name="generator"> tag, and WordPress plugin asset versions — keyed per host.
        
        
          
            Field Type Description
            
              platform string e.g. wordpress, apache, php.
              components []object { name, version } declared versions (override detection).
              auto_detect bool true by default; false disables passive fingerprinting for this host.
            
          
        
        Given a rule whose stack.affected is <5.9 and a host running contact-form-7 5.8.1, a probe is logged exposed="yes"; a patched 6.0.0 logs exposed="no"; an undetected version logs exposed="unknown" (never treated as not-exposed).
      

      
      
        Botnet Detection #
        Three-layer detection: IP reputation blocklists, behavioral auto-blocking, and user-agent fingerprinting.
        
          
            Field Type Default Description
            
              enabled bool false Enable botnet detection.
              log_only bool false Detect but don't block — logs [DETECT]. Use to tune before enforcing.
              Layer 1 — IP reputation
              ip_blocklists []object [] Remote blocklists to fetch. Each has a url and label. Duplicates auto-deduplicated.
              refresh_interval_min int 60 Minutes between blocklist refreshes.
              Layer 2 — Behavioral
              behavioral_enabled bool false Enable behavioral auto-blocking.
              error_threshold int 20 4xx/5xx responses in the window before ban.
              scan_threshold int 100 Unique paths accessed in the window before ban.
              window_sec int 60 Tracking window length in seconds.
              ban_duration_min int 30 How long a behaviorally-banned IP stays blocked.
              ignore_not_found_paths []string [] RE2 path patterns exempt from the error counter on 404 (still counts toward scan).
              Layer 3 — UA fingerprinting
              ua_fingerprint_enabled bool false Enable user-agent pattern matching.
              ua_patterns []string built-ins RE2 patterns vs User-Agent. When non-empty, replaces the built-in list.
              custom_ua_patterns []string [] Extra patterns always appended on top.
              block_empty_ua bool false Block requests with no User-Agent header.
            
          
        
        
          Tuning workflow
          Start with "log_only": true, watch [DETECT] lines to confirm patterns aren't over-matching, then switch to "log_only": false. Use \b word boundaries in UA patterns to avoid false positives — \bmozi\b won't match Mozilla.
        
      

      
      
        GeoIP Blocking #
        Block or allow requests by country code using a CIDR-to-country CSV database.
        
          
            Field Type Default Description
            
              geoip.enabled bool false Enable GeoIP checking.
              geoip.database_path string "" Path to a CSV file with columns cidr,country_code.
              geoip.blocked_countries []string [] ISO 3166-1 alpha-2 codes to block.
              geoip.allowed_countries []string [] If non-empty, only these codes are allowed through.
            
          
        
        {
  "geoip": {
    "enabled": true,
    "database_path": "/etc/constable/geoip.csv",
    "blocked_countries": ["CN", "RU", "KP"]
  }
}
        CSV format — one CIDR per line with its country code: 1.0.0.0/24,AU
      

      
      
        Learn Mode #
        A WAF bootstrapper for new deployments. Enable it and let real traffic teach the proxy what query parameters your application uses. After the observation window it generates mode:"log" WAF rules targeting those parameters with SQLi, XSS, and path-traversal patterns, written to a file for review. It never blocks.
        
          
            Field Default Description
            
              enabled false Enable the traffic profiler.
              window_sec 300 Observation window in seconds.
              min_requests 100 Minimum requests before generating rules.
              min_param_count 5 Minimum times a query param must appear.
              max_rules_per_type 20 Cap on how many parameters get rules.
              source_ips [] If non-empty, only learn from these source IPs / CIDRs. Empty = learn from all.
              output_file "learned-rules.json" File to write generated rules.
            
          
        
        {
  "listen_addr": ":8000",
  "target_url": "http://localhost:8080",
  "learn_mode": { "enabled": true, "source_ips": ["10.0.0.0/8", "192.168.0.0/16"] }
}
        
          Best practice — learn only from trusted traffic
          Learn Mode shapes rules from whatever it observes, so malicious traffic during the window can skew the result. If you can't guarantee the learning traffic is clean, pin learning to your trusted source ranges with source_ips.
        
      

      
      
        Adaptive Learning #
        Where Learn Mode is a one-shot rule generator, adaptive_learning is a continuous anomaly-scoring engine that keeps adapting to live traffic. It combines three signals into a per-request risk score: past attack styles, a learned profile of the application, and 4xx error indicators.
        Shadow-first, never blocks by default
        
          
            Mode Behavior
            
              off Engine not built (zero cost). Default.
              observe Learns and persists the model only. No scoring, no logs, never blocks.
              shadow Scores every request, emits [ADAPTIVE] + the proxy_adaptive_would_block_total metric, but never blocks. Run this while evaluating.
              enforce May block — but only by feeding a corroborated verdict to a conditional_rules entry.
            
          
        
        Even in enforce, a block requires all of: score >= enforce_threshold, at least min_corroborating_signals independent signal classes agreeing, and a mature path profile.
        "conditional_rules": [
  { "label": "adaptive-enforce", "trigger_on": "adaptive_score",
    "threshold": 3, "window_sec": 300, "action": "block", "block_duration_min": 30 }
],
"adaptive_learning": { "mode": "enforce", "warmup_sec": 86400 }
        
          
            Field Default Meaning
            
              mode off off / observe / shadow / enforce.
              model_file learned-model.json Persisted model path (stats only — no raw values).
              warmup_sec 86400 Time before any path profile can be considered mature.
              min_path_observations 200 Clean requests before a path profile matures.
              min_param_observations 20 Observations before a param's value stats are trusted.
              min_confidence 0.8 Minimum attribute confidence to score a deviation.
              enforce_threshold 0.85 Combined score needed to (corroborated) block in enforce.
              min_corroborating_signals 2 Independent signal classes that must agree to block.
              decay_half_life_sec 604800 Decay half-life for profiles + attack labels (7 days).
              snapshot_interval_sec 300 Model flush + decay sweep interval.
              max_path_profiles 5000 Cap on distinct path templates (LRU).
              exempt_paths [] Regexes; matching paths are never scored or learned.
            
          
        
        Inspect the model via local-only endpoints on the metrics port: GET /anomaly-model (full dump) and GET /anomaly-model/explain?path=/api/items (one path). The model persists stats-only with decay applied for elapsed time, and survives config reloads.
      

      
      
        Upstreams & Load Balancing #
        When upstreams is set it takes precedence over target_url. Unhealthy nodes are skipped automatically when health checks are enabled.
        
          
            Field Type Default Description
            
              upstreams []object [] List of upstream targets. Each has a url and optional weight.
              load_balance string "round-robin" "round-robin", "least-conn", or "random".
            
          
        
        {
  "upstreams": [
    { "url": "http://10.0.0.1:8080", "weight": 2 },
    { "url": "http://10.0.0.2:8080", "weight": 1 }
  ],
  "load_balance": "least-conn"
}
        Upstream transport tuning
        
          
            Field Default Description
            
              upstream_dial_timeout_ms 5000 TCP dial timeout for new upstream connections.
              upstream_keepalive_sec 30 TCP keep-alive interval on dialer.
              upstream_response_header_timeout_ms 30000 Max wait for the upstream's response headers.
              upstream_max_idle_conns 1024 Total idle connections kept across all upstreams.
              upstream_max_idle_conns_per_host 256 Idle connections kept per host.
              upstream_idle_conn_timeout_sec 90 How long an idle pooled connection stays before closing.
              upstream_total_request_timeout_sec 0 When > 0, caps the entire upstream forward (headers + body). 0 = no end-to-end limit.
            
          
        
        For HTTPS upstreams, TLS sessions are automatically resumed across connections (TLS 1.2 tickets, TLS 1.3 PSK) — no configuration needed.
      

      
      
        Health Checks #
        Periodically probe upstreams and remove unhealthy nodes from rotation.
        
          
            Field Type Default Description
            
              health_check.enabled bool false Enable background health probing.
              health_check.interval_sec int 10 Seconds between probes.
              health_check.timeout_sec int 5 HTTP timeout per probe in seconds.
              health_check.path string "/" Path probed on each upstream.
              health_check.unhealthy_threshold int 3 Consecutive failures before marking a node unhealthy.
              health_check.endpoint_path string "/healthz" Path the proxy itself answers for liveness (returns 200 OK).
            
          
        
      

      
      
        Response Caching #
        In-memory response cache. Serves repeated GET/HEAD requests from cache without hitting the upstream. Responses are cached only when the request carries no credentials and the response is 2xx with no Cache-Control: no-store/private or Set-Cookie.
        
          
            Field Type Default Description
            
              cache.enabled bool false Enable response caching.
              cache.ttl_sec int 60 Time-to-live in seconds for each cached entry.
              cache.max_entries int 0 → capped at 100000 Maximum cached entries. 0 is treated as a bounded 100,000 cap (with a warning) so an attacker can't OOM the cache.
              cache.max_body_bytes int 1048576 (1 MB) Max response body size to cache. Larger bodies are forwarded normally and never cached.
              cache.x_cache_header bool false Add X-Cache: HIT/MISS to responses.
              cache.skip_cookies []string [] Cookie-name prefixes that disqualify a request from the cache (prefix match).
            
          
        
        
          Per-user cookies
          For sites that issue per-user cookies, configure skip_cookies — otherwise personalized responses can be served to other clients. e.g. ["wordpress_logged_in_", "wp-postpass_", "PHPSESSID"].
        
        Cache log events: CACHE_HIT (served from cache) and CACHE_STORE (new response stored). Config reloads that change cache settings discard the old cache and start fresh.
      

      
      
        Gzip Compression #
        Compresses eligible upstream responses before sending them to clients that advertise Accept-Encoding: gzip. Works alongside the response cache (the cache stores uncompressed bytes and compresses on-the-fly).
        
          
            Field Type Default Description
            
              gzip.enabled bool false Enable gzip compression.
              gzip.level int 0 (default) 1 = fastest, 9 = best, 0 = Go default (level 6).
              gzip.exclude_types []string [] Content-Type substrings to skip.
              gzip.exclude_extensions []string [] Request path extensions to skip (e.g. ".jpg", ".mp4").
            
          
        
        
          Tip
          Images, video, audio, and already-compressed formats gain little from gzip and may even grow slightly. Add them to exclude_types / exclude_extensions to save CPU. If the upstream already set Content-Encoding, the proxy passes it through unchanged.
        
      

      
      
        Security Response Headers #
        Key-value pairs injected into every proxied response.
        {
  "security_headers": {
    "X-Content-Type-Options": "nosniff",
    "X-Frame-Options": "DENY",
    "X-XSS-Protection": "1; mode=block",
    "Referrer-Policy": "strict-origin-when-cross-origin",
    "Strict-Transport-Security": "max-age=31536000; includeSubDomains"
  }
}
      

      
      
        Remove Response Headers #
        Strip specific headers from the upstream response before forwarding to the client.
        {
  "remove_response_headers": ["Server", "X-Powered-By", "X-Generator"]
}
      

      
      
        Response Rewrites #
        String replacements applied to the response body and Location headers. Useful when an upstream returns hardcoded internal URLs or IP addresses.
        {
  "response_rewrites": [
    { "find": "http://10.0.0.5",  "replace": "https://example.com" },
    { "find": "https://10.0.0.5", "replace": "https://example.com" }
  ]
}
      

      
      
        TLS Termination #
        Manual TLS using your own certificate and key files.
        
          
            Field Type Default Description
            
              tls.listen_addr string ":443" Address for the TLS listener.
              tls.cert_file string "" Path to PEM certificate file.
              tls.key_file string "" Path to PEM private key file.
              tls.min_version string "1.2" Minimum TLS version: "1.0"–"1.3".
              tls.cipher_suites []string Go defaults Preferred cipher suites (TLS 1.0–1.2 only; TLS 1.3 fixed by Go).
              tls.client_auth string "none" mTLS: none / request / require. See Hardening.
              tls.client_ca_file string "" CA to verify client certificates against (mTLS).
            
          
        
      

      
      
        Let's Encrypt (Automatic TLS) #
        Automatically obtain and renew certificates. Certs are cached to disk and renewed within 30 days of expiry.
        
          
            Field Type Default Description
            
              lets_encrypt.enabled bool false Enable automatic certificate management.
              lets_encrypt.domains []string [] Domains to obtain a cert for. Every key in the domains map is also unioned into the SAN list automatically.
              lets_encrypt.email string "" Contact email registered with Let's Encrypt.
              lets_encrypt.cache_dir string "./certs" Directory to store the account key and certificate.
              lets_encrypt.staging bool false Use the staging environment for testing.
              lets_encrypt.challenge string "http" "http" (HTTP-01) or "dns-cloudflare" (DNS-01).
              lets_encrypt.cloudflare.api_token string "" Cloudflare token with Zone:DNS:Edit (DNS-01 only).
              lets_encrypt.cloudflare.zone_id string "" Cloudflare Zone ID (DNS-01 only).
            
          
        
        {
  "lets_encrypt": {
    "enabled": true,
    "domains": ["example.com", "*.example.com"],
    "email": "admin@example.com",
    "cache_dir": "/var/lib/constable/acme",
    "challenge": "dns-cloudflare",
    "cloudflare": { "api_token": "$ENV{CF_TOKEN}", "zone_id": "YOUR_ZONE_ID" }
  }
}
        
          Tip
          Set "staging": true first to verify the setup without hitting rate limits, then switch to false — the proxy re-issues a trusted production certificate automatically within seconds. DNS-01 works behind firewalls and supports wildcard certificates.
        
      

      
      
        HTTP & WWW Redirects #
        Issue 301 redirects from HTTP to HTTPS, and from the bare domain to www.
        
          
            Field Type Default Description
            
              https_redirect bool false Enable HTTP → HTTPS redirects.
              https_redirect_addr string ":80" Address for the redirect listener. When equal to listen_addr, redirects are handled inline on the main listener.
              www_redirect bool false Redirect example.com → www.example.com.
              allowed_redirect_hosts []string [] Explicit Host values that may appear in a redirect Location (open-redirect protection). Matched case-insensitively, port-stripped.
              allow_any_redirect_host bool false Legacy behavior of accepting any Host. Only set if an upstream layer already validates Host.
            
          
        
        
          Open-redirect protection
          Both redirects build the Location header from the request's Host. The allow-list resolves in order: allowed_redirect_hosts → lets_encrypt.domains → top-level domains map keys → otherwise deny with 400. Without an allow-list an attacker could send Host: evil.com and turn the proxy into an open redirect.
        
      

      
      
        Authentication #
        Three auth methods are available; only one needs to be enabled at a time. exempt_paths lists path prefixes that bypass authentication entirely.

        Basic Auth #
        Passwords are stored as sha256:<hex> hashes (unsalted) and compared in constant time.
        # Generate a password hash
echo -n "mypassword" | sha256sum
        {
  "auth": {
    "basic": {
      "enabled": true,
      "realm": "My Proxy",
      "users": { "alice": "sha256:89e0...f45e" },
      "exempt_paths": ["/healthz", "/public"]
    }
  }
}
        
          Threat model — unsalted SHA-256
          The proxy is stdlib-only by design; bcrypt/argon2/scrypt live outside stdlib. Constant-time compare blocks online timing attacks, but a leaked config can be cracked offline at GPU speed against common passwords. Use unique, 16+-character random passwords. For high-value credentials, terminate auth at an upstream IdP (OIDC/SSO) and use API keys or JWT through this proxy.
        

        API Key Auth #
        {
  "auth": {
    "api_key": {
      "enabled": true,
      "header": "X-API-Key",
      "keys": ["key-one-abc123", "key-two-def456"],
      "exempt_paths": ["/healthz"]
    }
  }
}

        JWT Auth #
        Supports HMAC algorithms HS256, HS384, HS512. The secret supports $ENV{VAR} expansion.
        {
  "auth": {
    "jwt": {
      "enabled": true,
      "secret": "$ENV{JWT_SECRET}",
      "algorithm": "HS256",
      "header": "Authorization",
      "exempt_paths": ["/healthz", "/login"]
    }
  }
}
      

      
      
        Prometheus Metrics #
        Exposes a Prometheus-compatible /metrics endpoint with request counts, latency histograms, and block/allow totals — including CVE and adaptive-learning series. The same listener serves the local-only /cve-intel and /anomaly-model endpoints.
        
          
            Field Type Default Description
            
              metrics_addr string "" (disabled) Address for the metrics listener, e.g. "127.0.0.1:9090".
              metrics_local_only bool false Restrict metrics (and /cve-intel / /anomaly-model) to loopback connections only.
            
          
        
        { "metrics_addr": "127.0.0.1:9090", "metrics_local_only": true }
      

      
      
        Peer-to-Peer Block Sync #
        When running multiple proxy instances, each node automatically propagates dynamic blocks (botnet behavioral bans and conditional-rule blocks) to its peers within seconds. Nodes also run a periodic full-reconciliation loop to catch missed events.
        Security model
        
          All messages are HMAC-SHA256 signed with a shared secret.
          Timestamp replay protection rejects messages more than 30 seconds old.
          Nonce deduplication prevents the same message from being applied twice.
          TLS transport — the listener requires a certificate (falls back to the main proxy cert).
          Optional mutual TLS for peer identity verification.
        
        
          
            Field Type Default Description
            
              peers.enabled bool false Enable peer sync.
              peers.listen_addr string "" Address for this node's peer-sync listener. Required when enabled.
              peers.shared_key string "" HMAC-SHA256 signing key shared by all peers. Supports $ENV{VAR}. Required — an empty key is rejected.
              peers.peers []object [] List of peer nodes — each has an address and optional per-peer shared_key override.
              peers.sync_interval_sec int 60 How often each node pulls full state from peers to reconcile.
              peers.cert_file / key_file string "" TLS cert/key for the listener (falls back to tls.*).
              peers.peer_ca_cert string "" PEM CA to verify peer TLS certs (for self-signed certs).
              peers.mutual_tls bool false Present this node's cert as a client cert when connecting to peers.
              peers.allow_plaintext bool false Opt into plaintext HTTP when no TLS cert is set (trusted private networks only).
            
          
        
        
          Peer-sourced bans are bounded
          Entries received from peers are validated, capped per shard (a flooding peer can't exhaust memory), and given a bounded expiry (max 24h — never permanent), so a single compromised peer cannot pin a permanent cluster-wide ban.
        
      

      
      
        Production Hardening #
        Advanced, production-oriented controls. All are off by default, hot-reloadable, and pure stdlib.

        Client-certificate (mTLS) auth
        tls.client_auth: "require" rejects the handshake unless the client presents a cert chaining to client_ca_file; the verified subject is forwarded upstream as X-Client-Cert-Subject.

        Connection-level DDoS controls
        rate_limit.max_concurrent_per_ip caps simultaneous in-flight requests per IP. Listener timeouts are tunable for slow-POST/slowloris defense: read_header_timeout_ms (10000), read_timeout_ms (30000), write_timeout_ms (60000), idle_timeout_ms (120000).

        Structured body inspection
        inspect_structured_body: true decodes JSON / urlencoded-form / multipart bodies into individual field values and runs body_rules against each. max_json_depth (default 64) rejects pathologically nested JSON.

        Forward-auth / external authorization
        "forward_auth": { "enabled": true, "url": "http://authz:9000/auth",
  "copy_request_headers": ["Cookie","Authorization"], "copy_response_headers": ["X-Auth-User"] }
        A subrequest carries the original method/URI/host; 2xx allows, 401/403 denies, any other status or transport error fails closed.

        Runtime management API
        "management": { "enabled": true, "token": "$ENV{ASP_ADMIN_TOKEN}" }
        Bearer-token + local-only admin endpoints on the metrics listener: POST /admin/ban?ip=<ip>&minutes=<n>, POST /admin/unban?ip=<ip>, GET /admin/bans, POST /admin/purge-cache. Bans persist across config reloads.

        Outlier ejection / circuit breaking
        outlier_detection ejects an upstream after consecutive_5xx (default 5) 5xx responses for ejection_sec (default 30) — catching backends that pass active probes but fail real requests. max_ejection_percent (default 50) caps simultaneous ejections; the last upstream is never ejected.

        Upstream retries
        upstream_retry retries idempotent requests on transient transport errors (never on a returned status) up to max_retries with a backoff_ms delay.

        Adaptive concurrency / load shedding
        adaptive_concurrency sheds requests (503) when in-flight exceeds a latency-steered limit — AIMD, bounded by [min_limit, max_limit].

        Distributed tracing
        tracing.enabled propagates a W3C traceparent to the upstream and logs a TRACE event linking trace_id to request_id. No OpenTelemetry SDK — pure stdlib.

        Real-time alerting webhooks
        alerting POSTs to a webhook (generic JSON or Slack) when thresholds are crossed — blocks_per_interval, rate_limited_per_interval, adaptive_would_block_per_tick, or alert_on_upstream_down.

        SIEM / syslog streaming
        "syslog": { "enabled": true, "network": "tcp+tls", "address": "siem.example.com:6514" }
        Mirrors every structured log event to a remote collector as RFC5424 frames over udp / tcp / tcp+tls.

        Bot challenge (proof-of-work)
        "bot_challenge": { "enabled": true, "secret": "$ENV{ASP_BOT_SECRET}", "difficulty": 4, "cookie_ttl_sec": 3600 }
        Gates matching paths behind a JavaScript proof-of-work interstitial. Stateless (HMAC over client IP + time) — no server-side storage, no CAPTCHA service.

        Response DLP / PII redaction
        "dlp": { "enabled": true, "mode": "redact", "builtins": true,
  "patterns": [ { "label": "internal-token", "pattern": "INT-[A-Z0-9]{20}" } ] }
        Scans response bodies for secrets/PII and either redacts matches in-flight (mode: "redact") or logs them. builtins: true enables a curated set (US SSN, credit cards, AWS keys, JWTs, PEM private keys).
      

      
      
        Daily AI Report #
        Once per day, the proxy ships the previous 24 hours of constable.log and the running config to the Claude API and emails the analysis — top threats, block/detect/error counts, per-rule performance, and concrete config-change recommendations. Disabled by default; the pipeline can never block proxy traffic.
        
          
            Setting Default Description
            
              ai_report.enabled false Master on/off switch.
              ai_report.schedule_time "08:00" Daily fire time, HH:MM 24-hour, server local time.
              ai_report.model "claude-opus-4-7" Anthropic model ID.
              ai_report.api_key_env "ANTHROPIC_API_KEY" Env var holding the Anthropic API key (never read from config).
              ai_report.max_tokens 4096 Output token cap on the Claude response.
              ai_report.smtp_host — SMTP server hostname.
              ai_report.smtp_port 587 SMTP port. STARTTLS auto-negotiated when advertised.
              ai_report.smtp_user — SMTP username. Empty disables auth.
              ai_report.smtp_password_env "SMTP_PASSWORD" Env var holding the SMTP password.
              ai_report.from / to — From: address and recipient list.
              ai_report.trigger_path / trigger_token empty Optional metrics-listener path + bearer token to run the report on demand (POST).
            
          
        
        
          Pre-send redaction
          Before the config is shipped to Claude, the report blanks auth.basic.users, auth.api_key.keys, auth.jwt.secret, every per-domain domains.*.auth block, all peers.shared_key fields, the Cloudflare API token, the SMTP user, and the trigger token.
        
      

      
      
        Complete config.json Example #
        A representative configuration touching the most common subsystems. Trim it to what you need — everything not listed inherits its default.
        {
  "listen_addr": ":80",
  "target_url": "http://localhost:8080",
  "max_workers": 64,
  "preserve_host": true,

  "block_status_code": 403,
  "block_message": "Blocked by proxy policy",

  "rate_limit": { "requests_per_second": 100, "burst": 200, "cleanup_interval_sec": 300 },

  "max_body_bytes": 10485760,
  "regex_timeout_ms": 2000,
  "allowed_methods": ["GET", "POST", "PUT", "PATCH", "DELETE", "HEAD", "OPTIONS"],

  "log_format": "json",
  "log_file": "constable.log",
  "log_max_size_mb": 100,
  "log_max_backups": 5,

  "metrics_addr": "127.0.0.1:9090",
  "metrics_local_only": true,

  "cache": {
    "enabled": true, "ttl_sec": 300, "max_entries": 10000,
    "skip_cookies": ["wordpress_logged_in_", "PHPSESSID"]
  },
  "gzip": { "enabled": true, "level": 6, "exclude_types": ["image/", "video/", "application/pdf"] },

  "tls": { "listen_addr": ":443", "min_version": "1.2" },
  "lets_encrypt": { "enabled": true, "domains": ["example.com"], "email": "admin@example.com",
                    "cache_dir": "/var/lib/constable/acme", "challenge": "http" },
  "https_redirect": true,
  "allowed_redirect_hosts": ["example.com", "www.example.com"],

  "security_headers": {
    "X-Content-Type-Options": "nosniff",
    "X-Frame-Options": "DENY",
    "Referrer-Policy": "strict-origin-when-cross-origin"
  },
  "remove_response_headers": ["Server", "X-Powered-By"],

  "cve_detection": { "enabled": true, "mode": "block" },

  "botnet_detection": {
    "enabled": true, "log_only": false,
    "ip_blocklists": [
      { "url": "https://www.spamhaus.org/drop/drop.txt", "label": "Spamhaus DROP" }
    ],
    "behavioral_enabled": true, "error_threshold": 20, "scan_threshold": 100,
    "ua_fingerprint_enabled": true
  },

  "url_rules": [
    { "label": "block .env files",      "pattern": "\\.env($|\\?)" },
    { "label": "path traversal",        "pattern": "(?i)(\\.\\.[\\\\/]|%2e%2e[%2f%5c])" },
    { "label": "sensitive file access", "pattern": "(?i)\\.(htaccess|htpasswd|git|svn|bak)($|[\\?/])" }
  ],
  "header_rules": [
    { "label": "block scanner user-agents", "pattern": "(?i)User-Agent:.*(sqlmap|nikto|nmap|nuclei)" }
  ],
  "body_rules": [
    { "label": "log4shell JNDI", "pattern": "(?i)\\$\\{jndi:(ldap|rmi|dns)://" }
  ],

  "conditional_rules": [
    {
      "label": "login brute-force",
      "path_pattern": "^/login$", "methods": ["POST"],
      "trigger_on": "failure", "threshold": 3, "window_sec": 1800,
      "action": "block", "block_duration_min": 60
    }
  ]
}

        
          Where things are written
          config.json (you — hot-reloaded every 3s) · learned-rules.json (Learn Mode — candidate rules to review) · learned-model.json (Adaptive Learning — stats-only model) · constable.log (the proxy — structured event log, ingested by the daily AI report).
        

        
          This documentation covers installing, configuring, and operating Constable.
          · Product overview · Talk to AgileSecOps

Field	Default	Meaning
`mode`	`off`	`off` / `observe` / `shadow` / `enforce`.
`model_file`	`learned-model.json`	Persisted model path (stats only — no raw values).
`warmup_sec`	`86400`	Time before any path profile can be considered mature.
`min_path_observations`	`200`	Clean requests before a path profile matures.
`min_param_observations`	`20`	Observations before a param's value stats are trusted.
`min_confidence`	`0.8`	Minimum attribute confidence to score a deviation.
`enforce_threshold`	`0.85`	Combined score needed to (corroborated) block in enforce.
`min_corroborating_signals`	`2`	Independent signal classes that must agree to block.
`decay_half_life_sec`	`604800`	Decay half-life for profiles + attack labels (7 days).
`snapshot_interval_sec`	`300`	Model flush + decay sweep interval.
`max_path_profiles`	`5000`	Cap on distinct path templates (LRU).
`exempt_paths`	`[]`	Regexes; matching paths are never scored or learned.

Field	Default	Description
`upstream_dial_timeout_ms`	`5000`	TCP dial timeout for new upstream connections.
`upstream_keepalive_sec`	`30`	TCP keep-alive interval on dialer.
`upstream_response_header_timeout_ms`	`30000`	Max wait for the upstream's response headers.
`upstream_max_idle_conns`	`1024`	Total idle connections kept across all upstreams.
`upstream_max_idle_conns_per_host`	`256`	Idle connections kept per host.
`upstream_idle_conn_timeout_sec`	`90`	How long an idle pooled connection stays before closing.
`upstream_total_request_timeout_sec`	`0`	When > 0, caps the entire upstream forward (headers + body). `0` = no end-to-end limit.

Setting	Default	Description
`ai_report.enabled`	`false`	Master on/off switch.
`ai_report.schedule_time`	`"08:00"`	Daily fire time, `HH:MM` 24-hour, server local time.
`ai_report.model`	`"claude-opus-4-7"`	Anthropic model ID.
`ai_report.api_key_env`	`"ANTHROPIC_API_KEY"`	Env var holding the Anthropic API key (never read from config).
`ai_report.max_tokens`	`4096`	Output token cap on the Claude response.
`ai_report.smtp_host`	—	SMTP server hostname.
`ai_report.smtp_port`	`587`	SMTP port. STARTTLS auto-negotiated when advertised.
`ai_report.smtp_user`	—	SMTP username. Empty disables auth.
`ai_report.smtp_password_env`	`"SMTP_PASSWORD"`	Env var holding the SMTP password.
`ai_report.from` / `to`	—	`From:` address and recipient list.
`ai_report.trigger_path` / `trigger_token`	empty	Optional metrics-listener path + bearer token to run the report on demand (`POST`).

Field	Type	Description
`label`	string	Human-readable name.
`path_pattern`	string	RE2 regex matched against the request path (not query string).
`allowed_methods`	[]string	Method whitelist for this path only. Overrides global for matching requests.
`inspect_get_body`	bool	Override `inspect_get_body` for this path only.
`cache`	bool	Override global `cache.enabled` for this path. Omit to inherit.
`gzip`	bool	Override global `gzip.enabled` for this path. Omit to inherit.
`url_rules` / `header_rules` / `body_rules`	[]rule	Additional rules applied only to matching requests.

Field	Type	Default	Description
`enabled`	bool	`false`	Master switch.
`mode`	string	`block`	Default action when a rule omits its own mode (`block`/`log`/`null`).
`require_stack_match`	bool	`false`	Only evaluate a rule when its platform/component is present behind the upstream. Unknown stacks fail open. Ubiquitous payloads (Log4Shell etc.) are never scoped out.
`block_when`	string	`always`	Gate blocking on the exposure verdict: `always`, `exposed_or_unknown`, or `exposed`.
`only_kev`	bool	`false`	Load only rules flagged `kev: true` (CISA KEV catalog).
`feed_url`	string	—	Feed of `[]CVERule` JSON. Empty = bundled feed. `http(s)://` = remote poll. Bare path / `file://` = local file.
`feed_token`	string	—	`$ENV{}`-expandable bearer token; redacted from the AI report.
`feed_interval_sec`	int	`3600`	Feed poll interval.
`feed_allow_http`	bool	`false`	Allow a plaintext `http://` feed URL (otherwise rejected).
`disable_builtins`	[]string	—	CVE ids to drop from the built-in catalog only.
`disable`	[]string	—	CVE ids to drop from any source.
`custom_rules`	[]CVERule	—	Operator-supplied rules.

Field	Type	Description
`platform`	string	e.g. `wordpress`, `apache`, `php`.
`components`	[]object	`{ name, version }` declared versions (override detection).
`auto_detect`	bool	`true` by default; `false` disables passive fingerprinting for this host.

Field	Type	Default	Description
`geoip.enabled`	bool	`false`	Enable GeoIP checking.
`geoip.database_path`	string	`""`	Path to a CSV file with columns `cidr,country_code`.
`geoip.blocked_countries`	[]string	`[]`	ISO 3166-1 alpha-2 codes to block.
`geoip.allowed_countries`	[]string	`[]`	If non-empty, only these codes are allowed through.

Field	Default	Description
`enabled`	`false`	Enable the traffic profiler.
`window_sec`	`300`	Observation window in seconds.
`min_requests`	`100`	Minimum requests before generating rules.
`min_param_count`	`5`	Minimum times a query param must appear.
`max_rules_per_type`	`20`	Cap on how many parameters get rules.
`source_ips`	`[]`	If non-empty, only learn from these source IPs / CIDRs. Empty = learn from all.
`output_file`	`"learned-rules.json"`	File to write generated rules.

Mode	Behavior
`off`	Engine not built (zero cost). Default.
`observe`	Learns and persists the model only. No scoring, no logs, never blocks.
`shadow`	Scores every request, emits `[ADAPTIVE]` + the `proxy_adaptive_would_block_total` metric, but never blocks. Run this while evaluating.
`enforce`	May block — but only by feeding a corroborated verdict to a `conditional_rules` entry.

Field	Type	Default	Description
`upstreams`	[]object	`[]`	List of upstream targets. Each has a `url` and optional `weight`.
`load_balance`	string	`"round-robin"`	`"round-robin"`, `"least-conn"`, or `"random"`.

Field	Type	Default	Description
`health_check.enabled`	bool	`false`	Enable background health probing.
`health_check.interval_sec`	int	`10`	Seconds between probes.
`health_check.timeout_sec`	int	`5`	HTTP timeout per probe in seconds.
`health_check.path`	string	`"/"`	Path probed on each upstream.
`health_check.unhealthy_threshold`	int	`3`	Consecutive failures before marking a node unhealthy.
`health_check.endpoint_path`	string	`"/healthz"`	Path the proxy itself answers for liveness (returns 200 OK).

Field	Type	Default	Description
`cache.enabled`	bool	`false`	Enable response caching.
`cache.ttl_sec`	int	`60`	Time-to-live in seconds for each cached entry.
`cache.max_entries`	int	`0` → capped at `100000`	Maximum cached entries. `0` is treated as a bounded 100,000 cap (with a warning) so an attacker can't OOM the cache.
`cache.max_body_bytes`	int	`1048576` (1 MB)	Max response body size to cache. Larger bodies are forwarded normally and never cached.
`cache.x_cache_header`	bool	`false`	Add `X-Cache: HIT`/`MISS` to responses.
`cache.skip_cookies`	[]string	`[]`	Cookie-name prefixes that disqualify a request from the cache (prefix match).

Field	Type	Default	Description
`gzip.enabled`	bool	`false`	Enable gzip compression.
`gzip.level`	int	`0` (default)	`1` = fastest, `9` = best, `0` = Go default (level 6).
`gzip.exclude_types`	[]string	`[]`	Content-Type substrings to skip.
`gzip.exclude_extensions`	[]string	`[]`	Request path extensions to skip (e.g. `".jpg"`, `".mp4"`).

Field	Type	Default	Description
`tls.listen_addr`	string	`":443"`	Address for the TLS listener.
`tls.cert_file`	string	`""`	Path to PEM certificate file.
`tls.key_file`	string	`""`	Path to PEM private key file.
`tls.min_version`	string	`"1.2"`	Minimum TLS version: `"1.0"`–`"1.3"`.
`tls.cipher_suites`	[]string	Go defaults	Preferred cipher suites (TLS 1.0–1.2 only; TLS 1.3 fixed by Go).
`tls.client_auth`	string	`"none"`	mTLS: `none` / `request` / `require`. See Hardening.
`tls.client_ca_file`	string	`""`	CA to verify client certificates against (mTLS).

Field	Type	Default	Description
`lets_encrypt.enabled`	bool	`false`	Enable automatic certificate management.
`lets_encrypt.domains`	[]string	`[]`	Domains to obtain a cert for. Every key in the `domains` map is also unioned into the SAN list automatically.
`lets_encrypt.email`	string	`""`	Contact email registered with Let's Encrypt.
`lets_encrypt.cache_dir`	string	`"./certs"`	Directory to store the account key and certificate.
`lets_encrypt.staging`	bool	`false`	Use the staging environment for testing.
`lets_encrypt.challenge`	string	`"http"`	`"http"` (HTTP-01) or `"dns-cloudflare"` (DNS-01).
`lets_encrypt.cloudflare.api_token`	string	`""`	Cloudflare token with `Zone:DNS:Edit` (DNS-01 only).
`lets_encrypt.cloudflare.zone_id`	string	`""`	Cloudflare Zone ID (DNS-01 only).

Field	Type	Default	Description
`https_redirect`	bool	`false`	Enable HTTP → HTTPS redirects.
`https_redirect_addr`	string	`":80"`	Address for the redirect listener. When equal to `listen_addr`, redirects are handled inline on the main listener.
`www_redirect`	bool	`false`	Redirect `example.com` → `www.example.com`.
`allowed_redirect_hosts`	[]string	`[]`	Explicit `Host` values that may appear in a redirect Location (open-redirect protection). Matched case-insensitively, port-stripped.
`allow_any_redirect_host`	bool	`false`	Legacy behavior of accepting any Host. Only set if an upstream layer already validates Host.

Field	Type	Default	Description
`metrics_addr`	string	`""` (disabled)	Address for the metrics listener, e.g. `"127.0.0.1:9090"`.
`metrics_local_only`	bool	`false`	Restrict metrics (and `/cve-intel` / `/anomaly-model`) to loopback connections only.

Field	Type	Default	Description
`peers.enabled`	bool	`false`	Enable peer sync.
`peers.listen_addr`	string	`""`	Address for this node's peer-sync listener. Required when enabled.
`peers.shared_key`	string	`""`	HMAC-SHA256 signing key shared by all peers. Supports `$ENV{VAR}`. Required — an empty key is rejected.
`peers.peers`	[]object	`[]`	List of peer nodes — each has an `address` and optional per-peer `shared_key` override.
`peers.sync_interval_sec`	int	`60`	How often each node pulls full state from peers to reconcile.
`peers.cert_file` / `key_file`	string	`""`	TLS cert/key for the listener (falls back to `tls.*`).
`peers.peer_ca_cert`	string	`""`	PEM CA to verify peer TLS certs (for self-signed certs).
`peers.mutual_tls`	bool	`false`	Present this node's cert as a client cert when connecting to peers.
`peers.allow_plaintext`	bool	`false`	Opt into plaintext HTTP when no TLS cert is set (trusted private networks only).

Constable Documentation

What's in the box #

Quick Start #

1. Install

2. The bare-minimum config.json

Environment variables for config loading

Guided Setup Path #

1. Add observability early #

2. Run both learning systems in observe mode #

3. Detect: review what learning found #

4. Apply: promote rules to blocking #

5. Going to production — checklist #

Remote Config from GitHub #

How it works

Install from .deb (Debian/Ubuntu) #

What the package lays down

Running as a systemd Service #

Lay out files

systemd unit — /etc/systemd/system/constable.service

Secrets in an env file (optional)

Start and enable

Request Processing Pipeline #

Multi-Domain Hosting #

Per-Vhost vs Global #

Merge precedence #

Core Settings #

Blocking Behavior #

IP Allow / Block Lists #

Rate Limiting #

Request Limits #

Allowed HTTP Methods #

Logging #

Request ID #

Rules Overview #

url_rules #

header_rules #

body_rules #

response_body_rules #

Per-Path Rules #

Conditional Rules #

Brute-force protection on a login endpoint

Trigger on rule detection (content-inspection)

CVE Detection #

Stack Awareness #

Botnet Detection #

GeoIP Blocking #

Learn Mode #

Adaptive Learning #

Shadow-first, never blocks by default

Upstreams & Load Balancing #

Upstream transport tuning

Health Checks #

Response Caching #

Gzip Compression #

Security Response Headers #

Remove Response Headers #

Response Rewrites #

TLS Termination #

Let's Encrypt (Automatic TLS) #

HTTP & WWW Redirects #

Authentication #

Basic Auth #

API Key Auth #

JWT Auth #

Prometheus Metrics #

Peer-to-Peer Block Sync #

Security model

Production Hardening #

Client-certificate (mTLS) auth

Connection-level DDoS controls

Structured body inspection

Forward-auth / external authorization

Runtime management API

Outlier ejection / circuit breaking

Upstream retries

Adaptive concurrency / load shedding

Distributed tracing

Real-time alerting webhooks

SIEM / syslog streaming

Bot challenge (proof-of-work)

`url_rules` #

`header_rules` #

`body_rules` #

`response_body_rules` #