Introduction

mitm2openapi converts mitmproxy flow dumps and HAR files into OpenAPI 3.0 specifications. It ships as a single static binary — no Python, no virtual environment, no runtime dependencies.

It is a Rust rewrite of mitmproxy2swagger by @alufers, who pioneered the "capture traffic, extract API spec" workflow. Credit to the original project for the idea and reference implementation.

Why?

The Python original works well but requires Python, pip, and mitmproxy installed in the environment. For CI pipelines, slim Docker images, security audits, and one-off usage, that dependency chain is friction.

mitm2openapi ships as a single ~5 MB static binary. Drop it into any environment and run. Same OpenAPI 3.0 output, plus first-class HAR support and glob-based filters for fully unattended pipelines.

Features

Fast — pure Rust, ~17× faster than the Python original (benchmarks)
Single static binary — no Python, no venv, no pip, no runtime dependencies
Two-format support — mitmproxy flow dumps (v19/v20/v21) and HAR 1.2
Two-step workflow — discover finds endpoints, you curate, generate emits OpenAPI 3.0
Glob filters — --exclude-patterns and --include-patterns for automated pipelines
Error recovery — skips corrupt flows, continues processing
Auto-detection — heuristic format detection from file content
Resource limits — configurable caps prevent denial-of-service on untrusted input
Strict mode — treat warnings as errors for CI gates
Structured reports — --report outputs machine-readable JSON processing summaries
Response examples — captured request/response bodies stored as named OpenAPI examples
Smart parameterization — detects UUIDs, hex strings, base58, UPPER_CASE slugs, and cross-request variability
Redaction — --redact-patterns and --redact-fields scrub sensitive data from examples
Battle-tested — integration tests against Swagger Petstore and OWASP crAPI
Cross-platform — Linux, macOS, Windows pre-built binaries

How it works

The tool uses a two-step workflow:

Discover — scan captured traffic and list all observed API endpoints
Curate — review the list and select which endpoints to include
Generate — produce a clean OpenAPI 3.0 spec from the selected endpoints

This separates endpoint selection from spec generation, giving you full control over what ends up in the final spec.

Next steps

Installation

From binary releases

Download a pre-built binary for your platform from GitHub Releases.

Binaries are available for Linux (x86_64, aarch64), macOS (x86_64, aarch64), and Windows (x86_64).

# Example: Linux x86_64 — replace <VERSION> with the release tag (e.g. v0.5.1)
curl -L "https://github.com/Arkptz/mitm2openapi/releases/download/<VERSION>/mitm2openapi-<VERSION>-x86_64-unknown-linux-gnu.tar.gz" \
  | tar xz
sudo mv mitm2openapi /usr/local/bin/

From source (via Cargo)

If you have a Rust toolchain installed:

cargo install --git https://github.com/Arkptz/mitm2openapi

Or from crates.io:

cargo install mitm2openapi

Verify installation

mitm2openapi --version

Shell completions

mitm2openapi uses clap for argument parsing. Shell completions are not yet bundled, but you can generate them for most shells via clap_complete if building from source.

Quick start

This walkthrough takes you from a traffic capture to a complete OpenAPI spec in under a minute.

Prerequisites

mitm2openapi installed (see installation)
A captured traffic file — either a mitmproxy .flow dump or a .har export from browser DevTools

If you do not have a capture yet, see capturing traffic for setup instructions.

Step 1: Discover endpoints

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com"

This scans every request in capture.flow that matches the prefix https://api.example.com and writes a templates file listing all observed URL paths.

Step 2: Curate the templates

Open templates.yaml. Each path is prefixed with ignore: by default:

x-path-templates:
- ignore:/api/users
- ignore:/api/users/{id}
- ignore:/api/products
- ignore:/static/bundle.js

Remove the ignore: prefix from paths you want in the final spec:

x-path-templates:
- /api/users
- /api/users/{id}
- /api/products
- ignore:/static/bundle.js

Paths still prefixed with ignore: are excluded from the generated spec.

Step 3: Generate the OpenAPI spec

mitm2openapi generate \
  -i capture.flow \
  -t templates.yaml \
  -o openapi.yaml \
  -p "https://api.example.com"

The resulting openapi.yaml contains a valid OpenAPI 3.0 spec with paths, methods, parameters, request bodies, and response schemas inferred from the captured traffic.

Skip the manual edit

If you already know which paths matter, use glob filters to automate curation:

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --exclude-patterns '/static/**,/images/**,*.css,*.js,*.svg' \
  --include-patterns '/api/**,/v2/**'

mitm2openapi generate \
  -i capture.flow \
  -t templates.yaml \
  -o openapi.yaml \
  -p "https://api.example.com"

Paths matching --include-patterns are auto-activated (no ignore: prefix). Paths matching --exclude-patterns are dropped entirely. Everything else still gets ignore: for manual review.

See filtering endpoints for the full glob syntax reference.

HAR files

The same workflow works with HAR files — just point -i at a .har file. The format is auto-detected:

mitm2openapi discover \
  -i capture.har \
  -o templates.yaml \
  -p "https://api.example.com"

See HAR files for details on exporting HARs from browser DevTools.

Capturing traffic

Before you can generate an OpenAPI spec, you need a captured traffic file. This chapter covers the most common ways to capture HTTP traffic.

Option 1: mitmproxy (recommended)

mitmproxy is a free, open-source HTTPS proxy. It captures traffic in its own binary flow format that mitm2openapi reads natively.

Install mitmproxy

# macOS
brew install mitmproxy

# Linux (pip)
pip install mitmproxy

# Or download from https://mitmproxy.org/

See the mitmproxy installation docs for platform-specific instructions.

Capture with mitmdump

mitmdump is the non-interactive version of mitmproxy, ideal for scripted captures:

# Start the proxy and write all traffic to a flow file
mitmdump -w capture.flow

# In another terminal, route your HTTP client through the proxy:
curl --proxy http://localhost:8080 https://api.example.com/users

The default proxy port is 8080. Use -p to change it:

mitmdump -w capture.flow -p 9090

Capture with mitmweb

mitmweb provides a browser-based UI for inspecting traffic in real time:

mitmweb -w capture.flow
# Open http://localhost:8081 in your browser to inspect traffic

HTTPS traffic

For HTTPS, you need to install the mitmproxy CA certificate on the client machine. After starting mitmproxy, navigate to http://mitm.it from the proxied client to download and install the certificate.

See the mitmproxy certificate docs for detailed instructions.

Tips

Use mitmdump --set flow_detail=0 for minimal console output during long captures
Combine with --set save_stream_filter to capture only specific hosts
The flow format is versioned (v19/v20/v21) — mitm2openapi supports all three

Option 2: Browser DevTools (HAR export)

All modern browsers can export captured network traffic as HAR (HTTP Archive) files.

Chrome / Chromium

Open DevTools (F12 or Ctrl+Shift+I)
Switch to the Network tab
Ensure recording is active (red circle icon)
Perform the actions you want to capture
Right-click in the request list → Save all as HAR with content

Firefox

Open DevTools (F12)
Switch to the Network tab
Perform the actions you want to capture
Click the gear icon → Save All As HAR

Safari

Enable the Develop menu in Preferences → Advanced
Open Web Inspector (Cmd+Option+I)
Switch to the Network tab
Perform the actions
Click Export in the toolbar

Note

HAR files from browser DevTools contain the full request and response bodies. Sensitive data (cookies, tokens, passwords) will be present in the export. Sanitize before sharing.

Option 3: Other HTTP proxies

Any tool that produces HAR 1.2 output works with mitm2openapi:

Charles Proxy — export sessions as HAR via File → Export
Fiddler — File → Export Sessions → HTTPArchive
Proxyman — export as HAR from the session menu

What to capture

For the best OpenAPI spec, capture diverse traffic:

Multiple endpoints — the more paths covered, the more complete the spec
Different HTTP methods — GET, POST, PUT, DELETE on the same resource
Various response codes — 200, 400, 404, 500 responses produce richer schemas
Query parameters — include requests with different query strings
Request bodies — POST/PUT with different payloads improve body schema inference

Next steps

Once you have a capture file, proceed to the quick start or learn about the full discover → curate → generate pipeline.

Discover, curate, generate

mitm2openapi uses a three-step pipeline to convert captured HTTP traffic into an OpenAPI specification. This chapter explains each step in detail.

Overview

graph LR
    A[Traffic capture] --> B[discover]
    B --> C[Templates file]
    C --> D[Curate]
    D --> E[generate]
    E --> F[OpenAPI 3.0 spec]

The pipeline separates endpoint discovery from spec generation, giving you an explicit curation step where you choose which endpoints appear in the final spec.

Step 1: Discover

The discover command scans a traffic capture and extracts all unique URL paths that match a given prefix.

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com"

What happens internally

The input file is read incrementally (streaming — memory usage stays bounded)
Each request's URL is checked against the --prefix filter
Matching paths are collected and deduplicated
Path segments that look like IDs (UUIDs, numeric strings) are replaced with {id} placeholders (or {id1}, {id2}, ... when a path has multiple parameters)
The result is written to the templates file

Templates file format

The output is a YAML file with path templates under an x-path-templates key:

x-path-templates:
- ignore:/api/users
- ignore:/api/users/{id}
- ignore:/api/products
- ignore:/api/products/{id}/reviews
- ignore:/static/bundle.js

Every path is prefixed with ignore: by default. This is intentional — it forces you to explicitly opt in to each endpoint.

Automatic parameterization

The discover step detects path segments that vary across requests and replaces them with named parameters:

Observed paths	Template
`/api/users/42`, `/api/users/99`	`/api/users/{id}`
`/api/orders/abc-def-123`	`/api/orders/{id}`

UUID-like and numeric segments are detected automatically. The following patterns are also recognized:

UPPER_CASE slugs — e.g. BTC_USDT, ETH_BTC
Hex strings — segments starting with 0x
Base58 identifiers — alphanumeric segments 20+ characters long
Cross-request variability — segments with 3 or more distinct values across requests

For patterns not covered by the built-in heuristics, use --param-regex to supply a custom regex. Any path segment matching the regex is treated as a parameter:

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --param-regex '[A-Z]{2,}_[A-Z]{2,}'

Step 2: Curate

Open the templates file in any text editor. For each path:

Remove ignore: to include the endpoint in the generated spec
Leave ignore: to exclude it
Delete the line to exclude it permanently

# Before curation
x-path-templates:
- ignore:/api/users
- ignore:/api/users/{id}
- ignore:/static/bundle.js

# After curation
x-path-templates:
- /api/users
- /api/users/{id}
- ignore:/static/bundle.js

You can also edit parameter names. The default {id} placeholder can be renamed to something more descriptive like {userId}:

- /api/users/{userId}

Automating curation with glob filters

For CI pipelines or large captures, manual curation is impractical. Use --include-patterns and --exclude-patterns during the discover step instead:

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --include-patterns '/api/**' \
  --exclude-patterns '/static/**,*.css,*.js'

Paths matching --include-patterns are emitted without the ignore: prefix (auto-activated). Paths matching --exclude-patterns are dropped entirely. Everything else gets ignore: for manual review.

See filtering endpoints for the full glob syntax.

Step 3: Generate

The generate command re-reads the traffic capture and produces an OpenAPI spec using the curated templates as a guide:

mitm2openapi generate \
  -i capture.flow \
  -t templates.yaml \
  -o openapi.yaml \
  -p "https://api.example.com"

What happens internally

The templates file is loaded and the ignore: entries are filtered out
Each template path is compiled into a regex for matching
The traffic capture is streamed again, matching each request against the templates
For each matched request:
- Path parameters are extracted
- Query parameters are collected
- Request body schema is inferred (JSON, form data)
- Response status code and body schema are recorded
When multiple requests match the same template, their schemas are merged:
- Different status codes (200, 400, 404) produce separate response entries
- Request body is taken from the first observation; subsequent same-endpoint observations only contribute response schemas
The final OpenAPI 3.0 document is written as YAML

Customizing output

The generate command accepts several options to tune the output:

mitm2openapi generate \
  -i capture.flow \
  -t templates.yaml \
  -o openapi.yaml \
  -p "https://api.example.com" \
  --openapi-title "My API" \
  --openapi-version "2.0.0" \
  --exclude-headers "authorization,cookie" \
  --ignore-images

See the CLI reference for all available options.

Response and request examples

The generate step captures actual request and response bodies as named examples in the OpenAPI spec. Each unique response per endpoint and status code is stored as a separate example, up to the limit set by --max-examples (default: 5).

When multiple requests hit the same endpoint with different request bodies, the schemas are merged using oneOf to represent all observed variants.

Redacting sensitive data

Production captures often contain tokens, passwords, or PII. Use --redact-patterns and --redact-fields to scrub sensitive values from examples before they land in the spec:

mitm2openapi generate \
  -i capture.flow \
  -t templates.yaml \
  -o openapi.yaml \
  -p "https://api.example.com" \
  --redact-patterns 'eyJ[\w-]+' \
  --redact-patterns 'sk-[a-zA-Z0-9]+' \
  --redact-fields 'password,token,secret,authorization'

--redact-patterns takes one regex per flag — repeat the flag for multiple patterns. Regexes with quantifiers like {32,} work correctly. --redact-fields accepts comma-separated field names whose values are replaced with "[REDACTED]".

Filtering OPTIONS requests

Both discover and generate accept --skip-options to exclude HTTP OPTIONS requests (typically CORS preflight) from processing:

mitm2openapi discover -i capture.flow -o templates.yaml -p "https://api.example.com" --skip-options

Worked example

Starting from a mitmproxy capture of a pet store API:

# Discover all endpoints under the API prefix
mitm2openapi discover \
  -i petstore.flow \
  -o templates.yaml \
  -p "http://petstore:8080" \
  --exclude-patterns '/static/**' \
  --include-patterns '/api/**'

# Templates file now has API paths auto-activated:
#   - /api/v3/pet
#   - /api/v3/pet/{id}
#   - /api/v3/pet/findByStatus
#   - /api/v3/store/inventory
#   - ignore:/static/swagger-ui.css

# Generate the spec
mitm2openapi generate \
  -i petstore.flow \
  -t templates.yaml \
  -o openapi.yaml \
  -p "http://petstore:8080"

# Result: openapi.yaml with paths, methods, schemas

The generated openapi.yaml is a valid OpenAPI 3.0 document that can be opened in Swagger UI, imported into Postman, or used as a contract for API testing.

Generating stable operationIds

Use --operation-id-strategy path to generate camelCase operationIds that openapi-generator converts to readable Rust method names:

mitm2openapi generate -i capture.har -t templates.yaml -o openapi.yaml -p https://api.example.com \
  --operation-id-strategy path

This produces ids like listUsers, getUser, createOrder, placeOrder.

Override specific operations with a YAML file:

# overrides.yaml
"GET /api/v1/contract/fair_price/{symbol}": getFairPrice
"POST /api/v1/private/order/place": placeOrder

mitm2openapi generate ... --operation-id-strategy path --operation-id-overrides overrides.yaml

Organizing operations with tags

Tags group operations into modules (one Rust source file per tag in openapi-generator). Use regex-based rules:

# tag-rules.yaml
rules:
  - match: "^/api/v1/contract/"
    tag: Contract
  - match: "^/api/v1/private/"
    tag: Private
default: Market

mitm2openapi generate ... --tag-rules tag-rules.yaml

Or use a fixed path segment as the tag:

mitm2openapi generate ... --tag-strategy path-segment --tag-segment-index 2

MEXC-style envelope APIs

MEXC and similar exchange APIs always return HTTP 200 with a success boolean:

{"success": true,  "data": {"price": 42000.5}}
{"success": false, "code": 1, "message": "Invalid symbol"}

Use --envelope-discriminator to split captured bodies into typed schemas:

mitm2openapi generate \
  -i capture.har -t templates.yaml -o openapi.yaml \
  -p https://api.example.com \
  --operation-id-strategy path \
  --tag-rules tag-rules.yaml \
  --envelope-discriminator success

The generated spec will include:

A shared components/schemas/ApiError (inferred from all error bodies)
Per-operation {OperationId}Success schemas
oneOf(SuccessSchema, ApiError) for operations with mixed bodies

Supply your own error schema instead of inferring:

mitm2openapi generate ... \
  --envelope-discriminator success \
  --envelope-error-shape api-error.yaml

Enriching generated specs

Auto-generated summaries like GET /api/v1/contract/fair_price/{symbol} aren't ideal for documentation or SDK generation. Use --enrichments to apply a YAML overlay with human-written metadata:

# enrichments.yaml
info:
  description: |
    Reverse-engineered MEXC web API.
    Source: captured browser traffic.

operations:
  getFairPrice:
    summary: Get fair price for a futures contract
    description: |
      Returns the mark price used for liquidation calculations.
    x-requires-auth: false
    x-rate-limit: "10/s"
    responses:
      "200":
        description: Fair price payload

  getAssets:
    summary: List futures account balances
    x-requires-auth: true

components:
  schemas:
    ApiError:
      description: |
        MEXC envelope error response.
        HTTP status is always 200; failure is signalled by success=false.

mitm2openapi generate \
  -i capture.har -t templates.yaml -o openapi.yaml \
  -p https://api.example.com \
  --operation-id-strategy path \
  --enrichments enrichments.yaml

Merge semantics

Scope	Rule
`info.*`	Overlay wins per-key (title, description, version)
`operations.<opId>.summary`, `description`, `deprecated`	Overlay wins
`operations.<opId>.tags`	Overlay replaces entire list
`operations.<opId>.x-*`	Passed through verbatim
`operations.<opId>.responses.<status>.description`	Overlay wins
`components.schemas.<name>.description`	Overlay wins (properties/type untouched)
Operation in overlay but not in spec	Warning (error under `--strict`)
Operation in spec but not in overlay	Left untouched

Note: operationIds in the overlay must match the final IDs after collision resolution. If two operations produce the same base ID, one gets a _2 suffix. Run generate once without --enrichments to see the resolved IDs.

The --enrichments flag requires --operation-id-strategy to be set (not none), since the overlay keys operations by operationId.

Filtering endpoints

The discover command supports glob-based filters to automate endpoint curation. This is useful for CI pipelines or large captures where manual editing is impractical.

Glob syntax

Filters use git-style glob patterns (powered by the globset crate):

Pattern	Matches	Does not match
`*`	Single path segment	Segments with `/`
`**`	Any number of path segments	(matches everything)
`?`	Any single character
`[abc]`	Character class
`{a,b}`	Alternation

`--exclude-patterns`

Paths matching any exclude glob are dropped entirely — they do not appear in the templates file at all.

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --exclude-patterns '/static/**,/images/**,*.css,*.js,*.svg,*.png'

Multiple patterns are comma-separated. A path is excluded if it matches any pattern.

`--include-patterns`

Paths matching any include glob are emitted without the ignore: prefix — they are auto-activated for the generate step.

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --include-patterns '/api/**,/v2/**'

Combining filters

When both are specified:

Exclude runs first — matching paths are dropped entirely
Include runs second — matching paths among the survivors are auto-activated
Everything else gets the ignore: prefix for manual review

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --exclude-patterns '/static/**,*.css,*.js' \
  --include-patterns '/api/**'

Result:

/static/bundle.js — excluded (dropped)
/api/users — included (auto-activated)
/dashboard — neither matched (gets ignore: prefix)

Examples

API-only spec

--include-patterns '/api/**' \
--exclude-patterns '/api/internal/**,/api/debug/**'

Strip static assets

--exclude-patterns '/static/**,/assets/**,*.css,*.js,*.svg,*.png,*.jpg,*.gif,*.ico,*.woff,*.woff2'

Multiple API versions

--include-patterns '/v1/**,/v2/**,/v3/**'

Pattern tips

Patterns match against the URL path only (after the prefix is stripped), not the full URL
Leading / is recommended for clarity but not required
Patterns are case-sensitive
Use ** sparingly — it matches everything, including deeply nested paths

Resource limits

To prevent denial-of-service when processing untrusted captures, mitm2openapi enforces several configurable and fixed limits.

Configurable limits

These limits can be adjusted via CLI flags:

Flag	Default	Purpose
`--max-input-size`	2 GiB	Reject files larger than this before reading
`--max-payload-size`	256 MiB	Cap on individual tnetstring payload allocation
`--max-depth`	256	Recursion depth limit for nested tnetstring structures
`--max-body-size`	64 MiB	Maximum request/response body considered during schema inference
`--allow-symlinks`	off	By default, symlinked inputs are rejected

Adjusting limits

Increase --max-input-size if you work with captures larger than 2 GiB:

mitm2openapi discover \
  -i large-capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --max-input-size 8GiB

Size suffixes are supported: KiB, MiB, GiB.

The other limits rarely need tuning. The defaults are designed to handle real-world captures while rejecting pathological inputs.

Symlink rejection

By default, symlinked input files are rejected to prevent path-traversal attacks on shared CI runners. If you need to process a symlinked file:

mitm2openapi discover \
  -i /path/to/symlinked-capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --allow-symlinks

Fixed per-field limits

These limits are applied unconditionally and cannot be changed via CLI flags:

Field	Cap	Behaviour when exceeded
Header name	8 KiB	Header dropped (other headers still processed)
Header value	64 KiB	Value truncated to cap
Form fields per request	1,000	Excess fields ignored
URL scheme	`http` / `https` only	Non-HTTP flows silently skipped
Port number	1 -- 65,535	Out-of-range port drops the request
HTTP status code	100 -- 599	Invalid codes treated as no response

UTF-8 validation

Identity fields (scheme, host, path, method, header names) require valid UTF-8. Flows with non-UTF-8 identity bytes are skipped to prevent data aliasing through replacement-character collisions.

Control characters (0x00--0x1F, 0x7F) in paths are stripped automatically.

Streaming and memory

Both mitmproxy flow files and HAR files are processed incrementally. Memory usage stays bounded regardless of input size — there is no need to load the entire capture into memory.

Peak RSS is proportional to the size of the largest single flow in the capture, not the total file size. For typical captures, expect 5--15 MB of memory usage.

When limits fire

When a per-field limit is exceeded (header too large, body too large, form fields over cap), the affected field is skipped or truncated and processing continues with the remaining data.

When a tnetstring parse error occurs, the iterator halts and the rest of the file is not processed — valid flows parsed before the error are still emitted. There is no resync because binary payloads can contain bytes that mimic valid length prefixes.

In both cases a warn-level log message is emitted with details.

Use strict mode to treat these warnings as errors, or processing reports to capture them as structured data.

Strict mode

Pass --strict to either discover or generate to treat warning-level events as hard failures. The process exits with code 2 if the processing report records any counted events.

Currently, the only event counter populated at runtime is parse_error — triggered when flows cannot be deserialized (corrupt tnetstring data, malformed HAR JSON). The cap_fired and rejected counters exist in the report schema but are not yet wired to the reader pipelines; they will be connected in a future release.

In practice, --strict today catches:

Parse errors during flow deserialization (tnetstring or HAR)
Errors counted by the streaming iterator wrapper in discover mode

Usage

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --strict

mitm2openapi generate \
  -i capture.flow \
  -t templates.yaml \
  -o openapi.yaml \
  -p "https://api.example.com" \
  --strict

CI usage pattern

Strict mode is designed for CI gates where silent degradation is unacceptable:

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --strict \
  || { echo "FAIL: corrupt or over-limit flows detected"; exit 1; }

Without `--strict`

Without the flag, parse errors are logged at warn level and processing continues with exit code 0. Affected flows are skipped, but the output file is still produced. Other warning-level events (cap fires, scheme rejections, etc.) are always logged but do not currently increment the report counters that --strict checks.

Exit codes

Code	Meaning
0	Success (no warnings, or `--strict` not set)
1	Fatal error (I/O failure, missing required arguments)
2	Strict mode violation (warnings detected with `--strict`)

Combining with reports

For CI pipelines that need both strict enforcement and structured diagnostics:

mitm2openapi generate \
  -i capture.flow \
  -t templates.yaml \
  -o openapi.yaml \
  -p "https://api.example.com" \
  --strict \
  --report report.json

The report is written even when --strict causes a non-zero exit, capturing the full details of what went wrong.

Processing reports

Pass --report <PATH> to either discover or generate to write a JSON processing summary. This is useful for CI pipelines that need structured data instead of log scraping.

Usage

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --report report.json

Report schema

{
  "report_version": 1,
  "tool_version": "0.5.1",
  "input": {
    "path": "capture.flow",
    "format": "Auto",
    "size_bytes": 102400
  },
  "result": {
    "flows_read": 150,
    "flows_emitted": 148,
    "paths_in_spec": 12
  },
  "events": {
    "parse_error": {
      "TNetString parse error at byte 98304: unexpected end of input": 1
    }
  }
}

Fields

Field	Type	Description
`report_version`	integer	Schema version (currently `1`)
`tool_version`	string	`mitm2openapi` version that produced the report
`input.path`	string	Input file path
`input.format`	string	Detected or specified format (`Auto`, `Mitmproxy`, `Har`)
`input.size_bytes`	integer	Input file size in bytes
`result.flows_read`	integer	Total flows/entries parsed from input
`result.flows_emitted`	integer	Flows that passed all filters and were processed
`result.paths_in_spec`	integer	Unique paths in the output (for `generate`)
`events`	object	Map of event categories to message counts

Event categories

Category	Meaning	Status
`parse_error`	Corrupt data encountered (tnetstring errors, malformed HAR entries)	Populated
`cap_fired`	A resource limit was triggered (body too large, depth exceeded)	Reserved — not yet populated at runtime
`rejected`	A flow was skipped (invalid UTF-8, unsupported scheme, bad port/status)	Reserved — not yet populated at runtime

The cap_fired and rejected categories are present in the report schema and will be connected to the reader pipelines in a future release. Currently, only parse_error events are counted.

CI integration

Parse the report in CI to make decisions based on processing quality:

mitm2openapi generate \
  -i capture.flow \
  -t templates.yaml \
  -o openapi.yaml \
  -p "https://api.example.com" \
  --report report.json

# Check if any events occurred
if jq -e '.events | length > 0' report.json > /dev/null 2>&1; then
  echo "Warning: processing had events"
  jq '.events' report.json
fi

Report with strict mode

The report is written even when --strict causes a non-zero exit code. This lets you capture full diagnostics while still failing the CI job:

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --strict \
  --report report.json \
  || { jq '.' report.json; exit 1; }

CLI reference

Warning

This reference was last synced with mitm2openapi --help at version 0.6.0. If you notice a flag missing from your local --help output, the tool may be ahead of these docs. Open an issue to prompt an update.

`mitm2openapi discover`

Scan captured traffic and produce a templates file listing all observed endpoints.

mitm2openapi discover [OPTIONS] -i <INPUT> -o <OUTPUT> -p <PREFIX>

Required arguments

Option	Description
`-i, --input <PATH>`	Input file (flow dump or HAR)
`-o, --output <PATH>`	Output YAML templates file
`-p, --prefix <URL>`	API prefix URL to filter requests

Optional arguments

Option	Default	Description
`--format <FORMAT>`	`auto`	Input format: `auto`, `har`, `mitmproxy`
`--exclude-patterns <GLOBS>`		Comma-separated globs; matching paths are dropped entirely
`--include-patterns <GLOBS>`		Comma-separated globs; matching paths are auto-activated
`--max-input-size <BYTES>`	`2GiB`	Maximum input file size. Accepts `KiB`, `MiB`, `GiB` suffixes
`--allow-symlinks`	off	Allow symlinked input files
`--strict`	off	Treat warnings as errors (exit code 2)
`--report <PATH>`		Write structured JSON processing report
`--skip-options`	off	Filter out OPTIONS requests from output
`--param-regex <PATTERN>`		Custom regex for path parameter detection

`mitm2openapi generate`

Generate an OpenAPI 3.0 spec from captured traffic using a curated templates file.

mitm2openapi generate [OPTIONS] -i <INPUT> -t <TEMPLATES> -o <OUTPUT> -p <PREFIX>

Required arguments

Option	Description
`-i, --input <PATH>`	Input file (flow dump or HAR)
`-t, --templates <PATH>`	Templates YAML file (from `discover`)
`-o, --output <PATH>`	Output OpenAPI YAML file
`-p, --prefix <URL>`	API prefix URL

Optional arguments

Option	Default	Description
`--format <FORMAT>`	`auto`	Input format: `auto`, `har`, `mitmproxy`
`--openapi-title <TITLE>`		Custom title for the spec
`--openapi-version <VER>`	`1.0.0`	Custom spec version
`--exclude-headers <LIST>`		Comma-separated headers to exclude from spec
`--exclude-cookies <LIST>`		Comma-separated cookies to exclude from spec
`--include-headers`	off	Include request headers in the spec
`--ignore-images`	off	Ignore image content types
`--suppress-params`	off	Suppress parameter suggestions
`--tags-overrides <JSON>`		JSON string for tag overrides
`--max-input-size <BYTES>`	`2GiB`	Maximum input file size
`--max-payload-size <BYTES>`	`256MiB`	Maximum tnetstring payload size
`--max-depth <N>`	`256`	Maximum tnetstring nesting depth
`--max-body-size <BYTES>`	`64MiB`	Maximum request/response body size
`--allow-symlinks`	off	Allow symlinked input files
`--strict`	off	Treat warnings as errors (exit code 2)
`--report <PATH>`		Write structured JSON processing report
`--skip-options`	off	Filter out OPTIONS requests from output
`--max-examples <N>`	`5`	Maximum examples per endpoint per status code
`--redact-patterns <PATTERN>`		Regex pattern to redact from examples (repeat flag for multiple)
`--redact-fields <FIELDS>`		Comma-separated field names to redact from examples
`--operation-id-strategy <STRATEGY>`	`none`	Strategy for operationId generation: `none`, `path`, `template`
`--operation-id-template <STRING>`		Template with `{method}` and `{path}` placeholders (requires `template`)
`--operation-id-overrides <PATH>`		YAML file with per-operation overrides
`--tag-strategy <STRATEGY>`	`legacy`	Tag assignment strategy: `legacy`, `none`, `path-segment`, `rules`
`--tag-segment-index <N>`		Path segment index for tag (requires `path-segment`)
`--tag-rules <PATH>`		YAML rules file (auto-sets strategy to `rules`)
`--envelope-discriminator <FIELD>`		JSON field name for discriminating success vs error
`--envelope-error-shape <PATH>`		YAML file with hand-supplied ApiError schema
`--envelope-success-component-suffix <STRING>`	`Success`	Suffix for success component names
`--enrichments <PATH>` / `-e`	—	path

Common flag details

`--format`

By default, the input format is auto-detected from a combination of file extension and content sniffing:

.flow extension or content starting with a tnetstring length prefix → mitmproxy format
.har extension or content starting with { → HAR format

Use --format mitmproxy or --format har to override auto-detection.

`--prefix`

The prefix URL filters which requests are processed. Only requests whose URL starts with the prefix are included. The prefix is stripped from paths in the generated spec.

Example: with --prefix https://api.example.com, a request to https://api.example.com/users/42 produces path /users/42 in the spec.

`--strict`

See strict mode for details on exit codes and CI usage.

`--report`

See processing reports for the JSON schema and CI integration examples.

Exit codes

Code	Meaning
0	Success
1	Fatal error (I/O failure, missing arguments, invalid input)
2	Strict mode violation (warnings with `--strict` enabled)

Environment variables

Variable	Description
`RUST_LOG`	Controls log verbosity. Default: `warn`. Set to `info` or `debug` for more output.

RUST_LOG=info mitm2openapi discover -i capture.flow -o templates.yaml -p "https://api.example.com"

mitmproxy flow dumps

mitm2openapi reads mitmproxy's native binary flow format. This is the recommended input format — it captures the richest data and is produced directly by mitmdump and mitmweb.

Supported versions

Flow format version	mitmproxy version	Status
v19	mitmproxy 8.x	Supported
v20	mitmproxy 9.x	Supported
v21	mitmproxy 10.x	Supported

The flow format is auto-detected from file content. No version flag is needed.

How flow files work

Flow files use the tnetstring serialization format. Each flow is a sequence of key-value pairs representing a complete HTTP request-response cycle.

A typical flow contains:

Request: method, URL (scheme, host, port, path), headers, body
Response: status code, headers, body
Metadata: timestamps, flow ID, client/server addresses

mitm2openapi extracts the request and response data relevant to OpenAPI spec generation and discards metadata.

Capturing flow files

# Record all traffic through the proxy
mitmdump -w capture.flow

# Record only traffic to a specific host
mitmdump -w capture.flow --set flow_detail=0 \
  --set save_stream_filter='~d api.example.com'

See capturing traffic for full setup instructions.

Directory input

If you pass a directory path to -i, mitm2openapi reads all .flow files in that directory (non-recursive). This is useful when you have traffic split across multiple capture sessions.

Known limitations

No WebSocket frames — WebSocket upgrade requests are captured, but frame-level data is not used for spec generation
No gRPC — binary protocol buffers inside HTTP/2 frames are not decoded
Corrupt files — when the tnetstring parser hits corruption, it stops and reports the byte offset. No resync is attempted because binary payloads can contain bytes that mimic valid tnetstring length prefixes. See diagnostics for details.
Large payloads — individual tnetstring payloads are capped at 256 MiB by default (adjustable via --max-payload-size)

HAR files

mitm2openapi reads HAR (HTTP Archive) files — the standard format for exporting browser network traffic. HAR version 1.2 is supported.

Producing HAR files

Browser DevTools

All modern browsers export HAR from their Network tab:

Chrome/Chromium: DevTools → Network → right-click → "Save all as HAR with content"
Firefox: DevTools → Network → gear icon → "Save All As HAR"
Safari: Web Inspector → Network → Export button

HTTP proxies

Several proxy tools export HAR:

Charles Proxy — File → Export Session → HAR
Fiddler — File → Export Sessions → HTTPArchive
Proxyman — Export as HAR

Programmatic generation

Libraries like puppeteer and playwright can produce HAR files from automated browser sessions:

// Playwright example
const context = await browser.newContext({
  recordHar: { path: 'capture.har' }
});
// ... run your test
await context.close(); // HAR is written on close

Usage

mitm2openapi discover \
  -i capture.har \
  -o templates.yaml \
  -p "https://api.example.com"

Format is auto-detected. Use --format har to force HAR parsing if auto-detection fails.

HAR vs mitmproxy flows

Aspect	mitmproxy flow	HAR
Source	mitmproxy proxy	Browser DevTools, HTTP proxies
Format	Binary (tnetstring)	JSON
Response bodies	Always present	Sometimes base64-encoded
HTTPS	Decrypted by proxy	Decrypted by browser
File size	Compact binary	Larger (JSON overhead)
Streaming	Native	Incremental JSON parsing

Both formats produce equivalent OpenAPI specs. Choose based on your capture workflow:

mitmproxy flows for server-side proxying, CI pipelines, and automated captures
HAR files for browser-based testing, manual exploration, and when you already have DevTools open

Incremental parsing

HAR files are parsed incrementally — the entire JSON is not loaded into memory at once. This means memory usage stays bounded even for large HAR exports (hundreds of megabytes).

Known limitations

Base64-encoded bodies — some HAR exporters base64-encode response bodies. Decode failures are logged as warnings and the body is skipped (not silently dropped).
Compressed content — if the HAR exporter did not decompress response bodies, mitm2openapi sees the compressed bytes. Most browser DevTools decompress automatically.
Timing data — HAR timing information (DNS, connect, TLS) is ignored; only request and response data is used for spec generation.

Performance & Benchmarks

Results are regenerated weekly by the benchmark workflow. See the workflow for the reproducible methodology.

Benchmarks

Generated by the benchmark workflow.

Benchmark results

Run: 2026-05-25 11:08 UTC, commit b2552943, runner: Linux 6.17.0-1013-azure

Fixture: 89 MB, 40k requests across 8 endpoint shapes (bench-fixtures-v1).

Timing

Command	Mean s	Min s	Max s	Relative
`Python mitmproxy2swagger`	40.767 ± 0.255	40.486	41.093	15.95 ± 0.13
`Rust mitm2openapi`	2.557 ± 0.013	2.540	2.571	1.00

Peak RSS

Tool	RSS
Python mitmproxy2swagger	46796 KB
Rust mitm2openapi	6244 KB

Security model

mitm2openapi processes untrusted binary input (traffic captures from unknown sources). The security model is designed to prevent denial-of-service, data corruption, and information leakage when handling adversarial input.

Threat model

The primary threat is a malicious capture file — a .flow or .har file crafted to exploit the parser. Scenarios include:

CI pipelines processing captures from untrusted contributors
Shared analysis servers where multiple users submit captures
Automated pipelines where the capture source is not fully controlled

Input validation layers

File-level checks

Before reading any content:

File type — only regular files are accepted. Symlinks, FIFOs, device files, and directories are rejected unless --allow-symlinks is explicitly set.
File size — files exceeding --max-input-size (default 2 GiB) are rejected before any bytes are read.
TOCTOU caveat — file metadata is checked via the path before reading to reject symlinks, non-regular files, and oversized inputs. There is a small TOCTOU window between the metadata check and the file open; mitigation via fd-based recheck after open is a future enhancement.

Parser-level caps

During parsing:

Cap	Default	Purpose
Payload size	256 MiB	Prevents OOM from oversized tnetstring values
Nesting depth	256	Prevents stack overflow from deeply nested structures
JSON depth	64	Prevents stack overflow in schema inference
Body size	64 MiB	Limits memory for individual request/response bodies

These caps trigger warn-level events and skip the affected data. Use --strict to treat them as hard errors.

Field-level validation

For every flow:

Scheme whitelist — only http and https are accepted. Other schemes (e.g., javascript:, data:) are silently skipped.
UTF-8 strictness — identity fields (method, scheme, host, path, header names) must be valid UTF-8. Invalid bytes cause the flow to be skipped, preventing data aliasing through replacement-character collisions.
Port range — port numbers must be 1--65,535. Out-of-range values drop the request.
Status code range — HTTP status codes must be 100--599.
Control character stripping — 0x00--0x1F and 0x7F in URL paths are removed.
Header caps — header names over 8 KiB are dropped; values over 64 KiB are truncated.
Form field count — at most 1,000 form fields per request are processed.

Output safety

Atomic writes — output files are written via a temporary file and renamed. If the write fails (disk full, permission denied), the target path is left untouched.
No resync on corruption — when the tnetstring parser encounters corrupt data, it halts immediately. It does not scan forward looking for the next valid frame, because binary payloads can contain bytes that look like valid length prefixes.

Streaming architecture

Both mitmproxy and HAR inputs are processed incrementally. At no point is the entire capture loaded into memory. This bounds peak RSS to the size of the largest single flow, regardless of total file size.

Glob pattern safety

The --exclude-patterns and --include-patterns flags use the globset crate, which compiles patterns into a DFA. This eliminates exponential backtracking that was possible with the original recursive glob matcher.

Recommendations

For processing untrusted captures:

Do not use --allow-symlinks unless you control the filesystem
Keep --max-input-size at the default (2 GiB) or lower
Run with --strict to fail fast on any anomaly
Use --report to capture processing diagnostics for audit trails
Run in a sandboxed environment (container, VM) when processing captures from unknown sources

Resource limits — configuring the caps
Strict mode — CI enforcement
Diagnostics — interpreting warnings and errors

Diagnostics

mitm2openapi uses structured logging to report issues during processing. This chapter covers how to interpret warnings, errors, and the structured report output.

Log levels

Control verbosity with the RUST_LOG environment variable:

# Default: warnings only
mitm2openapi discover -i capture.flow -o templates.yaml -p "https://api.example.com"

# More detail
RUST_LOG=info mitm2openapi discover -i capture.flow -o templates.yaml -p "https://api.example.com"

# Full debug output
RUST_LOG=debug mitm2openapi discover -i capture.flow -o templates.yaml -p "https://api.example.com"

Common warnings

Parse errors (tnetstring)

WARN TNetString parse error at byte 98304: unexpected end of input (148 flows parsed successfully)

This means the mitmproxy flow file contains corrupt data starting at byte 98,304. The parser halts immediately and the remaining bytes in the file are not processed. The 148 flows parsed before the corruption are still emitted.

No resync is attempted. Binary payloads can contain bytes that mimic valid tnetstring length prefixes, so scanning forward would produce phantom flows with fabricated data.

What to do:

If the file was truncated during transfer, re-capture or re-download
The 148 successfully parsed flows are still usable
Use --report to capture the exact byte offset for debugging

Cap-fired events

WARN body size 68157440 exceeds cap 67108864, truncating
WARN header name exceeds 8192 bytes, dropping
WARN form field count 1247 exceeds cap 1000, ignoring excess

These indicate that a specific field in a flow exceeded the built-in or configured limit. The affected field is truncated or dropped, but processing continues.

What to do:

Usually safe to ignore — the caps exist to prevent abuse, not normal traffic
If you need the full data, increase the relevant --max-* flag
Use --strict to fail on these if you need guaranteed completeness

Flow rejection events

WARN skipping flow: scheme "javascript" not in whitelist [http, https]
WARN skipping flow: invalid UTF-8 in host field
WARN skipping flow: port 0 out of valid range 1-65535

These mean an entire flow was skipped because it failed validation.

What to do:

Non-HTTP flows (WebSocket upgrades, CONNECT tunnels) are expected to be skipped
UTF-8 errors suggest the capture contains binary protocol data, not HTTP traffic
Invalid port/status usually indicates corrupt flow data

Structured reports

For machine-readable diagnostics, use --report:

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --report report.json

See processing reports for the full JSON schema.

Event categories in reports

Category	Examples
`parse_error`	Tnetstring corruption, HAR JSON syntax errors
`cap_fired`	Body too large, depth exceeded, form field count exceeded
`rejected`	Invalid scheme, non-UTF-8 identity fields, bad port/status

Using reports in CI

# Fail if any parse errors occurred
if jq -e '.events.parse_error | length > 0' report.json > /dev/null 2>&1; then
  echo "Parse errors detected"
  exit 1
fi

# Check flows-read vs flows-emitted ratio
RATIO=$(jq '.result.flows_emitted / .result.flows_read' report.json)
if (( $(echo "$RATIO < 0.9" | bc -l) )); then
  echo "Warning: more than 10% of flows were dropped"
fi

Strict mode interaction

With --strict, any warning-level event causes exit code 2. This converts the "informational" diagnostics above into hard failures:

mitm2openapi discover \
  -i capture.flow \
  -o templates.yaml \
  -p "https://api.example.com" \
  --strict \
  --report report.json

# Exit code 2 if ANY warning was emitted
# report.json still written for post-mortem

See strict mode for details.

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

Unreleased

0.8.0 - 2026-05-30

Added

(generate) add --enrichments <PATH> (-e) flag for applying operationId-keyed YAML overlays with human-written metadata (summaries, descriptions, tags, x- extensions, response descriptions, component schema descriptions)

0.7.0 - 2026-05-28

Added

(generate) integrate envelope detection into build()
(generate) integrate operationId strategy into builder
(cli) add envelope detection CLI flags
(generate) integrate tag rules strategy into builder
add envelope module for discriminator-based response splitting
(cli) add operationId and tag strategy CLI flags
add operation_id module with camelCase derivation and collision resolution
add tag_rules module with regex-based tag matching

Fixed

(cli) show correct value name and help text for --redact-patterns
(cli) stop splitting --redact-patterns on comma, fail invalid regex under --strict
(envelope) pin discriminator field to boolean enum: [false] in ApiError
(envelope) merge all error bodies in infer_api_error, not just first
(test) normalize CRLF in snapshot comparison, add .gitattributes eol=lf
(output) sort maps for deterministic YAML output

Other

(changelog) add bugfix entries from MEXC integration testing
document operationId, tags, and envelope features
(integration) add end-to-end test for operationId, tags, and envelope
add snapshot test pinning current generate output

Added

(generate) add --operation-id-strategy {none,path,template} flag for stable operationId generation
(generate) add --operation-id-overrides <path> for per-operation overrides via YAML
(generate) add --operation-id-template <string> for custom operationId templates
(generate) add --tag-strategy {legacy,none,path-segment,rules} flag
(generate) add --tag-rules <path> for regex-based tag assignment from YAML
(generate) add --tag-segment-index <N> to use a specific path segment as tag
(generate) add --envelope-discriminator <field> for discriminator-based response splitting
(generate) add --envelope-error-shape <path> for hand-supplied ApiError schema
(generate) add --envelope-success-component-suffix <string> (default Success)
(output) sort paths and component schemas alphabetically for deterministic YAML output
(generate) add --enrichments <PATH> (-e) flag for applying operationId-keyed YAML overlays with human-written metadata (summaries, descriptions, tags, x- extensions, response descriptions, component schema descriptions)

Fixed

(envelope) infer_api_error now merges all error bodies, not just the first — an outlier msg: 0 no longer overrides thousands of msg: "string" samples
(envelope) inferred ApiError schema now includes the discriminator field pinned with enum: [false]
(cli) --redact-patterns no longer splits on , — regex quantifiers like {32,} now work correctly (pass multiple patterns via repeated flags)
(cli) invalid --redact-patterns regex now hard-fails under --strict instead of silently skipping redaction

Found and fixed by integration testing against a 3.1 GB MEXC capture in mexc-reversed-sdk.

0.6.0 - 2026-05-27

Added

(generate) integrate redaction with collected examples
(generate) collect request body examples
(cli) add --max-examples flag with default cap of 5
(cli) re-introduce --param-regex with functional implementation
(discover) add cross-request variability detection for path params
(generate) merge request body schemas across multiple captures
add redaction module for example sanitization
(cli) add --skip-options flag to filter OPTIONS requests
(discover) extend param detection with UPPER_CASE, hex, base58 heuristics

Fixed

(changelog) remove duplicate Unreleased section
(mitmproxy) wire --max-payload-size to tnetstring parser, fix clippy
(tests) replace single-arm match with if let
resolve clippy warnings and example name leaking sensitive values
(mitmproxy) decompress response bodies based on content-encoding header

Other

(integration) add full pipeline test with all new features
update cli-reference, pipeline, introduction, README for new features
(builder) add examples accumulator infrastructure

Added

(cli) add --skip-options flag to filter OPTIONS requests (discover + generate)
(cli) add --param-regex flag for custom path parameter detection (discover)
(cli) add --max-examples flag to cap examples per endpoint (generate, default 5)
(cli) add --redact-patterns flag for regex-based value redaction (generate)
(cli) add --redact-fields flag for field-name-based redaction (generate)
(discover) enhanced path parameterization: UPPER_CASE slugs, hex strings, base58, cross-request variability
(generate) response/request examples stored as named OpenAPI examples
(generate) request body schema merging via oneOf for multiple captures

0.5.2 - 2026-04-24

Fixed

(har) apply header size caps consistent with mitmproxy reader
(reader) reject symlinked directory inputs and entries

Other

(security) cover symlink directory and entry rejection
(readme) trim content migrated to book, add docs badge
(book) add mdBook scaffold with book.toml and all chapter content
adjust CHANGELOG/CONTRIBUTING headings for mdBook inclusion
regenerate demo.gif skip ci

0.5.1 - 2026-04-22

Other

(bench) refresh benchmark results
(bench) drop small fixture tier
(readme) add benchmarks section linking to automated results
(bench) seed benchmarks.md with methodology and placeholders
regenerate demo.gif skip ci

0.5.0 - 2026-04-22

Added

(cli) add --strict flag to escalate warnings to errors

Other

(readme) document --strict flag and benchmark CI
(strict) verify strict mode exit codes

0.4.1 - 2026-04-22

Fixed

(builder) use .get() in dedup_schema_variants to satisfy indexing_slicing lint
(reader) warn on skipped directory entries and malformed overrides
(schema) union array element schemas and tighten dict heuristic

Other

(lint) deny clippy::indexing_slicing at crate level
extract is_numeric_string and is_uuid to shared module
(output) lazy-init regex via LazyLock
(error) replace guarded unwrap sites with pattern matching

0.4.0 - 2026-04-22

Added

feat!(builder): merge response schemas per status code
feat!(cli): remove unused --param-regex flag

Other

(readme) remove --param-regex mention from CLI reference
(cli) verify --param-regex is rejected as unknown argument
(builder) cover multi-status response aggregation
refactor!(error): mark Error enum as non_exhaustive
regenerate demo.gif skip ci

0.3.0 - 2026-04-22

Added

(report) track cap firings and parse errors in processing report
(cli) add --report flag for structured processing summary
(tnetstring) emit byte offset and error kind on parse halt

Other

(readme) document --report flag and parse halt diagnostics
(report) verify report file schema and contents
(tnetstring) verify parse halt diagnostics and no-resync on binary payload

0.2.6 - 2026-04-22

Fixed

(test) gate Unix-specific path-failure test behind cfg(unix)
(output) write YAML via tempfile and atomic rename

Other

(output) verify atomic write preserves target on failure
(deps) move tempfile to runtime dependencies

0.2.5 - 2026-04-22

Fixed

(builder) skip requests with unknown HTTP methods instead of aliasing to GET

Other

(builder) verify unknown method is skipped and standard methods preserved

0.2.4 - 2026-04-22

Fixed

(params) preserve multi-byte UTF-8 in urlencoding_decode

Other

(params) add UTF-8 roundtrip and overlong rejection cases
regenerate demo.gif skip ci

0.2.3 - 2026-04-22

Fixed

(builder) cap form-field count per request at 1000
(har) validate schemes and status codes, log base64 failures, cap bodies
(reader) validate port/status ranges, enforce strict UTF-8, and cap field sizes

Other

(readme) document per-field size and validation limits

0.2.2 - 2026-04-22

Added

(har) add streaming HAR entry iterator

Other

(readme) mention HAR streaming in resource limits and supported formats
(har) verify streaming does not materialize all entries
(reader) switch HAR dispatch to streaming iterator
regenerate demo.gif skip ci

0.2.1 - 2026-04-22

Added

(reader) add stream_mitmproxy_file and stream_mitmproxy_dir
(tnetstring) add streaming iterator TNetStringIter

Other

(readme) document resource-limit flags and streaming behavior
(main) switch discover and generate to streaming pipeline
(path_matching) cache compiled regexes in CompiledTemplates
(builder) add discover_paths_streaming variant
regenerate demo.gif skip ci

0.2.0 - 2026-04-22

Added

(path_matching) validate path parameter identifiers
(cli) expose --max-input-size, --max-payload-size, --max-depth, --max-body-size, --allow-symlinks
(reader) reject symlinks, non-regular files, and oversized inputs
(schema) enforce 64-level JSON recursion depth limit
(tnetstring) enforce 256-level recursion depth limit
(tnetstring) cap payload size at 256 MiB
(error) add typed variants for parse and input limits

Fixed

(test) gate symlink and FIFO tests behind cfg(unix)

Other

update Cargo.lock for globset dependency
(security) cover symlink, FIFO, and oversize input rejection
(har) bound format-detection read to 4 KiB
(builder) replace custom glob matcher with globset

0.1.2 - 2026-04-22

Other

add tests/fixtures/poc placeholder directory (P0.2)
regenerate demo.gif skip ci

0.1.1 - 2026-04-20

Other

(readme) add Why? section explaining the Python-vs-Rust trade-off
(deps) bump assert_cmd from 2.2.0 to 2.2.1 in the all-cargo group (#7)
regenerate demo.gif skip ci

0.1.0 - TBD

Initial release.

Contributing

This document covers how to run the three test tracks locally.

Prerequisites

Tool	Required for	Install
Rust toolchain	Build + unit tests	rustup.rs
Docker + Compose	All integration tests	docs.docker.com
`oasdiff`	Level 1 diff validation	`go install github.com/tufin/oasdiff@latest` or `brew install oasdiff`
Node.js + npm	Level 2	nodejs.org
Playwright	Level 2	`npx playwright install --with-deps chromium`
VHS	Demo GIF	`brew install vhs` or charm apt repo
ffmpeg, gifski, gifsicle	Demo GIF optimization	System package manager

Build

cargo build --release
# Binary: target/release/mitm2openapi

Unit Tests

cargo test

Level 1 — Petstore Golden Test (~2 min)

Full pipeline (compose up, seed, discover, generate, diff, teardown):

tests/integration/level1/run.sh

Strict mode (--fail-on WARN instead of BREAKING):

tests/integration/level1/run.sh --strict

Manual step-by-step:

cd tests/integration/level1
docker compose up -d
# Wait for petstore healthcheck...
../../ci/petstore/seed.sh
# Run mitm2openapi discover/generate against the proxy
# ...
docker compose down -v

Gotcha: The seed script sends requests through the mitmproxy proxy to petstore:8080 (Docker service name), not localhost. This is intentional — traffic must flow through the proxy to be captured.

Level 2 — crAPI + Playwright (~8 min)

cd tests/integration/level2

# Start crAPI stack (identity + community + workshop + web + mongo + postgres + mailhog + mitmproxy)
make up
# No seed needed — crAPI auto-seeds on first boot

# Run Playwright scenarios
npm install
npx playwright install --with-deps chromium
npx playwright test

# Cleanup
make down

Port conflict: Level 1 and Level 2 both use port 8080 (for different services). Do not run both stacks simultaneously.

Demo GIF (Phase 2 terminal recording)

cd ci/demo
make phase2        # VHS recording
make gif           # gifski + gifsicle optimization
make clean         # remove outputs

Phase 2 uses a real capture, not a committed fixture. The GHA workflow copies tests/integration/level2/out/crapi.flow (produced by Phase 1) into ci/demo/out/demo.flow before running the tape. Locally, do the same: run Phase 1 first, then cp tests/integration/level2/out/crapi.flow ci/demo/out/demo.flow.

Filtering `discover` output

Captures from real apps include static assets (/static/css/main.*.css, /images/*.svg, etc.) which bloat the generated OpenAPI spec. Two flags on discover handle this:

mitm2openapi discover \
  -i capture.flow -o templates.yaml -p http://api.example.com \
  --exclude-patterns '/static/**,/images/**,*.css,*.js,*.svg,*.png,*.jpg' \
  --include-patterns '/api/**,/v2/**'

--exclude-patterns: paths matching any glob are dropped entirely (not even emitted as ignore:).
--include-patterns: paths matching any glob are emitted without the ignore: prefix (i.e. auto-activated for generate). Everything else still gets ignore: for manual review.

Globs: * matches a single path segment, ** matches any subtree.

Together they let generate run with no intermediate sed or editor step — useful for automated pipelines like the demo GIF.

Ports Reference

Stack	Service	Port
Level 1	Petstore	8080
Level 1	mitmproxy	8081
Level 2	crAPI web	8888
Level 2	mailhog	8025
Level 2	mitmproxy	8080
Demo	Swagger UI	8088

Cleanup

All compose stacks use docker compose down -v to remove containers and volumes.

CI Workflows

Workflow	Trigger	Notes
`integration-level1.yml`	Every PR	Naive (required) + strict (informational)
`integration-level2.yml`	Nightly + manual dispatch	Full crAPI + Playwright
`demo-gif.yml`	Push to main (path-filtered) + manual dispatch	Terminal recording