Discover, curate, generate
- Overview
- Step 1: Discover
- Step 2: Curate
- Step 3: Generate
- Worked example
- Generating stable operationIds
- Organizing operations with tags
- MEXC-style envelope APIs
- Enriching generated specs
mitm2openapi uses a three-step pipeline to convert captured HTTP traffic into an OpenAPI
specification. This chapter explains each step in detail.
Overview
graph LR
A[Traffic capture] --> B[discover]
B --> C[Templates file]
C --> D[Curate]
D --> E[generate]
E --> F[OpenAPI 3.0 spec]
The pipeline separates endpoint discovery from spec generation, giving you an explicit curation step where you choose which endpoints appear in the final spec.
Step 1: Discover
The discover command scans a traffic capture and extracts all unique URL paths that match
a given prefix.
mitm2openapi discover \
-i capture.flow \
-o templates.yaml \
-p "https://api.example.com"
What happens internally
- The input file is read incrementally (streaming — memory usage stays bounded)
- Each request's URL is checked against the
--prefixfilter - Matching paths are collected and deduplicated
- Path segments that look like IDs (UUIDs, numeric strings) are replaced with
{id}placeholders (or{id1},{id2}, ... when a path has multiple parameters) - The result is written to the templates file
Templates file format
The output is a YAML file with path templates under an x-path-templates key:
x-path-templates:
- ignore:/api/users
- ignore:/api/users/{id}
- ignore:/api/products
- ignore:/api/products/{id}/reviews
- ignore:/static/bundle.js
Every path is prefixed with ignore: by default. This is intentional — it forces you to
explicitly opt in to each endpoint.
Automatic parameterization
The discover step detects path segments that vary across requests and replaces them with named parameters:
| Observed paths | Template |
|---|---|
/api/users/42, /api/users/99 | /api/users/{id} |
/api/orders/abc-def-123 | /api/orders/{id} |
UUID-like and numeric segments are detected automatically. The following patterns are also recognized:
- UPPER_CASE slugs — e.g.
BTC_USDT,ETH_BTC - Hex strings — segments starting with
0x - Base58 identifiers — alphanumeric segments 20+ characters long
- Cross-request variability — segments with 3 or more distinct values across requests
For patterns not covered by the built-in heuristics, use --param-regex to supply a custom
regex. Any path segment matching the regex is treated as a parameter:
mitm2openapi discover \
-i capture.flow \
-o templates.yaml \
-p "https://api.example.com" \
--param-regex '[A-Z]{2,}_[A-Z]{2,}'
Step 2: Curate
Open the templates file in any text editor. For each path:
- Remove
ignore:to include the endpoint in the generated spec - Leave
ignore:to exclude it - Delete the line to exclude it permanently
# Before curation
x-path-templates:
- ignore:/api/users
- ignore:/api/users/{id}
- ignore:/static/bundle.js
# After curation
x-path-templates:
- /api/users
- /api/users/{id}
- ignore:/static/bundle.js
You can also edit parameter names. The default {id} placeholder can be renamed to
something more descriptive like {userId}:
- /api/users/{userId}
Automating curation with glob filters
For CI pipelines or large captures, manual curation is impractical. Use --include-patterns
and --exclude-patterns during the discover step instead:
mitm2openapi discover \
-i capture.flow \
-o templates.yaml \
-p "https://api.example.com" \
--include-patterns '/api/**' \
--exclude-patterns '/static/**,*.css,*.js'
Paths matching --include-patterns are emitted without the ignore: prefix (auto-activated).
Paths matching --exclude-patterns are dropped entirely. Everything else gets ignore: for
manual review.
See filtering endpoints for the full glob syntax.
Step 3: Generate
The generate command re-reads the traffic capture and produces an OpenAPI spec using the
curated templates as a guide:
mitm2openapi generate \
-i capture.flow \
-t templates.yaml \
-o openapi.yaml \
-p "https://api.example.com"
What happens internally
- The templates file is loaded and the
ignore:entries are filtered out - Each template path is compiled into a regex for matching
- The traffic capture is streamed again, matching each request against the templates
- For each matched request:
- Path parameters are extracted
- Query parameters are collected
- Request body schema is inferred (JSON, form data)
- Response status code and body schema are recorded
- When multiple requests match the same template, their schemas are merged:
- Different status codes (200, 400, 404) produce separate response entries
- Request body is taken from the first observation; subsequent same-endpoint observations only contribute response schemas
- The final OpenAPI 3.0 document is written as YAML
Customizing output
The generate command accepts several options to tune the output:
mitm2openapi generate \
-i capture.flow \
-t templates.yaml \
-o openapi.yaml \
-p "https://api.example.com" \
--openapi-title "My API" \
--openapi-version "2.0.0" \
--exclude-headers "authorization,cookie" \
--ignore-images
See the CLI reference for all available options.
Response and request examples
The generate step captures actual request and response bodies as named examples in the
OpenAPI spec. Each unique response per endpoint and status code is stored as a separate
example, up to the limit set by --max-examples (default: 5).
When multiple requests hit the same endpoint with different request bodies, the schemas are
merged using oneOf to represent all observed variants.
Redacting sensitive data
Production captures often contain tokens, passwords, or PII. Use --redact-patterns and
--redact-fields to scrub sensitive values from examples before they land in the spec:
mitm2openapi generate \
-i capture.flow \
-t templates.yaml \
-o openapi.yaml \
-p "https://api.example.com" \
--redact-patterns 'eyJ[\w-]+' \
--redact-patterns 'sk-[a-zA-Z0-9]+' \
--redact-fields 'password,token,secret,authorization'
--redact-patterns takes one regex per flag — repeat the flag for multiple patterns.
Regexes with quantifiers like {32,} work correctly.
--redact-fields accepts comma-separated field names whose values are replaced with
"[REDACTED]".
Filtering OPTIONS requests
Both discover and generate accept --skip-options to exclude HTTP OPTIONS requests
(typically CORS preflight) from processing:
mitm2openapi discover -i capture.flow -o templates.yaml -p "https://api.example.com" --skip-options
Worked example
Starting from a mitmproxy capture of a pet store API:
# Discover all endpoints under the API prefix
mitm2openapi discover \
-i petstore.flow \
-o templates.yaml \
-p "http://petstore:8080" \
--exclude-patterns '/static/**' \
--include-patterns '/api/**'
# Templates file now has API paths auto-activated:
# - /api/v3/pet
# - /api/v3/pet/{id}
# - /api/v3/pet/findByStatus
# - /api/v3/store/inventory
# - ignore:/static/swagger-ui.css
# Generate the spec
mitm2openapi generate \
-i petstore.flow \
-t templates.yaml \
-o openapi.yaml \
-p "http://petstore:8080"
# Result: openapi.yaml with paths, methods, schemas
The generated openapi.yaml is a valid OpenAPI 3.0 document that can be opened in
Swagger UI, imported into Postman, or used
as a contract for API testing.
Generating stable operationIds
Use --operation-id-strategy path to generate camelCase operationIds that openapi-generator converts to readable Rust method names:
mitm2openapi generate -i capture.har -t templates.yaml -o openapi.yaml -p https://api.example.com \
--operation-id-strategy path
This produces ids like listUsers, getUser, createOrder, placeOrder.
Override specific operations with a YAML file:
# overrides.yaml
"GET /api/v1/contract/fair_price/{symbol}": getFairPrice
"POST /api/v1/private/order/place": placeOrder
mitm2openapi generate ... --operation-id-strategy path --operation-id-overrides overrides.yaml
Organizing operations with tags
Tags group operations into modules (one Rust source file per tag in openapi-generator). Use regex-based rules:
# tag-rules.yaml
rules:
- match: "^/api/v1/contract/"
tag: Contract
- match: "^/api/v1/private/"
tag: Private
default: Market
mitm2openapi generate ... --tag-rules tag-rules.yaml
Or use a fixed path segment as the tag:
mitm2openapi generate ... --tag-strategy path-segment --tag-segment-index 2
MEXC-style envelope APIs
MEXC and similar exchange APIs always return HTTP 200 with a success boolean:
{"success": true, "data": {"price": 42000.5}}
{"success": false, "code": 1, "message": "Invalid symbol"}
Use --envelope-discriminator to split captured bodies into typed schemas:
mitm2openapi generate \
-i capture.har -t templates.yaml -o openapi.yaml \
-p https://api.example.com \
--operation-id-strategy path \
--tag-rules tag-rules.yaml \
--envelope-discriminator success
The generated spec will include:
- A shared
components/schemas/ApiError(inferred from all error bodies) - Per-operation
{OperationId}Successschemas oneOf(SuccessSchema, ApiError)for operations with mixed bodies
Supply your own error schema instead of inferring:
mitm2openapi generate ... \
--envelope-discriminator success \
--envelope-error-shape api-error.yaml
Enriching generated specs
Auto-generated summaries like GET /api/v1/contract/fair_price/{symbol} aren't ideal for documentation or SDK generation. Use --enrichments to apply a YAML overlay with human-written metadata:
# enrichments.yaml
info:
description: |
Reverse-engineered MEXC web API.
Source: captured browser traffic.
operations:
getFairPrice:
summary: Get fair price for a futures contract
description: |
Returns the mark price used for liquidation calculations.
x-requires-auth: false
x-rate-limit: "10/s"
responses:
"200":
description: Fair price payload
getAssets:
summary: List futures account balances
x-requires-auth: true
components:
schemas:
ApiError:
description: |
MEXC envelope error response.
HTTP status is always 200; failure is signalled by success=false.
mitm2openapi generate \
-i capture.har -t templates.yaml -o openapi.yaml \
-p https://api.example.com \
--operation-id-strategy path \
--enrichments enrichments.yaml
Merge semantics
| Scope | Rule |
|---|---|
info.* | Overlay wins per-key (title, description, version) |
operations.<opId>.summary, description, deprecated | Overlay wins |
operations.<opId>.tags | Overlay replaces entire list |
operations.<opId>.x-* | Passed through verbatim |
operations.<opId>.responses.<status>.description | Overlay wins |
components.schemas.<name>.description | Overlay wins (properties/type untouched) |
| Operation in overlay but not in spec | Warning (error under --strict) |
| Operation in spec but not in overlay | Left untouched |
Note: operationIds in the overlay must match the final IDs after collision resolution. If two operations produce the same base ID, one gets a
_2suffix. Rungenerateonce without--enrichmentsto see the resolved IDs.
The --enrichments flag requires --operation-id-strategy to be set (not none), since the overlay keys operations by operationId.