Skip to content

feat(doccano-django): keploy compat lane sample + Python line coverage gate#101

Open
AkashKumar7902 wants to merge 8 commits intomainfrom
feat/doccano-django-sample
Open

feat(doccano-django): keploy compat lane sample + Python line coverage gate#101
AkashKumar7902 wants to merge 8 commits intomainfrom
feat/doccano-django-sample

Conversation

@AkashKumar7902
Copy link
Copy Markdown
Contributor

@AkashKumar7902 AkashKumar7902 commented Apr 30, 2026

Summary

Adds a new doccano-django/ sample that owns end-to-end orchestration (compose / bootstrap / traffic / noise filter / coverage) for the doccano + postgres-v3 compat lane. The keploy CI lanes in keploy/integrations and keploy/enterprise consume it as a thin wrapper.

The sample also doubles as the falsifying reproducer for the postgres-v3 simple-Query bind-collapse bug (keploy/integrations#177, issue #178).

Layout

doccano-django/
├── Dockerfile             # FROM doccano/doccano:backend (base; uninstrumented)
├── Dockerfile.coverage    # extends base, adds coverage[toml] + auto-start .pth
├── docker-compose.yml     # postgres-13 + doccano backend, fixed subnet, env-driven
├── docker-compose.coverage.yml  # overlay; sets COVERAGE_PROCESS_START + /coverage mount
├── .coveragerc            # parallel + sigterm; source = /backend
├── flow.sh                # bootstrap | record-traffic | coverage
├── keploy.yml.template    # globalNoise for body.created_at/updated_at, header.Date/Expires
└── README.md              # contract + run modes

Coverage architecture

Real Python line coverage via coverage.py 7.6.1, validated locally end-to-end at 59% (2150/3660 lines) on the existing record-traffic surface.

Critical: the base Dockerfile and docker-compose.yml are untouched. Coverage instrumentation lives in a separate Dockerfile.coverage + docker-compose.coverage.yml overlay, applied only by the standalone GH Actions coverage workflow. The keploy/integrations and keploy/enterprise CI lanes consume the base compose unchanged and pay zero coverage-instrumentation cost.

How it works: the overlay installs coverage[toml] and drops a coverage_subprocess.pth file into site-packages so each forked gunicorn worker auto-starts coverage.process_startup(). .coveragerc sets parallel = true + sigterm = true so per-worker .coverage.<pid> files flush on graceful shutdown. flow.sh coverage shells into the running backend container, runs coverage combine + coverage report, and emits Covered N/M (XX.X%) for the GH Actions gate.

flow.sh coverage against the base image (no overlay) is a graceful no-op, so enterprise lanes' flow.sh coverage || true informational calls keep working.

Coverage gate

.github/workflows/doccano-django.yml triggers ONLY on changes under doccano-django/**. Runs build (PR HEAD) + release (base ref) coverage in parallel, fails the PR if the build's line coverage drops more than 1.0pp below release. Threshold overridable via vars.DOCCANO_COVERAGE_THRESHOLD. Reports attached as artifacts.

Run modes

  • Smoke check (without keploy): docker compose up -d && bash flow.sh bootstrap && bash flow.sh record-traffic — exactly what the keploy CI lanes wrap.
  • Real coverage (without keploy): apply the overlay, run bootstrap + record-traffic + SIGTERM the backend + flow.sh coverage.
  • With keploy: lane scripts in keploy/integrations and keploy/enterprise wrap docker compose up in keploy record / keploy test against the base compose.

See README for full commands.

Consumers

  • keploy/integrations .woodpecker/doccano-postgres.yml — three-cell record/replay matrix that falsifies the postgres-v3 bind-collapse bug.
  • keploy/enterprise .woodpecker/doccano-linux.yml — same three-cell matrix wired to the enterprise compat-matrix harness.

Test plan

  • docker compose up -d boots postgres + doccano cleanly
  • flow.sh bootstrap 240 returns admin + fixed token within 240s
  • flow.sh record-traffic exercises 16+ /v1/me warmup hammer + project / metrics / labels surface
  • Coverage validated locally end-to-end: 59% line coverage of /backend
  • flow.sh coverage against base compose exits 0 cleanly with INFO message

@AkashKumar7902 AkashKumar7902 force-pushed the feat/doccano-django-sample branch from 9d9f78f to 6f19648 Compare April 30, 2026 23:24
… regression)

Minimum reproducer for the polymorphic-resourcetype failure that
motivated keploy/integrations#177. Wraps doccano v1.8.5 +
django-rest-polymorphic + postgres 13.3-alpine — the same shape
the bug originally surfaced on (keploy/enterprise PRs #1889 / #1964,
pipelines 3556 / 3572).

Per the keploy-ci-debug skill, the sample owns ALL orchestration
the lane scripts in keploy/integrations and keploy/enterprise need:
the docker-compose, the admin-bootstrap flow, the API traffic loop,
the noise filter (via keploy.yml.template), and a coverage-report
helper. Future lanes that exercise the same backend re-use this
directory; they don't redefine compose / bootstrap / traffic in
their own scripts. The intent is to migrate
enterprise/.ci/scripts/doccano-linux.sh from its current ~400-line
inlined-everything shape down to a thin "clone sample → wrap in
keploy → assert" wrapper in a follow-up PR.

Layout:

* `Dockerfile` — `FROM doccano/doccano:backend`. Wrapper exists
  so a future doccano patch (or a backport of an upstream fix that
  changes the bug-triggering shape) is a one-line edit here, not
  scattered across lane scripts.
* `docker-compose.yml` — postgres + doccano backend on a fixed
  subnet, every name fully env-driven (DOCCANO_BACKEND_CONTAINER /
  DOCCANO_DB_CONTAINER / DOCCANO_APP_PORT / DOCCANO_DB_IP /
  DOCCANO_NETWORK_SUBNET). Lane scripts running multiple matrix
  cells in parallel pass per-cell values so the cells don't
  collide on container names. Two-phase boot
  (DOCCANO_SKIP_BOOTSTRAP=0 → migrations + admin; named volume
  retained; DOCCANO_SKIP_BOOTSTRAP=1 → gunicorn-only against the
  populated volume) so record/replay see a deterministic state.
* `flow.sh` — four subcommands:
    bootstrap      — log in as admin, install the deterministic
                     authtoken_token row so record-time and
                     replay-time Authorization headers match.
    record-traffic — drive the API: 16-call /v1/me warmup hammer
                     (gunicorn worker contenttypes-cache warmup,
                     necessary for the SIGINT-driven shutdown
                     pattern lanes use), POST a polymorphic
                     TextClassificationProject, GET / PATCH it,
                     plus dependent category-types / examples /
                     categories / metrics reads that exercise the
                     multi-bind django_content_type lookups the
                     fix targets. Fire-and-forget; keploy is the
                     assertion layer at replay.
    coverage       — walk the running backend's URL resolver
                     (introspecting actual served methods, not
                     Django's permissive http_method_names default)
                     and the just-recorded keploy/test-set-*
                     tests; emit a (method, path) coverage
                     percentage for the v1/projects + accessory
                     surface.
    list-routes    — print the route table the coverage report
                     uses as its denominator (diagnostic).
* `keploy.yml.template` — globalNoise filter for the inherently
  non-deterministic fields (Date/Expires headers, created_at/
  updated_at body fields). Centralised here so a future doccano
  version that adds another auto-timestamp field is one edit
  rather than a fan-out across lane scripts. Lane scripts
  envsubst this template into the per-cell run dir.
* `README.md` — bug shape, local-run instructions, lane pointers.

Sample is keploy-independent: `docker compose up && bash flow.sh
bootstrap && bash flow.sh record-traffic` works against bare
doccano. Verified locally: 25/25 calls return expected status,
polymorphic resourcetype is `TextClassificationProject` end-to-end.
The route walker emits 144 (method, path) pairs for the v1/projects
+ /v1/me + /v1/users + /v1/health + /v1/auth surface; coverage
matching against synthetic recorded tests rounds correctly.

Lanes that pin to this sample (pinned to the
feat/doccano-django-sample branch via --branch until this PR
merges):

* keploy/integrations `.woodpecker/doccano-postgres.yml` —
  three-way matrix (record-build × replay-build, record-latest ×
  replay-build, record-build × replay-latest); depends_on
  prepare-and-run.
* keploy/enterprise `.woodpecker/doccano-linux.yml` — being
  migrated to consume this sample in a follow-up PR; today still
  uses inline compose generation.

Signed-off-by: Akash Kumar <meakash7902@gmail.com>
@AkashKumar7902 AkashKumar7902 force-pushed the feat/doccano-django-sample branch from 6f19648 to 2cc92c3 Compare April 30, 2026 23:57
Pipeline 3597 / 909 (post-compose-render fix) failed at:

  Container ... Error dependency postgres failed to start

Misleading. Real cause: doccano_record_traffic fired its very
first POST /v1/projects against a backend whose port was open but
gunicorn was still booting; the 5xx response failed `curl -fsS`,
set -e killed the script silently, the lane saw a zero-second
"traffic done", SIGINTed keploy ~3s later, and the recording
captured nothing. The "dependency postgres failed" line in the
log is downstream noise from the SIGINT compose-down.

Fix: gate doccano_record_traffic on doccano_wait_for_fixed_token
before any curl fires. /v1/me with the fixed Authorization
header is a stronger readiness signal than wait_for_port: it
proves gunicorn is past boot, auth is wired, the named-volume
token is loaded, and the DB is responsive — all four guarantees
the first POST needs.

Lane scripts can keep their own port-level wait (wait_for_port),
but the sample's flow.sh now refuses to fire traffic until the
backend is genuinely serving. Local smoke-test pattern is
unchanged: bootstrap + record-traffic still work standalone.

Signed-off-by: Akash Kumar <meakash7902@gmail.com>
Copilot AI review requested due to automatic review settings May 1, 2026 00:24
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new doccano-django/ sample app intended to act as a minimal, deterministic reproducer for Keploy’s postgres-v3 simple-Query bind regression (using upstream doccano backend + Postgres).

Changes:

  • Introduces a docker-compose orchestration for doccano backend + postgres on a fixed subnet.
  • Adds flow.sh to bootstrap a deterministic auth token, drive the minimal HTTP traffic, and report route coverage.
  • Adds documentation and a Keploy noise-filter template to reduce timestamp-related diffs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
doccano-django/Dockerfile Wrapper around doccano backend image (intended version pin).
doccano-django/docker-compose.yml Compose stack for doccano backend + postgres with fixed IP/subnet support.
doccano-django/flow.sh Bootstrap/token setup, traffic driver, and coverage reporting utilities.
doccano-django/keploy.yml.template Shared Keploy noise configuration for nondeterministic timestamp fields.
doccano-django/README.md Explains the regression shape and provides local run instructions + lane references.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread doccano-django/flow.sh Outdated
Comment on lines +301 to +314
local f method route
while IFS= read -r f; do
method=$(awk '/^ method:/{print $2; exit}' "$f")
route=$(awk '/^ url:/{print $2; exit}' "$f")
route="${route%%\?*}"
case "$route" in
http://*|https://*)
route="/${route#*://*/}"
;;
esac
if [ -n "$method" ] && [ -n "$route" ]; then
echo "$method $route"
fi
done < <(find keploy -type f -path '*/tests/*.yaml' 2>/dev/null) | sort -u
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doccano_list_recorded_routes searches for method: and url: lines with exactly 4 leading spaces, but Keploy-generated Http testcase YAMLs in this repo use deeper indentation (e.g. spec.req.method is indented 8 spaces). As a result, method/route will often be empty and the coverage report will incorrectly show 0 covered routes. Consider matching ^\s*method: / ^\s*url: or parsing the YAML more robustly (e.g. with yq).

Copilot uses AI. Check for mistakes.
Comment thread doccano-django/flow.sh Outdated
Comment on lines +247 to +254
methods = set()
# Generic / mixin DRF views: handler methods named after HTTP
# verbs (get / post / put / patch / delete / head / options).
# Filter to ones the class itself defines (or any non-base
# ancestor — `not in object`s vars is too narrow because a
# mixin like ListModelMixin defines `list`, not `get`).
for m in ("get", "post", "put", "patch", "delete"):
if hasattr(view, m):
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says the route table “reports only methods the view class actually overrides”, but actual_methods uses hasattr(view, m) which will include inherited handlers from base classes (potentially inflating the denominator with methods the view doesn’t implement). Either update the comment or tighten the check (e.g., inspect view.__mro__ / __dict__ to ensure the method is defined on a non-base ancestor).

Suggested change
methods = set()
# Generic / mixin DRF views: handler methods named after HTTP
# verbs (get / post / put / patch / delete / head / options).
# Filter to ones the class itself defines (or any non-base
# ancestor — `not in object`s vars is too narrow because a
# mixin like ListModelMixin defines `list`, not `get`).
for m in ("get", "post", "put", "patch", "delete"):
if hasattr(view, m):
def defines_method_in_mro(cls, method_name):
for base in getattr(cls, "__mro__", (cls,)):
if base is object:
break
if method_name in getattr(base, "__dict__", {}):
return True
return False
methods = set()
# Generic / mixin DRF views: handler methods named after HTTP
# verbs (get / post / put / patch / delete / head / options).
# Filter to ones explicitly defined on the class or a non-base
# ancestor in the MRO, rather than inherited via attribute
# lookup alone.
for m in ("get", "post", "put", "patch", "delete"):
if defines_method_in_mro(view, m):

Copilot uses AI. Check for mistakes.
Comment thread doccano-django/flow.sh Outdated

{
echo "================ doccano API coverage ================"
echo "Covered ${covered}/${total} (${method:+}${pct}%)"
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coverage summary line uses (${method:+}${pct}%), which will prefix the percentage with the last loop’s $method value (e.g. GET42.1%). This looks unintended and makes the output confusing; it should print just ${pct}% (or a separate field if you want to show the method).

Suggested change
echo "Covered ${covered}/${total} (${method:+}${pct}%)"
echo "Covered ${covered}/${total} (${pct}%)"

Copilot uses AI. Check for mistakes.
Comment thread doccano-django/Dockerfile
# v1.8.5 was the version exercised on keploy/enterprise pipeline 3556
# (PR #1889) and pipeline 3572 (PR #1964 minimal repro) where the
# bug originally manifested.
FROM doccano/doccano:backend
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Dockerfile claims the sample is pinned to doccano v1.8.5, but FROM doccano/doccano:backend is a rolling tag. This undermines reproducibility of the regression sample (base image contents can change over time). Pin the image to a specific tag that corresponds to v1.8.5 and/or a digest (@sha256:...) so the sample stays deterministic.

Suggested change
FROM doccano/doccano:backend
FROM doccano/doccano:1.8.5

Copilot uses AI. Check for mistakes.
Comment thread doccano-django/flow.sh Outdated
"v1/me",
"v1/users",
"v1/health",
"v1/auth",
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above says route filtering includes /v1/fp/* (filepond) endpoints, but SCOPE_PREFIXES doesn’t include v1/fp. If the intent is to include filepond endpoints in the coverage denominator, add that prefix; otherwise update the comment so the scope matches what’s actually enumerated.

Suggested change
"v1/auth",
"v1/auth",
"v1/fp",

Copilot uses AI. Check for mistakes.
Comment thread doccano-django/flow.sh Outdated
Comment on lines +330 to +333
# Only counted as "covered" if the recorded test passed at record
# time — a 5xx that landed in the test set still adds to the
# denominator, but its method-path pair counts as covered only if
# the response status was 2xx/3xx. Filter implemented inline below.
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The coverage report comment says only requests with 2xx/3xx responses count as covered, but the implementation only checks whether a (method, path) appears in the recorded tests file and does not filter by resp.status_code. This makes coverage numbers misleading if any recorded tests captured 4xx/5xx responses. Either implement the status_code filter or adjust the comment/documentation to match the actual behavior.

Suggested change
# Only counted as "covered" if the recorded test passed at record
# time — a 5xx that landed in the test set still adds to the
# denominator, but its method-path pair counts as covered only if
# the response status was 2xx/3xx. Filter implemented inline below.
# Coverage is based only on whether a normalised method-path pair
# appears in the recorded test set. This report does not distinguish
# between 2xx/3xx and 4xx/5xx responses; any recorded match counts
# as covered.

Copilot uses AI. Check for mistakes.
Comment thread doccano-django/flow.sh Outdated
Comment on lines +340 to +345
echo "WARNING: could not enumerate doccano routes (is the backend container '${DOCCANO_BACKEND_CONTAINER:-doccano_backend}' running?)" >&2
rm -f "$routes_file" "$recorded_file"
return 0
fi
if [ ! -s "$routes_file" ]; then
echo "WARNING: route enumeration produced no rows; skipping coverage report" >&2
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The echo "WARNING: ..." messages add new “WARNING” logs. If this output is consumed by CI/lane harnesses, it can be interpreted as a warning signal and the project guidelines indicate warnings shouldn’t be introduced. Consider emitting a clear error with a next step (e.g., “failed to enumerate routes; verify the backend container name via DOCCANO_BACKEND_CONTAINER”) or a neutral info/debug message without the “WARNING” prefix.

Suggested change
echo "WARNING: could not enumerate doccano routes (is the backend container '${DOCCANO_BACKEND_CONTAINER:-doccano_backend}' running?)" >&2
rm -f "$routes_file" "$recorded_file"
return 0
fi
if [ ! -s "$routes_file" ]; then
echo "WARNING: route enumeration produced no rows; skipping coverage report" >&2
echo "Skipping coverage report: failed to enumerate doccano routes; verify that backend container '${DOCCANO_BACKEND_CONTAINER:-doccano_backend}' is running and that DOCCANO_BACKEND_CONTAINER names the correct container." >&2
rm -f "$routes_file" "$recorded_file"
return 0
fi
if [ ! -s "$routes_file" ]; then
echo "Skipping coverage report: route enumeration returned no rows; verify that backend container '${DOCCANO_BACKEND_CONTAINER:-doccano_backend}' exposes routes and that DOCCANO_BACKEND_CONTAINER points to the correct container." >&2

Copilot uses AI. Check for mistakes.
Comment thread doccano-django/flow.sh Outdated
Comment on lines +340 to +345
echo "WARNING: could not enumerate doccano routes (is the backend container '${DOCCANO_BACKEND_CONTAINER:-doccano_backend}' running?)" >&2
rm -f "$routes_file" "$recorded_file"
return 0
fi
if [ ! -s "$routes_file" ]; then
echo "WARNING: route enumeration produced no rows; skipping coverage report" >&2
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The echo "WARNING: ..." message here introduces a new warning-style log line. Prefer either a neutral message or a hard error with an actionable next step, depending on whether skipping the coverage report is expected in normal operation.

Suggested change
echo "WARNING: could not enumerate doccano routes (is the backend container '${DOCCANO_BACKEND_CONTAINER:-doccano_backend}' running?)" >&2
rm -f "$routes_file" "$recorded_file"
return 0
fi
if [ ! -s "$routes_file" ]; then
echo "WARNING: route enumeration produced no rows; skipping coverage report" >&2
echo "Skipping coverage report because doccano routes could not be enumerated. Start the backend container '${DOCCANO_BACKEND_CONTAINER:-doccano_backend}' and rerun the script if you need a coverage report." >&2
rm -f "$routes_file" "$recorded_file"
return 0
fi
if [ ! -s "$routes_file" ]; then
echo "Skipping coverage report because route enumeration produced no rows. Verify that the backend is serving routes and rerun the script if you expected coverage output." >&2

Copilot uses AI. Check for mistakes.
Comment thread doccano-django/README.md
Comment on lines +46 to +49
* `keploy.yml.template` — keploy config skeleton (proxy port, DNS
port, container name placeholders) that lane scripts in
`keploy/integrations` and `keploy/enterprise` `envsubst` into a
per-job copy.
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

README says keploy.yml.template is a “config skeleton (proxy port, DNS port, container name placeholders) … envsubst into a per-job copy”, but the checked-in template currently only contains test.globalNoise entries and no placeholders. Either update the README to describe it as a shared noise-filter snippet, or expand the template to match what the README claims lane scripts consume.

Suggested change
* `keploy.yml.template` — keploy config skeleton (proxy port, DNS
port, container name placeholders) that lane scripts in
`keploy/integrations` and `keploy/enterprise` `envsubst` into a
per-job copy.
* `keploy.yml.template`shared keploy noise-filter snippet
(`test.globalNoise` entries) consumed by lane scripts in
`keploy/integrations` and `keploy/enterprise`; it is not a
placeholder-based per-job config skeleton.

Copilot uses AI. Check for mistakes.
Comment thread doccano-django/flow.sh
Comment on lines +51 to +58
while true; do
local code
code=$(curl -sS -o /tmp/doccano-login.json -w '%{http_code}' \
-H 'Content-Type: application/json' \
-X POST "${base}/v1/auth/login/" \
-d "{\"username\":\"${DOCCANO_ADMIN_USER}\",\"password\":\"${DOCCANO_ADMIN_PASSWORD}\"}" || true)
if [ "$code" = "200" ] && jq -e '.key' /tmp/doccano-login.json >/dev/null 2>&1; then
break
Copy link

Copilot AI May 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function writes curl responses to fixed paths under /tmp (/tmp/doccano-login.json, /tmp/doccano-me.json). In CI matrix runs (or multiple local runs in parallel), concurrent invocations can clobber each other’s files and cause flakey jq/curl checks. Use mktemp for these files and ensure they’re cleaned up (e.g., via a trap).

Copilot uses AI. Check for mistakes.
Adds .github/workflows/doccano-django.yml — runs ONLY on changes
under doccano-django/ (or this workflow file) so unrelated
samples in this repo don't pay the doccano runtime cost.

Three jobs:

* `build-coverage` — checks out the PR's HEAD ref, brings up
  the sample's compose, drives flow.sh bootstrap +
  record-traffic with a per-call audit log enabled, runs
  flow.sh coverage. Captures the percentage as a job output.
* `release-coverage` — same end-to-end against
  github.event.pull_request.base.ref (typically main) so we
  have a baseline to compare against. Skipped on direct push
  events to main (no baseline to diff against — main IS the
  baseline).
* `coverage-gate` — fails the PR if build's coverage drops
  more than COVERAGE_THRESHOLD pp below release.
  COVERAGE_THRESHOLD defaults to 1.0pp; override with the
  `DOCCANO_COVERAGE_THRESHOLD` actions variable per-repo.
  Sticky-comments the PR with the diff via
  marocchino/sticky-pull-request-comment so reviewers see the
  delta inline.

The two measurement jobs share their body via
.github/workflows/scripts/run-and-measure.sh — same script,
different ref. Lifting it out of the YAML keeps the YAML focused
on orchestration (matrix / outputs / artifacts) and the bash on
the actual workflow logic.

Coverage source uses flow.sh's per-call audit log
(DOCCANO_FIRED_ROUTES_FILE). That makes the measurement genuinely
keploy-independent: the workflow doesn't run keploy at all,
doesn't compare against recorded test sets, just measures what
the sample's flow.sh ACTUALLY exercises against doccano's URL
resolver. Lane scripts in keploy/integrations and keploy/enterprise
consume the same flow.sh but use the keploy/test-set-*/tests/*.yaml
tree as their numerator (authoritative — only calls keploy actually
captured count). Both modes are wired into
flow.sh::doccano_list_recorded_routes via the
DOCCANO_FIRED_ROUTES_FILE fallback.

Sample-side changes:

* flow.sh::doccano_wait_for_fixed_token extracted as its own
  function (was inlined into doccano_bootstrap_token, broke
  doccano_record_traffic's forward reference and silently
  fail-fasted under set -e).
* flow.sh::doccano_record_traffic gates on
  doccano_wait_for_fixed_token before any curl fires —
  port-open isn't a sufficient readiness signal under
  SIGINT-driven shutdown, the very first curl -fsS POST would
  5xx on a still-booting gunicorn and silently kill the script.
* flow.sh::log_fired writes (METHOD, URL) to
  DOCCANO_FIRED_ROUTES_FILE before each curl in
  doccano_record_traffic. Cheap, optional (no-op when env var
  unset), and keeps the audit log adjacent to the curl that
  produces it so future contributors can't add a curl without
  also adding the log entry.
* flow.sh::doccano_list_recorded_routes falls back to the audit
  log when no keploy/test-set-*/tests/*.yaml exists — the
  standalone-mode numerator the workflow needs.

Verified locally: workflow body (`run-and-measure.sh`) runs
end-to-end against bare doccano in ~3 minutes, captures 16
unique (method, path) pairs, emits coverage=11.1% to
GITHUB_OUTPUT. The gate logic itself is plain bash + python3
arithmetic; no codecov/coveralls dependency, no hosted service
needed.

Signed-off-by: Akash Kumar <meakash7902@gmail.com>
Run 25196349264 (the very PR introducing doccano-django/) failed
in release-coverage with:

  An error occurred trying to start process '/usr/bin/bash' with
  working directory '.../doccano-django'. No such file or directory

Expected: the workflow checks out the PR's base ref to compute
the baseline coverage, but on the introducing PR there's no
baseline — `doccano-django/` doesn't exist on main yet.

Fix: a `detect` step inspects whether `doccano-django/flow.sh`
exists on the checked-out base ref. If yes, the measurement
runs as before. If no (first-PR-bootstrap case), an
`empty-baseline` step emits coverage=0.0 onto the job output,
the measurement step is skipped via `if:`, and the upload-
artifact step is also skipped (so we don't claim a non-existent
report file). The job's `outputs.coverage` falls back through
`||` so the gate sees 0.0 either way.

Net effect on the introducing PR: build's coverage (currently
~11%) is compared against 0%, gate trivially passes. After
this PR merges and a future PR edits doccano-django/, the
detect step finds the sample on main, real measurement runs,
real diff applies.

Signed-off-by: Akash Kumar <meakash7902@gmail.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

doccano-django sample coverage

ref coverage
base (main) 0.0%
this PR 59.0%

Threshold: PR may not drop coverage by more than 1.0pp. Override per-repo via the DOCCANO_COVERAGE_THRESHOLD actions variable.

Replaces the prior API-route-surface "coverage" (which counted
fired routes / known routes — a proxy that read like real coverage
but didn't measure code execution) with actual line coverage via
coverage.py 7.6.1.

Architecture:
  - `Dockerfile.coverage` extends `doccano/doccano:backend` to
    install coverage[toml] and drop a `coverage_subprocess.pth`
    file into site-packages, so every gunicorn worker that forks
    auto-starts `coverage.process_startup()`.
  - `.coveragerc` runs in parallel mode (one .coverage.<pid> per
    worker) with sigterm = true so flushing happens on graceful
    shutdown.
  - `docker-compose.coverage.yml` is an OVERLAY: the GH Actions
    coverage workflow applies it via `-f docker-compose.yml -f
    docker-compose.coverage.yml`. The base `Dockerfile` and
    `docker-compose.yml` are untouched, so keploy/integrations and
    keploy/enterprise CI lanes consume the base compose and pay
    zero coverage-instrumentation cost.
  - `flow.sh::doccano_report_coverage` shells into the running
    backend, runs `coverage combine` + `coverage report
    --format=total`, emits `Covered N/M (XX.X%)` matching the
    helper script's regex. When called against the base image
    (no overlay) it prints "INFO: ... uninstrumented" and exits 0
    so enterprise lanes' `flow.sh coverage || true` informational
    calls keep working.

Removed:
  - `doccano_list_routes` (the Django URL-resolver walk).
  - `doccano_list_recorded_routes` (the keploy-tests / fired-routes
    reader).
  - The legacy route-surface `doccano_report_coverage` body.
  - `list-routes` subcommand (was diagnostic only for the surface
    metric).

Validated locally: e2e run produced `coverage=59.0` to
GITHUB_OUTPUT against a clean stack (gunicorn 4 workers, traffic
loop fired, SIGTERM flush, combine+report inside container).
59% reflects bootstrap + the sample's small traffic surface;
adding curls to flow.sh::doccano_record_traffic moves the
number up.

Signed-off-by: Akash Kumar <meakash7902@gmail.com>
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
…y modes

Signed-off-by: Akash Kumar <meakash7902@gmail.com>
@AkashKumar7902 AkashKumar7902 changed the title feat: add doccano-django sample for keploy postgres-v3 bind regression feat(doccano-django): keploy compat lane sample + Python line coverage gate May 1, 2026
Signed-off-by: Akash Kumar <meakash7902@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants