Interpreting Results#

gh-velocity produces output in four formats: pretty (default), JSON, markdown, and HTML. Each format contains the same data, structured for different consumers. This guide explains how to read the output, what healthy metrics look like, and what common patterns mean.

Reading your output#

Output formats at a glance#

FormatBest forFlag
PrettyTerminal reading, quick checks--results pretty (default)
JSONAgents, scripts, jq pipelines, CI--results json
MarkdownPasting into issues, PRs, Discussions--results markdown
HTMLSelf-contained report files--results html

All commands accept --results (or -r). If you omit it, you get pretty.

Three metric states#

Every timing metric (lead time, cycle time) can be in one of three states:

StatePretty outputJSONMeaning
Completed2d 4h"duration_seconds": 187200Work started and finished
In progressin progress"started_at": "...", "duration_seconds": nullWork started, clock still running
N/AN/A"started_at": null, "duration_seconds": nullNo start signal found

N/A usually means your cycle time strategy has no signal for that issue. Possible causes:

  • No lifecycle.in-progress.match configured — no label to look for, so every issue is N/A
  • Configured but no matching label event found — the label exists in config but was never applied to this issue
  • Issue is in backlog — matches a backlog label, excluded from cycle time
  • Negative cycle time was filtered — start after close, discarded as invalid

See Cycle Time Setup to fix this.

Output layers#

Every command produces up to four layers of output:

LayerWhat it containsExample
StatsAggregate numbers (mean, median, P90, count)Lead time median: 5d
DetailPer-item breakdownsIssue #42: 2d cycle time
InsightsHuman-readable observations with severity"Low classification coverage: 60% are unclassified"
ProvenanceHow the output was produced (command, config, scope)Reproducibility metadata

The report command shows stats only (one line per metric). Standalone commands like flow cycle-time and quality release show all four layers. In JSON, each layer is a top-level key.

Reading pretty output#

Pretty output is designed for a terminal. Here is an example from quality release:

Release  v1.2.0  owner/repo
Previous tag: v1.1.0 (14d ago)
Cadence: 14 days
Hotfix: no

Composition (12 issues):
  bug: 3 (25%)  feature: 7 (58%)  chore: 1 (8%)  other: 1 (8%)
  Bug ratio: 25%

  #   Title                          Lead     Cycle    Lag   Category
  ─── ────────────────────────────── ──────── ──────── ───── ────────
  42  Fix auth timeout               2d       1d       0d    bug
  38  Add dark mode                  45d      5d       3d    feature
  51  Upgrade deps                   120d     1d       0d    chore      OUTLIER
  ...

Aggregates:
  Lead time   — median: 8d, mean: 22d, P90: 45d, P95: 120d, stddev: 35d
  Cycle time  — median: 2d, mean: 3d, P90: 5d, P95: 5d
  Release lag — median: 1d, mean: 2d
  Outliers: 1 (threshold: 98d)

Key elements:

  • Per-issue rows -- individual metrics per issue. Statistical outliers are marked OUTLIER.
  • Aggregates -- distribution summary. Median is the primary number; mean is shown for comparison.
  • Composition -- category breakdown and bug ratio.
  • Cadence and hotfix -- release rhythm.

Reading JSON output#

JSON is the richest format -- every pretty-output field is present, plus additional fields for programmatic use.

gh velocity quality release v1.2.0 --results json | jq '.aggregates.lead_time'
{
  "count": 12,
  "mean_seconds": 1900800,
  "median_seconds": 691200,
  "stddev_seconds": 3024000,
  "p90_seconds": 3888000,
  "p95_seconds": 10368000,
  "outlier_cutoff_seconds": 8467200,
  "outlier_count": 1
}

Durations are always in seconds. Divide by 86400 to get days. Ratios are floats between 0 and 1. Booleans flag outlier status per issue:

gh velocity quality release v1.2.0 --results json | \
  jq '[.issues[] | select(.lead_time_outlier) | {number, title, lead_time_seconds}]'

In JSON mode, errors appear as structured ErrorEnvelope objects on stderr. See Agent Integration for parsing details.

Reading markdown output#

Markdown output is ready to paste into GitHub Issues, PRs, or Discussions:

gh velocity quality release v1.2.0 --results markdown

Use with --post to have gh-velocity post it automatically, or pipe into gh issue comment:

gh velocity quality release v1.2.0 --results markdown | \
  gh issue comment 100 --body-file -

What healthy metrics look like#

There are no universal benchmarks -- what matters is your trend over time and whether the numbers match your team's experience. These patterns indicate healthy delivery:

Lead time#

  • Median under 30 days for most product teams -- a typical issue goes from creation to close within a month.
  • P95 under 90 days. A P95 over 90 days means old issues are being closed alongside new work -- normal, but worth understanding.
  • Predictability label is not "low". Derived from the coefficient of variation (CV = stddev / mean). "Low" means individual item durations are hard to predict. See Understanding Statistics.

Cycle time#

  • Median under 7 days means active work on a typical issue completes within a week.
  • Cycle time much shorter than lead time is normal. The gap is backlog time (waiting before work starts).
  • N/A cycle times mean the configured strategy has no signal for those issues. See Cycle Time Setup.

Release lag#

  • Median under 7 days means completed work reaches users within a week.
  • High release lag with low cadence means completed work sits waiting for a release. Consider releasing more frequently.

Composition and bug ratio#

  • Bug ratio under 30% is typical for product-focused teams.
  • High "other" count means issues lack classification labels. Run gh velocity config preflight to generate category matchers, or label issues before releasing.

Velocity#

  • Stable velocity across iterations indicates predictable delivery. Look at the trend column in the history table.
  • Completion rate above 80% means the team commits to a realistic amount of work per iteration.
  • High standard deviation in velocity across iterations suggests scope changes or inconsistent estimation.

Throughput#

  • Steady or growing item count means consistent output. A sudden drop may indicate blockers or context switching.

Common patterns and what they mean#

Large gap between mean and median lead time#

Your data is right-skewed -- a few old issues are pulling the mean up. The median is the better measure of "typical." See Understanding Statistics.

Example from a real repo:

  • Mean lead time: 280 days
  • Median lead time: 60 days

Two issues open for 4+ years were closed in the release. The median tells you the typical issue takes about 2 months.

Many outliers in one release#

Outliers are flagged using the IQR method. Multiple outliers in a single release often indicate a backlog cleanup alongside normal work. Check whether they are old issues finally closed or genuinely slow work.

Cycle time is N/A for most issues#

N/A means "no start signal found" — not "zero." Check your strategy configuration.

Common causes:

  • Issue strategy: No lifecycle.in-progress.match configured, or labels are not applied to issues. Fix by adding lifecycle labels.
  • PR strategy: PRs do not reference issues with "Closes #N" or "Fixes #N". Fix by adding closing keywords to PR descriptions.

See Troubleshooting for detailed resolution steps.

High release lag despite fast cycle time#

Work finishes quickly but waits before release. This typically indicates a batch release process. Release more often, or automate releases on merge.

Bug ratio of 0%#

Either there are genuinely no bugs, or your bug category matchers are misconfigured. Check that your config has a category named bug with appropriate match rules. See Troubleshooting.

"Not assessed" items in velocity output#

Items matching no effort query are excluded from velocity and committed totals, reported separately. A high not-assessed count means your effort matchers need tuning. Run gh velocity config validate --velocity to see unmatched issues.

Why noise exclusion matters#

Repos with spam, duplicate, or invalid issues produce misleading metrics. The preflight command detects these labels and adds -label: exclusions to the scope query automatically. Example from cli/cli (30-day window):

MetricBefore (no exclusions)After (noise excluded)
Issues closed11257
Lead time median8 minutes35 days
Bug ratio25%35%
Bug fix speed"slower than other work""faster than other work"
Hotfix count7217
Predictability (CV)2.71.8

Half the closed issues were spam or duplicates closed in under 60 seconds. These instant closures dragged the median lead time to 8 minutes, made bug fixes look slower than "other work" (which was mostly spam), and inflated the hotfix count by counting every instant closure as a "hotfix."

After excluding noise labels, every metric became meaningful: lead time reflects actual delivery time, bug ratio reflects real workload composition, and the insight about bug fix speed reversed — bugs are actually resolved faster than features.

Checking your scope for noise#

Run preflight to regenerate your config and inspect the scope:

gh velocity config preflight -R owner/repo --write

If noise labels are detected, the generated config will include exclusions:

scope:
  query: "repo:cli/cli -label:duplicate -label:invalid -label:suspected-spam"
  # Excluded 3 noise label(s) detected in this repo: duplicate, invalid, suspected-spam

Detected patterns: spam, duplicate, and invalid (matched at word boundaries). For different noise labels, add manual exclusions to the scope query.

Comparing across releases#

To spot trends, run the same command for multiple releases and compare:

gh velocity quality release v2.0.0 --results json > v2.json
gh velocity quality release v1.9.0 --results json > v1.json

echo "v1.9.0 median: $(jq -r '.aggregates.lead_time.median_seconds / 86400 | round | "\(.)d"' v1.json)"
echo "v2.0.0 median: $(jq -r '.aggregates.lead_time.median_seconds / 86400 | round | "\(.)d"' v2.json)"

For automated trend tracking, see the CI Setup: Scheduled trend reports pattern.

Next steps#