Quality evidence and gate results¶
This page provides the complete quality evidence generated by the VeriOps pipeline. It covers KPI metrics, protocol gate results, all 32 automated quality checks, and RAG retrieval readiness.
KPI metrics¶
These metrics come from reports/kpi-wall.json:
| Metric | Value | Target | Status |
|---|---|---|---|
| Quality score | 100% | 80% | Excellent |
| Total documents | 12 | -- | Indexed across all content types |
| Stale pages | 0 | 0 | No stale pages |
| Documentation gaps | 0 | 0 | Full coverage |
| Metadata completeness | 100% | 100% | All frontmatter fields present and valid |
| Frontmatter errors | 0 | 0 | All pages pass schema validation |
Protocol contract gate results¶
Each protocol passes through eight pipeline stages: ingest, lint, regression, docs generation, frontmatter gate, snippet lint, test assets, and publish.
| Protocol | Contract source | Status | Stages passed | Notes |
|---|---|---|---|---|
| REST | api/openapi.yaml (OpenAPI 3.0) |
PASS | 8/8 | 14 endpoints validated |
| GraphQL | contracts/graphql.schema.graphql |
PASS | 8/8 | 3 operation types validated |
| gRPC | contracts/grpc/veriops.proto (Proto3) |
PASS | 8/8 | 3 RPC methods validated |
| AsyncAPI | contracts/asyncapi.yaml (v2.6.0) |
PASS | 8/8 | 3 channels validated |
| WebSocket | contracts/websocket.yaml |
PASS | 8/8 | 3 channels validated |
Quality checks enforced¶
The pipeline runs 32 automated checks on every documentation page. These checks are organized into four categories.
GEO checks (8 checks -- LLM and AI search optimization)¶
| Check ID | Rule | Severity | Threshold | Purpose |
|---|---|---|---|---|
| GEO-1 | Meta description present | Error | Must exist | Ensures every page has a description for search snippets |
| GEO-1b | Meta description minimum length | Warning | 50 characters | Prevents truncated search snippets |
| GEO-1c | Meta description maximum length | Warning | 160 characters | Prevents overflow in search results |
| GEO-2 | First paragraph length | Warning | 60 words max | Ensures concise opening for LLM extraction |
| GEO-3 | First paragraph definition | Suggestion | Contains definition verb | Helps LLMs identify what the page is about |
| GEO-4 | Heading specificity | Warning | No generic headings | Prevents vague headings like "Overview" or "Setup" |
| GEO-5 | Heading hierarchy | Error | No skipped levels | Ensures proper H2-H3-H4 nesting |
| GEO-6 | Fact density | Warning | 1 fact per 200 words | Keeps content information-rich for AI retrieval |
SEO checks (14 checks -- search engine optimization)¶
| Check ID | Rule | Severity | Threshold | Purpose |
|---|---|---|---|---|
| SEO-01 | Title length | Error/Warning | 10-70 characters | Optimal title display in search results |
| SEO-02 | Title keyword match | Suggestion | 50% overlap | Aligns title with filename keywords |
| SEO-03 | URL depth | Warning | Max 4 levels | Prevents deep URLs that search engines deprioritize |
| SEO-04 | URL naming | Warning | Kebab-case | Consistent, readable URL structure |
| SEO-05 | Image alt text | Warning | 100% coverage | Accessibility and image search visibility |
| SEO-06 | Internal links | Suggestion | Min 1 link | Cross-references improve crawlability |
| SEO-07 | Bare URLs | Warning | Zero bare URLs | Requires descriptive link text |
| SEO-08 | Path characters | Warning | Alphanumeric + hyphens | Prevents encoding issues in URLs |
| SEO-09 | Line length | Warning | Max 120 characters | Mobile readability |
| SEO-10 | Heading keywords | Suggestion | Shared with title | Signals relevance to search engines |
| SEO-11 | Freshness signal | Suggestion | Date in frontmatter | Indicates content currency |
| SEO-12 | Content depth | Warning | Min 100 words | Prevents thin content penalties |
| SEO-13 | Duplicate headings | Warning | Zero duplicates | Unique headings for anchor links |
| SEO-14 | Structured data | Suggestion | Min 1 element | Tables, code blocks, or lists for rich snippets |
Style checks (6 checks -- tone and voice consistency)¶
| Rule | Enforced by | Description |
|---|---|---|
| American English | Vale + AmericanEnglish | Use "color" not "colour," "optimize" not "optimise" |
| Active voice | Vale + write-good | "Configure the webhook" not "The webhook should be configured" |
| No weasel words | Vale + write-good | No "simple," "easy," "just," "many," "various" |
| No contractions | Vale + Google style | "do not" not "don't," "cannot" not "can't" |
| Second person | Vale + Google style | "you" not "the user" or "one" |
| Present tense | Vale + Google style | "sends" not "will send" for current features |
Contract checks (4 checks -- per-protocol technical accuracy)¶
| Rule | Applies to | Description |
|---|---|---|
| Schema validation | All 5 protocols | Validates contract syntax and semantics against protocol spec |
| Regression detection | All 5 protocols | Detects breaking changes against baseline snapshot |
| Snippet lint | All 5 protocols | Validates code examples match contract schemas |
| Self-verification | All 5 protocols | Runs examples against live/mock endpoints and verifies responses |
RAG retrieval pipeline¶
The pipeline generates a knowledge retrieval index that powers AI-driven search and support agents.
| Metric | Value | Target | Status |
|---|---|---|---|
| Knowledge graph nodes | 957 | -- | Topics, entities, and concepts extracted |
| Knowledge graph edges | 817 | -- | Relationships between nodes |
| Knowledge modules | 124 | -- | Auto-extracted topic chunks |
| Retrieval precision@3 | 0.58 | 0.5 | Pass |
| Retrieval recall@3 | 0.93 | 0.5 | Pass |
| Hallucination rate | 0.0 | 0.1 | Pass (all retrieved docs exist in corpus) |
| Evaluation queries | 60 | -- | Curated queries across 12 document categories |
Advanced retrieval features (enabled by default)¶
Six advanced features maximize production retrieval quality:
| Feature | Description | Expected impact |
|---|---|---|
| Token-aware chunking | Splits modules into 750-token chunks with 100-token overlap (cl100k_base) |
Improves recall for long documents |
| Hybrid search (RRF) | Fuses semantic (FAISS) and token-overlap rankings with Reciprocal Rank Fusion (k=60) | Higher recall for mixed queries |
| HyDE query expansion | Generates hypothetical doc passage via gpt-4.1-mini before embedding |
Better retrieval for vague queries |
| Cross-encoder reranking | Rescores top 20 candidates with cross-encoder/ms-marco-MiniLM-L-6-v2 |
Higher precision in top-N |
| Embedding cache | In-memory LRU cache (TTL: 3,600 seconds, max: 512 entries) for query embeddings | Reduced latency and API costs |
| Multi-mode evaluation | Compares token, semantic, hybrid, and hybrid+rerank modes across 50 curated queries | Data-driven mode selection |
Run a full retrieval comparison across all four modes:
python3 scripts/run_retrieval_evals.py \
--mode all \
--dataset config/retrieval_eval_dataset.yml \
--report reports/retrieval_comparison.json
Quality score formula¶
Quality score measures documentation health and is independent of RAG retrieval metrics:
score = 100 - metadata_penalty - stale_penalty - gap_penalty
- metadata_penalty: deduction for missing or invalid frontmatter fields
- stale_penalty: deduction for pages not reviewed within the freshness window
- gap_penalty: deduction for documented gaps in coverage
Pipeline artifacts¶
All artifacts generated by the pipeline for this site:
| Artifact | Status | Description |
|---|---|---|
| Multi-protocol contract report | Generated | Protocol validation results for all 5 protocols |
| KPI wall | Generated | Quality metrics dashboard |
| Lifecycle report | Generated | Document freshness and staleness tracking |
| Glossary sync report | Generated | Terminology consistency verification |
| Retrieval eval report | Generated | RAG precision, recall, and hallucination metrics |
| Knowledge graph report | Generated | Node and edge counts for knowledge graph |
| Pipeline stage summary | Generated | Stage execution details and timing |
Next steps¶
- How-to: keep docs aligned with every release for the release-day workflow
- Troubleshooting: common pipeline issues if pipeline stages fail