Maturity Self-Assessment Worksheet

This worksheet is part of the Agentic SDLC — How to Deliver with Confidence guide. It is designed to be used independently. No prior reading is required.

Use this worksheet to determine your team’s current governance maturity level for AI-generated code. Answer each question honestly based on what your team actually does today — not what you plan to do or aspire to do.

How to score: Start at Level 0 and work upward. If all questions for a level are answered “yes,” your team is at that level. Your maturity level is the highest level where all questions are answered “yes.”

Level 0: Unstructured

Every team starts here or above. Answer these questions to confirm your baseline.

Does your team use AI code generation (any tool, any frequency)?
Are AI-generated PRs reviewed by at least one human before merging?

If you answered “yes” to both: You are at least Level 0. Proceed to Level 1. If you answered “no” to either: You are not yet using AI code generation in a reviewable workflow. Start by establishing a basic PR review requirement for all generated code.

Level 1: Documented

Does your team have written architecture decision records (or equivalent documents that capture technical decisions)?
Does your team have written coding conventions or style guides?
Are these documents stored in a known, accessible location (not scattered across personal notes or chat history)?
Do new team members know where to find these documents?
Have the documents been reviewed or updated in the last 6 months?

If all “yes”: You are at least Level 1. Proceed to Level 2. If any “no”: You are at Level 0. Recommended next actions:

Capture the top 10 architecture decisions your team follows implicitly
Write down your coding conventions, starting with naming and file structure
Store everything in a single, team-accessible location
Schedule a review of these documents within 90 days

Level 2: Enforced

Are your most important architecture decisions compiled into constraints that the AI generator receives at generation time (rule files, system prompts, context files)?
Do you have automated checks (pre-merge gates, linting rules, or scanning tools) that verify compliance with at least some of your documented decisions?
When an automated check finds a violation, is the violation blocked or flagged before merge — not just logged?
Can you identify which of your documented decisions are compiled (enforced) versus documentation-only (unenforced)?
Have you tested your enforcement: deliberately violated a compiled constraint and confirmed it was caught?

If all “yes”: You are at least Level 2. Proceed to Level 3. If any “no”: You are at Level 1. Recommended next actions:

Select your 5 highest-impact decisions and invariants
Compile them into rule files the generator can consume
Add pre-merge checks for your hardest constraints (invariants)
Test each enforcement mechanism with a deliberate violation
Track which decisions are compiled and which are not

Level 3: Governed

For non-trivial tasks, does your team write a technical specification before generating code?
Are specifications peer-reviewed by at least one person other than the author before generation begins?
Does the specification review evaluate architecture alignment, security considerations, and testability of acceptance criteria?
After generation, do you run drift detection (comparison of generated code against the specification)?
Are PR reviews conducted with the specification, plan, and verification results attached as context?
Do you have defined specialist review domains (e.g., database, security, API) with clear criteria for when each is triggered?

If all “yes”: You are at least Level 3. Proceed to Level 4. If any “no”: You are at Level 2. Recommended next actions:

Introduce specifications for any task estimated at more than one day of work
Establish a peer review step for specifications before generation
Set up drift detection that compares generated output to specification requirements
Attach specifications and verification results to every PR of generated code
Define at least 3 specialist review domains and their trigger criteria

Level 4: Adaptive

After each implementation cycle, does your team conduct a governance sweep to identify uncaptured decisions?
Are new decisions surfaced during implementation captured and compiled into enforcement before the next cycle?
Does your team actively manage the balance between tightly constraining the generator and leaving room for discovery?
Are your governance constraints versioned — can you see how they have evolved over time?
Is there a named owner (role or individual) responsible for maintaining governance quality?
Has your governance set been updated in the last 30 days based on implementation findings?

If all “yes”: You are at Level 4. If any “no”: You are at Level 3. Recommended next actions:

Add a governance sweep as a standard step after each major feature ships
Assign a governance owner: someone who reviews the decision records, invariants, and compiled rules regularly
Track governance changes in version control so evolution is visible
Schedule a monthly review of whether the governance set still reflects the actual codebase
Identify one area where the generator is under-constrained and compile a new rule for it

Summary

Level	Name	Your Status
0	Unstructured	[ ]
1	Documented	[ ]
2	Enforced	[ ]
3	Governed	[ ]
4	Adaptive	[ ]

Your current level: _

Next actions (from the recommendations above):