You know the move: ship fast, punt the “small stuff,” and soothe yourself with a tech-debt ticket. In 2026, that’s lazy thinking—especially when an LLM can refactor, add tests, and tidy in minutes.
Problem
Teams treat tech debt like a safe deposit box for future you. It isn’t. It’s a tax on every change that touches that corner of the codebase.
When I say “tech debt” here, I mean the everyday stuff:
- naming and structure
- tests
- error handling and edge cases
- observability (logs/metrics/tracing)
- small schema tweaks and light infra work
Not massive infrastructure shifts or architectural rewrites.
Why It Matters
- Risk compounds: Edge cases you skip today become outages tomorrow.
- Queues lie: “We’ll get to it” rarely happens. Tickets become permission to forget.
- AI changed the cost curve: If you can explain what’s wrong, a model can reshape code and generate tests quickly. The code itself is the spec.
- Asset thinking: You’re responsible for the asset value of the codebase, not just throughput this sprint.
The Line: How Much Time To Steal
Use time as the governor: up to 2× the “quick-and-dirty” estimate if that’s what it takes to do it properly. Often it won’t. But if doing it right doubles the effort and keeps the asset healthy, do it. IMO “Honest estimates” beat “padding.” Include the quality work silently and give one number.
Not deception: This isn’t sandbagging; it’s a single, honest number that already includes tests, edges, and observability. The actual deception is shipping half a change and calling it done.
Guardrails to prevent bloat:
- Anchor to a plausible quick‑and‑dirty baseline and cap at ≤2×.
- Calibrate with history: compare estimates vs actuals in retro; tighten if you’re off.
- If you keep bumping into the cap, stop and split scope or run a spike.
What To Do (Playbook)
Estimate Honestly
- Include tests, edge handling, observability, and small refactors. Give a single date. No “with/without debt” menu.
- Keep an internal ≤2× guardrail relative to the quick‑and‑dirty path.
- Calibrate with past work: review estimate vs actual in retro and adjust.
- Log assumptions and risks in the PR, not as negotiable line items in planning.
Stay Silent, Stay Firm
- Don’t invite negotiation with line items. If challenged, discuss scope and non-functional requirements, not “can we skip tests?”
- This increases trust: dates stop slipping from rework. Transparency lives in PRs—call out what reliability work was done without turning it into bargaining chips.
Use Spikes Liberally
- Uncertainty? Time-box a spike. Prove the seam to refactor, the boundary to extract, the test points to lock in. Then estimate.
Exploit AI Where It’s Strong
- Feed the model the current code + a crisp critique (“why this is bad”). Ask for a refactor plan, patch, and unit tests.
- Keep AI changes safe: limit to small, reviewable diffs; add tests; run static analysis; require green CI before merge.
- Compliance: use an approved internal model or a redaction proxy if external use is restricted; log prompts and diffs in the PR for auditability.
- Ownership: never auto-merge model output. You review, you own.
Prioritise The Nagging Stuff
- When you feel the itch to create “error handling” tickets—think retries, idempotency, edge-case validation, and observability—that’s your cue to do it now.
Negotiate With Risk, Not Virtue
- If a PM wants it earlier: the earliest date is the date you gave. To go faster, cut scope or loosen non-functionals (e.g., “acceptable to be slower”, “defer analytics”, “basic UX, polish later”).
- Options live in scope and non-functionals, not in skipping basic quality.
Keep Improvements Surgical
- Scope refactors to the surface you’re already touching. No hero rewrites. If it balloons, stop and propose a separate initiative.
Scripts For Pushback
- “This is the four-day date including reliability and tests. If you want faster, we can cut scope or relax non-functionals. Which lever do you prefer?”
- “We can ship a brittle version in three days, but defect risk and rework increase. Your call—do we want smaller scope now, or higher risk?”
- “Options live in scope and non-functionals, not in skipping basic quality.”
- “We optimise for long-lived code. This estimate reflects that.”
AI Workflow You Can Use Tomorrow
- Refactor brief: “Read these files. Smells: duplicate parsing, side-effects in validators, tight coupling to HTTP. Propose a plan to extract a pure core and thin adapters.”
- Patch request: “Apply step 1 of your plan. Keep behaviour identical. Add unit tests for branches A/B/C and property-based tests for invariants X/Y.”
- Edge handling: “Add retries with backoff for outbound calls, make operations idempotent, and emit structured logs with correlation IDs.”
- Review assist: “Explain this diff, list risky changes, and suggest extra tests.”
You still own the review. As always, own the code that AI produces for you.
Where This Applies (And Doesn’t)
- Use this: Teams building long-lived codebases—scale-ups, enterprise product teams, platforms that will survive multiple roadmaps.
- Think twice: Early-stage startups or high-pivot contexts where the codebase may be discarded. Ship learning, not legacy.
IMO: This applies to all industries. Even regulated teams benefit when “done” includes correctness by default; you’ll pass audits more easily. Keep full traceability: PR checklist, test artefacts, and change‑control notes.
Legacy Modules: Use Judgement
- Don’t rewrite everything you see.
- If touching a gnarly area, stabilise the seam you need: add tests, extract one function, add logging, stop.
- If “doing it right” spills beyond your change, spike it and propose a separate track.
Metrics That Matter
- Defects: escaped defects, reopen rate. Aim downwards.
- Human signals: developer happiness / sprint health checks. Aim upwards.
(Yes, cycle time might hold steady or even rise slightly—because you’re buying lower rework and saner on-call later. If defects and pager minutes go down while cycle time stays flat, your effective velocity has improved.)
Pitfalls To Avoid
- Secret buffers vs honest scope: Be transparent in your own head; outwardly, the estimate already includes quality. Don’t invent slack for procrastination.
- Refactor sprawl: If you can’t finish inside your task, stop. Propose a spike.
- Checklist theatre: A rigid Definition of Done becomes a bargaining chip. Prefer a few non-negotiable principles (“tests for touched logic, key errors handled, observable by default”) over a thousand boxes.
- Inflated baselines: Don’t secretly enlarge the “quick-and-dirty” number. Use past data (estimates vs actuals) to keep yourself honest.
- Junior gold-plating: Require code-owner sign-off for non-trivial cleanups and time-box refactors.
Minimal Non-Negotiables (Not A Full DoD)
- Tests for any logic you touched (happy + edge paths).
- Error handling for external calls (retries/backoff, idempotency).
- Observability where it matters (structured logs/metrics/traces).
- Small refactors that reduce obvious duplication or coupling.
Call It What It Is
This is not “padding.” It’s honest estimates that include the cost of durability. You’re reducing future risk and protecting roadmap velocity by making the codebase a better asset today.
Summary Of Takeaways
- Default to one estimate that includes quality—tests, edges, observability, small refactors.
- Steal the time up to 2× if that’s what “right” requires—usually it’s less.
- Use spikes to kill uncertainty; AI to compress refactors and tests.
- Negotiate with scope and non-functionals, not skipped quality.
- Keep refactors surgical; don’t rewrite the world.
- Measure defects and developer happiness to validate the approach.