I’ve worked with AI coding agents daily for 2 years. The first year I used them naively; the second year I built discipline. This post is that discipline, made concrete.
How much faster I got numerically doesn’t matter — what matters is that the quality of the code didn’t drop. If anything, I produced fewer bugs than the juniors coming into the codebase. That’s not a brag, and it’s not an ad for not using agents — it’s an observation about what they’re actually good for.
1. Make the agent plan first, approve it, then have it write
“Write me an auth module” = uncontrolled. The flow that works:
- “Draft a plan for the auth module. Endpoints, DB tables, edge cases.”
- Read the plan, ask questions, correct it.
- “Write it per the plan” — but step by step on large modules.
Don’t grant permission to write code without an approved plan. Ever.
2. Grant limited access
An agent doesn’t need to be able to run rm -rf. Prefer a sandbox:
- Filesystem read-only outside specific mounts.
- Network access only to specific endpoints.
- Access to production secrets never. Use a local mock or a scrubbed copy.
This isn’t an “the agent is malicious” assumption — it’s an “it can make mistakes” assumption.
3. Have it test, run it, roll it back
The agent wrote code. Now:
- Have it write tests (or run the existing ones).
- Run the tests. Its claim that they pass isn’t enough — actually see it.
- Do a manual smoke test. Trust your eyes, not the agent’s claims.
- When in doubt,
git checkout .— try again.
The agent’s summary isn’t reality. It’s not lying — it just assumes its own output is good. Verification is a human job.
4. Don’t skip the code review step
Code an agent produces should be reviewed like code a junior engineer produces. Don’t fall for the “the agent already checked it” fallacy. Especially:
- Security. SQL injection, XSS, auth bypass.
- Performance. N+1 queries, needless cache invalidation.
- Boundary conditions. Empty input, very large input, Unicode.
Static analysis is good for catching these, but the human eye is indispensable.
5. Keep the context
Don’t forget the agent “has no past.” Every session, you have to explain why you reached each decision. To automate that:
- I keep a live snapshot of state in the repo, in something like
CLAUDE.mdor.ssot/. - I write the important decisions there: “We use PostgreSQL because…, Redis runs in cache-only mode because…”.
- The agent reads this at the start of the session and makes more consistent decisions.
6. Don’t put the agent in place of a single engineer
Putting an agent in place of a senior engineer is wrong. Using the agent as an extension of a senior engineer is right. Telling juniors to just “have the agent do it” is usually a disaster — because:
- They don’t know how to read the agent’s output critically.
- They haven’t yet learned to ask the right question.
- They can’t yet feel the difference between “it works” and “it works correctly.”
The task you give a junior can be to audit the agent — but not to direct it.
7. You are responsible for every line of code that reaches production
This is the most important one. If you merged a SQL injection hole the agent wrote, the responsibility is yours. “The AI wrote it” is no excuse in a PR review.
In practice: don’t merge anything the agent produced until you’re sure it’s code you’ve actually read, understood, and could defend if you had to.
Conclusion
Agents are incredibly powerful tools. Seeing them as “magic wands that ship a feature in 30 seconds” is failure. Seeing them as “an intern that extends a senior engineer” is success. Which side you pick determines the quality of the software that comes out.
Two years on, I still follow the same discipline because it works.