skip to content
Dawid Rycerz

Me and coding agents

I’ve been using AI for development since Codex-based GPT-3 was released - when was that? 2021 I think? I tried many tools: browser UIs with typing/cloning, open-source CLIs, more capable agent-style tools in containers through editor plugins, and full “AI Editors” together with the most capable models of their times.

Through all that time I was polishing my setup to get the most from those tools and enhance my work. The latest tries were with CLI-based coding agents - gemini, copilot, codex and claude code - and I got the best results with my claude code setup.

I’ve been using Claude Code for a few months now, and I think I nailed it. It can deliver more or less ~80% of what I want, help with boring planning tasks, prepare estimations based on my recent tasks history, and do bureaucracy tasks, so I can focus on processes, technology, architecture and leadership.

Below are parts of a presentation I made for my team colleagues during one of our KT sessions.

Context Management

This is the most important skill when working with Claude Code. I call it a skill because it requires experience. You need to size your tasks for a single session: not too small (waste tokens on planning), and not too big (won’t fit and won’t finish).

  • Check context usage with /context command
  • Don’t let it self-compact - you lose control over what stays and what goes
  • Run /compact yourself when getting close to the limit
  • Use /clear between unrelated tasks

Think of context as your agent’s working memory. Polluted context means confused agent and bad output.

Models - Pick the Right One

In the console you can choose /model - that will be running your task, or you can ask it to run a task with a specific model.

ModelBest forCost
OpusComplex reasoning, planning, deep analysis$$$
SonnetMost everyday coding tasks$$
HaikuExploration, searches, simple subtasks$

All models share the same ~200k context window (as for today: 11 Feb 2026). Opus is better at reasoning within that window, not bigger. Sometimes there are longer-context releases too, which can be great for long-running tasks.

My approach: use Opus for the main flow and spawn subagents with Haiku or Sonnet for subtasks. This saves money and keeps the main context clean.

Commands and Skills

Claude Code supports saved commands and skills (custom prompts). I wrote a couple of my own and reused some found on awesomeclaude.ai. Here are the ones I use most:

/plan-task - plan a task implementation
/commit-smart - atomic git commits
/interview-me - interview for plan details
/code-review - review a merge request
/compact - manual context compaction
/clear - reset context
/delegate - spawn codex/gemini session to do stuff

Keep plugins minimal. Every plugin adds noise to the context.

Memory and Knowledge

Claude Code has several knowledge sources:

  • AGENTS.md / CLAUDE.md - local project knowledge (conventions, structure, tools).
  • Memory (~/.claude/memory/) - things to remember across sessions
  • MCP Servers - external search, docs, APIs
  • Web Search - up-to-date documentation and references

AGENTS.md is “here’s how this project works”. Memory is “here’s what I learned across sessions”. MCP/Search are external sources.

MCP Servers

I don’t use MCP too much. I prefer giving simple cli tools + a short explanation of how to use them. But one I mostly use is:

docs-mcp-server - it creates a local vector DB with preloaded docs of libs I use, so the agent can search up-to-date docs.

Subagents

Subagents are subtasks that share your session but have their own context. Use them to:

  • Explore the codebase without polluting the main context - they will return with summary of findings.
  • Run searches with cheaper models (Haiku/Sonnet).
  • Delegate research while you focus on the plan.

Imagine that subagents are interns/juniors you send to gather information while you think about the bigger picture.

Task Sizing

Finding the right task size matters a lot:

  • Too small: “change line 42” - waste of resources, you could do it faster
  • Too big: “refactor the entire authentication system” - won’t fit in a single context session
  • Just right: “add retry logic to the API client with exponential backoff, update tests” - clear scope, achievable, testable

Plan First, Always

This is my biggest takeaway. Before any coding:

  1. If the task is too big - split it into smaller pieces
  2. Keep a SPRINT.md file with todo/done tasks
  3. During planning, spawn subagents with cheaper models for exploration
  4. Read the plan carefully - adjust, question, discuss details

The plan must be clear to you and accepted before a single line of code is written.

Here’s the SPRINT.md template I use:

# Sprint Backlog
## Current Sprint: [Sprint Name/Number]
Started: [Date]
Goal: [One-line sprint goal]
## In Progress
- [ ] **[TASK-001]** Task description
- Assignee: Claude
- Started: [Date]
- Notes: Any blockers or context
## Backlog (Prioritized)
- [ ] **[TASK-002]** High priority task
- [ ] **[TASK-003]** Medium priority task
- [ ] **[TASK-004]** Lower priority task
## Completed This Sprint
- [x] **[TASK-000]** Initial setup
- Completed: [Date]
- PR/Commit: [reference]
## Blocked
<!-- Tasks waiting on external dependencies or decisions -->
## Sprint History
### Sprint 1 - [Date Range]
- Completed: 5 tasks
- Carried over: 2 tasks
- Notes: [retrospective notes]

Give Clear Instructions

Always provide:

  • Clear goal - what should the end result look like?
  • Location - where to make changes (if you know)
  • Tools - what tools to use (pytest, npm, etc.)
  • Docs/Links - any relevant documentation
  • Success criteria - how to test, what “done” means

Here are some real examples from my sessions:

Multi-perspective code review - clear roles, clear deliverable:

Spawn team of teammates to analyze state of code:
- Golang master to check golang code state and conventions
- Security analyst to check security of that project
- QA engineer to check testcases state here
- Technical writer to see documentation state.
On the end write me some summary from review.

Early design exploration - context, location, diverse perspectives:

I'm designing a CLI tool - it's in very early design phase
written down in README.md.
Create an agent team to explore this from different angles:
one teammate on UX, one on technical architecture,
one playing devil's advocate.

Content update with constraints - source material, translation requirement:

Update Polityka Prywatności site with new statement that
we host on statichost.eu. English statement is here,
but on this website must be in Polish.
This website is hosted on statichost.eu, a privacy-focused
hosting platform that does not store any personal data.
[rest of English text to translate]
Full details: https://www.statichost.eu/privacy/

E2E testing - shared task file, environment, scope:

Check SPRINT.md e2e tests tasks and plan usecases.
I'd like to run on debian 13. Part of plan should be
preparation of scripts in tmp/ dir (gitignored).
Your task will be to prepare scripts and run them
inside container.

My Daily Workflow

Claude Code has three modes that I switch between during work:

ModeWhat it does
RegularAsk permission for each edit
Auto EditApply edits automatically, ask for commands
PlanResearch and plan only - no edits until approved

Phase 1: Sprint Planning

I start by describing the feature I want to build. I give Claude my high-level goal, what I know about the codebase and my idea for implementation. Then I ask it to create a SPRINT.md with tasks, sorted by priority. Early on I also ask it to set up testing and safety nets - pre-commit hooks with gitleaks for secret scanning, extensive linters, and test execution.

Phase 2: Task Execution

For each task from the sprint:

  1. /clear - fresh context
  2. /plan-task - plan in Plan mode with parallel “scouts” subagents gathering context
  3. Read the plan carefully, ask for adjustments
  4. /interview-me - fill in gaps
  5. Accept plan, switch to Auto Edit mode
  6. Watch it work (or do something else in the meantime - another Claude session and another sprint!)
  7. Make sure tests pass
  8. Update SPRINT.md

Phase 3: Commit and Push

  1. /commit-smart - atomic commit
  2. Ask “What’s the next task?”
  3. Repeat Phase 2 until sprint is done
  4. Remember: /clear or /compact between tasks!
  5. Clean up commit history if needed
  6. Push to remote

Phase 4: Review and Ship

  1. Check CI pipelines
  2. /clear - fresh context
  3. Run code review, give the MR ID and tell what to focus on
  4. Fix any findings
  5. Send to human reviewer
  6. Done

What if all fails?

Failing is part of process, using git and doing all those small commits, keeping new features in branches is crucial. If it fails - just checkout to commit that worked, create new branch, and try again, if fails again, try again giving him in context his previous failures. If fails third time… It’s time you do your “20%” of job :)

Security

Last but not least - keep blast radius small, run claude in isolated environment like vm, container - some sort of isolation where claude will have access only to things that you want him to access. I personally use incus vms and devcontainers for that.

If you need to reach external api - give minimal required access, preferably read only. Treat coding agent like an external employee that does a specific task. Would you give him access to your personal laptop?

The best way to learn Claude Code is to use Claude Code to learn Claude Code.


Further Reading

  • Claude Code Workflow by silennai - great breakdown of the five pillars of agentic coding
  • The 80% Problem in Agentic Coding by Addy Osmani - why AI generates 80% of code but the last 20% is where the real work is
  • Just Talk to It by Peter Steinberger - simple and practical advice on working with AI agents
  • the-startup - spec-driven development framework for Claude Code with multi-agent workflows
  • Claude limits - quick notes about Claude limits / context and practical constraints