Context Engineering
How to convert team criteria into persistent, operational instructions — AGENTS.md, skills, and context hygiene for AI-assisted development.
The quality of an agent’s output is directly proportional to the quality of the context it receives. That’s not a prompt engineering problem — it’s an engineering problem. Context needs to be designed, maintained, and versioned like any other artifact.
This page is about the practical side: how to encode your team’s standards into persistent, reusable instructions that the agent loads automatically, every session.
Context Engineering vs. Prompt Engineering
Prompt engineering is per-session: you write a good prompt for a specific task. Context engineering is structural: you design what the agent always knows, so you don’t have to repeat it. AGENTS.md is context engineering; the message you type to start a task is prompt engineering.
AGENTS.md — the foundation
AGENTS.md is a file placed at the project root. Most modern agents (Claude Code, Cursor, Codex) load it automatically at the start of every session. It’s the highest-leverage context investment you can make: write it once, never repeat yourself.
Please check support for your AI tool.
A well-crafted AGENTS.md contains everything the agent needs to work within your team’s standards without being reminded:
- Tech stack and quick start — language, frameworks, how to run and test the project
- Architecture rules — layer structure, dependency direction, port/adapter conventions
- Naming conventions — classes, files, events, commands
- Commit format — Conventional Commits types and scope rules
- Testing requirements — what must be tested, at which layer, in which format
- Negative instructions — what the agent must never do (the most underused and most valuable section)
- Quality checklist — what to verify before marking a task complete
Negative instructions
Negative instructions are consistently the highest-signal lines in an AGENTS.md. They encode patterns you’ve already been burned by — failures the agent will repeat forever without explicit prohibition.
## Never do this
- NEVER use `any` types in TypeScript — use `unknown` and narrow with type guards
- NEVER add infrastructure imports inside src/domain/ or src/application/
- NEVER commit .env files or secrets
- NEVER use console.log in application code — use the injected logger
- NEVER modify files in /config/secrets/
- NEVER write tests that call private methods directly
- NEVER create a service that does more than one thing
Each line on this list came from a real error. That’s why it works.
Full AGENTS.md template
# Agent Instructions
## Tech Stack
[Language + version, main framework, database, package manager]
## Quick Start
[install → dev → test → build commands]
## Architecture Rules
- Domain: src/domain/ — business logic, no external imports ever
- Application: src/application/ — use cases and port interfaces
- Infrastructure: src/infrastructure/ — port implementations
- Dependencies always point inward: infrastructure → application → domain
- All external dependencies go through Port interfaces in src/application/ports/
- NEVER instantiate DB clients or HTTP clients inside domain or application
## Naming Conventions
- Classes: PascalCase | Interfaces: PascalCase (no I prefix)
- Files: kebab-case | Domain folders: kebab-case, plural
- Events: past tense (OrderCreated) | Commands: imperative (CreateOrder)
- Constants: SCREAMING_SNAKE_CASE
## Commit Format
All commits MUST use Conventional Commits:
feat(scope): subject
fix | refactor | test | docs | chore | perf | ci | style
## Testing Requirements
- Domain logic: unit tested, no database or HTTP
- Repository implementations: integration tested with real DB
- Every OpenSpec scenario: has a corresponding Gherkin feature file
- Test behaviour, not implementation — never call private methods in tests
## Never do this
- NEVER use `any` types
- NEVER add infrastructure imports to domain or application layers
- NEVER commit .env files
- NEVER use console.log — use the logger service
- NEVER write tests that verify SQL queries or internal method calls
## Quality Checklist (verify before marking task complete)
- [ ] No infrastructure imports in domain or application
- [ ] All dependencies are constructor-injected as interfaces
- [ ] Commit messages follow Conventional Commits format
- [ ] Tests cover the behaviour implemented, not the implementation
Claude-specific: CLAUDE.md
For Claude Code specifically, place a CLAUDE.md in the project root (or .claude/CLAUDE.md). This file adds Claude-specific behaviour on top of the general AGENTS.md rules:
# Claude Instructions
You are a senior engineer at Aircury following our internal Framework.
## Before implementing any task
1. Read the relevant spec in openspec/changes/<name>/specs/
2. Check design.md for architecture decisions specific to this change
3. Verify your plan against AGENTS.md architecture rules before writing code
## When creating files
- Follow the Hexagonal Architecture structure from design.md
- Domain classes: no infrastructure imports, ever
- All external dependencies: constructor-injected as interfaces
## When writing tests
- Unit tests for domain logic (no database, no HTTP)
- Integration tests for adapter implementations (use test database)
- BDD/Gherkin for user-facing scenarios from the spec file
## Commit discipline
- Commit after each logical unit of work — don't batch everything at the end
- If a task touches multiple layers, commit each layer separately
Skills — team criteria as reusable instructions
Skills are named instruction sets you can invoke by name, rather than typing context from scratch each session. Where AGENTS.md covers what’s always true, skills cover what’s true for a specific type of task.
Examples of skills worth encoding:
| Skill | What it does |
|---|---|
/propose | Scaffolds an OpenSpec proposal from a one-line description |
/spec | Generates spec files from a proposal with SHALL/WHEN/THEN format |
/review | Reviews a diff against the AGENTS.md architecture checklist |
/adr | Creates an ADR from a design decision made in the current session |
/feature-flag | Wraps a new feature in a flag following the project’s feature flag pattern |
Skills encode the judgement the team has accumulated — the steps you’d take anyway, made repeatable and invokable. They’re the difference between an agent that understands your process and one that makes up a process on the fly.
Context hygiene
Context decay
After a long session, agents lose track of constraints defined early in the conversation. Architecture rules agreed in message 1 may be silently violated by message 40. This is the biggest source of quality drift in long AI sessions.
Rules for managing context over long sessions:
- Put rules in AGENTS.md, not in chat. Rules injected through chat disappear when the context window fills. AGENTS.md persists.
- Start fresh context for each major task. Don’t carry context from a completed feature into a new one.
- Re-state key constraints when switching files. “Now implementing the repository adapter — architecture rules apply as defined in AGENTS.md” is enough.
- Check architecture early, not just at the end. By the time the full feature is implemented, architectural mistakes are expensive to unwind.
- Commit incrementally. Committing after each task forces a review point and keeps the implementation scope narrow.
The context pipeline
There are four layers of context the agent uses, in order of persistence:
| Layer | What it is | Persistence |
|---|---|---|
| AGENTS.md | Project-wide standards and rules | Permanent |
| Design document | Architecture decisions for this change | Per-change |
| Spec file | What the system must do | Per-capability |
| Task prompt | What to implement right now | Per-session |
Each layer narrows the agent’s solution space further. AGENTS.md rules out entire classes of architectural mistakes. The design document defines the specific structure for this change. The spec defines the behaviour. The task prompt gives the immediate scope.
When all four layers are present and consistent, the agent’s output is highly predictable. When any layer is missing, variance increases.