The /start command is a 1,400 line markdown state machine with 15 numbered steps that takes a Linear ticket ID and produces a ready-to-test implementation. I type /start AUTM-123 and walk away. When I come back, there’s a feature branch, an implementation plan posted to Linear, working code, and passing tests.
The previous post
covered optimizing /commit for the 1M context window. This one covers /start - the command that runs before commit, and the one where defensive overhead had the most room to shrink.
What /start Actually Does
The 15 steps, in sequence:
- Parse arguments, check for existing sessions 0.5. Create 9 workflow tasks with dependency wiring
- Detect project type from working directory
- Pre-flight validation (clean worktree, valid repo)
- Validate ticket ID format
- Fetch ticket from Linear
- Fetch comments (for reopened ticket detection)
- Detect reopened tickets with feedback context
- Validate team-repository match, auto-switch if needed 7.1. Load Workflow Profile from CLAUDE.md 7.5. Detect session rules from user message
- Create implementation plan (via Plan subagent on Sonnet)
- Get user approval on plan
- Create feature branch
- Update Linear to In Progress, post plan comment
- Confirm branch ready
- Implement changes (the actual coding)
- Run quality gates 14.5. Verification gate
- Hand off for user testing
Steps 4 through 7.1 were the problem. Five sequential operations, each waiting for the previous one to complete, despite most of them being independent.
The Sequential Tax
I mapped the data dependencies between steps. The question was simple: which operations actually need results from previous operations, and which are just sequential because I wrote them that way?
Step 4 (fetch ticket) returns the UUID and team prefix. Step 5 (fetch comments) needs the UUID. Step 7 (validate team-repo) needs the prefix. Step 7.1 (load profile) needs TARGET_DIR, which Step 7 might change. So there’s a real dependency chain: 4 → 5, 4 → 7 → 7.1.
But Step 7.1 (reading CLAUDE.md) doesn’t actually need Step 7’s result in most cases. Most of the time, you’re already in the right repo. The profile read could start immediately - and if Step 7 later discovers a repo mismatch, the profile gets re-read from the correct directory.
This gave me the restructured flow.
Five Changes
1. Parallel Ticket Fetch + Profile Load
Before: Five sequential MCP/file calls across Steps 4, 5, and 7.1.
After: Two parallel messages.
Message 1 dispatches three calls simultaneously:
get_issue(Linear MCP)Read(CLAUDE.md)(profile load - moved from Step 7.1)- Project detection (already happened in Step 1, but formalizes the parallel structure)
Message 2 dispatches two calls that need the UUID from Message 1:
list_comments(needs UUID)- Team-repo validation (needs prefix)
The total wall-clock time drops from about 8 seconds of serial calls to about 3 seconds of two parallel batches. The profile load that previously waited until Step 7.1 now runs concurrently with the ticket fetch. If Step 7 later discovers a repo mismatch (rare), the profile gets re-loaded from the new TARGET_DIR - a small cost paid only in the uncommon case.
2. Conditional Task Creation
Step 0.5 creates 9 tasks with full dependency wiring. That’s 2 messages and 17 tool calls. At 200K tokens, this made sense - tasks survive context compaction, so they’re the recovery mechanism when Claude loses its place mid-workflow.
At 1M tokens, a single /start workflow rarely compacts. The 9 tasks are overhead for the majority of tickets.
I added conditions. Tasks are created when:
- Chain mode is active (multiple tickets need per-ticket tracking)
- The ticket is an epic child (complex, multi-file work)
- Resuming a previous session (tasks already exist)
For a simple single-ticket /start - the most common case - task creation is skipped entirely. If the plan turns out to have 5+ implementation tasks, tasks get created retroactively at that point.
The TaskUpdate calls throughout subsequent steps guard against the no-tasks case: if (t1) TaskUpdate(t1.id, ...). This is a no-op pattern - it adds no overhead when tasks exist, and silently skips when they don’t.
3. Reduced Checkpoints
The old /start wrote session state after Steps 7, 7.5, 8, 9, 10, 11, 12, 13, 14, and 15. That’s roughly 10 file writes during a single workflow run. Each write is cheap individually, but collectively they add up - and more importantly, they represent 10 assumptions that context might compact between any two adjacent steps.
With 1M tokens, I reduced to 3 critical checkpoints:
- After plan approval (Step 9) - the first irreversible decision
- After branch creation (Step 10) - git state is now established
- At handoff (Step 15) - implementation complete
Everything between these points can be reconstructed from git state and the Linear API if context does compact. The plan file is on disk. The branch exists in git. The ticket status is in Linear. The session file doesn’t need to track what’s already tracked elsewhere.
4. Inline Plans for Simple Tickets
Step 8 always dispatched a Plan subagent on Sonnet. The subagent explores the codebase, reads relevant files, and synthesizes an implementation plan. This keeps verbose search output out of the main context window - a significant concern at 200K tokens, less so at 1M.
For tickets where the description clearly scopes to 1-3 files - “fix the button color on the settings page” or “add a loading spinner to the dashboard” - the subagent dispatch overhead (10-15 seconds) isn’t justified. The main context has plenty of room for a few file reads and a short plan.
The new logic is conditional:
- 3 or fewer files in scope AND clear description → generate the plan inline using Glob/Grep/Read directly
- 4+ files, ambiguous description, reopened ticket, epic child → dispatch the Plan subagent as before
This trades a small amount of Opus context (the inline plan uses the more expensive model) for 10-15 seconds of wall-clock time on simple tickets. Since simple tickets are the majority, the aggregate savings are meaningful.
5. Early Profile Load
This isn’t a separate optimization - it falls out of Change 1. But it’s worth calling out because it changes the step ordering in a way that matters for the rest of the workflow.
The Workflow Profile (base branch, quality gates, review settings, deploy hints) previously loaded at Step 7.1 - after team-repo validation. Now it loads in Message 1 of the parallel fetch, alongside the ticket fetch. The Step 7.1 heading still exists in the command file for documentation purposes, but it notes that execution moved to the parallel fetch.
This means the profile is available earlier. Steps 2-3 (pre-flight validation) still use the basic PROJECT_TYPE from Step 1, but anything from Step 4 onward has full profile access. No behavior change in normal flow - the profile was always available by Step 8 when it was first needed - but the earlier load eliminates a category of bugs where profile fields are referenced before the profile is parsed.
What I Didn’t Change
The session state schema stays the same. All those fields - stashCreated, stashOriginalBranch, chainInvokingDir, ticketContext - are still valuable when context does compact. The schema isn’t the problem. Writing it 10 times per workflow was.
The Plan subagent stays for complex tickets. The 1M window is large but not infinite. A thorough codebase exploration for a 10-file feature change produces thousands of lines of Grep and Read output. Keeping that in a subagent’s context (on cheaper Sonnet) rather than the main context (on Opus) is still the right trade-off for complex work.
The verification gate at Step 14.5 stays mandatory. This is the one that catches skipped unit tests - a real problem I documented after a production incident. Context compaction making Claude skip tests was the original motivation, and even with 1M tokens reducing compaction frequency, the verification gate costs seconds and prevents hours of debugging.
Performance Target
| Metric | Before | After |
|---|---|---|
| Ticket fetch + profile load | ~8s serial | ~3s parallel |
| Task creation (simple tickets) | ~4s | 0s (skipped) |
| Checkpointing overhead | ~5s (10 writes) | ~2s (3 writes) |
| Plan for small tickets | ~25s (subagent) | ~10s (inline) |
| Total | - | ~12-22s saved per /start |
The range depends on ticket complexity. Simple tickets save the most (inline plan + no tasks = ~22s). Complex tickets save the least (subagent + tasks + full checkpoints = ~12s from the parallel fetch alone).
The Pattern
The same lesson from the /commit optimization
applies here: defensive machinery built for one constraint persists after the constraint changes. The difference with /start is that the machinery is more deeply embedded. Session checkpoints aren’t a single function call - they’re woven into the control flow between every pair of steps. Removing them required reasoning about what’s reconstructable from external state (git, Linear) versus what exists only in the session file.
The next two posts cover /staging and /production - the deployment commands. The pattern shifts there from “remove defensive overhead” to “parallelize independent external checks” - Sentry queries, smoke tests, worktree resets. Different shape of optimization, same underlying principle.