
So, you’ve tried running multiple Codex agents at once, and suddenly you’re staring at your repo thinking, “Which one just messed everything up?” Yeah… I’ve been there. I’m Hanks — I throw AI tools into real projects, break stuff on purpose, track the failures, and rebuild systems from scratch. The past two weeks, I ran Codex macOS App through parallel feature builds, hotfixes, and background automations — not demos, not toy examples — real multi-file migrations where one wrong move leaves you hunting through three branches to figure out which agent touched what.
My real question wasn’t “can Codex run multiple agents?” — the docs already say yes. What I needed to know: can I actually trust this setup when agents overlap on the same code, or am I asking for merge hell? After 47 parallel tasks across six repos, here’s what actually stuck.

I kept coming back to this: if you can't instantly answer "which agent is editing what right now," your setup is already broken.
Codex gives you two core isolation tools:
Here's the distinction that saved me: threads don't automatically isolate your changes. If you start three threads all targeting "Local" (your main checkout), they're all touching the same files. That's where worktrees come in.
A Git worktree is just a separate directory checkout of your repo. Same history, different working copy. Each worktree can be on a different branch, in a detached HEAD state, or wherever you need it — and changes stay isolated until you explicitly merge.
In Codex App (launched February 2, 2026), when you start a thread on a worktree, it spins up a fresh directory in $CODEX_HOME/worktrees/, checks out your target branch (or creates a detached HEAD), and runs there. Your main checkout? Untouched.

Testing moment: I had three agents running — one refactoring auth logic, one updating API responses, one fixing linter issues. First run, I set all three to "Local." Twenty minutes in, I couldn't tell which changes belonged to which task. Second run, I used worktrees. Each agent's diff stayed clean, isolated, and traceable.
That's when I stopped treating threads as "parallel tasks" and started treating them as "parallel sandboxes."

Not every task needs isolation. If you're running a quick one-off ("fix this typo," "update that docstring"), spinning up a worktree is overhead you don't need.
But worktrees become essential when:
Real test: I ran a dependency upgrade (agent A on worktree) while fixing test failures (agent B on local). Agent A took 28 minutes. Without worktrees, I'd have been blocked from committing B's fixes until A finished. With worktrees, I merged B's PR, then came back to review A's changes when it landed in the review queue.
Here's what I learned the hard way: worktrees don't prevent conflicts — they just delay the merge point. If two agents modify the same function, you'll still hit a conflict when you try to combine them.
The fix isn't better tooling. It's better task boundaries.
My splitting rule: If I can't describe the task scope without saying "and also," it's two tasks.
Good task split:
Bad task split:
Agent C will touch both boundaries. If another agent is working on either side, you're setting up a collision.
I started writing task prompts like this:
Scope: Only files in src/auth/*
Constraint: Do not touch src/api/* or shared error types
Success: Auth tests pass, no changes outside src/auth/
Codex respects scope constraints better than I expected — but only if you're explicit.
One failure case: I asked an agent to "clean up unused imports across the codebase" while another was adding new imports for a feature. Both agents touched the same files. Merge conflict. The lesson: never run cleanup tasks in parallel with feature work unless you're fine with manual resolution.
When you start a new thread in Codex App, you pick a target:
I messed this up twice before building a decision tree:
Testing moment: I was running a database schema migration agent. First attempt: Local target. Agent generated migration files, but then I needed to test them, which meant running my local dev server. But the agent was still running commands in the same directory. I had to cancel, stash, test manually, then restart.
Second attempt: Worktree target. Agent ran in isolation. I could test the migration in the worktree's terminal, see it fail, give feedback, let the agent iterate — all while my main checkout kept running my normal dev environment.
According to OpenAI's announcement, each agent on a worktree can run for up to 30 minutes, isolated from your main workspace. That time limit matters — if your task needs longer, you might hit the wall and have to restart.
Let's say you started a thread on Local, but halfway through you realize it should've been a worktree. What now?
Option 1: Commit and restart If the agent hasn't made breaking changes:
git add .
git commit -m "WIP: [task name]"
git worktree add ../task-name-worktree
git cherry-pick HEAD # Apply the WIP commit to the worktree
Then start a new thread targeting that worktree.
Option 2: Stash and isolate If changes are incomplete:
git stash
git worktree add ../task-name-worktree [branch]
cd ../task-name-worktree
git stash pop # Apply the stashed changes here
Resume the task in the worktree context.
What I actually did: I just let the local thread finish (it was 80% done), reviewed the diff, committed to a feature branch, then deleted the local changes. Not elegant, but practical.
The real fix is deciding target before you start. I now have a 10-second checklist:
Worktrees accumulate fast. After a week of testing, I had 11 worktree directories scattered across $CODEX_HOME/worktrees/. Half were finished tasks I'd forgotten to clean up.
Git's official worktree documentation recommends removing worktrees as soon as you're done with them, especially if they contain uncommitted work. Codex App automates some of this — if you don't "pin" a worktree by creating a branch on it, Codex will eventually clean it up. But "eventually" isn't a strategy.
I started naming worktrees by task type + timestamp:
refactor-auth-0203
bugfix-api-timeout-0204
experiment-cache-layer-0203
This made it obvious which worktrees were stale. If I saw experiment-cache-layer-0203 on February 5th and couldn't remember what it was testing, I knew it was safe to delete.
Codex App's default: It creates worktrees in detached HEAD state with auto-generated names. That's fine for throwaway work, but if you want to keep the worktree around (for iteration or handoff), use the "Create branch here" button in the thread header. This converts the worktree into a real branch, pins it, and adds it to your sidebar for easy access.
List active worktrees:
git worktree list
Remove a finished worktree:
git worktree remove ../worktree-name
If it has uncommitted changes, add --force:
git worktree remove --force ../worktree-name

Testing moment: I had a worktree where the agent made changes, I reviewed them, but never committed. When I tried to remove it, Git refused (protecting uncommitted work). I had to either commit the changes or force-remove. The lesson: always commit or explicitly decide to discard before cleanup.
Sometimes you start work on a worktree, but then need to bring changes back to your main checkout (e.g., to run integration tests that require your full local environment).
Option 1: Merge the branch If you created a branch on the worktree:
git checkout main # In your local checkout
git merge worktree-branch-name
Option 2: Cherry-pick specific commits If you only want some changes:
git log # In the worktree, find the commit hash
git checkout main # In your local checkout
git cherry-pick <commit-hash>
What worked best: I treated worktrees as staging grounds. Agent finishes work → I review the diff in the worktree → if it's good, I create a branch → open a PR → delete the worktree after merge.
If I needed to test changes locally before merging, I'd checkout the worktree's branch in my main directory:
git checkout worktree-branch-name
Then test, then switch back:
git checkout main
This keeps the worktree clean for the agent's continued iteration, while giving me local access when I need it.
After two weeks, here's what stuck:
refactor-auth > feature/auth-refactor > random hash. Future you will thank present you.git worktree list daily and process finished tasks.At Macaron, we see this pattern all the time: people start with conversation-style task delegation, then hit the reality that ideas need structure to become work. Our agent helps translate context into execution steps — not by running code for you, but by building the scaffold so you know what needs doing and in what order. If you're juggling parallel tasks and want a system that keeps your execution plans from fragmenting, you can test it inside real workflows without locking into anything permanent.