Skip to main content
Tech Insights 29 January 2026 14 min read

How to Orchestrate AI CLI Agents with Claude Code

Claude Code can run Gemini CLI, Claude CLI, and Codex CLI as background tasks in parallel, turning hours of manual research into minutes of automated, cross-validated output. This step-by-step guide shows how to set up the orchestration pattern and build reusable workflows.

MM
Mark McNeece Founder, 365i
Claude Code as a central orchestrator coordinating Gemini CLI, Claude CLI, and Codex CLI agents in parallel
At a Glance 14 min read
  • Claude Code can orchestrate Gemini CLI, Claude CLI, and Codex CLI as parallel background agents, reducing research tasks from hours to 10-15 minutes.
  • Each agent has a specialism: Gemini for web research with Google search, Claude CLI for local file analysis and validation, Codex for code generation and review.
  • Cross-validation between agents catches hallucinated data and broken source URLs before they reach your final output.
  • All three tools offer free tiers: Gemini CLI gives 1,000 requests/day, Claude Code requires a Pro subscription, and Codex CLI is free and open source.
  • Orchestration workflows can be saved as reusable Claude Code custom commands for repeatable tasks like competitor analysis or technical audits.

If you have used Claude Code for any length of time, you have probably noticed it can run Bash commands in the background while continuing to work on other things. That is not just a convenience feature. It is the foundation for turning Claude Code into a full orchestrator, a central supervisor that delegates specialist tasks to other AI CLI tools running in parallel, collects the results, validates them, and synthesises everything into a finished deliverable.

This guide walks you through the practical workflow of orchestrating Gemini CLI, Claude CLI, and OpenAI's Codex CLI from within Claude Code. Not because it is a neat party trick, but because it collapses hours of manual research and validation into minutes of parallel AI execution.

Why Orchestrate Instead of Using One Tool?

A single AI tool is good. Three specialist tools running simultaneously changes everything. Here is the practical difference:

Single Agent vs Orchestrated Workflow
Approach Research Time Validation Quality
Manual research 3-4 hours Manual cross-checking Depends on fatigue
Single AI agent 30-45 minutes None (trust the output) Risk of hallucination
Orchestrated agents 10-15 minutes Automated cross-validation Multiple perspectives

The speed gain alone is worth it. But the real advantage is quality: when Gemini researches a topic and Claude CLI independently validates the findings, you catch hallucinated statistics and broken source URLs before they reach your final output. One agent checking another agent's work is far more reliable than trusting a single source of AI-generated information. If you are deciding which AI tools to include in your toolkit, our sister site's Claude vs ChatGPT comparison covers the coding and creative writing trade-offs between the two leading models.

Total time: About 1 hour (first time), 15 minutes once familiar
Estimated cost: Free (all tools have free tiers)

Tools you'll need

  • Claude Code: The orchestrator (desktop app or CLI)
  • Gemini CLI: Web research agent with built-in Google search
  • Claude CLI: File analysis and validation (bundled with Claude Code)
  • Codex CLI: Code analysis and generation (free, open source)
  • Terminal / PowerShell: For running commands

What you'll need

  • Anthropic API key (free tier available with Claude Pro)
  • Google API key for Gemini (free tier: 60 requests/minute)
  • Internet connection for API access

Setup

Before you can orchestrate anything, you need the three CLI tools installed and verified. This takes about ten minutes.

Step 1: Install Your AI CLI Toolkit

Three terminal windows showing Claude Code, Gemini CLI, and Codex CLI installed and responding to test prompts
Each AI CLI tool installed and responding to a basic test prompt, confirming connectivity.

Install each tool and verify it works with a simple test prompt. Claude CLI comes bundled with Claude Code, so you only need to install three things separately.

Claude Code: Download from code.claude.com or install via npm:

npm install -g @anthropic-ai/claude-code

Gemini CLI: Install via npm (requires Node.js 18+):

npm install -g @google/gemini-cli
# Then verify:
gemini -p "What is 2+2?"

Codex CLI: Install from the OpenAI Codex GitHub repository. To enable web search (useful for research tasks), add this to your ~/.codex/config.toml:

[features]
web_search_request = true

Or use the --search flag when running commands. Full feature reference: developers.openai.com/codex/cli/features.

Tip: Run a test prompt on each tool before proceeding. gemini -p "Hello" and codex exec "Hello" should both return responses within seconds. If either fails, check your API key configuration.

Step 2: Understand What Each Agent Does Best

Diagram showing four AI agents with their specialist capabilities: Claude Code for orchestration, Gemini for research, Claude CLI for validation, Codex for code
Each agent has distinct strengths that make it the best choice for specific task types.

Effective orchestration means giving each agent the tasks it handles best. Here is how the capabilities break down:

AI CLI Agent Capabilities Comparison
Agent Best For Non-Interactive Syntax Key Advantage
Claude Code Orchestration, file editing, complex reasoning Interactive (the orchestrator) Background tasks, multi-step workflows
Gemini CLI Web research, real-time data, large context gemini -p "prompt" Built-in Google search, 1M token context
Claude CLI Local file analysis, validation, structured output claude -p "prompt" Local file access, WebFetch for URL checking
Codex CLI Code generation, review, refactoring codex exec "prompt" Sandboxed code execution, web search with --search

"I think AI agent workflows will drive massive AI progress this year, perhaps even more than the next generation of foundation models."

- Andrew Ng, Founder, DeepLearning.AI, The Batch, March 2024

Ng's insight is exactly what makes orchestration valuable in practice. Rather than waiting for one AI model to become perfect at everything, you get better results right now by connecting specialist agents that each do one thing well. Gemini is excellent at web research because it has native Google search. Claude CLI is excellent at reading your local project files and validating data. Codex excels at code-level analysis. Claude Code ties them together. When Anthropic's CEO predicted at Davos that AI would handle most software engineering within months, this kind of multi-agent orchestration is what makes that prediction plausible. And each model upgrade, like Claude Opus 4.6's improved reasoning and writing, makes every agent in the chain more capable.

Design Your Workflow

Before launching any agents, plan which tasks can run in parallel and which depend on earlier results. This is where most of the productivity gain comes from.

Step 3: Map Your Tasks to the Right Agent

Flowchart showing different task types being routed to their optimal AI agent based on requirements
Route each task type to the agent best equipped to handle it.

Take your overall project and break it into discrete tasks. For each task, ask: does this need web access, local file access, code analysis, or coordination?

Here is a practical example. Suppose you are researching a topic for a blog post. Your task breakdown might look like this:

  • Web research on the topic → Gemini CLI (has Google search)
  • Find expert quotes with source URLs → Gemini CLI
  • Check for duplicate content on your site → Claude CLI (reads local files)
  • Find internal link opportunities → Claude CLI (reads your articles database)
  • Validate quotes and URLs from research → Claude CLI (WebFetch to verify)
  • Analyse competitor code/structure → Codex CLI
  • Write and coordinate the final output → Claude Code (the orchestrator)

The first four tasks can all run simultaneously. The validation task depends on research completing first. The writing depends on everything else. This dependency mapping tells you what to launch in parallel and what to hold back.

Step 4: Write Effective Delegation Prompts

Code editor showing a well-structured delegation prompt with annotations highlighting key sections
A delegation prompt must be entirely self-contained. The receiving agent has no access to your conversation history.

This is the most important step, and the one most people get wrong. Each delegated agent runs in complete isolation. It has no access to your Claude Code conversation, no knowledge of your project, and no memory of previous tasks. Every prompt must be entirely self-contained. If you want to automate context gathering for these prompts, Dynamic Context Injection can inject live project data into your skill files before Claude even sees them.

A Gemini research prompt:

gemini -p "Research the current state of AI CLI tool adoption
in 2026. Find 5-10 recent statistics with source URLs.
Include data about developer productivity gains from
multi-agent workflows. Format each statistic as:
'[Stat]' - [Source Name] ([Source URL])"

A Claude CLI validation prompt:

claude -p --allowedTools "Read,WebFetch" "Read the file at
'website/includes/data/articles.php'. Check if any existing
article covers AI orchestration or multi-agent workflows.
Return: DUPLICATE FOUND or NO DUPLICATE with details."

A Codex CLI code analysis prompt:

codex exec "Review the JavaScript files in src/components/
for accessibility issues. List each issue with file path,
line number, and suggested fix."

"The Unix command line turns out to be the perfect environment to play around with this new cutting edge technology, because the Unix philosophy has always been about tools that output things that get piped into other tools as input."

- Simon Willison, Creator of Datasette, LLMs on the Command Line, 2023

Willison nails it. The command line was built for exactly this kind of tool composition. Each AI CLI tool takes a prompt as input and returns structured output, the same pipe-and-filter pattern that has powered Unix for fifty years. Claude Code just adds an intelligent coordination layer on top.

Tip: Always specify the output format you want. If you need structured data, say so explicitly: "Return results as a numbered list" or "Format as JSON." This makes it much easier for Claude Code to parse and combine results from multiple agents.

Run the Orchestration

With your tasks mapped and prompts written, it is time to launch everything in parallel.

Step 5: Launch Parallel Background Tasks

Terminal showing multiple AI agent tasks launched simultaneously with background process indicators
Three research tasks running in parallel (Gemini researching, Claude CLI checking files, Codex analysing code) while Claude Code continues working.

In Claude Code, you launch background tasks using the Bash tool with run_in_background: true. This tells Claude Code to start the command, give you a task ID, and continue working on other things while it runs.

Here is what that looks like in practice. From within a Claude Code session, you would launch all your research tasks simultaneously:

# Task 1: Gemini does web research (background)
gemini -p "Research [your topic]..." &

# Task 2: Claude CLI checks local files (background)
claude -p --allowedTools "Read,Grep,Glob" "Check for..." &

# Task 3: Codex analyses code (background)
codex exec "Review the code in..." &

Claude Code tracks each background task with a unique ID. While all three agents work simultaneously, Claude Code can continue with its own tasks: reading your project documentation, planning the content structure, or preparing templates. When a background task completes, the output is available for collection.

This is where the speed advantage becomes concrete. Three research tasks that would take 15 minutes each sequentially now complete in 15 minutes total. The more independent tasks you can parallelise, the greater the time saving.

Step 6: Collect and Cross-Validate Results

Split screen showing research output on one side and validation results with pass and fail indicators on the other
Cross-validation catches errors: one agent checks another's output for accuracy before the data reaches your final deliverable.

When background tasks complete, collect the outputs and run a validation pass. This is the step that separates orchestration from just "running three things at once."

The validation pattern works like this: take the research Gemini produced (statistics, quotes, URLs) and delegate a checking task to Claude CLI. Claude CLI uses WebFetch to visit the source URLs and verify the quotes actually exist on those pages. It flags anything that fails verification.

claude -p --allowedTools "WebFetch" "Verify these claims:
1. URL: [url] - Does it exist? Does it contain: '[quote]'?
2. Statistic: [stat] - Source: [source_url]
Return PASS or FAIL for each with explanation."

In my experience building these workflows, about 15-20% of AI-generated research contains inaccuracies: wrong URLs, slightly misquoted sources, or outdated statistics. The cross-validation step catches these before they become embarrassing errors in your published content.

Tip: Build a fallback sequence. If an agent fails: retry once → try a different agent with the same prompt → flag for manual review. Never let a single failure block your entire workflow.

Deliver Results

With validated research in hand, Claude Code brings everything together.

Step 7: Synthesise Findings into Your Deliverable

Diagram showing multiple research streams from different AI agents flowing into Claude Code and producing a unified final document
Claude Code combines validated research from all agents into a coherent final output with proper source attribution.

This is Claude Code's home territory. It takes the validated outputs from Gemini, Claude CLI, and Codex, and weaves them into a coherent deliverable, whether that is a blog post, a report, a competitive analysis, or technical documentation.

The key is that Claude Code has all the context: your project files, the dynamic context from your CLAUDE.md, the research outputs, and the validation results. It knows which statistics passed verification and which were flagged. It knows which internal links are relevant because Claude CLI already checked your content database.

For content creation specifically, the orchestration pattern means your articles come with built-in fact-checking. Every quote has a verified source URL. Every statistic has been cross-referenced. That level of accuracy is what separates professional content from AI-generated filler, and it is what Google's E-E-A-T guidelines reward.

Step 8: Build Reusable Workflows

File structure showing a Claude Code custom command directory with reusable orchestration prompt templates
Save your orchestration pattern as a custom command so it runs on any topic with a single invocation.

Once you have a working orchestration pattern, save it as a Claude Code custom command. Create a Markdown file in your project's .claude/commands/ folder (or ~/.claude/commands/ for global access) that describes the workflow with parameterised prompts.

Your command file defines the research agents to launch, the validation steps to run, and the output structure to produce. When you invoke it with a new topic, the entire orchestration runs automatically (research, validation, and synthesis), producing a fact-checked deliverable in minutes instead of hours.

The same pattern transfers to other workflows. Claude Code's rapid growth has been driven partly by non-technical users discovering these automation patterns. Competitor analysis, technical audits, market research reports, SEO content briefs: any task that combines research, validation, and synthesis benefits from multi-agent orchestration.

Tip: Start with one workflow that you run frequently. Perfect it, then expand. Trying to orchestrate everything at once leads to overly complex commands that are harder to debug than the manual process they replaced.

Frequently Asked Questions

Do I need coding experience to orchestrate AI CLI agents?

You need basic comfort with a terminal: running commands, navigating folders, and reading output. You do not need to be a developer. Claude Code handles the complex orchestration logic. If you can copy and paste commands, you can orchestrate AI agents.

How much does it cost to run Claude Code with Gemini and Codex?

All three tools offer free tiers. Gemini CLI gives you 60 requests per minute and 1,000 per day at no cost. Claude Code requires a Claude Pro subscription (around £18/month). Codex CLI is free and open source. For most orchestration workflows, you will stay within free limits.

What happens if one AI agent fails during orchestration?

The other agents continue running independently. When Claude Code detects a failure, it can retry the failed task, delegate to a fallback agent, or flag it for manual review. The key is building retry logic into your workflow so a single failure does not block everything.

Can I run more than three AI agents in parallel?

Yes. Claude Code can launch as many background tasks as your system supports. However, research on multi-agent systems suggests limiting to three or four agents for effective coordination. More agents means more outputs to validate and more potential for conflicts.

Which AI CLI tool is best for web research?

Gemini CLI is the strongest choice for web research because it has built-in Google search via the google_web_search tool. It returns processed summaries with citations rather than raw search results. Claude CLI can also do web searches via WebSearch and WebFetch tools, making it a good fallback.

How do I verify AI research outputs are accurate?

Use one agent to verify another. After Gemini returns research, delegate a validation task to Claude CLI that checks source URLs, verifies quotes exist at cited pages, and confirms statistics match their claimed sources. This cross-validation catches hallucinated data before it reaches your final output.

Can Claude Code orchestrate agents on Windows?

Yes. Claude Code, Gemini CLI, and Codex CLI all run on Windows, macOS, and Linux. On Windows, use PowerShell or Windows Terminal. The background task functionality works the same across all platforms.

What tasks should I not delegate to AI agents?

Do not delegate tasks requiring access to sensitive credentials, destructive operations like database modifications, or final editorial decisions. AI agents are excellent researchers and validators, but a human should always review the synthesised output before publication or deployment.

Want AI Working Harder for Your Business?

AI orchestration is just one way to leverage these tools. If you want your website to be visible to AI systems when they recommend businesses to customers, you need proper AI visibility. Check how AI sees your business right now with the AI Visibility Checker.

Get in Touch

Sources