Back to YouTube
Parker RexJuly 6, 2025

I Spent $10K+ to Test Every AI Coding Agent (Augment, Claude, Cursor, Devin, OpenAI)

Spent $10K+ testing AI coding assistants (Augment, Claude, Cursor, Devin, OpenAI) across 6 core use cases—honest, practical insights.

Show Notes

I spent over $10K testing a wide mix of AI coding agents to see what actually moves the needle across real product workflows. Here are the patterns, the tools, and the practical takeaways you can apply today.

Approach and context

  • Two camps exist: the skeptic who’s burned through tools, and the builder who uses AI daily and wants to squeeze more value. This video targets both by laying out concrete patterns that scale.
  • Six common use cases when shipping product with AI agents: bug fixes, stale branches/PR merges, planning/PRDs, code execution, design (0→1), and design fixes.
  • Two modes of AI coding: active (at the desk, watching 1–3 threads in real time) and passive (remote/background agents with heavy up-front planning).

The six AI coding use cases

  • Bug fixes
    • Start with good observability (logs, traces) before asking AI to fix anything.
    • For small bugs, use Cursor Bugbot (and 03) to surface and fix quickly.
    • For bigger fixes, bring in Augment with a focused prompt and logs to guide the change.
    • Example workflow: open a PR, have Cursor/03 propose a patch; use Augment for deeper analysis if needed.
    • Quick note: keep prompts scoped and avoid over-optimizing the prompt itself; the context is king.
    • Actionable prompts tip:
      • Use a prompt like: “Reflect on the five to seven probable sources of the problem. Distill to the one or two most likely, then add logs to validate assumptions before implementing the fix.”
      • You can tweak logs/files to tag relevant areas.
  • Merging PRs / handling merge conflicts
    • Augment is the go-to “merge master” for large/codebase-heavy merges; it reliably resolves conflicts and speeds up the process.
    • If bigger or more delicate changes are involved, pair Augment with a local agent (like Cursor/03) for final polish.
    • Observation: you can wake up to several merge-related PR changes; a solid merge assistant saves hours.
  • Planning and PRDs (product requirements)
    • Two robust planning flows:
      • Claude Code issues prompt (for small features): type a quick feature/bug note, let it scan the repo, align with templates, and generate a concrete issue.
      • Cursor planning (RST-based, phase-driven): break work into phases, write a plan file that includes visuals, icons, and a clear phase breakdown. This is especially valuable for bigger efforts.
    • Core principle: start with the end result (the why) and work backward to the plan (the how).
    • Takeaway: build a feature tree and capture the inputs/outputs for each phase before you touch code.
  • Code execution and automation
    • Active vs passive execution: run plans actively when you can supervise; use passive remote agents for long-running or background tasks.
    • Common stack approach: Opus for execution diffs; Augment for heavy lifting; Devon for remote environments to spin up clean VMs; Cursor for diffs and real-time feedback.
    • If you need a quick, repeatable loop, use a stable baseline (e.g., Opus) and only branch out when you’re testing something new.
  • Design (UI/UX from zero to one)
    • Start from existing patterns you like; use design references and remix them (the article uses v0ero as a basis to prototype quickly).
    • Iterate by pulling in inspiration from brands/products you admire; plan the layout with screenshots and concrete UI elements (icons, spacing, typography).
    • The design plan should clearly separate form and function: what the UI should look like and how the data flows through it.
  • Design fixes and iteration
    • If the UI looks off (too much white space, odd outlines), capture references, and build a new plan with visuals before coding.
    • Use examples from other sites to anchor changes (frequency, layout patterns, component behavior).
    • The key: plan inputs (screenshots, icons, CSS globals) to drive the actual UI changes, not just “make it prettier.”

Active AI coding vs. Passive background agents

  • Active AI coding
    • You’re at the keyboard, watching multiple threads, performing real-time code review.
    • Best practice: require tight context, frequent checks, and fast feedback loops.
  • Passive background agents
    • Heavily plan up front, rely on remote agents to execute, and come back with results later.
    • Best practice: invest more in planning artifacts (plans, PRDs, issue templates) so the agent can work with minimal supervision.

Tooling and workflow highlights

  • Devon: remote environment setup that spins up VMs from a snapshot; great for waking up to work that’s already prepped.
  • Augment: reliable for larger-scale code changes and merges; acts as a merge master for complex PRs.
  • Cursor (and Cursor Pro/Bugbot): fast, in-repo assistance for fixes, diffs, and smaller tasks; strong for iterative improvements.
  • Claude prompts in Augment: used for deeper reasoning in debugging and planning, with prompts that guide reasoning before implementing changes.
  • OpenAI prompts: paired with planning and design prompts for iterative exploration.
  • Convex: used to simplify the tech stack and reduce boilerplate; helps keep the architecture approachable.
  • v0ero: design tool to explore UI concepts quickly; used to prototype design directions before coding.
  • Every Claude Code (and related planning prompts): inspiration and proven prompts for planning flows.
  • Observability basics: Pino/logging and tracing to give AI a solid signal when diagnosing issues.
  • Community angle: VI (Vibe with AI) community with engineers from Microsoft, Google, and startups exchanging patterns and learnings.
  • Ship or Skip: Parker’s new show for testing what to adopt and what to drop.

Concrete takeaways you can apply now

  • Start with the end in mind: define the desired result for each feature before touching code.
  • Build a feature tree for any major initiative; don’t skip the planning phase.
  • Use simple tech stacks where possible; leverage tools that already know well (npm, eslint) to speed up adoption.
  • Separate planning from execution: use a small, fast flow for small features (Claude Code issues prompt) and a more thorough, phase-based plan for larger efforts (Cursor planning with RST).
  • Use two execution modes:
    • Active: stay engaged, review results in real time, maintain strict context.
    • Passive: plan thoroughly up front, let remote agents work through planned phases.
  • For merges and large codebase changes, rely on Augment as your merge master; reserve Cursor/03 for smaller, verifications.
  • Prototyping UI? Start with design inspiration (like v0ero), capture visuals, then translate into code with a clear plan for UI components and CSS/global tokens.
  • Don’t over-hype tools; focus on enduring skills: planning, systems design, and understanding core technologies—these won’t change as quickly as models do.

Actionable next steps

  • Pick two use cases you work with most (e.g., bug fixes and planning) and implement a two-flow setup: Claude Code issues prompt for small tasks; Cursor planning (RST) for larger features.
  • Create a simple feature tree for your next project and a planning document with phase-based milestones and visuals.
  • Start with a minimal tech stack you’re comfortable with (e.g., npm + ESLint) to reduce friction when introducing AI agents.
  • Try a remote agent workflow with Devon for a week to see how much time you save on environment setup and PR churn.
  • If you’re prototyping UI, use a design tool like v0ero to sketch and gather screenshots before coding.

If you want a deeper pass with timestamped highlights or a shorter quick-start gist for each use case, I can tailor a condensed version or a step-by-step playbook.