YouTube Deep SummaryYouTube Deep Summary

Star Extract content that makes a tangible impact on your life

Video thumbnail

Claude Code + GitHub WORKFLOW for Complex Apps

Greg Baugues • 2025-06-26 • 18:40 minutes • YouTube

📚 Chapter Summaries (10)

🤖 AI-Generated Summary:

Unlocking AI Superpowers: A Workflow for AI-Assisted Coding with Cloud Code and GitHub

Over the past few weeks, I have been developing a new web app using an AI-assisted coding workflow that leverages Cloud Code and GitHub. This approach has truly unlocked new superpowers in my development process. In this post, I’ll share the workflow I use, why it works, and dive into the details of each phase: planning, creating, testing, and deploying code with AI assistance.


The High-Level AI Coding Workflow

At its core, my workflow follows a simple cycle:

  1. Plan: I create GitHub issues that define the work to be done. Cloud Code uses detailed slash commands to process these issues, breaking down large tasks into small, atomic steps using scratch pads.
  2. Create: Claude Code (the AI assistant) writes the code based on the plan, commits the changes, and opens a pull request (PR).
  3. Test: Testing happens in two ways — running automated test suites and using Puppeteer to simulate browser interactions for UI changes.
  4. Deploy: After tests pass, the PR is reviewed and merged, triggering continuous integration (CI) workflows and deployment.

Once an issue is completed and merged, I clear the AI’s context window to start fresh on the next issue, keeping the process clean and efficient.


Why Use a Workflow Like This?

Writing code is just one part of the software development life cycle (SDLC), which traditionally follows the phases of plan, create, test, and deploy. AI coding assistants like Claude Code excel when integrated into a well-defined workflow that mirrors these phases.

My workflow is heavily inspired by GitHub Flow, a lightweight branching model designed for small teams—which, in this case, is one human developer plus one AI assistant. This approach ensures that AI-generated code is managed, reviewed, and deployed systematically, reducing errors and maintaining code quality.


Phase 1: Creating and Refining GitHub Issues

The process begins with well-crafted GitHub issues outlining precise, granular tasks. Initially, I used dictation tools and Claude to convert high-level ideas into a requirements document, then generated GitHub issues from there.

Key insight: The quality of issues directly impacts AI performance. Early on, I learned that vague or large issues led to suboptimal results. Breaking down work into very specific, atomic issues made it easier for Claude Code to produce accurate and reliable code. This phase requires human attention, acting much like an engineering manager—refining specs, prioritizing tasks, and clarifying requirements.


Phase 2: Setting Up Your Foundation

Before rapid development can begin, you need a solid foundation:

  • Test Suite & Continuous Integration: I set up automated tests and CI using GitHub Actions, ensuring that tests run with every commit Claude makes.
  • Puppeteer for UI Testing: Puppeteer allows Claude to interact with the app’s UI in a browser, simulating clicks and form submissions to verify front-end changes.
  • GitHub CLI: Installing the GitHub CLI enables Cloud Code to interact programmatically with GitHub, handling issues, commits, and PRs.

This setup ensures that AI-generated code is continuously validated, reducing bugs and maintaining stability.


Phase 3: Planning with Custom Slash Commands

The heart of the workflow is a custom slash command I created for Cloud Code to process issues. This command breaks down into four parts: plan, create code, test, and deploy.

  • The planning stage is the most critical and involved. Claude Code uses the GitHub CLI to fetch issue details and searches previous work and pull requests to avoid duplication.
  • I leverage “think harder” prompt techniques to encourage the AI to deeply analyze the task and break it into manageable subtasks documented in scratch pads.
  • This detailed planning guides the AI’s code generation and testing, improving accuracy and relevance.

Phase 4: Creating, Testing, and Deploying Code

Once the plan is ready, Claude Code writes the code and commits changes directly to a feature branch. Then:

  • The test suite runs automatically via CI.
  • Puppeteer performs UI tests if needed.
  • Claude opens a pull request for review.

Human Review & AI Collaboration

While Claude can handle commits and PR creation, human oversight remains essential. I read PRs, leave comments, and sometimes use another custom slash command to have Claude review the PR in the style of coding experts like Sandy Mets, which helps identify maintainability improvements and catch subtle issues.

There have been moments when I trusted Claude to handle everything and merged PRs after tests passed, but I remain cautious. Good tests are my safety net, catching regressions and ensuring that new features don’t break existing functionality.


Managing Context and Workflow Efficiency

After merging a PR, I use a /clear command to wipe Claude’s context window. This “cold start” approach means each issue must be self-contained, with all necessary information included. It keeps token usage efficient and reduces the risk of confusion from lingering context.


Exploring GitHub Actions and Parallel Work with Work Trees

Anthropic recently introduced Claude integration via GitHub Actions, letting you tag Claude directly within GitHub to handle small tasks like copy edits or minor tweaks. However, for large feature development, I prefer using the console version of Claude Code, which gives better control and cost efficiency.

Work Trees for Parallel AI Agents

Work trees allow running multiple instances of Claude on different branches in parallel—like multitabling in online poker. Each instance works independently in its own directory, enabling concurrent feature development.

In practice, I found this approach a bit cumbersome due to repeated permission approvals and extra management overhead. For my current project size, a single AI instance handling one issue at a time works best.


Final Thoughts: Finding Your Balance Between Human and AI Roles

This workflow highlights the complementary roles of human developers and AI assistants:

  • Humans: Spend significant effort in planning, refining issues, and reviewing code to maintain quality and direction.
  • AI: Handles code creation, testing, committing, and even PR reviews, accelerating development speed.

Finding the right balance depends on your project size, complexity, and trust in AI-generated code. With a strong foundation and clear processes, you can empower AI to take on the heavy lifting while you focus on high-level decisions and quality assurance.


Want to Learn More?

If you found this workflow interesting, you might also enjoy my video on Claude Code Pro Tips, where I share deeper insights and practical advice for working effectively with AI coding assistants.


Harnessing AI coding assistants like Claude Code within a structured workflow can transform how you build software—unlocking new efficiencies and superpowers you never thought possible. Give this approach a try, and see how AI can become your ultimate coding partner.


📝 Transcript Chapters (10 chapters):

📝 Transcript (503 entries):

## Overview of the AI Coding workflow [00:00] In this video, I want to talk to you about the workflow that I've been using with Cloud Code and GitHub to build a new web app over the last couple weeks. And I feel like it's sort of unlocked new superpowers for me. So, first let me just go through the workflow at a high level just to give you the taste in case you don't have much time. And then we're going to circle back. We're going to talk about the why you might need a workflow like this. And then we'll dive into each of the steps in a little bit more detail. So, here's how it works. I create GitHub issues for all the work I want to have done on the app. In cloud code, I have a detailed slash command with instructions on how to process issues. At a high level, I want it to first plan its work using scratch pads to break down the big issue into small atomic tasks. Then once it's planned its work, it can create the code. After it's created the code, it can then test its work. It can do this in two different ways. One, running the test suite. And second, it can use Puppeteer to click in a browser if it's made any UI changes. Then once it has tested its work, it will commit its work to GitHub and open up a pull request which is then reviewed. Sometimes I review that PR, sometimes I have Claude Code review the PR with a different slash command that I've written. Also, I have continuous integration set up on GitHub via GitHub actions so that anytime a commit is made, we run the test suite and we run a llinter and we check to make sure that it is safe to merge the commits into the main branch. And then in cloud code, I use /clear to wipe away the context window. And then I have it tackle the next issue and repeat the cycle. Now, I ## Software Development Life Cycle [01:39] don't want to pretend like I've created the wheel here because what I've just described to you could be summed up as a cycle of plan, create, test, deploy, which are generally considered to be the four phases of the software development life cycle. So, why do you need a cycle like this if you have such powerful software coding agents? Well, the software industry has known for a long time that writing code is just one phase of what's required to ship and maintain complex software. Turns out that some of the processes and systems that we built to manage the creation of software work really well with these AI coding assistants and in particular cloud code. Now to be even more specific, the workflow I've just described is based heavily upon GitHub flow, which is a workflow first published by Scott Shaon, who is one of the co-founders of GitHub. Published this about 13 14 years ago when GitHub was just about 35 employees. So this is a workflow that's well known that works really great for small teams. Say if your team was, I don't know, approximately the size of one human and one AI coding assistant. Let's go back through and talk about each of those four phases in a little bit more detail. Plan, create, test, and deploy. Uh, let's start off with creating the ## Creating and Refining GitHub Issues [03:01] issues. When I very first started working on this app, I started that with a dictation session via Super Whisper. And then I just worked with Claude to turn that into a requirements document. And then once I had those steps, I told Claude Code to create GitHub issues from there. Now, you also need a way for Claude Code to interact with GitHub. And Enthropic's recommended way of doing so is to install the GitHub CLI. And this allows Cloud Code to run GH via Bash to interact with GitHub. For some reason, you can't install that CLI. You could use the MCP server, but the CLI is a recommended way of doing so. Now, I would say the first mistake I really made here was that I had it create those issues. It's probably about 30 or 40 issues, and then I just had to start working on them. It was overly optimistic of me to assume that we could go straight from the GitHub issues that it created to writing software. In reality, my job perhaps got a little bit less fun because instead of writing code now, I really needed to go and make sure that I was being very specific in those issues and really refining them. And I'd say the more granular, the more specific, the more atomic those issues got, the better results I had. And I had a couple false starts where I kind of had to throw the whole project away and really go back and spend time in GitHub and say, "Okay, what do we do first? What do we do second? And how do we break this down and keep it really tightly scoped so that we're setting ourselves up for success?" In fact, it's kind of funny. I was at Twilio for 9 years. I was a manager for a lot of that time. And I feel like I got a little burned out on being a manager and I have really been enjoying writing code over the last couple years. And these last couple weeks, I feel like I had to put my manager hat back on. I've written very little code myself and instead I've spent most of my time writing really detailed specs, reviewing code that was written by someone else, leaving comments and saying things like, "H, this is not quite good enough. Please try again." Or, "Actually, I thought I wanted that, but that now that I see it, that's not quite what I want." Or, "Throw away all your work and uh I don't actually want this at all." Uh and so if you want to like roleplay as an engineering manager, uh this process is actually a pretty good way to do that. The first couple issues that we worked on were setting up the test suite and continuous integration. Most of my work that I've done over the last 10 years has been in Python, but anytime I'm building a more complex web app and need a users table, I find myself starting to reach back for Rails. I also think there's something about the MVC framework which is not unique to Rails. Django has this too and lots of frameworks use the model viewcontroller framework. But I think there's something about modularizing your codebase that makes it easier for coding agents to work with because they can focus on code that's related to one idea as opposed to say a main.py or an index.js that's a thousand lines long. Rails has really nicely integrated testing framework and it was really important to me from the ## Setting Up Your Foundation [05:54] beginning to get my test suite up and running so that I could set up GitHub's continuous integration so that I could have my tests run automatically every time Claude Code was pushing commits. Now along the same lines, I also set up the Puppeteer local MCP server and Puppeteer allows Claude Code to use a web browser to test the local changes to your app. I've actually found this to be really useful as I've started in on redesigning the app. It's also good for testing to see if buttons work or forms work. It's actually very surprising and very satisfying to watch cloud code uh click around in a browser to test the work that it's already done. So, I'd say before you can really get moving with rapid iterative feature development, you need some really well- definfined issues. You need your app set up on a GitHub repository and you need continuous integration set up with a really good test suite and Puppeteer helps a lot as well. But once you have that foundation in place, now you're ready to go. All right, so I have some issues here. Let me talk through what happens when I have Claude Code work on an issue. Most important thing here is you're going to create a slashcomand. You can do this in thecloud/comands directory. A slash command is basically a prompt template and you can add ## Plan: Custom Slash Commands [07:10] command line arguments to that. So the argument that we're going to be passing into this one is the issue number. Now for my /command for processing issues, I started with the one that came from the anthropic post on best practices for agent decoding. That was a post written by Boris who is the original creator of cloud code. And I started there and then I just iterated over time. I added more to it. And you can see I broken up into four parts. plan, create code, test, and deploy. And uh plan is the biggest one. You know, it's perhaps the most important. I'm telling Cloud Code to use the GitHub CLI to view the issue. Uh I also then ask it to go dig up some prior art on this. So, uh I do have it use what's called scratchpad. So, it basically has a directory in the codebase where Claude code can plan all of its work. And I ask it to search those scratch pads for uh previous work related to this issue. I ask it to look through PRs, previous PRs in GitHub to see if it can find other work that's been done on this issue so it can figure out what's been done and why. I use here the think harder uh prompt to trigger thinking mode. Uh Anthropic has several of these. So you can do think hard, think harder. I think you can do think hardest and ultraink. I cannot tell you why I've settled in on think harder. It seems to be working well. Um maybe I need to bump this up to ultraink in the future. I don't know. Uh, but the key here is that I want it to break the issue down into small, manageable tasks. Then I ask it to write that plan on a new scratchpad and to include a link to the issue there. Now, Claude Code's going to write the code and after it's written some code, it's going to commit the code. Or is it? I think one of the biggest questions that's going to come out of this workflow is, do you have Claude code write the commit for you or is it your responsibility to do that? I ## Create, Test, Deploy [08:59] have been convicted by Thomas Tacic. He wrote this post a few weeks ago called All of My AI skeptic friends are nuts. It was super popular. It's probably the best piece of writing that I've read on AI assisted coding. The link's in the description here. I encourage you just to like read it. It's an amazing piece of writing. Uh and there's a section he's going through all of the uh criticisms or objections from his AI skeptic friends about why you shouldn't use AI assisted coding. So the objection here is but you have no idea what the code is. And Thomas replies, "Are you a vibe coding YouTuber?" Maybe. Uh, can you not read code? If so, astute point. Otherwise, what the [ __ ] is wrong with you? You've always been responsible for what you merge domain. You were 5 years ago and you are tomorrow, whether or not you use an LLM. If you build something with an LLM that people will depend on, read the code. In fact, you'll probably do more than that. You'll spend 5 to 10 minutes knocking it back into your own style. And in fact, as I talk to uh engineer friends who are working at large companies using claude code there, they will actually not even let claude do the commit even though it's really great at writing commit messages, but instead they will open up all of its changes in an IDE such as cursor, review them all. I've not really been doing either of those things on this project. I started I I really did start there and I was like being very diligent opening up all the code and cursor. Uh, at some point I have to admit I started getting lazy. So maybe I've fallen back into the vibe coding YouTuber genre, I guess. Uh, but uh, I have been letting Claude do all of the commits and then I do try to read the PRs. Although I will say, and we'll get to this in a second, sometimes I just have Claude read the PR. Uh, but let me tell you what makes me feel a little bit a little bit better about having Claude do that, and that's tests. So when I started this project, I wanted to be really sure that I had a good test suite because I do feel like in other projects such as like the games I built for my daughters, I often run into issue where things are working pretty good and then Claude makes a change. Sometimes a seemingly simple or benign change and it breaks all the stuff. I'm not looking for necessarily 100% code coverage, but I do want to have high confidence that Claude can work on one feature without breaking the stuff it's done before. All right. Finally, we have planned. We have created code. We have tested the code. Now, it is time to deploy. I personally deploy to render. I like it for a lot of the stuff I've been building lately, both in the Python and Rails apps. Uh, render will look for pushes to your main branch of GitHub and then automatically deploy your new app. So in this workflow, merging a branch into the main branch in GitHub is the same approximately as deploying to production. And so the way that we set up a branch to merge it into main is by opening up a pull request. So you as the human here working with the AI, let's assume that you have had Claude make the commits and then let's assume that you have had Claude open the PR. This is the place where you really can get in and review the changes that it's made and you can leave comments on the changes that Claude has made and then you can go back into the console and ask Claude to view those comments and to make changes based on them. You can also set up a separate slash command to ask Claude to do a uh PR review for you. Now, if you do have a slash command for doing a PR review, what I would encourage you to do is to open up cloud code in a completely new shell and then to run it fresh and so that it is not doesn't have the context pollution of the work that it's already done. I have a a slash command for doing PR reviews uh where I ask it to review it in the style of Sandy Mets. Sandy Mets is one of my heroes from the Rails world. She has some great principles for writing beautiful maintainable code. When I have Claude review the code in the style of Sandy Mets, it reveals places where we can make things more maintainable or more readable that I would have missed and certainly that Claude missed on its first pass. Now, I I will admit there's been more than a few times over the last couple weeks when I've had Claude write the code. I've had Claude do the PR review. Uh I've ensured that the test pass and I'm like, "Looks good to me." And I click the button to merge the poll request. So again, this the video is not ## Your job vs. AI's job [13:29] intended to be prescriptive about the workflow, but I think the high-level bits here make a lot of sense. And then you got to figure out where in those individual steps of of the plan, the create, the test, and deploy are you going to get hyper involved as the human? And for me personally, I have been hyper involved in the planning phase. And I found it really difficult to delegate anything other than just like cleaning up my ideas or my pros to Claude. I think the planning is where I've been spending a whole lot of time and then I personally for this app and the size of the app and size of the codebase and all have been able to delegate a lot of the creating testing and deploying or the reviewing of the the coding etc to Claude. All right. So finally now that I have merged my PR here's what I do. I go back to claude and I run /cle. This completely wipes away the context window. I am not compacting the window. I am clearing the window. The idea here is that each issue should contain all of the information that Claude needs to perform that work. It should be able to ## Context Management with /clear [14:32] work on the issue from a cold start. And thanks to the scratch pads and thanks to its ability to review PRs and all the previous work that's been done on the codebase, that issue should be descriptive enough for it to tackle it with no working memory. And this also frees up your context window. It will help you get better results while using fewer tokens. Now, let me address a quick question because you probably saw that Anthropic launched uh Claude via GitHub actions and this is a really cool feature that lets you just tag Claude in your directly from GitHub and have it work on some stuff. Um, so I have been ## Claude via GitHub Actions [15:07] playing with that a little bit. The primary reason why I'm not using that is because as of today, um, that usage of the GitHub actions is built with metered billing against your API. even if you're on a Claude Max plan. So, I have upgraded now to the $200 a month Claude Max plan. I am finding it is totally worth it to get the Claw 4 Opus use. Um, I've just been thrilled with the value I'm getting there, but then I was kind of bummed to then get a $50 API bill from Anthropic after I had been using uh tagging Claude in GitHub. And so, I was like, man, if I'm already getting unlimited access, uh, I might as well just do it in the console. And candidly, I think I'm getting much better uh insight and results from using claude code in the console. And so I actually talked to a friend Martin who works at Anthropic and his suggestion was use uh Claude in the GitHub actions when you're say doing a PR review and there's a small change perhaps a copy change or just like something tiny that needs to be tweaked but you don't necessarily want to go into the codebase and do it yourself. It's really good for those smaller fixes, but you probably don't want to be using GitHub actions for really large meaningful changes to your codebase. Uh, finally, let me just talk about work trees because, uh, Anthropic talks about this quite a bit. The best ## Work Trees: Run Parallel Agents [16:28] analogy that I have for work trees would be multitabling and poker. You know, you start playing online poker on a single table and then you realize you're just kind of clicking buttons every once in a while. you could probably play two table at a time and then at some point you've bought a bigger monitor and you're like playing four or eight tables at a time. That's sort of what running clawed work trees feels like. Uh instead of different poker tables up, you're just tabbing between different tabs in the terminal. And generally I think that the industry as a whole is excited about uh running coding agents in parallel or in the background. And work trees is the method that you can use with GitHub to run multiple instances of Claude working on multiple issues at the same time. I personally ran into two issues with it. The first is because I'm just getting started building this app. There's so much work that just simply needs to be done iteratively. There aren't a lot of features that can be developed in parallel where the code bases don't touch each other. Um, I found the interface for working with work trees to be a little bit clunky. The general idea behind a work tree is that you create copies of your git repo in separate subdirectories and then you have one version of claude running in, you know, subdirectory A on let's just call it branch A and then you have another one running on branch B and they're running in parallel in two different directories on your computer. Um, the issue that I had was that when I spun up a new version of Claude, like a new Claude session, I didn't have the same permissions that I had already approved on that first session of Claude. And so, every time I created a new branch, I was having to approve all the permissions again. And I just felt like I was having to babysit it a lot more. And then what happens is after you have finished work on that issue or that branch, you're supposed to delete that directory and then create a new work tree again. And so every time you're creating a new work tree, you're reapproving those permissions. And it just felt like I was doing more babysitting and more cleaning up merge conflicts than it was really worth it. Uh I found that just working with a single cloud instance is sufficient for me. Now, if you made it this far, you'd probably also enjoy the video I did on claude code pro tips. So check that one out.