YouTube Deep SummaryYouTube Deep Summary

Star Extract content that makes a tangible impact on your life

Video thumbnail

How the 1% ACTUALLY Build Apps with Cursor's Context Engineering

AI LABS • 2025-07-09 • 13:29 minutes • YouTube

📝 Transcript (403 entries):

You probably know about vibe coding, but turns out that when it was coined by Andre Karpathy, it wasn't like he invented it. He just coined something that people had been doing for months. And now he has done it again. Karpathy, who was a founding member at OpenAI, has unknowingly coined another term, context engineering. And again, just like vibe coding, it's nothing new. Many people have been doing this practice. But one thing that he is right about is that this is absolutely necessary and this is the way that we should be coding with AI. Now, this is not just an explainer video. We will not only be going hands-on with what context engineering is and how you would prepare the context, but I will also show you how to properly use that context. And this is what most of you are completely missing. Now, the first thing to understand is that all models have context windows. It is the amount of text that they can currently remember. With the prompts that we were giving LLMs, we were phrasing things in a specific way to get a single good answer from the LLM. But here with context engineering, we're giving all relevant facts, rules, tools, and information and filling in the model's context window so that there is just no chance of hallucination and the model knows what it is doing. This way, we're actually working on what the model needs to remember in order to accomplish what we want. Now, if we look at the tweet in the first part, he tells us about how we're now shifting from prompt engineering to context engineering and essentially what context engineering consists of. There's also this diagram I found from another person that pretty nicely explains how context engineering isn't just a new form of prompt engineering. It's a broader term that includes everything from rag to memory and also includes prompt engineering within it. So this whole art has now been termed as context engineering. In the second part of the tweet, Andre actually tells us that it's not only the context we need to look at, it's also the app we're using because the LLM app isn't just a chat GPT wrapper anymore. It doesn't just contain an LLM. It uses that LLM and gives you tools and workflows that are actually useful. He specifies that the LLM app needs to have the components necessary for context engineering and that apps like cursor, clawed code, and other coding agents aren't just chat GPT wrappers anymore. They are actually important components in context engineering. Now, on the topic of the LLM apps we need to use, we have cursor and claude code. Both have their own strengths, but right now Claude Code is much more powerful as an agent. Cursor has been catching up though, recently adding features like the to-do lists that Claude Code already had. Bottom line, the context engineering workflow I'm about to show you works in either app. So, you can use whichever one you've purchased. Now, I've explained what context engineering is, and you're all probably excited thinking we should just go ahead and give everything to the model to get the exact results we want. But here's the thing with these coding models. Remember the context window I told you about? Well, once that fills up, the chances of hallucination increase rather than things getting more accurate. So, efficient management of the context window is crucial. You can't just dump everything into one file. You need to break it down into pieces and only give it to the model when it's needed. So, right now, I'm going to explain my workflow with context engineering. I've been doing this long before the term was even coined. It's just a new trend now. But there is something new I learned while watching a video by Cole Medan which was actually pretty great. It introduced the idea of including external documentation in the context window as well. So I got inspired by that and updated my workflow. Coming back to my workflow. First we start with a PRD which is the project requirement document. In that we list the features we want. Based on that the model can decide what's best for us. If you're a developer you can add specific requirements to the PRD as well. For example, I've mentioned that I want Nex.js for the front end and fast API for the back end. But even if you don't know what you want, the workflow I'm about to show you can automatically configure all of that and get you a ready-made app. Now, let's come to the part of the engineering workflow that actually has the context for the models, the documentation folder. These four files are the most important. the implementation plan, the project structure which is currently empty because it's still being generated, the UI and UX documentation, and finally bug tracking. These files are the different components that the AI model needs to complete the project. Now, this was context that the model will use, but the model should also know how to use it. For that, I've set up two rules, the generate rule and the work rule. First, the generate rule converts the PRD into all the other files. It basically generates full context for the development process. Once all that context has been generated, the models context limit gets full for that session and I won't be able to generate quality code further on. You can see this in cursor because it uses models with limited context windows. They fill up quickly. If I were using clawed code, this wouldn't happen as soon. But once I have all four files generated and ready, that becomes our complete context. Now if the model starts working on the project, it doesn't need to keep all of that loaded in its context. Otherwise, it'll just hallucinate more and more. That's why we move to an implementation plan so we can work through everything step by step. Now you might ask, how does cursor even know how to use these files? That's where the workflow rule comes in. It's always attached to cursor and tells it exactly how to use each file. When implementing the project, it looks at the implementation file. When working on UI and UX, it refers to the UI and UX documentation. If it's about to create something new or run a command, it checks the project structure to make sure it's consistent. And when there's an error or a bug, it first looks into the bug tracking file to make sure it wasn't already documented. This workflow rule regulates that entire process. I've purposely kept it small. You can see it's way smaller than the generate file, which is really long. The implementation plan is even longer. This is so that the workflow file which is to be always in the context takes as little place there as possible. In the implementation plan, we also have task lists. These are broader task lists and then they have their own subtasks. And you might say that cursor and claude already have their task lists. Yes, they do. But here when a subtask comes up, it decides whether to create a new task list in the LLM app to break that specific subtask if it is too long or if it's simple, then just follow the steps outlined in the current one. For example, when we reach the core feature stage, and by the way, this implementation is for the entire app, not just the MVP. It proceeds step by step. You can narrow it down to just the MVP if you want. Right now, the full app's development is estimated at 3 to 4 weeks. If it were just the MVP, it would be a matter of hours. When it comes to something like designing and implementing the database and schema, that task would have been broken down into further tasks by cursor. This is where Andre's advice on the LLM app being good enough comes in. And cursor is good enough that it can decide on its own. And if you think that when you open a new chat, meaning the context window resets, cursor will forget what the project was about, you don't have to worry about cursor forgetting because everything's already written down in the implementation file. That's the core idea in context engineering. Of course, both of these files will be in the description and you can generate these documentation files for yourself. But again, I encourage you to create your own workflows. The important thing isn't that you got these implementation files from me. The important thing is that you understand what context engineering actually is. And with that understanding, you can build your own implementation, your own generation workflows, and the exact set of files that cursor or claude code needs to follow. Oh, and if you're enjoying the content we're making, I'd really appreciate it if you hit that subscribe button. We're also testing out channel memberships. launched the first tier as a test and 90 people have joined so far. The support's been incredible. So, we're thinking about launching additional tiers. Right now, members get priority replies to your comments. Perfect if you need feedback or have questions. Now, there's another important point you need to understand about context engineering. You can see that I wanted to make the implementation plan for an MVP, but it's been taken to an advanced level because of my generate prompt. Even though I mentioned that I wanted an MVP since it was written in the generate file that the whole application should be developed with example stages being of a whole app rather than an MVP. It didn't really take the MVP scope into account. This brings me to the crucial point. You need to be very careful and read everything you give to these AI models because they will follow instructions blindly. If there are any conflicts or contradictions, there's no telling which one they'll follow. In my case, I specifically told it in the generate file that it should first build the foundations and do the setup, then include advanced features. So, it is following that because that's what it was told to do. Whether it's a file, a config, or anything else you generate from claude or chat GPT models. Please don't just blindly accept it. You need to read through everything carefully and adjust it to your own workflow. I highly recommend taking an hour to go through everything that needs to be done so that you don't face problems moving on. Now, another important thing that I recommend doing is this. I suggest you decide the tech stack yourself because even though the whole workflow may look automated, eventually it's your decision because what if it integrates something that works with the PRD but doesn't work with you. For example, you have access to OpenAI models but it integrates clawed models. Both work in the project but not for you. So instead of integrating the tech stack discovery process into your entire context ecosystem, I recommend researching it yourself. Now let me show you the workflow in action. As you can see on the left side, there's a process going on. I've asked it to start building the app and begin with stage one. You might notice I started a completely new chat. It doesn't know anything about the project, but it gets the context from the implementation plan. Everything's written at the top, what we're building, the tech stack, and so on. Let me show you what it did. It went ahead and created the to-dos. Pretty simple stuff. It just picked up what I had written and started implementing them. It didn't need to divide the subtasks itself. This new feature in cursor is something I really like. It just copied exactly what I had specified and started executing it. As you can see, it's implementing them one by one. It's installing everything mentioned and going step by step. This was already available in claude code, but now that it's in cursor, it makes things a lot easier. The model knows exactly what it's doing. The context is right there and it's following the instructions step by step. And you can see that as I told it to make all the folders, they're starting to take shape. If I collapse the view, we can already see our back end, our front end, our scripts, and our shared folders all starting to form. The whole project is now being built from the ground up. You can see that now everything is being generated. All my basic foundations are being set up. If I go into the back end, you can see that the Python app is being configured because the model already knows how to set it up. It understands what text stack is going to be used. So, it didn't need to think about that. Even if you do ask it explicitly, it's not a problem. But that was just my recommendation. Now that it knows what it needs to connect, it's using all the APIs and laying down whatever foundation is required. Everything it sets up now will be included in that initial structure. One thing with software development is that you can't just go ahead and implement any feature at any time. You need the basic structure in place first. If that foundation isn't there, either you'll end up restarting everything from scratch or the amount of modification needed later will be too much. Reconfiguring and adding features without proper scalability in mind. Well, that kind of project just isn't good. Anyway, as you can see, this new to-do feature, even though it's moving slowly, the advantage is that everything stays on track. it doesn't forget what needs to be done and everything is being completed thoroughly. The model won't move on until it verifies the current step. So now you can see it's progressing. This will take some time and stage one is just about setting everything up. But here's what I want you to take away. While these implementation plans will be available in the description below, you should create your own. If you want to use claude code, you can use these files with claude code as well. Just drag them in and when you enter the slash command, it'll generate everything for you. For example, I created this custom command called generate implementation. And all I need to do is copy the generate workflow file over here. That's it. It'll generate everything for me. Another thing I really like about claude code is that it includes a cloud.md file which holds all the codebased documentation. But for managing the context window, using this documentation approach with multiple files is way better than dumping everything into one single file. Also, if I ask it to do something that requires multiple agents, Claude code will spin up those agents. This is where claude code has a slight advantage because all the agents can work at once. But that only helps in tasks where you don't need to go step by step. For example, this entire implementation plan needs to be done step by step. Each part has to connect sequentially. You can't just install everything in one go. Sure, for things like installing dependencies or setting up packages, that parallelism can help. But for most of these workflows, it has to be done one step at a time. Now, there are some things like generating UI variations where multiple agents really shine. They can each create different variations for you. And for that, we actually have a separate video dedicated to claude code that also falls under context management because in that video, I've used these rule files as well. So, definitely go check that out. That brings us to the end of this video. If you'd like to support the channel and help us keep making videos like this, you can do so by using the super thanks button below. As always, thank you for watching and I'll see you in the next