[00:00] (0.24s)
You probably know about vibe coding, but
[00:02] (2.24s)
turns out that when it was coined by
[00:03] (3.92s)
Andre Karpathy, it wasn't like he
[00:05] (5.84s)
invented it. He just coined something
[00:07] (7.52s)
that people had been doing for months.
[00:09] (9.36s)
And now he has done it again. Karpathy,
[00:11] (11.52s)
who was a founding member at OpenAI, has
[00:14] (14.00s)
unknowingly coined another term, context
[00:16] (16.72s)
engineering. And again, just like vibe
[00:18] (18.80s)
coding, it's nothing new. Many people
[00:20] (20.72s)
have been doing this practice. But one
[00:22] (22.64s)
thing that he is right about is that
[00:24] (24.32s)
this is absolutely necessary and this is
[00:26] (26.72s)
the way that we should be coding with
[00:28] (28.48s)
AI. Now, this is not just an explainer
[00:31] (31.04s)
video. We will not only be going
[00:32] (32.56s)
hands-on with what context engineering
[00:34] (34.88s)
is and how you would prepare the
[00:36] (36.48s)
context, but I will also show you how to
[00:38] (38.56s)
properly use that context. And this is
[00:40] (40.64s)
what most of you are completely missing.
[00:42] (42.72s)
Now, the first thing to understand is
[00:44] (44.40s)
that all models have context windows. It
[00:46] (46.88s)
is the amount of text that they can
[00:48] (48.56s)
currently remember. With the prompts
[00:50] (50.16s)
that we were giving LLMs, we were
[00:52] (52.00s)
phrasing things in a specific way to get
[00:54] (54.32s)
a single good answer from the LLM. But
[00:56] (56.72s)
here with context engineering, we're
[00:58] (58.56s)
giving all relevant facts, rules, tools,
[01:01] (61.52s)
and information and filling in the
[01:03] (63.52s)
model's context window so that there is
[01:05] (65.68s)
just no chance of hallucination and the
[01:07] (67.92s)
model knows what it is doing. This way,
[01:10] (70.24s)
we're actually working on what the model
[01:12] (72.24s)
needs to remember in order to accomplish
[01:14] (74.32s)
what we want. Now, if we look at the
[01:16] (76.32s)
tweet in the first part, he tells us
[01:18] (78.32s)
about how we're now shifting from prompt
[01:20] (80.48s)
engineering to context engineering and
[01:22] (82.80s)
essentially what context engineering
[01:24] (84.56s)
consists of. There's also this diagram I
[01:26] (86.80s)
found from another person that pretty
[01:28] (88.48s)
nicely explains how context engineering
[01:30] (90.96s)
isn't just a new form of prompt
[01:32] (92.96s)
engineering. It's a broader term that
[01:34] (94.80s)
includes everything from rag to memory
[01:37] (97.04s)
and also includes prompt engineering
[01:39] (99.04s)
within it. So this whole art has now
[01:40] (100.88s)
been termed as context engineering. In
[01:42] (102.96s)
the second part of the tweet, Andre
[01:44] (104.64s)
actually tells us that it's not only the
[01:46] (106.56s)
context we need to look at, it's also
[01:48] (108.56s)
the app we're using because the LLM app
[01:51] (111.04s)
isn't just a chat GPT wrapper anymore.
[01:53] (113.44s)
It doesn't just contain an LLM. It uses
[01:55] (115.84s)
that LLM and gives you tools and
[01:58] (118.00s)
workflows that are actually useful. He
[02:00] (120.24s)
specifies that the LLM app needs to have
[02:02] (122.64s)
the components necessary for context
[02:04] (124.80s)
engineering and that apps like cursor,
[02:07] (127.12s)
clawed code, and other coding agents
[02:09] (129.28s)
aren't just chat GPT wrappers anymore.
[02:11] (131.60s)
They are actually important components
[02:13] (133.36s)
in context engineering. Now, on the
[02:15] (135.84s)
topic of the LLM apps we need to use, we
[02:18] (138.16s)
have cursor and claude code. Both have
[02:20] (140.24s)
their own strengths, but right now
[02:21] (141.84s)
Claude Code is much more powerful as an
[02:23] (143.84s)
agent. Cursor has been catching up
[02:25] (145.76s)
though, recently adding features like
[02:27] (147.60s)
the to-do lists that Claude Code already
[02:30] (150.08s)
had. Bottom line, the context
[02:32] (152.00s)
engineering workflow I'm about to show
[02:33] (153.92s)
you works in either app. So, you can use
[02:36] (156.48s)
whichever one you've purchased. Now,
[02:38] (158.40s)
I've explained what context engineering
[02:40] (160.40s)
is, and you're all probably excited
[02:42] (162.40s)
thinking we should just go ahead and
[02:44] (164.08s)
give everything to the model to get the
[02:46] (166.00s)
exact results we want. But here's the
[02:47] (167.84s)
thing with these coding models. Remember
[02:49] (169.60s)
the context window I told you about?
[02:51] (171.44s)
Well, once that fills up, the chances of
[02:53] (173.68s)
hallucination increase rather than
[02:55] (175.76s)
things getting more accurate. So,
[02:57] (177.44s)
efficient management of the context
[02:59] (179.12s)
window is crucial. You can't just dump
[03:01] (181.12s)
everything into one file. You need to
[03:02] (182.96s)
break it down into pieces and only give
[03:05] (185.04s)
it to the model when it's needed. So,
[03:06] (186.88s)
right now, I'm going to explain my
[03:08] (188.64s)
workflow with context engineering. I've
[03:10] (190.96s)
been doing this long before the term was
[03:12] (192.72s)
even coined. It's just a new trend now.
[03:14] (194.72s)
But there is something new I learned
[03:16] (196.56s)
while watching a video by Cole Medan
[03:18] (198.56s)
which was actually pretty great. It
[03:20] (200.08s)
introduced the idea of including
[03:22] (202.08s)
external documentation in the context
[03:24] (204.40s)
window as well. So I got inspired by
[03:26] (206.32s)
that and updated my workflow. Coming
[03:28] (208.24s)
back to my workflow. First we start with
[03:30] (210.40s)
a PRD which is the project requirement
[03:32] (212.72s)
document. In that we list the features
[03:34] (214.88s)
we want. Based on that the model can
[03:37] (217.20s)
decide what's best for us. If you're a
[03:39] (219.20s)
developer you can add specific
[03:40] (220.88s)
requirements to the PRD as well. For
[03:42] (222.96s)
example, I've mentioned that I want
[03:44] (224.72s)
Nex.js for the front end and fast API
[03:47] (227.36s)
for the back end. But even if you don't
[03:49] (229.20s)
know what you want, the workflow I'm
[03:51] (231.04s)
about to show you can automatically
[03:52] (232.88s)
configure all of that and get you a
[03:54] (234.80s)
ready-made app. Now, let's come to the
[03:56] (236.64s)
part of the engineering workflow that
[03:58] (238.56s)
actually has the context for the models,
[04:00] (240.64s)
the documentation folder. These four
[04:02] (242.72s)
files are the most important. the
[04:04] (244.56s)
implementation plan, the project
[04:06] (246.24s)
structure which is currently empty
[04:08] (248.00s)
because it's still being generated, the
[04:09] (249.92s)
UI and UX documentation, and finally bug
[04:12] (252.72s)
tracking. These files are the different
[04:14] (254.80s)
components that the AI model needs to
[04:17] (257.28s)
complete the project. Now, this was
[04:18] (258.96s)
context that the model will use, but the
[04:21] (261.04s)
model should also know how to use it.
[04:22] (262.96s)
For that, I've set up two rules, the
[04:24] (264.96s)
generate rule and the work rule. First,
[04:27] (267.20s)
the generate rule converts the PRD into
[04:29] (269.52s)
all the other files. It basically
[04:31] (271.28s)
generates full context for the
[04:32] (272.96s)
development process. Once all that
[04:34] (274.72s)
context has been generated, the models
[04:36] (276.88s)
context limit gets full for that session
[04:39] (279.04s)
and I won't be able to generate quality
[04:40] (280.96s)
code further on. You can see this in
[04:42] (282.80s)
cursor because it uses models with
[04:44] (284.80s)
limited context windows. They fill up
[04:46] (286.88s)
quickly. If I were using clawed code,
[04:49] (289.04s)
this wouldn't happen as soon. But once I
[04:51] (291.12s)
have all four files generated and ready,
[04:53] (293.52s)
that becomes our complete context. Now
[04:55] (295.60s)
if the model starts working on the
[04:57] (297.20s)
project, it doesn't need to keep all of
[04:59] (299.12s)
that loaded in its context. Otherwise,
[05:01] (301.36s)
it'll just hallucinate more and more.
[05:03] (303.28s)
That's why we move to an implementation
[05:05] (305.20s)
plan so we can work through everything
[05:07] (307.28s)
step by step. Now you might ask, how
[05:09] (309.52s)
does cursor even know how to use these
[05:11] (311.52s)
files? That's where the workflow rule
[05:13] (313.52s)
comes in. It's always attached to cursor
[05:15] (315.68s)
and tells it exactly how to use each
[05:17] (317.68s)
file. When implementing the project, it
[05:19] (319.60s)
looks at the implementation file. When
[05:21] (321.52s)
working on UI and UX, it refers to the
[05:23] (323.92s)
UI and UX documentation. If it's about
[05:26] (326.32s)
to create something new or run a
[05:28] (328.08s)
command, it checks the project structure
[05:30] (330.00s)
to make sure it's consistent. And when
[05:31] (331.84s)
there's an error or a bug, it first
[05:33] (333.76s)
looks into the bug tracking file to make
[05:35] (335.84s)
sure it wasn't already documented. This
[05:37] (337.92s)
workflow rule regulates that entire
[05:40] (340.08s)
process. I've purposely kept it small.
[05:42] (342.32s)
You can see it's way smaller than the
[05:44] (344.00s)
generate file, which is really long. The
[05:46] (346.00s)
implementation plan is even longer. This
[05:48] (348.08s)
is so that the workflow file which is to
[05:50] (350.24s)
be always in the context takes as little
[05:52] (352.32s)
place there as possible. In the
[05:53] (353.92s)
implementation plan, we also have task
[05:56] (356.08s)
lists. These are broader task lists and
[05:58] (358.48s)
then they have their own subtasks. And
[06:00] (360.40s)
you might say that cursor and claude
[06:02] (362.16s)
already have their task lists. Yes, they
[06:04] (364.24s)
do. But here when a subtask comes up, it
[06:06] (366.64s)
decides whether to create a new task
[06:08] (368.48s)
list in the LLM app to break that
[06:10] (370.72s)
specific subtask if it is too long or if
[06:13] (373.20s)
it's simple, then just follow the steps
[06:15] (375.12s)
outlined in the current one. For
[06:16] (376.96s)
example, when we reach the core feature
[06:18] (378.88s)
stage, and by the way, this
[06:20] (380.24s)
implementation is for the entire app,
[06:22] (382.40s)
not just the MVP. It proceeds step by
[06:24] (384.88s)
step. You can narrow it down to just the
[06:26] (386.80s)
MVP if you want. Right now, the full
[06:29] (389.20s)
app's development is estimated at 3 to 4
[06:31] (391.68s)
weeks. If it were just the MVP, it would
[06:33] (393.92s)
be a matter of hours. When it comes to
[06:35] (395.76s)
something like designing and
[06:37] (397.20s)
implementing the database and schema,
[06:39] (399.28s)
that task would have been broken down
[06:40] (400.96s)
into further tasks by cursor. This is
[06:43] (403.28s)
where Andre's advice on the LLM app
[06:45] (405.36s)
being good enough comes in. And cursor
[06:47] (407.12s)
is good enough that it can decide on its
[06:48] (408.88s)
own. And if you think that when you open
[06:50] (410.80s)
a new chat, meaning the context window
[06:53] (413.04s)
resets, cursor will forget what the
[06:55] (415.04s)
project was about, you don't have to
[06:56] (416.64s)
worry about cursor forgetting because
[06:58] (418.40s)
everything's already written down in the
[07:00] (420.32s)
implementation file. That's the core
[07:02] (422.16s)
idea in context engineering. Of course,
[07:04] (424.56s)
both of these files will be in the
[07:06] (426.16s)
description and you can generate these
[07:08] (428.00s)
documentation files for yourself. But
[07:10] (430.08s)
again, I encourage you to create your
[07:11] (431.92s)
own workflows. The important thing isn't
[07:14] (434.00s)
that you got these implementation files
[07:16] (436.00s)
from me. The important thing is that you
[07:17] (437.92s)
understand what context engineering
[07:19] (439.76s)
actually is. And with that
[07:21] (441.04s)
understanding, you can build your own
[07:22] (442.88s)
implementation, your own generation
[07:24] (444.80s)
workflows, and the exact set of files
[07:26] (446.80s)
that cursor or claude code needs to
[07:28] (448.80s)
follow. Oh, and if you're enjoying the
[07:30] (450.96s)
content we're making, I'd really
[07:32] (452.64s)
appreciate it if you hit that subscribe
[07:34] (454.48s)
button. We're also testing out channel
[07:36] (456.40s)
memberships. launched the first tier as
[07:38] (458.32s)
a test and 90 people have joined so far.
[07:40] (460.80s)
The support's been incredible. So, we're
[07:42] (462.80s)
thinking about launching additional
[07:44] (464.16s)
tiers. Right now, members get priority
[07:46] (466.80s)
replies to your comments. Perfect if you
[07:48] (468.88s)
need feedback or have questions. Now,
[07:50] (470.96s)
there's another important point you need
[07:52] (472.64s)
to understand about context engineering.
[07:54] (474.96s)
You can see that I wanted to make the
[07:56] (476.56s)
implementation plan for an MVP, but it's
[07:59] (479.04s)
been taken to an advanced level because
[08:00] (480.96s)
of my generate prompt. Even though I
[08:02] (482.80s)
mentioned that I wanted an MVP since it
[08:05] (485.04s)
was written in the generate file that
[08:06] (486.80s)
the whole application should be
[08:08] (488.16s)
developed with example stages being of a
[08:10] (490.32s)
whole app rather than an MVP. It didn't
[08:12] (492.56s)
really take the MVP scope into account.
[08:14] (494.96s)
This brings me to the crucial point. You
[08:16] (496.88s)
need to be very careful and read
[08:18] (498.56s)
everything you give to these AI models
[08:20] (500.64s)
because they will follow instructions
[08:22] (502.24s)
blindly. If there are any conflicts or
[08:24] (504.40s)
contradictions, there's no telling which
[08:26] (506.16s)
one they'll follow. In my case, I
[08:27] (507.92s)
specifically told it in the generate
[08:29] (509.68s)
file that it should first build the
[08:31] (511.36s)
foundations and do the setup, then
[08:33] (513.28s)
include advanced features. So, it is
[08:35] (515.44s)
following that because that's what it
[08:37] (517.04s)
was told to do. Whether it's a file, a
[08:39] (519.20s)
config, or anything else you generate
[08:41] (521.12s)
from claude or chat GPT models. Please
[08:43] (523.36s)
don't just blindly accept it. You need
[08:45] (525.20s)
to read through everything carefully and
[08:47] (527.36s)
adjust it to your own workflow. I highly
[08:49] (529.76s)
recommend taking an hour to go through
[08:51] (531.92s)
everything that needs to be done so that
[08:53] (533.92s)
you don't face problems moving on. Now,
[08:56] (536.08s)
another important thing that I recommend
[08:57] (537.92s)
doing is this. I suggest you decide the
[09:00] (540.48s)
tech stack yourself because even though
[09:02] (542.56s)
the whole workflow may look automated,
[09:04] (544.96s)
eventually it's your decision because
[09:06] (546.80s)
what if it integrates something that
[09:08] (548.56s)
works with the PRD but doesn't work with
[09:10] (550.64s)
you. For example, you have access to
[09:12] (552.56s)
OpenAI models but it integrates clawed
[09:15] (555.12s)
models. Both work in the project but not
[09:17] (557.52s)
for you. So instead of integrating the
[09:19] (559.52s)
tech stack discovery process into your
[09:22] (562.00s)
entire context ecosystem, I recommend
[09:24] (564.48s)
researching it yourself. Now let me show
[09:26] (566.24s)
you the workflow in action. As you can
[09:28] (568.24s)
see on the left side, there's a process
[09:30] (570.16s)
going on. I've asked it to start
[09:31] (571.76s)
building the app and begin with stage
[09:33] (573.68s)
one. You might notice I started a
[09:35] (575.52s)
completely new chat. It doesn't know
[09:37] (577.28s)
anything about the project, but it gets
[09:39] (579.04s)
the context from the implementation
[09:40] (580.80s)
plan. Everything's written at the top,
[09:42] (582.64s)
what we're building, the tech stack, and
[09:44] (584.72s)
so on. Let me show you what it did. It
[09:46] (586.48s)
went ahead and created the to-dos.
[09:48] (588.16s)
Pretty simple stuff. It just picked up
[09:49] (589.84s)
what I had written and started
[09:51] (591.12s)
implementing them. It didn't need to
[09:52] (592.56s)
divide the subtasks itself. This new
[09:54] (594.64s)
feature in cursor is something I really
[09:56] (596.40s)
like. It just copied exactly what I had
[09:58] (598.48s)
specified and started executing it. As
[10:00] (600.48s)
you can see, it's implementing them one
[10:02] (602.16s)
by one. It's installing everything
[10:03] (603.92s)
mentioned and going step by step. This
[10:06] (606.00s)
was already available in claude code,
[10:07] (607.84s)
but now that it's in cursor, it makes
[10:09] (609.76s)
things a lot easier. The model knows
[10:11] (611.84s)
exactly what it's doing. The context is
[10:13] (613.92s)
right there and it's following the
[10:15] (615.44s)
instructions step by step. And you can
[10:17] (617.28s)
see that as I told it to make all the
[10:19] (619.20s)
folders, they're starting to take shape.
[10:21] (621.20s)
If I collapse the view, we can already
[10:23] (623.20s)
see our back end, our front end, our
[10:25] (625.44s)
scripts, and our shared folders all
[10:27] (627.52s)
starting to form. The whole project is
[10:29] (629.52s)
now being built from the ground up. You
[10:31] (631.76s)
can see that now everything is being
[10:33] (633.36s)
generated. All my basic foundations are
[10:35] (635.60s)
being set up. If I go into the back end,
[10:37] (637.76s)
you can see that the Python app is being
[10:39] (639.76s)
configured because the model already
[10:41] (641.60s)
knows how to set it up. It understands
[10:43] (643.52s)
what text stack is going to be used. So,
[10:45] (645.68s)
it didn't need to think about that. Even
[10:47] (647.44s)
if you do ask it explicitly, it's not a
[10:49] (649.84s)
problem. But that was just my
[10:51] (651.12s)
recommendation. Now that it knows what
[10:52] (652.96s)
it needs to connect, it's using all the
[10:55] (655.04s)
APIs and laying down whatever foundation
[10:57] (657.68s)
is required. Everything it sets up now
[10:59] (659.84s)
will be included in that initial
[11:01] (661.60s)
structure. One thing with software
[11:03] (663.20s)
development is that you can't just go
[11:04] (664.88s)
ahead and implement any feature at any
[11:07] (667.12s)
time. You need the basic structure in
[11:09] (669.12s)
place first. If that foundation isn't
[11:11] (671.04s)
there, either you'll end up restarting
[11:13] (673.12s)
everything from scratch or the amount of
[11:15] (675.12s)
modification needed later will be too
[11:17] (677.12s)
much. Reconfiguring and adding features
[11:19] (679.36s)
without proper scalability in mind.
[11:21] (681.52s)
Well, that kind of project just isn't
[11:23] (683.36s)
good. Anyway, as you can see, this new
[11:25] (685.44s)
to-do feature, even though it's moving
[11:27] (687.44s)
slowly, the advantage is that everything
[11:29] (689.68s)
stays on track. it doesn't forget what
[11:31] (691.76s)
needs to be done and everything is being
[11:33] (693.76s)
completed thoroughly. The model won't
[11:35] (695.68s)
move on until it verifies the current
[11:37] (697.76s)
step. So now you can see it's
[11:39] (699.28s)
progressing. This will take some time
[11:41] (701.52s)
and stage one is just about setting
[11:43] (703.68s)
everything up. But here's what I want
[11:45] (705.20s)
you to take away. While these
[11:46] (706.64s)
implementation plans will be available
[11:48] (708.64s)
in the description below, you should
[11:50] (710.48s)
create your own. If you want to use
[11:52] (712.24s)
claude code, you can use these files
[11:54] (714.24s)
with claude code as well. Just drag them
[11:56] (716.32s)
in and when you enter the slash command,
[11:58] (718.48s)
it'll generate everything for you. For
[12:00] (720.32s)
example, I created this custom command
[12:02] (722.48s)
called generate implementation. And all
[12:04] (724.64s)
I need to do is copy the generate
[12:06] (726.32s)
workflow file over here. That's it.
[12:08] (728.24s)
It'll generate everything for me.
[12:10] (730.00s)
Another thing I really like about claude
[12:11] (731.84s)
code is that it includes a cloud.md file
[12:14] (734.72s)
which holds all the codebased
[12:16] (736.08s)
documentation. But for managing the
[12:18] (738.00s)
context window, using this documentation
[12:20] (740.40s)
approach with multiple files is way
[12:22] (742.64s)
better than dumping everything into one
[12:24] (744.64s)
single file. Also, if I ask it to do
[12:27] (747.12s)
something that requires multiple agents,
[12:29] (749.28s)
Claude code will spin up those agents.
[12:31] (751.52s)
This is where claude code has a slight
[12:33] (753.44s)
advantage because all the agents can
[12:35] (755.60s)
work at once. But that only helps in
[12:37] (757.60s)
tasks where you don't need to go step by
[12:39] (759.92s)
step. For example, this entire
[12:41] (761.76s)
implementation plan needs to be done
[12:43] (763.92s)
step by step. Each part has to connect
[12:46] (766.32s)
sequentially. You can't just install
[12:48] (768.08s)
everything in one go. Sure, for things
[12:50] (770.00s)
like installing dependencies or setting
[12:52] (772.08s)
up packages, that parallelism can help.
[12:54] (774.48s)
But for most of these workflows, it has
[12:56] (776.48s)
to be done one step at a time. Now,
[12:58] (778.40s)
there are some things like generating UI
[13:00] (780.48s)
variations where multiple agents really
[13:02] (782.40s)
shine. They can each create different
[13:04] (784.32s)
variations for you. And for that, we
[13:06] (786.40s)
actually have a separate video dedicated
[13:08] (788.32s)
to claude code that also falls under
[13:10] (790.48s)
context management because in that
[13:12] (792.32s)
video, I've used these rule files as
[13:14] (794.24s)
well. So, definitely go check that out.
[13:16] (796.40s)
That brings us to the end of this video.
[13:18] (798.32s)
If you'd like to support the channel and
[13:20] (800.08s)
help us keep making videos like this,
[13:22] (802.16s)
you can do so by using the super thanks
[13:24] (804.16s)
button below. As always, thank you for
[13:26] (806.48s)
watching and I'll see you in the next