[00:00] (0.12s)
this is going to sound dramatic but if
[00:01] (1.76s)
you're in Tech if you're a programmer
[00:03] (3.28s)
building apps or just interested I
[00:05] (5.80s)
really think if you don't understand
[00:07] (7.84s)
this then you're going to get completely
[00:10] (10.16s)
left behind in the coming years why
[00:12] (12.28s)
combinator is saying this is going to be
[00:14] (14.24s)
10 times bigger than SAS it actually led
[00:16] (16.88s)
to me switching to Windows for the first
[00:19] (19.16s)
time ever and buying a new laptop when
[00:21] (21.24s)
you think of AI apps you probably think
[00:22] (22.72s)
of okay chbt open AI but once or twice
[00:26] (26.40s)
you might have heard of AI agents and
[00:28] (28.44s)
this article by anthropic lays it out in
[00:30] (30.88s)
a really solid way you can think of
[00:33] (33.04s)
Agents as composable building blocks
[00:35] (35.92s)
similar to patterns in program and with
[00:38] (38.44s)
these building blocks you can either
[00:39] (39.64s)
build workflows which augment your code
[00:42] (42.96s)
and replace functions and can do a set
[00:45] (45.00s)
of actions or you have agents which are
[00:46] (46.92s)
a level higher they orchestrate your
[00:50] (50.04s)
workflows your functions so basically
[00:51] (51.96s)
they choose what actions to take and
[00:54] (54.40s)
then based on the result of that action
[00:55] (55.92s)
they choose what to do next now these
[00:57] (57.80s)
building blocks I spoke about they're
[00:59] (59.36s)
very similar IL to Concepts you already
[01:01] (61.40s)
know if you know anything about
[01:02] (62.64s)
programming you have prompt chaining
[01:04] (64.72s)
which is the same as doing multiple
[01:06] (66.44s)
function calls with optional error
[01:08] (68.36s)
handling evaluator Optimizer it's just a
[01:11] (71.24s)
loop routing is like parallel concurrent
[01:14] (74.60s)
async programming which helps improve
[01:16] (76.76s)
your performance by running multiple
[01:18] (78.24s)
things at once and orchestration and
[01:20] (80.36s)
synthesis basically what data Engineers
[01:22] (82.28s)
do you take large data sets and you
[01:24] (84.00s)
transform it into a more useful
[01:25] (85.72s)
structured format as we can see there's
[01:27] (87.76s)
a big catch with these types of workfl
[01:29] (89.92s)
flows and systems they often trade
[01:31] (91.84s)
latency and cost for better task
[01:33] (93.96s)
performance so agents are expensive in
[01:36] (96.12s)
time and money they take a while to run
[01:37] (97.76s)
because you're doing all these backtack
[01:39] (99.28s)
tasks and you're doing tons of llm calls
[01:41] (101.92s)
maybe recursively in a loop and feeding
[01:44] (104.12s)
in large context because your agent
[01:45] (105.96s)
needs to understand the past actions
[01:48] (108.20s)
that it's already taken but and this is
[01:50] (110.08s)
super interesting is there is a way
[01:52] (112.08s)
completely around this and it's why I
[01:53] (113.68s)
bought the new laptop it allows you to
[01:55] (115.84s)
run llms on your computer for free and a
[01:58] (118.52s)
lot faster everyone is saying it it's
[02:00] (120.40s)
not AI that's going to take your job
[02:02] (122.04s)
it's someone that knows AI better than
[02:03] (123.76s)
you do if that's true at all what we're
[02:05] (125.44s)
doing here learning is absolutely key
[02:08] (128.20s)
for securing your future career in other
[02:10] (130.16s)
words you want to thrive you got to
[02:11] (131.40s)
learn this stuff as much as possible now
[02:13] (133.44s)
the most serious courses that I've come
[02:15] (135.20s)
across when I was searching around
[02:17] (137.08s)
trying to learn are simply learns Ai and
[02:19] (139.68s)
machine learning courses give me just a
[02:21] (141.68s)
minute CU if you want to go deep I think
[02:23] (143.12s)
they're worth checking out and just a
[02:24] (144.84s)
heads up this video is sponsored by
[02:26] (146.88s)
simply learn one of the best ones I came
[02:28] (148.84s)
across was the micros soft bagged AI
[02:30] (150.84s)
engineer course because it covers
[02:32] (152.68s)
everything from generative AI to deep
[02:34] (154.72s)
learning prompt engineering and more
[02:36] (156.96s)
there's over 25 projects and a Capstone
[02:39] (159.96s)
so you'll walk away with a lot of
[02:41] (161.32s)
hands-on experience then there's
[02:42] (162.92s)
electives which really let you
[02:44] (164.52s)
specialize advanc generative AI NLP and
[02:48] (168.56s)
even preparing for the Microsoft azer
[02:50] (170.60s)
certification exam and in the end you
[02:52] (172.44s)
even get a certificate from Microsoft
[02:54] (174.28s)
and if you're curious about reviews 4.5
[02:56] (176.36s)
on switch up 4.4 on Career Karma you can
[02:59] (179.24s)
check those out they've also got
[03:00] (180.52s)
financing options so I would encourage
[03:02] (182.20s)
you if this sounds interesting at all at
[03:03] (183.76s)
least check out Simply learns website
[03:05] (185.76s)
evaluate some of the different courses
[03:07] (187.44s)
and this is a really structured way just
[03:09] (189.20s)
to get fully immersed in AI so if you're
[03:11] (191.64s)
interested check out the pin comment or
[03:13] (193.64s)
Link in description thanks again to
[03:15] (195.20s)
Simply learn for the sponsor back to the
[03:17] (197.52s)
video all you need to do is go to
[03:19] (199.76s)
ama.com and you can download a bunch of
[03:22] (202.52s)
different ones for free so running
[03:24] (204.44s)
through this really quick you just go to
[03:26] (206.12s)
the models Tab and you can see a full
[03:28] (208.72s)
list here it goes goes on and on now
[03:31] (211.24s)
most important part when you're running
[03:32] (212.80s)
it locally your model size has to be
[03:35] (215.72s)
less than your GPU vram so this
[03:39] (219.32s)
particular card RTX 4070 it has 8 GB so
[03:44] (224.32s)
I have to check how big is the model not
[03:47] (227.16s)
in terms of uh parameters like this one
[03:50] (230.12s)
has 70 billion but in terms of the
[03:52] (232.40s)
actual let's say file or uh trained
[03:55] (235.36s)
model size so I can go into for example
[03:58] (238.08s)
llama 3.3 it's going to be too big I
[04:00] (240.52s)
already know with the 70 and the 405 um
[04:04] (244.60s)
billion parameters so what I do is I
[04:06] (246.80s)
just go into tags and I can see like
[04:09] (249.12s)
yeah they're 40 49 53 and so on I
[04:12] (252.88s)
already have a few installed and again
[04:15] (255.24s)
you can install a new one just by
[04:17] (257.96s)
running this command and it will
[04:19] (259.72s)
immediately start running it but I can
[04:22] (262.04s)
uh show you which ones I have installed
[04:24] (264.08s)
so o llama
[04:25] (265.52s)
LS and you'll see that I have quen I
[04:29] (269.04s)
have llama 3
[04:30] (270.24s)
actually have a compressed version right
[04:31] (271.72s)
now and uh just cuz I was testing it and
[04:34] (274.36s)
I have L lava which is image analysis uh
[04:38] (278.52s)
so these quen models you'll notice
[04:40] (280.44s)
they're actually even the compressed
[04:42] (282.60s)
ones are under eight but actually when I
[04:45] (285.32s)
run it it is not fully using my GPU so
[04:49] (289.16s)
if I do if I run quen instruct Q2 let me
[04:53] (293.76s)
just show you what I
[04:55] (295.96s)
mean so once you run that you get a
[04:58] (298.08s)
command prompt and you can kind of test
[04:59] (299.44s)
the speed by typing in a command hello
[05:01] (301.84s)
and even for that really simple one
[05:03] (303.20s)
you'll see there was a bit of a delay
[05:05] (305.04s)
now if I go to a new window and I do AMA
[05:07] (307.12s)
PS I can see the reason is because
[05:10] (310.20s)
actually when this model is running the
[05:12] (312.24s)
size expands to 7.5 GB because it needs
[05:15] (315.08s)
a little bit of extra space and then on
[05:18] (318.04s)
top of that my GPU needs some extra
[05:20] (320.20s)
space to run normal processes so while
[05:22] (322.44s)
it's able to fit 91% into the GPU that's
[05:24] (324.92s)
still going to be a pretty big
[05:26] (326.48s)
performance hit because it has to
[05:27] (327.84s)
offload things to the CPO which is just
[05:30] (330.04s)
like exponentially slower and it's what
[05:31] (331.96s)
you'd have to do if you don't have a GPU
[05:33] (333.72s)
locally so let's just kill this one and
[05:36] (336.64s)
we'll instead run the Llama
[05:40] (340.00s)
3.2 so I just copy that model
[05:43] (343.76s)
name just running it again because it
[05:45] (345.92s)
didn't fully stop the previous one and
[05:48] (348.24s)
now when we type hello boom instant
[05:50] (350.12s)
response that's what we want now if we
[05:51] (351.92s)
check the you uh utilization we can see
[05:54] (354.80s)
it's 100% GPU it expanded to 5.4 but
[05:58] (358.16s)
that's okay as long as you you have this
[06:00] (360.36s)
when you run a llama piece companies
[06:02] (362.72s)
like this origami agents it's in y
[06:05] (365.00s)
combinator in the first month I think
[06:07] (367.00s)
they're doing 100K recurring Revenue
[06:09] (369.56s)
already and let's just take a look at
[06:11] (371.16s)
this company and try to build a simple
[06:13] (373.24s)
version so they do business lead
[06:15] (375.32s)
generation with custom prompting and
[06:17] (377.36s)
their whole pitch is something I've
[06:18] (378.44s)
described in the past few videos you
[06:19] (379.84s)
have virtually unlimited unstructured
[06:21] (381.84s)
data on the web that maybe doesn't exist
[06:24] (384.40s)
in a database you have access to and if
[06:26] (386.44s)
you can extract and structure this in a
[06:28] (388.24s)
useful way this is a huge opportunity
[06:30] (390.76s)
even if you're building a very Niche
[06:32] (392.24s)
agent for a very specific industry so if
[06:34] (394.84s)
we scroll down we can see some of the
[06:36] (396.16s)
queries people are able to run find
[06:37] (397.64s)
woocommerce store owners who sell
[06:39] (399.76s)
products covered in uh By Us health
[06:42] (402.12s)
insurance here's another one you can
[06:44] (404.24s)
visit the site if you want to see them
[06:45] (405.72s)
all but let's take a look at this one
[06:47] (407.88s)
specifically so if we think in terms of
[06:50] (410.24s)
agentic patterns so for this one if we
[06:52] (412.20s)
think in terms of agentic patterns how
[06:53] (413.92s)
would we actually achieve this with
[06:55] (415.80s)
various workflows to make this a bit
[06:57] (417.28s)
more concrete we have the orchestrator
[06:58] (418.72s)
at the top which first chooses what
[07:00] (420.76s)
steps we want to run so first find
[07:03] (423.08s)
products second find stores and then
[07:05] (425.04s)
third see if they're woocommerce or not
[07:07] (427.88s)
and within each step we'd have sub
[07:09] (429.16s)
workflows so we might have just have a
[07:10] (430.92s)
search Google workflow a scrape article
[07:14] (434.00s)
site to extract the information we need
[07:16] (436.28s)
so I've actually coded a simple version
[07:17] (437.92s)
of an app that's very similar to what a
[07:20] (440.48s)
B2B agent like origami would do but I've
[07:23] (443.96s)
given you the choice here between
[07:25] (445.28s)
running it locally for free or using the
[07:27] (447.48s)
open AI API but again I'll emphasize
[07:30] (450.04s)
that agents specifically they can do a
[07:32] (452.48s)
lot of llm calls consume a lot of tokens
[07:34] (454.52s)
so you can see my component files map to
[07:36] (456.68s)
the architecture that we talked about
[07:38] (458.20s)
first I have the orchestrator which has
[07:40] (460.20s)
a lot of different methods and again if
[07:41] (461.68s)
you really want to get into it just
[07:43] (463.00s)
check out the code but at the highest
[07:45] (465.24s)
level first we're generating the tasks
[07:48] (468.04s)
upfront first do this then do this then
[07:50] (470.32s)
do this and use subw workflows to
[07:52] (472.24s)
accomplish them then another important
[07:54] (474.00s)
method we have a prompt to select the
[07:55] (475.96s)
next workflow from the workflow
[07:57] (477.96s)
definitions in these folders which are
[07:59] (479.84s)
basically the same as just programming
[08:01] (481.92s)
functions for our workflows I could have
[08:03] (483.68s)
gone super generic and just written a
[08:05] (485.44s)
crawler with a custom prompt but I
[08:07] (487.56s)
wanted to make it a little bit more
[08:08] (488.80s)
specific where I can where I have one
[08:10] (490.92s)
just for search Google where I can
[08:12] (492.80s)
ensure that the search results are
[08:14] (494.44s)
getting crawled and extracted correctly
[08:16] (496.60s)
so I've also added a little bit of
[08:18] (498.24s)
custom scraping code here if we're doing
[08:20] (500.24s)
B2B having something that finds LinkedIn
[08:22] (502.36s)
profiles is very useful and I've
[08:25] (505.08s)
specified here that we want to just pull
[08:26] (506.84s)
the serps or Google search results for
[08:29] (509.08s)
let's say name company name let's say we
[08:31] (511.40s)
do a prompt like find me all the coding
[08:33] (513.64s)
boot camp owners in the USA and send me
[08:36] (516.12s)
their name and their LinkedIn profile so
[08:38] (518.44s)
I can message them that would be an
[08:39] (519.88s)
example use case there and then of
[08:41] (521.60s)
course we have our generic crawl site so
[08:43] (523.68s)
we have these search results for Google
[08:45] (525.92s)
then our orchestrator will decide which
[08:48] (528.52s)
ones are worth visiting that might have
[08:50] (530.52s)
the information we want so I've run this
[08:52] (532.32s)
a few times already and I'll show you
[08:53] (533.72s)
the output I put in find 10 Facebook
[08:55] (535.64s)
software engineer names and their
[08:56] (536.84s)
LinkedIn profiles and we can see just
[08:58] (538.80s)
from this prompt the model completely
[09:00] (540.32s)
ran and extracted exactly what we need
[09:02] (542.52s)
we have name profile URL and position so
[09:05] (545.08s)
software engineer software engineer yeah
[09:07] (547.80s)
pretty much all software Engineers with
[09:09] (549.92s)
direct links let me just show you
[09:11] (551.52s)
another example of this let's say you
[09:12] (552.84s)
want to find Shopify apps to Market a
[09:15] (555.52s)
specific product to them all I have to
[09:18] (558.20s)
do is say find me 10 Shopify
[09:21] (561.44s)
apps then find their Founders SL owners
[09:27] (567.68s)
and their
[09:30] (570.60s)
their names and Linkedin
[09:33] (573.72s)
URLs now I don't want this video to get
[09:35] (575.84s)
too long but let's just see what it does
[09:37] (577.80s)
first so so here our orchestrator has
[09:39] (579.68s)
broken it up into two key tasks first
[09:42] (582.20s)
find the Shopify apps and names from a
[09:43] (583.84s)
Google search then get founder names and
[09:45] (585.76s)
Linkedin URLs so it's going to complete
[09:47] (587.88s)
the first one before the second one then
[09:49] (589.60s)
it selected the search Google workflow
[09:51] (591.16s)
and we can see the search query here
[09:52] (592.84s)
that'll run for a little bit and then
[09:54] (594.08s)
we'll see that our step was complete
[09:55] (595.80s)
with a summary so I found the first
[09:57] (597.36s)
Shopify app which is named clavio and it
[09:59] (599.72s)
found and it's doing another search
[10:01] (601.60s)
query for clavio Founders LinkedIn and
[10:04] (604.12s)
so far it found two results now it's
[10:06] (606.20s)
doing the same for oero privy and it's
[10:09] (609.24s)
just going down through the list of URLs
[10:11] (611.40s)
that we found so this is going to
[10:12] (612.84s)
continue to run let me just show you the
[10:14] (614.36s)
end output with that very simple prompt
[10:16] (616.28s)
that we put in in the beginning so here
[10:18] (618.00s)
we can see a result summary and it also
[10:19] (619.72s)
saved us a Json file let's open that
[10:22] (622.40s)
file and first we can see it got all the
[10:25] (625.04s)
app names and URLs but and we can also
[10:27] (627.68s)
look at the summary down here
[10:28] (628.84s)
unfortunately it didn't 100% understand
[10:30] (630.80s)
this because for certain apps it found
[10:32] (632.72s)
more than one person so of course you
[10:34] (634.12s)
could further refine The Prompt that
[10:35] (635.52s)
you're inputting to get a different
[10:37] (637.48s)
result but let's just take a look at one
[10:39] (639.64s)
of these URLs to see if it's correct and
[10:41] (641.92s)
yeah for this one at least we got the
[10:43] (643.28s)
co-founder at sprocket let's just check
[10:45] (645.32s)
another one toer tagrin for
[10:48] (648.32s)
yo and yeah so we got the CEO there so
[10:51] (651.52s)
so of course it's still not perfect I
[10:52] (652.84s)
coded this agent in about 1 day but you
[10:54] (654.92s)
can probably start to see the power of
[10:57] (657.04s)
composing things together doing custom
[10:58] (658.68s)
workflows and having a really solid
[11:00] (660.64s)
orchestrator that being said if you like
[11:02] (662.24s)
this video please leave a comment so I
[11:03] (663.92s)
can make more agent videos or AI videos
[11:07] (667.28s)
and with that being said I'll see you
[11:08] (668.44s)
guys in the next one