YouTube Deep SummaryYouTube Deep Summary

Star Extract content that makes a tangible impact on your life

Video thumbnail

Learn AI Agents - How they Work & Build Your Own

Aaron Jack • 11:10 minutes • YouTube

🤖 AI-Generated Summary:

Why Understanding AI Agents is Crucial for Programmers and Tech Enthusiasts in 2024

If you’re in tech—whether you’re a programmer building apps or simply interested in the latest trends—you might find this a bit dramatic, but understanding AI agents is becoming absolutely essential. As Y Combinator predicts, the AI agent market could be 10 times bigger than SaaS in the near future. This insight even convinced me to switch to Windows and invest in a new laptop just to keep up.

What Are AI Agents?

When most people think of AI apps, they picture chatbots like ChatGPT or OpenAI models. But AI agents go beyond that. According to a detailed article by Anthropic, agents are composable building blocks in AI, much like programming patterns. You can think of them as workflows that augment your code, replace functions, and execute sequences of actions. On a higher level, agents orchestrate these workflows—they decide what steps to take next based on the results of previous actions.

This is similar to concepts programmers already know:
- Prompt chaining is like multiple function calls with error handling.
- Evaluator and optimizer loops help improve results iteratively.
- Routing is akin to parallel or asynchronous programming.
- Orchestration and synthesis resemble data engineering tasks where raw data is transformed into useful, structured formats.

The Trade-Off: Cost and Latency vs. Performance

AI agents excel at complex tasks but come with a big catch—they consume significant time and resources. Running these agents involves numerous calls to large language models (LLMs), often recursively, which means high latency and high costs. Agents need to maintain context from all previous actions, increasing computational demand.

Running LLMs Locally: A Game Changer

Here’s the exciting part: you can bypass some of these costs and latency by running LLMs directly on your local machine. This inspired my laptop upgrade. Platforms like Hugging Face’s AMA provide free access to many models you can run locally, as long as the model size fits your GPU’s VRAM. For example, my RTX 4070 GPU has 8GB VRAM, so I use compressed or smaller models like LLaMA 3.2 to get instant responses and efficient GPU utilization.

Running models locally reduces dependency on paid APIs and speeds up development, but you must manage hardware limitations and model sizes carefully.

Real-World Applications: Building a B2B Agent for Lead Generation

One promising use case for AI agents is business lead generation. A Y Combinator-backed startup, Origami Agents, is already generating $100K in recurring revenue with an AI agent that queries unstructured web data to find niche leads—like WooCommerce store owners selling specific products.

I built a simplified version of this kind of agent to demonstrate the power of AI workflows:
- Orchestrator: Coordinates the sequence of tasks like finding products, finding stores, and verifying store types.
- Workflows: Include Google search scraping, extracting LinkedIn profiles, and crawling websites.
- Prompt engineering: Custom prompts help target specific queries, like “Find 10 Facebook software engineer names with LinkedIn profiles” or “Find Shopify app founders and their LinkedIn URLs.”

The agent runs these workflows sequentially, scraping and structuring data into JSON files. While not perfect, it already produces valuable, actionable data with minimal manual effort.

Why You Should Care and How to Get Started

AI agents are rapidly changing how software is built and how automation works. If you don’t learn these concepts, you risk being left behind.

To dive deep, I highly recommend serious AI and machine learning courses. Simply Learn offers excellent programs, including the Microsoft-backed AI Engineer course, covering generative AI, deep learning, prompt engineering, and more. They offer hands-on projects, certifications, and even financing options.

Check out their offerings if you want a structured and comprehensive path into AI.

Final Thoughts

  • AI agents represent the next frontier beyond chatbots.
  • They enable complex, multi-step automation by orchestrating workflows.
  • Running LLMs locally can save cost and improve speed but requires suitable hardware.
  • Building custom agents can unlock powerful business applications like lead generation.
  • Learning AI agent development is crucial for future-proofing your career.

If you’re interested in exploring AI agents yourself, start by experimenting with local LLMs on platforms like Hugging Face, then move on to building simple orchestrators and workflows. And if you want to see my agent code or have questions, leave a comment—I’d love to share more!


Stay ahead in tech by mastering AI agents—the future has never looked more exciting.

Links & Resources:
- Hugging Face AMA models: https://huggingface.co/models
- Simply Learn AI Engineer Course: [Link in Description]
- Origami Agents (Y Combinator startup): https://origami.agents


Thanks for reading! If you enjoyed this post, share it with your network and subscribe for more AI insights.


📝 Transcript (312 entries):

this is going to sound dramatic but if you're in Tech if you're a programmer building apps or just interested I really think if you don't understand this then you're going to get completely left behind in the coming years why combinator is saying this is going to be 10 times bigger than SAS it actually led to me switching to Windows for the first time ever and buying a new laptop when you think of AI apps you probably think of okay chbt open AI but once or twice you might have heard of AI agents and this article by anthropic lays it out in a really solid way you can think of Agents as composable building blocks similar to patterns in program and with these building blocks you can either build workflows which augment your code and replace functions and can do a set of actions or you have agents which are a level higher they orchestrate your workflows your functions so basically they choose what actions to take and then based on the result of that action they choose what to do next now these building blocks I spoke about they're very similar IL to Concepts you already know if you know anything about programming you have prompt chaining which is the same as doing multiple function calls with optional error handling evaluator Optimizer it's just a loop routing is like parallel concurrent async programming which helps improve your performance by running multiple things at once and orchestration and synthesis basically what data Engineers do you take large data sets and you transform it into a more useful structured format as we can see there's a big catch with these types of workfl flows and systems they often trade latency and cost for better task performance so agents are expensive in time and money they take a while to run because you're doing all these backtack tasks and you're doing tons of llm calls maybe recursively in a loop and feeding in large context because your agent needs to understand the past actions that it's already taken but and this is super interesting is there is a way completely around this and it's why I bought the new laptop it allows you to run llms on your computer for free and a lot faster everyone is saying it it's not AI that's going to take your job it's someone that knows AI better than you do if that's true at all what we're doing here learning is absolutely key for securing your future career in other words you want to thrive you got to learn this stuff as much as possible now the most serious courses that I've come across when I was searching around trying to learn are simply learns Ai and machine learning courses give me just a minute CU if you want to go deep I think they're worth checking out and just a heads up this video is sponsored by simply learn one of the best ones I came across was the micros soft bagged AI engineer course because it covers everything from generative AI to deep learning prompt engineering and more there's over 25 projects and a Capstone so you'll walk away with a lot of hands-on experience then there's electives which really let you specialize advanc generative AI NLP and even preparing for the Microsoft azer certification exam and in the end you even get a certificate from Microsoft and if you're curious about reviews 4.5 on switch up 4.4 on Career Karma you can check those out they've also got financing options so I would encourage you if this sounds interesting at all at least check out Simply learns website evaluate some of the different courses and this is a really structured way just to get fully immersed in AI so if you're interested check out the pin comment or Link in description thanks again to Simply learn for the sponsor back to the video all you need to do is go to ama.com and you can download a bunch of different ones for free so running through this really quick you just go to the models Tab and you can see a full list here it goes goes on and on now most important part when you're running it locally your model size has to be less than your GPU vram so this particular card RTX 4070 it has 8 GB so I have to check how big is the model not in terms of uh parameters like this one has 70 billion but in terms of the actual let's say file or uh trained model size so I can go into for example llama 3.3 it's going to be too big I already know with the 70 and the 405 um billion parameters so what I do is I just go into tags and I can see like yeah they're 40 49 53 and so on I already have a few installed and again you can install a new one just by running this command and it will immediately start running it but I can uh show you which ones I have installed so o llama LS and you'll see that I have quen I have llama 3 actually have a compressed version right now and uh just cuz I was testing it and I have L lava which is image analysis uh so these quen models you'll notice they're actually even the compressed ones are under eight but actually when I run it it is not fully using my GPU so if I do if I run quen instruct Q2 let me just show you what I mean so once you run that you get a command prompt and you can kind of test the speed by typing in a command hello and even for that really simple one you'll see there was a bit of a delay now if I go to a new window and I do AMA PS I can see the reason is because actually when this model is running the size expands to 7.5 GB because it needs a little bit of extra space and then on top of that my GPU needs some extra space to run normal processes so while it's able to fit 91% into the GPU that's still going to be a pretty big performance hit because it has to offload things to the CPO which is just like exponentially slower and it's what you'd have to do if you don't have a GPU locally so let's just kill this one and we'll instead run the Llama 3.2 so I just copy that model name just running it again because it didn't fully stop the previous one and now when we type hello boom instant response that's what we want now if we check the you uh utilization we can see it's 100% GPU it expanded to 5.4 but that's okay as long as you you have this when you run a llama piece companies like this origami agents it's in y combinator in the first month I think they're doing 100K recurring Revenue already and let's just take a look at this company and try to build a simple version so they do business lead generation with custom prompting and their whole pitch is something I've described in the past few videos you have virtually unlimited unstructured data on the web that maybe doesn't exist in a database you have access to and if you can extract and structure this in a useful way this is a huge opportunity even if you're building a very Niche agent for a very specific industry so if we scroll down we can see some of the queries people are able to run find woocommerce store owners who sell products covered in uh By Us health insurance here's another one you can visit the site if you want to see them all but let's take a look at this one specifically so if we think in terms of agentic patterns so for this one if we think in terms of agentic patterns how would we actually achieve this with various workflows to make this a bit more concrete we have the orchestrator at the top which first chooses what steps we want to run so first find products second find stores and then third see if they're woocommerce or not and within each step we'd have sub workflows so we might have just have a search Google workflow a scrape article site to extract the information we need so I've actually coded a simple version of an app that's very similar to what a B2B agent like origami would do but I've given you the choice here between running it locally for free or using the open AI API but again I'll emphasize that agents specifically they can do a lot of llm calls consume a lot of tokens so you can see my component files map to the architecture that we talked about first I have the orchestrator which has a lot of different methods and again if you really want to get into it just check out the code but at the highest level first we're generating the tasks upfront first do this then do this then do this and use subw workflows to accomplish them then another important method we have a prompt to select the next workflow from the workflow definitions in these folders which are basically the same as just programming functions for our workflows I could have gone super generic and just written a crawler with a custom prompt but I wanted to make it a little bit more specific where I can where I have one just for search Google where I can ensure that the search results are getting crawled and extracted correctly so I've also added a little bit of custom scraping code here if we're doing B2B having something that finds LinkedIn profiles is very useful and I've specified here that we want to just pull the serps or Google search results for let's say name company name let's say we do a prompt like find me all the coding boot camp owners in the USA and send me their name and their LinkedIn profile so I can message them that would be an example use case there and then of course we have our generic crawl site so we have these search results for Google then our orchestrator will decide which ones are worth visiting that might have the information we want so I've run this a few times already and I'll show you the output I put in find 10 Facebook software engineer names and their LinkedIn profiles and we can see just from this prompt the model completely ran and extracted exactly what we need we have name profile URL and position so software engineer software engineer yeah pretty much all software Engineers with direct links let me just show you another example of this let's say you want to find Shopify apps to Market a specific product to them all I have to do is say find me 10 Shopify apps then find their Founders SL owners and their their names and Linkedin URLs now I don't want this video to get too long but let's just see what it does first so so here our orchestrator has broken it up into two key tasks first find the Shopify apps and names from a Google search then get founder names and Linkedin URLs so it's going to complete the first one before the second one then it selected the search Google workflow and we can see the search query here that'll run for a little bit and then we'll see that our step was complete with a summary so I found the first Shopify app which is named clavio and it found and it's doing another search query for clavio Founders LinkedIn and so far it found two results now it's doing the same for oero privy and it's just going down through the list of URLs that we found so this is going to continue to run let me just show you the end output with that very simple prompt that we put in in the beginning so here we can see a result summary and it also saved us a Json file let's open that file and first we can see it got all the app names and URLs but and we can also look at the summary down here unfortunately it didn't 100% understand this because for certain apps it found more than one person so of course you could further refine The Prompt that you're inputting to get a different result but let's just take a look at one of these URLs to see if it's correct and yeah for this one at least we got the co-founder at sprocket let's just check another one toer tagrin for yo and yeah so we got the CEO there so so of course it's still not perfect I coded this agent in about 1 day but you can probably start to see the power of composing things together doing custom workflows and having a really solid orchestrator that being said if you like this video please leave a comment so I can make more agent videos or AI videos and with that being said I'll see you guys in the next one