The Chaos of AI Agents

Overview

The video explores the use of AI agents—command-line chatbots that can control computers autonomously—to create generative art through code. The creator experiments with different AI models like Claude and Gemini, letting them self-direct their creative processes, collaborate, and continuously modify outputs, while reflecting on the challenges, costs, and potential of such autonomous AI agents.

Main Topics Covered

Introduction to AI command-line agents controlling computers
Differences between AI agents: Claude, Gemini, and OpenAI’s Codeex
Generative art creation via AI-written code and image feedback loops
Autonomous AI behavior vs. guided AI coding
Multi-agent collaboration and communication challenges
Concept of AI “role-playing” and creativity
Experimentation with evolutionary art refinement
Limitations and hallucinations in AI self-assessment
Cost considerations of running AI agents extensively
Reflections on the future potential and current limitations of AI agents

Key Takeaways & Insights

AI agents can autonomously generate code that creates images, then analyze those images to iteratively improve their output without human intervention.
Claude and Gemini are better suited for image-based feedback loops since they can read image files; Codeex lacks this capability.
AI agents tend to shortcut open-ended tasks by generating a single script to endlessly create random images, which conflicts with the goal of active iterative creativity.
Multi-agent collaboration is currently chaotic and error-prone, with agents overwriting each other’s work and failing to maintain coherence.
AI models fundamentally operate as advanced next-token predictors, which is powerful but different from human intelligence. Their “role-playing” ability allows them to simulate creative personas.
Agents often produce grandiose, overblown descriptions and invented statistics, reflecting a lack of self-awareness and critical reflection.
Running these AI agents, especially with more capable models like Claude Opus, is expensive.
Current AI agents excel at clear, well-defined coding tasks with human oversight but struggle with truly open-ended, creative, and autonomous projects.
Multi-agent communication and coordination require more than just smart prompting; fundamental model improvements are needed.
The ideal vision of a “country of genius AI agents” working together remains distant.

Actionable Strategies

Use AI agents that can read and write files, including images, to enable iterative feedback loops in creative projects.
Implement selection or evolutionary steps where the AI chooses preferred outputs and generates variations to promote refinement.
Run AI agents in isolated virtual environments to prevent system crashes and resource overuse.
Facilitate communication between multiple agents by creating shared text files for messaging, with mechanisms for conflict resolution like file locking or retrying.
Save intermediate outputs regularly to avoid losing work overwritten by autonomous agents.
Provide clear, carefully crafted prompts to guide AI agents effectively and discourage shortcuts.
Combine multiple agents cautiously, understanding the current limitations of coordination and potential for destructive interference.
Expect to manually review and touch up AI-generated outputs, especially for public-facing materials like thumbnails.

Specific Details & Examples

Claude Opus is described as probably the best but also the most expensive coding model; running it for a few hours cost around $34.
A full day of multiple Claude Sonnet instances (cheaper, faster, less capable) cost about $20.
Gemini was cheaper but had API usage limits and was artificially priced low by Google.
The feedback loop involved generating an image via Python code, then reading the image to inform the next iteration.
An evolutionary refinement process was tested: generating two images, selecting the preferred one, and creating variations on it.
Multi-agent city-building project involved four Claude Sonnet agents communicating via a shared plan.txt file, resulting in a chaotic, incoherent image with alien invasion themes.
Agents frequently created fanciful project names like “meta evolution engine” and “quantum field evolutionary organisms environment” but mostly produced random images or text.
Some examples of cute outputs included little people and a dog, though sometimes floating unrealistically in the image.

Warnings & Common Mistakes

AI agents often try to bypass open-ended tasks by creating scripts that loop infinitely rather than iteratively generating and critiquing outputs.
Running AI agents outside of virtual environments risks freezing or crashing the host machine due to heavy resource use.
Multiple agents working on the same files can overwrite and destroy each other's work without proper coordination.
AI agents tend to hallucinate or fabricate plausible-sounding but false information, including fake statistics and exaggerated descriptions of their own creativity.
Lack of self-reflection and critical assessment in AI outputs means users must remain skeptical and oversee results.
API limits and costs can constrain experimentation and scalability.
Open-ended creative tasks remain a challenge, revealing the gap between current AI capabilities and true general intelligence.

Resources & Next Steps

The video creator provides prompt files for the AI agents in the video description or on GitHub for viewers to reuse.
Patreon and Coffee pages are available to support the creator’s work and access additional interactive experiences like a Minecraft server with AI bots.
Viewers are encouraged to experiment with autonomous AI coding agents themselves, using virtual environments and multiple models like Claude and Gemini.
Future improvements may come from more advanced AI models better suited for multi-agent collaboration and open-ended creativity.
Monitoring ongoing developments in AI agent frameworks and multimodal capabilities (e.g., vision tools for Codeex) is suggested.

← Back to Emergent Garden Blog