[00:00] (0.00s)
When Chad GPT released image generation,
[00:01] (1.92s)
it took the world by storm. But besides
[00:03] (3.84s)
all the fun cartoons and stuff you can
[00:05] (5.52s)
make, it's actually very useful for
[00:07] (7.04s)
business use cases like stock photos,
[00:09] (9.36s)
product shots, marketing and material.
[00:11] (11.92s)
It's really good. And now it's also
[00:13] (13.28s)
available in the API. So differentiating
[00:15] (15.04s)
from midjourney where you can't use the
[00:16] (16.48s)
API at all. Now you can do that with
[00:18] (18.08s)
OpenAI and the GBT image one model. In
[00:20] (20.80s)
this video, I'm going to show you how to
[00:21] (21.92s)
use it. I'm going to show you all the
[00:23] (23.40s)
parameters, how it works, and what it
[00:26] (26.00s)
costs. Let's go. And actually using the
[00:27] (27.76s)
images API from OpenAI is actually very
[00:30] (30.08s)
easy. So once you have your OpenAI
[00:31] (31.76s)
client set up, you can just say images.
[00:34] (34.08s)
And from there you can have an edit
[00:35] (35.36s)
function or a generate function. So
[00:37] (37.12s)
let's start with the generate function.
[00:38] (38.40s)
And then you have to pass in your
[00:39] (39.60s)
generation option. So I put them all up
[00:41] (41.36s)
here. And the first one is the model. So
[00:43] (43.12s)
you want to use GPT- image- one. This is
[00:46] (46.08s)
the new one where they just figured
[00:47] (47.36s)
something out. It just works so much
[00:48] (48.88s)
better than Dolly 2 or Dolly 3. I mean,
[00:50] (50.88s)
you could try those. They work okay, but
[00:52] (52.80s)
this one's really next level. And I'll
[00:54] (54.40s)
show you all the parameters one by one
[00:55] (55.84s)
in this application. I built to test the
[00:57] (57.76s)
API. First one would be the image size.
[00:59] (59.60s)
So by default it's automatic. So the AI
[01:01] (61.92s)
basically determines the resolution of
[01:03] (63.44s)
the image it's going to create but you
[01:05] (65.36s)
can't overwrite that. You can make it a
[01:06] (66.80s)
square image or you can make it
[01:08] (68.56s)
landscape or portrait mode. Those are
[01:10] (70.32s)
the three options you have if you want
[01:11] (71.60s)
to specify what you want. And then
[01:13] (73.12s)
number of images and this can go between
[01:15] (75.44s)
1 and 10 and 10 is the max for this new
[01:18] (78.24s)
GPT image model. In terms of image
[01:20] (80.00s)
format defaults to PNG. You can also
[01:22] (82.72s)
change it to JPEG or WEBP format. One
[01:24] (84.96s)
thing to keep in mind here, so if you
[01:26] (86.40s)
use WEBP or PNG, you can actually make
[01:29] (89.60s)
the image transparent. JPEG doesn't
[01:32] (92.16s)
support transparency, but when you do
[01:33] (93.76s)
stuff JPEG, you can actually set the
[01:35] (95.84s)
compression you want and send it right
[01:38] (98.00s)
to the API, so you can control that. I'm
[01:39] (99.68s)
going to stick with PNG, which is the
[01:41] (101.04s)
default. Then for background type, you
[01:42] (102.72s)
can actually set it to be transparent.
[01:44] (104.64s)
And so you pass the parameter of
[01:46] (106.24s)
transparent, and then the API will try
[01:48] (108.16s)
to make the image background
[01:50] (110.28s)
transparent. So if you have an image of
[01:52] (112.24s)
a person, you can have the background be
[01:53] (113.92s)
transparent. So you can just put onto a
[01:55] (115.28s)
web page fairly easily. Next has got
[01:56] (116.88s)
image quality. And the three excepts
[01:58] (118.48s)
would be high, medium, and low. And you
[02:00] (120.64s)
might be tempted to try out low. From my
[02:03] (123.36s)
experience, it gives you really bad
[02:04] (124.56s)
results. I'll show you that in a second.
[02:06] (126.08s)
The last one is content moderation. And
[02:08] (128.08s)
I always put this to low because I just
[02:09] (129.76s)
want as little moderation as possible,
[02:11] (131.60s)
but by default, it's actually automatic.
[02:13] (133.36s)
It probably be higher than that. So I
[02:15] (135.04s)
would always overwrite that to be low.
[02:16] (136.88s)
So say I want to create an infographic
[02:18] (138.72s)
of the bread making process for my
[02:20] (140.80s)
bakery. So I'm going to set it to low
[02:22] (142.48s)
here. We'll turn the background type
[02:23] (143.84s)
back to auto. We'll just say generate
[02:26] (146.08s)
image. And from my experience, when you
[02:28] (148.08s)
do set the low, you can do a generation
[02:30] (150.32s)
is very cheap. So like for example here,
[02:32] (152.24s)
the estimated cost is only 1 cent. But
[02:34] (154.16s)
the problem is it always generates with
[02:35] (155.44s)
much lower quality. So it's like
[02:37] (157.68s)
basically unusable. In this case, I
[02:39] (159.20s)
actually got the words right, which I'm
[02:40] (160.56s)
a bit surprised by. Other times I did
[02:42] (162.16s)
this and didn't even get the words
[02:43] (163.44s)
right. They're all scrambled. But the
[02:45] (165.36s)
images themselves are very poor quality.
[02:47] (167.52s)
They're not usable. So we do have the
[02:49] (169.04s)
option. I guess you could use it just to
[02:50] (170.80s)
test that you got the connection working
[02:52] (172.16s)
and everything, but for an actual image,
[02:53] (173.52s)
I would never use it. Change that back
[02:55] (175.20s)
to medium. We'll try medium. And then
[02:57] (177.28s)
we'll set the background type now to
[02:59] (179.28s)
transparent. So you can see what that
[03:00] (180.88s)
looks like. So I'm going to say stock
[03:03] (183.12s)
photo of a realistic person with a hat
[03:05] (185.36s)
that says recharge landscaping on it.
[03:07] (187.28s)
There it is. But as you can see, didn't
[03:08] (188.56s)
get the letters totally right in the
[03:09] (189.92s)
hat. It got the recharge right, but it
[03:11] (191.52s)
didn't get the landscaping right. So
[03:12] (192.80s)
what I would do in this case is actually
[03:14] (194.24s)
upload a image of your logo as a
[03:16] (196.16s)
reference, and then I could use that to
[03:17] (197.76s)
put on the hat. You might have to try a
[03:19] (199.36s)
couple times to get the text right,
[03:20] (200.72s)
which is where the cost can kind of add
[03:22] (202.24s)
up. But what it did do was actually made
[03:23] (203.92s)
a transparent background. So if you see
[03:25] (205.52s)
the white in the background, if I
[03:27] (207.68s)
download that and open it up for my
[03:29] (209.36s)
preview here now, you can see that
[03:30] (210.88s)
actually is a transparent, perfectly
[03:32] (212.32s)
transparent background. So it's kind of
[03:34] (214.32s)
a handy feature to be able to pass in
[03:35] (215.92s)
that transparent background type. It
[03:37] (217.60s)
just saves you a few minutes from having
[03:38] (218.88s)
to go to a background remover app. I do
[03:41] (221.28s)
have an estimated cost to this
[03:43] (223.44s)
application. So based on the various
[03:45] (225.36s)
settings, it'll tell you the cost, which
[03:47] (227.44s)
does actually dramatically change based
[03:49] (229.52s)
on what you choose for this stuff. So
[03:51] (231.60s)
for example, you could say like a low
[03:53] (233.36s)
quality image with a square resolution
[03:57] (237.20s)
of 1024 x 1024, it only cost
[04:00] (240.20s)
1.1. If we change that to landscape with
[04:03] (243.52s)
image quality high, now we go up to 25
[04:05] (245.92s)
cents. Basically 25 times more
[04:08] (248.08s)
expensive. And then obviously if you
[04:09] (249.76s)
bump up the number of images you're
[04:11] (251.04s)
producing like to 10 now it's 250 for
[04:13] (253.76s)
one shot of the API. So you do have to
[04:15] (255.84s)
be a bit careful because you can spend a
[04:17] (257.12s)
lot of money with this thing. So then
[04:18] (258.32s)
back in the code it's actually really
[04:19] (259.52s)
easy. So you just have to make a backend
[04:21] (261.36s)
process. You can do this in next.js and
[04:23] (263.28s)
have an automatic backend created. You
[04:25] (265.60s)
can do it in React using express as a
[04:27] (267.28s)
backend. All kinds of different options.
[04:29] (269.28s)
But basically like I was showing you
[04:30] (270.40s)
before I have to do is just say
[04:31] (271.44s)
images.generate then pass in your
[04:33] (273.12s)
options. And all those options I just
[04:35] (275.04s)
showed you they're all right here. So, I
[04:37] (277.76s)
put some documentation in the video
[04:39] (279.36s)
description so you can look at some
[04:40] (280.56s)
references, but it's really easy to use.
[04:42] (282.24s)
And then the images.edit is very much
[04:44] (284.40s)
the same. The only difference is now
[04:46] (286.48s)
instead of just passing in the prompt
[04:48] (288.40s)
and those parameters. So now you pass
[04:50] (290.24s)
all that, but additionally you pass in
[04:52] (292.24s)
what they call an image. So this is the
[04:54] (294.16s)
image you want to edit. So if you if I
[04:55] (295.84s)
just generated one like I did just then,
[04:57] (297.92s)
I could just pass that back in as an
[05:00] (300.08s)
image. But they also have a really
[05:01] (301.20s)
interesting use case here. They've taken
[05:02] (302.80s)
these four image files. So they have
[05:05] (305.12s)
like the body lotion, the soap, bubble
[05:07] (307.12s)
bath, etc. And they put it all together
[05:10] (310.24s)
into a bunch of images files. So they
[05:12] (312.80s)
have four of them here. So they packed
[05:14] (314.32s)
them up into array and actually passed
[05:15] (315.76s)
the array in and said, "Create a lovely
[05:19] (319.28s)
gift basket with these four items in
[05:20] (320.96s)
it." And then generated a gift basket
[05:23] (323.20s)
and put all the items that have picked
[05:24] (324.80s)
out of these other images right in
[05:27] (327.60s)
So, I can think of so many use cases for
[05:29] (329.12s)
something like this where you can just
[05:30] (330.64s)
take a bunch of different images or even
[05:32] (332.48s)
like take an image of your branding, put
[05:34] (334.32s)
it onto a person's t-shirt, put it on
[05:36] (336.24s)
their hat. And if you're interested in
[05:37] (337.44s)
AI, in particular, AI for software
[05:39] (339.12s)
development, make sure you subscribe to
[05:40] (340.40s)
my newsletter, the AI unleash news. It's
[05:42] (342.24s)
the first link in the description. I
[05:43] (343.60s)
hope to see you there. Thanks for
[05:45] (345.28s)
watching the video. I hope you're having
[05:46] (346.56s)
an amazing day. I'll talk to you in the